Hobione's Weblog

Living & Breathing in Web 2.0 Era

Regain search engine on Glassfish app server

“Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (currently HTML, XML, Excel, Powerpoint, Word, PDF and RTF).  A TagLibrary eases integrating search results in your JSP based web page.”  You may download the production version from http://regain.sourceforge.net/ or beta version from http://www.assembla.com/spaces/regain2/documents

Our NAS division inTranet web site needed a search look up like Google search.  I came across with Regain Search engine after done some research.  It is pretty simple to install on a server (Solaris box).  Regain does not do any database search, but if you have a link that requires to go to database, Regain will crwal up the link store that file metadata in the index. Hibernate search or Lucene search does the full scale database search, I have not used them yet but that’s what I hear.

Here are few steps to install Regain on Glassfish: (Unzip regain_v1.5.0-preview-80717-1556_server.zip)

  1. Create a directory under ../SUNWappserver/domains/domain1
    1. Name it regain
    2. Copy the crawler directory from c:\regain\runtime (after unzipped) and paste it to domain1/regain/
  2. Under crawler dir, you will find CrawlerConfiguration.xml
    1. Modify the xml file with your domain name etc.
    2. Create a directory and call it searchindex under crawler
  3. Now the fun part is to build indexing
    1. Change directory (cd) to domian1/regain/crawler dir and run this following command
    2. java -jar regain-crawler.jar or,  java -jar regain-crawler.jar –help (more options)
  4. Copy conf dir, it's sub directories and the xml file (all three –> conf\regain\SearchConfiguration.xml ) from the downloaded zip-file (can be found under regain\runtime\search) and paste 'em directly under domian1/application
  5. Modify the SearchConfiguration.xml file mainly line 74, 80 and 83
  6. Deploy the regain.war file via app server’s beautiful admin gui.
  7. Modify the web.xml (domain1\applications\j2ee-modules\regain\WEB-INF). You have to specify the regain webapp where to look for search configuration file.
<!-- The location of the configuration file -->
<context-param>
<param-name>searchConfigFile</param-name>
<param-value>../conf/regain/SearchConfiguration.xml</param-value>

</context-param>

Regain webapp’s web.xml map to SearchConfiguration.xml and SearchConfiguration.xml know where to find searchindex dir to bring up the query results.  (Three steps).

7. Open up a browser and type http://yourdomainname/contextName, i.e. http://axous2.abc.aaa.info/search

Happy searching!

Home page: http://regain.sourceforge.net/
Open Source Full Text Search Engines Written In Java: http://www.manageability.org/blog/stuff/full-text-lucene-jxta-search-engine-java-xml

Advertisements

August 1, 2008 Posted by | GlassFish, Search Engine | 24 Comments