Solr with Apache Tomcat (http://wiki.apache.org/solr/SolrTomcat)

March 14, 2012 § 1 Comment

Simple Example Install

Installing Tomcat 6

Apache Tomcat is a web application server for Java servlets. These are instructions for manually installing Tomcat 6 on Linux, recommended because distribution Tomcats are either old or quirky.

Create the solr user. As solr, extract the Tomcat 6.0 download into /opt/tomcat6, hereafter referred to as the $CATALINA_HOME directory.

Edit $CATALINA_HOME/conf/tomcat-users.xml to enable the manager login as user “tomcat” with password “tomcat” (insecure):

 

<role rolename="manager"/>
<role rolename="admin"/>
<user username="tomcat" password="tomcat" roles="manager,admin"/>

Start Tomcat with $CATALINA_HOME/bin/catalina.sh run. Tomcat runs on the port defined $CATALINA_HOME/conf/server.xml, configured by default to port 8080.

The startup script tomcat6 can be placed in /etc/init.d/tomcat6 on CentOS/RedHat/Fedora so that you can start Tomcat using service tomcat6 start. Use chkconfig to enable the tomcat6 service to start on boot.

Building Solr

Skip this section if you have a binary distribution of Solr. These instructions are for building Solr from source, if you have a nightly tarball or have checked out the trunk from subversion athttp://svn.apache.org/repos/asf/lucene/dev/trunk. Assumes that you have JDK 1.6 already installed.

In the source directory, run ant dist to build the .war file under dist. Build the example for the Solr tutorial by running ant example. Change to the ‘example’ directory, run java -jar start.jarand visit localhost:8983/solr/admin to test that the example works with the Jetty container.

Installing Solr instances under Tomcat

Assuming that Solr and its example are built, this is how to install the Solr example as an instance under Tomcat.

Copy the example/solr directory from the source to the installation directory like /opt/solr/example/solr, herafter $SOLR_HOME. Copy the .war file dist/apache-solr-*.war into $SOLR_HOMEas solr.war.

The configuration file $SOLR_HOME/conf/solrconfig.xml in the example sets dataDir for the index to be ./solr/data relative to the current directory – which is true for running the Jetty server provided with the example, but incorrect for Tomcat running as a service. Modify the dataDir to specify the full path to $SOLR_HOME/data:

  •   <dataDir>${solr.data.dir:/opt/solr/example/solr/data}</dataDir>

The dataDir can also be temporarily overridden with the JAVA_OPTS environment variable prior to starting Tomcat:

  •   export JAVA_OPTS="$JAVA_OPTS -Dsolr.data.dir=/opt/solr/example/solr/data"

Create a Tomcat Context fragment to point docBase to the $SOLR_HOME/solr.war file and solr/home to $SOLR_HOME:

 

<?xml version="1.0" encoding="utf-8"?>
<Context docBase="/opt/solr/example/solr/solr.war" debug="0" crossContext="true">
  <Environment name="solr/home" type="java.lang.String" value="/opt/solr/example/solr" override="true"/>
</Context>

Symlink or place the file in $CATALINA_HOME/conf/Catalina/localhost/solr-example.xml, where Tomcat will automatically pick it up. Tomcat deletes the file on undeploy (which happens automatically if the configuration is invalid).

Repeat the above steps with different installation directories to run multiple instances of Solr side-by-side.

If Tomcat is not already running, start it with service tomcat6 start or $CATALINA_HOME/bin/startup.sh run. The Solr admin should be available at http://<host&gt;:8080/solr-example/admin.

Single Solr Instance

If you are sure that you will only ever run one instance of Solr, you can do away with the Context fragment by placing the .war in $CATALINA_HOME/webapps/solr-example.war and setting the Solr home through a global environment variable prior to starting Tomcat:

 

export JAVA_OPTS="$JAVA_OPTS -Dsolr.solr.home=/opt/solr/example"

Troubleshooting

Login to Tomcat Management page does not work

$CATALINA_HOME/conf/tomcat-users.xml may be missing the correct user line.

Tomcat Manager does not list Solr

The Context fragment may be invalid. Examine $CATALINA_HOME/tomcat6/logs/catalina.out.

Exceptions when visiting Solr admin

View $CATALINA_HOME/logs/catalina.out for a better view of the exceptions. Probably caused by an incorrect path in solrconfig.xml or the Context fragment, or by an unclean build (run ant clean and rebuild the source).

HTTP 500 error

If, when installing Solr 3.5, you get an HTTP 500 error and the exception message begins with

org.apache.solr.common.SolrException: Error loading class 'solr.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389)...

the problem is caused by incorrect <lib> locations in $SOLR_HOME/conf/solrconfig.xml and can be fixed by reading through this note in the mailing list archive: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3C7FA7F0B0-935A-4D01-A389-DB3B7EDA0464@gmail.com%3E

关于这个问题的补充说明

中告诉我们,要在使用tomcat6.0时避免这个错误,要么注释 <queryResponseWriter name=”velocity”http://wiki.apache.org/solr/VelocityResponseWriter&#8221; style=”color: rgb(0, 68, 170); border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; border-image: initial; text-decoration: none; “>VelocityResponseWriter” enable=”${solr.velocity.enabled:true}”/>,要么将${solr.velocity.enabled:true}”/修改为${solr.velocity.enabled:false}”/,要么就得将solrconfig.xml中的<lib dir=”../../contrib/extraction/lib” /><lib dir=”../../contrib/clustering/lib” />这些相对目录的地址调整为正确的(因为大多数情况下我们都是从example\solr下copy的solrconfig.xml,因此里面的目录位置肯定与实际位置不相符合),修改的时候一定要注意dir=”../../contrib/extraction/lib”这个地址的相对位置是对于$SOLR_HOME/conf/来说的,而不是$SOLR_HOME/conf/solrconfig.xml

Optional Configuration

Logging

For information about controlling JDK Logging (aka: java.util logging) in Tomcat, please consult the Tomcat docs… http://tomcat.apache.org/tomcat-6.0-doc/logging.html

URI Charset Config

If you are going to query Solr using international characters (>127) using HTTP-GET, you must configure Tomcat to conform to the URI standard by accepting percent-encoded UTF-8.

Edit Tomcat’s conf/server.xml and add the following attribute to the correct Connector element: URIEncoding=”UTF-8″.

 

<Server ...>
 <Service ...>
   <Connector ... URIEncoding="UTF-8"/>
     ...
   </Connector>
 </Service>
</Server>

This is only an issue when sending non-ascii characters in a query request… no configuration is needed for Solr/Tomcat to return non-ascii chars in a response, or accept non-ascii chars in an HTTP-POST body.

Configuring Solr Home with JNDI

A Tomcat context fragments can be used to configure the JNDI property needed to specify your Solr Home directory.

Just put a context fragment file under $CATALINA_HOME/conf/Catalina/localhost that looks something like this…

$ cat /tomcat55/conf/Catalina/localhost/solr.xml

 

<Context docBase="/some/path/solr.war" debug="0" crossContext="true" >
   <Environment name="solr/home" type="java.lang.String" value="/my/solr/home" override="true" />
</Context>

A few things to keep in mind:

  • The “conf/Catalina/localhost” directory may not exist by default in your installation. You may have to create it.
  • “/some/path/solr.war” is the absolute path to where ever you want to keep the Solr war using the appropriate syntax for your Operating System. In Tomcat 5.5 and later, the war file must be stored outside of the webapps directory for this to work. Otherwise, this entire Context element is ignored.
  • “/my/solr/home” should be to where you have created your Solr Home directory, using the appropriate syntax for your Operating System.
  • Prior to Tomcat 5.5, a “path” attribute was required for Context elements (starting with 5.5, the path attribute must not be used except when statically defining a Context in server.xml, as it will be inferred from the Context fragment filename.

Enabling Longer Query Requests

If you try to submit too long a GET query to Solr, then Tomcat will reject your HTTP request on the grounds that the HTTP header is too large; symptoms may include an HTTP 400 Bad Request error or (if you execute the query in a web browser) a blank browser window.

If you need to enable longer queries, you can set the maxHttpHeaderSize attribute on the HTTP Connector element in your server.xml file. The default value is 4K. (See http://tomcat.apache.org/tomcat-5.5-doc/config/http.html)

 

Multiple Solr Webapps

Tomcat context fragments make configuring multiple Solr webapps (with JNDI) in a single Tomcat server easy.

Just follow the previous instructions for “Configuring Solr Home with JNDI” to create a seperate context fragment file under $CATALINA_HOME/conf/Catalina/localhost for each solr webapp you want to run:

$ cat /tomcat55/conf/Catalina/localhost/solr1.xml

 

<Context docBase="/some/path/solr.war" debug="0" crossContext="true" >
   <Environment name="solr/home" type="java.lang.String" value="/some/path/solr1home" override="true" />
</Context>

$ cat /tomcat55/conf/Catalina/localhost/solr2.xml

 

<Context docBase="f:/solr.war" debug="0" crossContext="true" >
   <Environment name="solr/home" type="java.lang.String" value="/some/path/solr2home" override="true" />
</Context>

Don’t put anything related to Solr under the webapps directory.

The solr home directories are configured via JNDI in the context fragment, and in the examples above will be /some/path/solr1home and /some/path/solr2home The URLs to the two webapps will be http://host:port/solr1 and http://host:port/solr2

Tomcat on Windows

Single Solr app

  • Download and install Tomcat for Windows using the MSI installer. Install it with the tcnative.dll file. Say you installed it in c:\tomcat\

  • Check if Tomcat is installed correctly by going to http://localhost:8080/

  • Change the c:\tomcat\conf\server.xml file to add the URIEncoding Connector element as shown above.
  • Download and unzip the Solr distribution zip file into (say) c:\temp\solrZip\
  • Make the “solr home” directory called, where you intend the application server to function, say c:\web\solr\
  • Copy the contents of the example\solr directory c:\temp\solrZip\example\solr\ to c:\web\solr\
  • Stop the Tomcat service
  • Copy the *solr*.war file from c:\temp\solrZip\dist\ to the Tomcat webapps directory c:\tomcat\webapps\
  • Rename the *solr*.war file solr.war
  • Configure Tomcat to recognize the solr home directory you created, by adding the Java Options -Dsolr.solr.home=c:\web\solr and -Dsolr.velocity.enabled=false
    • either use the system tray icon to add the java option
    • or manually edit the environment script c:\tomcat\bin\setenv.bat and add it to JAVA_OPTS
  • * Note: For Tomcat 7 and Solr3.4(last version on 2011-09-23), the above option on setenv.bat may not work, so you could not use it and put this code fragment
    on $CATALINA_HOME/conf/Catalina/localhost/solr.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <Context docBase="C:\apache-tomcat-7.0.21\webapps\solr.war" debug="0" crossContext="true" >
        <Environment name="solr/home" type="java.lang.String" value="C:\solr" override="true" />
    </Context>
  • Start the Tomcat service
  • Go to the solr admin page to verify that the installation is working. It will be at http://localhost:8080/solr/admin

Multiple Solr apps

  • Download and install Tomcat for Windows using the MSI installer. Install it with the tcnative.dll file. Say you installed it in c:\tomcat\

  • Check if Tomcat is installed correctly by going to http://localhost:8080/

  • Change the c:\tomcat\conf\server.xml file to add the URIEncoding Connector element as shown above.
  • Download and unzip the Solr distribution zip file into (say) c:\temp\solrZip\
  • Say you need two apps in c:\web\solr1 and c:\web\solr2; create these two directories
  • Copy the contents of the example\solr directory c:\temp\solrZip\example\solr\ to c:\web\solr1\ and to c:\web\solr2\
  • Stop the Tomcat service
  • Copy the *solr*.war file from c:\temp\solrZip\dist\ to the Tomcat lib directory c:\tomcat\lib\
  • Rename the *solr*.war file solr.war
  • Make a new text file in c:\tomcat\conf\Catalina\localhost called solr1.xml with the following code fragment
    <Context docBase="c:\tomcat\lib\solr.war" debug="0" crossContext="true" >
       <Environment name="solr/home" type="java.lang.String" value="c:\web\solr1" override="true" />
    </Context>
  • Make a new text file in c:\tomcat\conf\Catalina\localhost called solr2.xml with the following code fragment
    <Context docBase="c:\tomcat\lib\solr.war" debug="0" crossContext="true" >
       <Environment name="solr/home" type="java.lang.String" value="c:\web\solr2" override="true" />
    </Context>
  • Start the Tomcat service
  • Go to the solr admin pages for the 2 webapps to verify that the installation is working. It will be at http://localhost:8080/solr1/admin and http://localhost:8080/solr2/admin

64-bit Note

The MSI installer that installs Tomcat as a Windows service isn’t prepared to support 64-bit Windows out of the box. There are some straightforward workarounds, though. See http://stackoverflow.com/questions/211446/how-to-run-tomcat-6-on-winxp-64-bit

//!\\ TODO //!\\

Indicate how to index in tomcat (rather than built-in jetty support via start.jar).

External Resources

http://www.ibm.com/developerworks/java/library/j-solr1/

Troubleshooting Errors

It’s possible that you get an error related to the following:

 

SEVERE: Exception starting filter SolrRequestFilter
java.lang.NoClassDefFoundError: Could not initialize class org.apache.solr.core.SolrConfig
        at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:76)
.........
Caused by: java.lang.RuntimeException: XPathFactory#newInstance() failed to create an XPathFactory for the default object model: http://java.sun.com/jaxp/xpath/dom with the XPathFactoryConfigurationException: javax.xml.x
path.XPathFactoryConfigurationException: No XPathFctory implementation found for the object model: http://java.sun.com/jaxp/xpath/dom
        at javax.xml.xpath.XPathFactory.newInstance(Unknown Source)

This is due to your tomcat instance not having the xalan jar file in the classpath. It took me some digging to find this, and thought it might be useful for others. The location varies from distribution to distribution, but I essentially just added (via a symlink) the jar file to the shared/lib directory under the tomcat directory.

Lucene.Net ultra fast search for MVC or WebForms site => made easy!

March 7, 2012 § Leave a comment

Introduction

Have you ever heard of Lucene.Net? If not, let me introduce it briefly.

Lucene.Net is a line-by-line port of popular Apache Lucene, which is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search. Especially, an application where you want to achieve something close to Google search results, and no just search results, but very fast search results, or may be just insanely fast search results, but only in your app and on your terms!

So, while technically possibly, though somewhat challenging, you can integrate original Apache Lucene into your .NET application, and it will give you insanely fast search. But it will take quite a while, and will probably force to cut corners here and there, thus making your site way too complex and error prone. So, unless you absolutely have to have the fastest search on the planet (and beat Google along the way), you shouldn’t go this way, as for majority of .NET applications Lucene.Net will be madly fast anyway.

Main purpose of Lucene.Net is to be easy to integrate into any .NET application and provide most of the speed and flexibility of the original Java-based library. And it does it pretty good! You will learn that in this article. Even original Apache Lucene documentation applies to Lucene.Net 99% of the time!

You might ask: “why bother with Lucene.Net? My SQL Server returns search results pretty fast anyway…”. Yeah I thought that too, until I tried Lucene.Net.

First of all, I discovered that Lucene is still faster than SQL query. And it is absolutely fast when searching for some text or pharase, no matter how many words are in your search. For example, when you search for a sentence of five different words in some text or description and want results to be in an order of relevance, in a same way that major web search engines do. How would you do it in SQL? Or .NET?… Your code might get very complex, and search queries long and complicated… That may become equivalent to slow-turtlish kind of search…

Good news are that Lucene.Net solves most of those problems for you! No need to write complicated search logic anymore! All you need to do is to correctly integrate it into you application! And that is what this article is about!

So, if you are interested in trying Lucene.Net for you .NET web site or application, continue reading, and prepare to embrace some love for Lucene!

Small disclaimer, sort of)):

I am not an expert in Lucene, and this article is not only about Lucene, it is rather about how to make it work in you .NET app/site. Hence, there will not be any advanced Lucene topics covered (at least initially), only what is needed to get it working.

Refactoring: Improving the Design of Existing Code.

March 7, 2012 § Leave a comment

Many programmers seems to get caught up on the idea of refactoring.

Most of us are familiar with the Boy Scout rule which says:

Always leave code better than when you found it

But do you actually apply it in your day to day work?

I’ve found that for myself the answer to this question is sometimes “no.”

Refactoring, I got influence when I read Martin Fowler’s book on Refactoring: Improving the Design of Existing Code. Certainly, I just realize how important it becomes when we maintain the code. I feel lucky to across such, I was introduce to this book from my ex Project Manager Mr. James Edward. He gave me one important advice before he left company,

“Remember always break into smaller piece as much as you can.”

Why we don’t follow the rule

Personally, I know that there are many reasons why I have failed to follow the Boy Scout rule in my own day to day coding activities.

How often do you say to yourself something like:

“Yeah, I’ve just got to get this code checked in.  I don’t have time to clean it up.”

Or

“Refactoring this correctly would take too long, and I want to make sure I do it right.”

At first glance, these seem like perfectly valid excuses, but the problem is that the cumulative effect of this kind of thinking is exactly what causes code rot.

“Code rot” is when code from your application begins to become brittle and hard to maintain.

As software developers we should really strive to prevent code rot, and regularly refactoring and cleaning up code, is like brushing the teeth of our application.

There are definitely a large number of excuses I come up with for not refactoring code, but I would say that the number one mental block is this idea of perfection and needing to do it right.

Small refactoring are good!

One thing I try to tell myself is that small refactoring are good and I don’t need to solve the whole problem all at once.

We shouldn’t let the fact that we can’t completely clean up a section of code or refactor it to the final structure we want, prevent us from putting that code on a bus headed that direction.

Many programmers tend to have the perfect solution mindset which requires us to find the 100% best solution to a problem and think that the 95% effective one is no good.

This mindset can be a huge stumbling block to productivity and it can also be a big hindrance to keeping our campsite clean.

It is often helpful to embrace that a series of small changes can be more beneficial than one large change of the same resulting magnitude, even if the small changes end up requiring more total work.

The reason for this is two-fold:

  1. Big changes rarely actually get done, so they are put off
  2. Small changes usually are more natural and evolve the code in an organically correct direction.

Going backwards to go forwards

I even find that many times I take one step backwards in order to go two forwards.  Refactoring sometimes has to just progress naturally as you make something clearer, only to undue it a bit later with another change that ends up making more sense once you can actually see the problem being solved more clearly.

Think about solving a Rubix Cube.  If you have ever attempted to solve one of these things, you know that sometimes you have to wreck that perfect wall of green in order to start getting the white blocks in place.  Many times it is impossible to solve a Rubix Cube without traversing backwards in the solution first.

The point is, don’t be afraid to get out there and make something clearer, or go a direction that seems like it will at least improve the code.

You don’t have to find the perfect design pattern to refactor the code into in order to start making changes.

  • Start by changing this variable name to be a bit more clear.
  • Extract those lines of code into a method that makes that functionality more clear.
  • Get rid of some duplication here and there.

The active code reader

When I am in my “zone” and doing things right, I am even refactoring code while I am reading it.

There is no better way to understand some code than to refactor it.

Think about it, how do we learn?

We read something or are taught it and then we rephrase it differently to confirm understanding.

“Let me get this straight, are you saying… blah blah blah?”

“Oh, now I get it, if I do blah and blah then blah blah?”

Why shouldn’t we do this with code?

I know some of you are really scared by this idea, and you’re saying “nope, don’t just go touching code you don’t understand!  You are not getting anywhere near my code base.”

But, give it a shot, what is the worst that is going to happen?  You are going to refactor yourself into a dead end and have to revert your changes?

More likely than not, you will end up learning the code better and improving it.  Two for one!

One more analogy then I’m done

I promise!

Ever solved a crossword puzzle?

Did you sit there and immediately fill in all the answers one by one?

Perhaps you filled in some answers that you knew.  Perhaps the short ones first, then you went back over all the clues again and suddenly found that with some letters filled in you could better guess the clues.

Most likely you made several sweeps like this until you finally solved the puzzle or gave up in disgust wondering why you wasted an hour of your life and who the heck studies books on geography that could actually solve this puzzle.

Do you think it would be any different with code?  Making small refactoring is like filling in the clues you know the answer to in a crossword puzzle.  As you keep refactoring, the answers to other puzzles about the code and which way it should go become clearer and clearer.

Don’t try and solve the whole puzzle one by one in a single pass.

Where Am I?

You are currently viewing the archives for March, 2012 at Naik Vinay.