Solr Search Integration

Last modified by Vincent Massol on 2024/02/26 17:56

 XWiki
 Implementation
 Completed

Description

This has now been implemented by the XWiki Development Team in http://jira.xwiki.org/browse/XWIKI-5676

Project Description

 The idea of the project is to use Apache SOLR as the search engine for XWiki. XWiki is using Lucene as a core component for Wiki Search. Lucene is little hard to configure and doesn't support features like facet search, hit highlighting, customizing search relevancy using boost index out of the box. Solr stands out in its minimal configuration to implement the search engine.Few libraries and a couple of XML configuration files are sufficient to implement a well to do engine. Configuring multiple languages is easy in SOLR compared to Lucene.
          
Using SOLR, one can customize the indexing process by using required analyzers with selected tokenizers and set of filters on the dataset to generate highly customizable relevancy index. Through the front end, the user can select or configure the fields to be searched for and their weight which contributes to the document score. The SOLR component offers flexibility in the following ways.

  • Based on SOLR's schema and complementary information, the indexing process is customizable to only index and store as little information as needed. 
  • Through code customizability (exploiting the possibility of Groovy code in pages), the transformation of a user-query to a SOLR query should be
       adjustable, far beyond the simple text-parsing (enabling, for example, the prohibition of some spaces, or the conversion to multiple fields based on input
       parameters).
  • This component supports calibrating the search engine's parameter (such as the Dismax' parser coefficients, the analyzers usefulness, ...)
       by classical quantitative methods such as precision-and-recall, which any wiki master, or a collaborator, should be able to exploit and report.

Approach

 XWiki-platform-search module is split into api and solr implementation. We can have a single Wiki Search API exposed to xwiki and have multiple components with different implementations of SOLR like Embedded Solr server, a remote solr server hosted on a different source, indexing and querying the database instead of XWiki document objects.. etc..

Solr API

The following are the APIs proposed for Solr which can be used by any one of the XWiki's scripting language. 

Following is the link to the search API. The design is aimed such that it could b used by lucene or solr component.

https://dl.dropbox.com/u/2027256/apidocs/xwiki-platform-search-api/apidocs/index.html

Following is the link to the API of the Embedded Solr Server Implementation.

https://dl.dropbox.com/u/2027256/apidocs/xwiki-platform-search-solrj/apidocs/index.html


 

Get Connected