EMu with Solr indexing

EMu 8.0 introduces an indexing method that allows use of Apache Solr for searching rather than the default Texpress based indexing. If Solr indexing is enabled, Solr does not replace Texpress as the underlying database engine used by EMu, rather it replaces the Texpress searching component; Texpress effectively outsources its searching component to Solr, and Solr feeds back results to Texpress for disbursement. All other database activities, including record insertion, modification and deletion as well as record locking and sorting, continue to be performed by Texpress. The image below shows the relationship between EMu, Texpress and Solr when Solr indexing is enabled:

EMu relationship diagram

Solr indexing is optional and can be set on a per module basis. The default EMu installation will continue to use Texpress indexing as it provides a robust and well-tested platform, but it is anticipated that many institutions will adopt Solr indexing given the advantages it provides over Texpress indexing, including:

  • significantly lower disk usage;
  • fast range-based searching for numeric, date, time, latitude and longitude values;
  • fast wildcard (pattern) searching without the need for partial indexes;
  • elimination of false matches (the number of matching records is always correct);
  • no need for nightly / weekly maintenance to rebuild indexes;
  • no configured limit on the number of records in a module; and
  • no bit-slicing phase when a re-index is performed.

Note: Solr typically provides more efficient and faster searches than Texpress, however where record matches are in the millions, performance may be slower. An option to abort a query that is taking too long is available, and this will return matching records up to the point the query was aborted.

Read on: