General Solr Features

  • Based on Apache Lucene Search Library
  • Complete query capabilities, such as keyword, Boolean and +/- queries. Also supports proximity searches, wildcards, fielded searches, term/field/document weights, find-similar, spell-checking, auto-suggest, multi-lingual and more.
  • Sort by relevancy, date, or any field. Strong out-of-the box relevancy ranking means little custom tweaking except in extreme cases.
  • Query term highlighting in result sets.
  • Faceting of results and control over number and types of facets displayed.
  • Scalable, with low overhead indexes, rapid incremental indexing, and distributed search. Index replication reduces system downtime.
  • HTML-based administration interface (LW builds on this, however).
  • Open interfaces based on standards such as XML, JSON, CSV and HTTP to load data and get results.

Other features have been added over successive releases.

System Configuration and Setup

Feature Solr 1.4.1 Solr 3.6 Solr
4.0
Wizard-based installation
Built-in version-to-version migration tools
ReST API for configuration

Connectors

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
Websites
SolrXML
Local filesystems
SMB (Windows Shares)
HDFS, S3
JDBC
SharePoint
Load CSV
Post JSON
Create custom connectors

Indexing Content

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
UI for configuration of data sources
UI dashboard for monitoring crawl and index processes
Scheduling content indexing
Transform XML Solr docs via XSLT for indexing
Various fixes for multi-threaded DIH
Near-Realtime
New Lucene index format
Batch crawling
Embedded Luke to analyze indexes

Index Management

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
UI for adding and updating fields
TermsComponent for access to indexed terms
Lockless Writes (DocumentWriterPerThread)
Automaton based auto-complete/suggest
Data-specific indexing codecs
Per-field similarity definition

Query Handling

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
eDismax query parser
Spell Check
Unary operators (+,-,!) will not be operators if followed by whitespace
Mixed range operators (such as { and ] ) are legal
Open-ended ranges (such as [a TO *])
KStemmer
Spellcheck without separate index
Locale-sensitive range queries
Realtime Get to retrieve the latest version of a document without a commit
Hunspell Stemmer
Language Identification
Psuedo-Joins to select a set of documents based on their relationship to a second set of documents

Results and Relevance

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
Spatial search, filtering, boosting, sorting
Grouping (Field Collapsing)
User alerts for new content
Suggester (Auto-complete)
Popularity-based ranking (Click Scoring Relevance Framework)
Per-field/flexible scoring
Numeric range faceting
Numeric range faceting in distributed search
CSV response writer
Ruby, JSON, XML, Python and response writers
Sort by output of a FunctionQuery
Highlighting improvements (such as fastvectorhighlight)
SolrJ parses grouped and range responses
Pseudo-field support to return extra data such as function query values, score explanation and field aliasing with stored fields
Relevance Function Queries
Pivot Faceting
Conditional function queries
Per-segment field faceting for improved performance with Near-Real Time search
New ranking algorithms (BM25, LM, DFR, etc.)
Highlight a facet term instead of query term when browsing facet results

Security

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
HTTPS encryption between components
Access Control List integration for document access control
LDAP or Active Directory for user authentication
Solr filter queries (fq) for document access control

Deployment and Monitoring

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
MBeans for indexing activity
MBeans for search activity
MBeans for filter, result and document caches
MBeans for crawling activity
Zabbix integration
Nagios integration
Built-in QPS Dashboard
Searchable log files
StatsComponent support for dates and strings

Scaling and Performance

Feature Solr 1.4.1 Solr 3.6 Solr 4.0
SolrCloud Phase 1 (distributed search, cluster state, read-side fault tolerance, and centralized configurations)
SolrCloud Phase 2 (distributed indexing, real-time GET, read & write fault tolerance, and cluster elasticity)
Distributed search
Memory improvements for searching and sorting
Post filters and filter cache controls
Fast fuzzyquery
Tiered merge policy
Distributed search grouping (distributed field collapsing)

About Lucidworks

Read more from this author

Best of the Month. Straight to Your Inbox!
Dive into the best content with our monthly Roundup Newsletter! Each month, we handpick the top stories, insights, and updates to keep you in the know.