Introducing Lucidworks View!

April 12, 2016

Read More

Latest

lucene-text-analyzers

Better Feature Engineering with Spark, Solr, and Lucene Analyzers

This blog post is about new features in the Lucidworks spark-solr open source toolkit. For an introduction to the spark-solr project, see Solr as an Apache Spark SQL DataSource Performing text analysis in Spark The Lucidworks spark-solr open source toolkit now contains tools to break down full text into words a.k.a. tokens using Lucene’s text […]

April 13, 2016
whats-new-solr-6

Apache Solr 6 Is Released! Here’s What’s New:

Happy Friday – Apache Solr 6 just released!  From the official announcement: “Solr 6.0.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html “See the CHANGES.txt “Solr 6.0 Release Highlights: Improved defaults for “Similarity” used in Solr, in order to provide better default experience for new users. Improved “Similarity” defaults for users upgrading: DefaultSimilarityFactory has been removed, implicit […]

April 8, 2016
ldap_dit_example

Secure Fusion: Leveraging LDAP

This is the third in a series of articles on securing your data in Lucidworks Fusion. Secure Fusion: SSL Configuration covers transport layer security and Secure Fusion: Authentication and Authorization covers general application-level security mechanisms in Fusion. This article shows you how Fusion can be configured to use an LDAP server for authentication and authorization. […]

March 23, 2016
fusion_access_users

Secure Fusion: Authentication and Authorization

This is the second in a series of articles on securing your data in Lucidworks Fusion. Here’s Part One. This post covers Fusion’s basic application-level security mechanisms. At the application layer, Fusions delivers security via: Authentication – users must sign on using a username and password. Authorization – each username is associated with one or […]

March 10, 2016
solr-developer-survey-2015-thumb

2015 Solr Developer Survey

Our annual snapshot of the vibrant Solr community and the amazing things developers all over the world are doing with Apache Solr.

February 24, 2016
fusion-ssl

Secure Fusion: SSL Configuration

This is the first in a series of articles on securing your data in Lucidworks Fusion. The first step in securing your data is to make sure that all data sent to and from Fusion is encrypted by using HTTPS and SSL instead of regular HTTP. Because this encryption happens at the transport layer, not […]

February 23, 2016
Lucidworks_Dark Data__Pitch 1
solr-daterangefield

Solr’s DateRangeField, How Does It Perform?

Solr’s DateRangeField I have to credit David Smiley as co-author here. First of all, he’s largely responsible for the spatial functionality and second he’s been very generous explaining some details here. Mistakes are my responsibility of course. Solr has had a new DateRangeField for quite some time (since 5.0, see SOLR-6103. DateRangeFields are based on […]

February 13, 2016
trey-grainger

Welcome Trey Grainger!

Trey Grainger has joined as Lucidworks SVP of Engineering where he’ll be heading up our engineering efforts for both open source Apache Lucene/Solr and our Lucidworks Fusion platform.

February 10, 2016
query-solr-config

Fusion plus Solr Suggesters for More Search, Less Typing

The Solr suggester search component was previously discussed on this blog in the post Solr Suggester by Solr committer Erick Erickson. This post shows how to add a Solr suggester component to a Fusion query pipeline in order to provide the kind of auto-complete functionality expected from a modern search app. By auto-complete we mean […]

February 4, 2016
apache-solr-history
/browse
jake_profile

Jake Mannix Joins Lucidworks as Principal Data Engineer

We are pleased to welcome Jake Mannix to the Lucidworks team as our new Principal Data Engineer. Jake’s past work includes: Working on data pipelining with Apache Spark to scale a semantic search engine at the Allen Institute for Artificial Intelligence. Jake was tech lead for Twitter’s Data Science / Data Engineering team, building both the […]

January 20, 2016
understanding-transaction-logs-softcommit-and-commit

Top Blog Posts of 2015

The best of the blog from the past year including Solr authentication, Hadoop connectors, facets and stats, Docker, Spark, and increasing indexing and performance.

January 13, 2016
developeronfire

Erik Hatcher, Developer on Fire, Profiled by Dave Rael

Lucidworks engineer, Apache Lucene committer, and co-author of Lucene in Action as well as co-author of Java Development with Ant is profiled on Dave Rael’s Developer on Fire podcast. Subscribe or listen on iTunes or via the podcast’s feed.

January 9, 2016
babar-christmas-cropped

Open Source Hadoop Connectors for Solr

Lucidworks is happy to announce that several of our connectors for indexing content from Hadoop to Solr are now open source. We have six of them, with support for Spark, Hive, Pig, HBase, Storm and HDFS, all available in Github. All of them work with Solr 5.x, and include options for Kerberos-secured environments if required. HDFS for Solr […]

December 17, 2015
make-money-from-home

Data Security and Human Insecurities: How Scammers Take Advantage

Lucidworks CEO Will Hayes latest Forbes columns looks at the ways scammers take advantage of the big holes in big data to prey on all of us: “The immense amount of data we expose about ourselves make it incredibly easy to get targeted. … These profiles make it easier than ever for up-to-no-gooders to target us […]

December 16, 2015
/browse
Pasted image at 2015_11_20 13_03
Declaration of Independence segment

Query Autofiltering IV: – A Novel Approach to Natural Language Processing

This is my fourth blog post on a technique that I call Query Autofiltering. The basic idea is that we can use meta information stored within the Solr/Lucene index itself (in the form of string or non-tokenized text fields) to generate a knowledge base from which we can parse user queries and map phrases within […]

November 19, 2015
solr-docker

Solr on Docker

It is now even easier to get started with Solr: you can run Solr on Docker with a single command.

November 3, 2015

Stump The Chump: Austin Winners

Last week was another great Stump the Chump session at Lucene/Solr Revolution in Austin. After a nice weekend of playing tourist and eating great BBQ, today I’m back at my computer and happy to announce last weeks winners: Barani Bikshandi ($100 Amazon gift certificate) Carlos Eduardo Sponchiado (Sponch) ($50 Amazon gift certificate) Aditya Varun Chadha […]

October 19, 2015
BarBQueTime3

Focusing on Search Quality at Lucene/Solr Revolution 2015

I just got back from Lucene/Solr Revolution 2015 in Austin on a big high. There were a lot of exciting talks at the conference this year, but one thing that was particularly exciting to me was the focus that I saw on search quality (accuracy and relevance), on the problem of inferring user intent from […]

October 19, 2015
big-data-gartner-hype-cycle

Data As a Virtuous Cycle

Deck from Lucidworks CEO Will Hayes’s opening remarks on the first day of Lucene/Solr Revolution 2015.

October 16, 2015
linkedin-apache-lucene-search-galene
dogs_poker
apache-solr-target

Quantifying Performance Gains When Batching Indexing Updates to Solr

Batching when indexing is good: For quite some time it’s been part of the lore that one should batch updates when indexing from SolrJ (the post tool too, but I digress). I recently had the occasion to write a test that put some numbers to this general understanding. As usual, YMMV. The interesting bit isn’t that the absolute […]

October 5, 2015
apache-lucene-approaching-join-index

Approaching Join Index in Apache Lucene

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Mikhail Khludnev’s session on joins and block-joins in Lucene. Lucene works great with independent text documents, but real life problems often require to handle relations between documents. Aside from several workarounds, […]

October 2, 2015
building-seo-sem-app-apache-solr

Building a Large Scale SEO/SEM Application with Apache Solr

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Rahul Jain’s session on indexing large scale SEO/SEM data. Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine’s […]

October 2, 2015
CC BY 2.0 http://www.cgpgrey.com/

Lasso Some Prizes by Stumping The Chump in Austin Texas

Professional Rodeo riders typically only have a few seconds to prove themselves and win big prizes. But you’ve still got two whole weeks to prove you can Stump The Chump with your tough Lucene/Solr questions, and earn both bragging rights and one of these prizes… 1st Prize: $100 Amazon gift certificate 2nd Prize: $50 Amazon […]

September 30, 2015
apache-solr-deduplication

How StubHub De-Dupes with Apache Solr

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting StubHub engineer Neeraj Jain’s session on de-duping in Solr. Stubhub handles large number of events and related documents. Use of Solr within Stubhub has grown from search for events/tickets to content […]

September 29, 2015
optimizing-apache-solr

Pushing the Limits of Apache Solr at Bloomberg

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Anirudha Jadhav’s session on going beyond the conventional constraints of Solr. The goal of the presentation is to delve into the implementation of Solr, with a focus on how to optimize […]

September 25, 2015
apache-solr-multi-tenant

How Bloomberg Scales Apache Solr in a Multi-tenant Environment

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Bloomberg engineer Harry Hight’s session on scaling Solr in a multi-tenant environment. Bloomberg Vault is a hosted communications archive and search solution, with over 2.5 billion documents in a 45TB Solr […]

September 23, 2015
managed-search-solr-getty

How Getty Images Executes Managed Search with Apache Solr

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Jacob Graves’s session on how they configure Apache Solr for managed search at Getty Images. The problem is to create a framework for business users that will: Hide technical complexity Allows control […]

September 22, 2015
apache-solr-lucene-multilingual-search

How CareerBuilder Executes Semantic and Multilingual Strategies with Apache Lucene/Solr

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Trey Grainger’s session on multilingual search at CareerBuilder. When searching on text, choosing the right CharFilters, Tokenizer, stemmers and other TokenFilters for each supported language is critical. Additional tools of the […]

September 18, 2015
lucidworks-fusion-2.1

Lucidworks Fusion 2.1 Now Available!

Our last release brought a slew of new features as well as a new user experience. With Fusion 2.1 LTS, we have polished these features, and tweaked the visual appearance and the interactions.

September 16, 2015
tuning-apache-solr-for-logs

Tuning Apache Solr for Log Analysis

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Radu Gheorghe’s session on tuning Solr for analyzing logs. Performance tuning is always nice for keeping your applications snappy and your costs down. This is especially the case for logs, social […]

September 15, 2015
reference-architecture-solr-siren-knowledge-graph

Searching and Querying Knowledge Graphs with Solr/SIREn: a Reference Architecture

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Tummarello Delbru and Giovanni Renaud’s session on querying knowledge graphs with Solr/SIREn: Knowledge Graphs have recently gained press coverage as information giants like Google, Facebook, Yahoo and Microsoft, announced having deployed […]

September 14, 2015
data-sciene-bias-infographic
austin.weird.fest.logo

Stump The Chump: Meet The Panel Keeping Austin Weird

As previously mentioned: On October 15th, Lucene/Solr Revolution 2015 will once again be hosting “Stump The Chump” in which I (The Chump) will be answering tough Solr questions — submitted by users like you — live, on stage, sight unseen. Today, I’m happy to announce the Panel of experts that will be challenging me with […]

September 14, 2015
secure-solr-apache-sentry

How Cloudera Secures Solr with Apache Sentry

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Cloudera’s Gregory Chanan’s session on TOPIC. Apache Solr, unlike other enterprise Big Data applications that it is increasingly deployed alongside, provides minimal security features out of the box. This limitation makes […]

September 10, 2015
max.asc_timing_zoom_featurebanner

Min/Max On Multi-Valued Field For Functions & Sorting

One of the new features added in Solr 5.3 is the ability to specify that you wanted Solr to use either the min or max value of a multi-valued numeric field — either to use directly (perhaps as a sort), or to incorporate into a larger more complex function. For example: Suppose you were periodically […]

September 10, 2015
apache-solr-media-metadata-search
search-parellism-apache-solr

Search-Time Parallelism at Etsy: An Experiment With Apache Lucene

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Shikhar Bhushan from Etsy’s experiments at Etsy with search-time parallelism. Is it possible to gain the parallelism benefit of sharding your data into multiple indexes, without actually sharding? Isn’t your Lucene […]

September 8, 2015
thoth-apache-solr-anaytics-search

Using Thoth as a Real-Time Solr Monitor and Search Analysis Engine

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Mhatre Braga and Praneet Damiano’s session on how Trulia uses Thoth and Solr for real-time monitoring and analysis. Managing a large and diversified Solr search infrastructure can be challenging and there […]

September 4, 2015