Lucidworks Fusion 2.4 Ready For Download

June 9, 2016

Read More

Latest

large-scale-log-analytics

Large Scale Log Analytics with Solr

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Sematext’s Radu Gheorghe and Rafał Kuć’s talk, “Large Scale Log Analytics with Solr”. Radu and Rafał will also be presenting at Lucene/Solr Revolution 2016. This talk is about searching and analyzing […]

August 25, 2016
treemap

Solr Troubleshooting: Treemap Approach

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting a talk from the newest Solr committer, Alexandre Rafalovitch. Alexandre will also be presenting at Lucene/Solr Revolution 2016. Solr is too big of a product to troubleshoot as if it were […]

August 23, 2016
search-machine-learning

Where Search Meets Machine Learning

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Verizon’s Joaquin Delgado and Diana Hu’s talk, “Where Search Meets Machine Learning”. Joaquin and Diana discuss ML-Scoring, an open source framework they’ve created that tightly integrates machine learning models into popular […]

August 22, 2016
PastedGraphic-10

Learning to Rank in Solr

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Bloomberg’s Michael Nilsson and Diego Ceccarelli’s talk, “Learning to Rank in Solr”. In information retrieval systems, learning to rank is used to re-rank the top X retrieved documents using trained machine […]

August 17, 2016
PastedGraphic-9

Lessons from Sharding Solr at Etsy

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Gregg Donovan’s session, “Lessons from Sharding Solr at Etsy”. Gregg covers the following lessons learned at Etsy while sharding Solr: How to enable SolrJ to handle distributed search fanout and merge; […]

August 16, 2016
solr-sparksql-datasource

Solr as SparkSQL DataSource, Part II

Solr as a SparkSQL DataSource Part II Co-authored with Kiran Chitturi, Lucidworks Data Engineer Last August, we introduced you to Lucidworks’ spark-solr open source project for integrating Apache Spark and Apache Solr, see: Part I. To recap, we introduced Solr as a SparkSQL Data Source and focused mainly on read / query operations. In this […]

August 16, 2016
parallel-sql

Parallel Computing in SolrCloud

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Joel Bernstein’s session about Parallel Computing in SolrCloud. This presentation provides a deep dive into SolrCloud’s parallel computing capabilities – breaking down the framework into four main areas: shuffling, worker collections, […]

August 15, 2016
Jimi Cropped

Pivoting to the Query: Using Pivot Facets to build a Multi-Field Suggester

Suggesters, also known as autocomplete, typeahead or “predictive search” are powerful ways to accelerate the conversation between user and search application. Querying a search application is a little like a guessing game – the user formulates a query that they hope will bring back what they want – but sometimes there is an element of […]

August 12, 2016
PastedGraphic-3

Queue Based Indexing & Collection Management at Gannett

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Devansh Dhutia’s session on how Gannet manages schema changes to large Solr collections. Deploying schema-changes to solr collections with large volumes of data can be problematic when the reindex activity can […]

August 11, 2016
20151014-Carlo_ACLucidWorks_Austin_2015-28

10 Things You Don’t Want to Miss at Lucene/Solr Revolution 2016

Are we the only ones who feel like this summer is flying by? While the thought of saying goodbye to warm summer days pains us a little inside, we are excited that Lucene/Solr Revolution 2016 is just two short months away! The conference will be held October 11-14 in Boston, MA. If you haven’t secured […]

August 10, 2016
lucidworks-view-page-sceencap
lucidworks-view-page-sceencap

Search Hub 2.0 Public Beta

Introduction For quite some time now, Lucidworks has been hosting a community site named Search Hub (aka LucidFind) that consists of a searchable archive of a number of Apache Software Foundation mailing lists, source code repositories and wiki pages, as well as related content that we’ve deemed beneficial. Previously, we’ve had three goals in building […]

June 27, 2016
jeff-depa-lucidworks

Welcome Jeff Depa!

We’re happy to announce another new addition to the Lucidworks team! Please welcome Jeff Depa, our new Senior Vice President of Worldwide Field Operations in May 2016 (full press release: Lucidworks Appoints Search Veterans to Senior Team). Jeff will lead the company’s day-to-day field operations, including its rapidly growing sales, alliances and channels, systems engineering and […]

May 24, 2016
realm-config-saml-1

Secure Fusion: Single Sign-On

Single Sign-On (SSO) mechanisms allow a user to use the same ID and password to gain access to a connected system or systems. In a web services or distributed computing environment, single sign-on can only be achieved by registering information about the sign-on authority with all systems that require its services. The previous article in […]

May 9, 2016
lucene-text-analyzers

Better Feature Engineering with Spark, Solr, and Lucene Analyzers

This blog post is about new features in the Lucidworks spark-solr open source toolkit. For an introduction to the spark-solr project, see Solr as an Apache Spark SQL DataSource Performing text analysis in Spark The Lucidworks spark-solr open source toolkit now contains tools to break down full text into words a.k.a. tokens using Lucene’s text […]

April 13, 2016
whats-new-solr-6

Apache Solr 6 Is Released! Here’s What’s New:

Happy Friday – Apache Solr 6 just released!  From the official announcement: “Solr 6.0.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html “See the CHANGES.txt “Solr 6.0 Release Highlights: Improved defaults for “Similarity” used in Solr, in order to provide better default experience for new users. Improved “Similarity” defaults for users upgrading: DefaultSimilarityFactory has been removed, implicit […]

April 8, 2016
ldap_dit_example

Secure Fusion: Leveraging LDAP

This is the third in a series of articles on securing your data in Lucidworks Fusion. Secure Fusion: SSL Configuration covers transport layer security and Secure Fusion: Authentication and Authorization covers general application-level security mechanisms in Fusion. This article shows you how Fusion can be configured to use an LDAP server for authentication and authorization. […]

March 23, 2016
fusion_access_users

Secure Fusion: Authentication and Authorization

This is the second in a series of articles on securing your data in Lucidworks Fusion. Here’s Part One. This post covers Fusion’s basic application-level security mechanisms. At the application layer, Fusions delivers security via: Authentication – users must sign on using a username and password. Authorization – each username is associated with one or […]

March 10, 2016
solr-developer-survey-2015-thumb

2015 Solr Developer Survey

Our annual snapshot of the vibrant Solr community and the amazing things developers all over the world are doing with Apache Solr.

February 24, 2016
fusion-ssl

Secure Fusion: SSL Configuration

This is the first in a series of articles on securing your data in Lucidworks Fusion. The first step in securing your data is to make sure that all data sent to and from Fusion is encrypted by using HTTPS and SSL instead of regular HTTP. Because this encryption happens at the transport layer, not […]

February 23, 2016
Lucidworks_Dark Data__Pitch 1
solr-daterangefield

Solr’s DateRangeField, How Does It Perform?

Solr’s DateRangeField I have to credit David Smiley as co-author here. First of all, he’s largely responsible for the spatial functionality and second he’s been very generous explaining some details here. Mistakes are my responsibility of course. Solr has had a new DateRangeField for quite some time (since 5.0, see SOLR-6103. DateRangeFields are based on […]

February 13, 2016
trey-grainger

Welcome Trey Grainger!

Trey Grainger has joined as Lucidworks SVP of Engineering where he’ll be heading up our engineering efforts for both open source Apache Lucene/Solr and our Lucidworks Fusion platform.

February 10, 2016
query-solr-config

Fusion plus Solr Suggesters for More Search, Less Typing

The Solr suggester search component was previously discussed on this blog in the post Solr Suggester by Solr committer Erick Erickson. This post shows how to add a Solr suggester component to a Fusion query pipeline in order to provide the kind of auto-complete functionality expected from a modern search app. By auto-complete we mean […]

February 4, 2016
apache-solr-history
/browse
jake_profile

Jake Mannix Joins Lucidworks as Principal Data Engineer

We are pleased to welcome Jake Mannix to the Lucidworks team as our new Principal Data Engineer. Jake’s past work includes: Working on data pipelining with Apache Spark to scale a semantic search engine at the Allen Institute for Artificial Intelligence. Jake was tech lead for Twitter’s Data Science / Data Engineering team, building both the […]

January 20, 2016
understanding-transaction-logs-softcommit-and-commit

Top Blog Posts of 2015

The best of the blog from the past year including Solr authentication, Hadoop connectors, facets and stats, Docker, Spark, and increasing indexing and performance.

January 13, 2016
developeronfire

Erik Hatcher, Developer on Fire, Profiled by Dave Rael

Lucidworks engineer, Apache Lucene committer, and co-author of Lucene in Action as well as co-author of Java Development with Ant is profiled on Dave Rael’s Developer on Fire podcast. Subscribe or listen on iTunes or via the podcast’s feed.

January 9, 2016
babar-christmas-cropped

Open Source Hadoop Connectors for Solr

Lucidworks is happy to announce that several of our connectors for indexing content from Hadoop to Solr are now open source. We have six of them, with support for Spark, Hive, Pig, HBase, Storm and HDFS, all available in Github. All of them work with Solr 5.x, and include options for Kerberos-secured environments if required. HDFS for Solr […]

December 17, 2015
make-money-from-home

Data Security and Human Insecurities: How Scammers Take Advantage

Lucidworks CEO Will Hayes latest Forbes columns looks at the ways scammers take advantage of the big holes in big data to prey on all of us: “The immense amount of data we expose about ourselves make it incredibly easy to get targeted. … These profiles make it easier than ever for up-to-no-gooders to target us […]

December 16, 2015
/browse
Pasted image at 2015_11_20 13_03
Declaration of Independence segment

Query Autofiltering IV: – A Novel Approach to Natural Language Processing

This is my fourth blog post on a technique that I call Query Autofiltering. The basic idea is that we can use meta information stored within the Solr/Lucene index itself (in the form of string or non-tokenized text fields) to generate a knowledge base from which we can parse user queries and map phrases within […]

November 19, 2015
solr-docker

Solr on Docker

It is now even easier to get started with Solr: you can run Solr on Docker with a single command.

November 3, 2015

Stump The Chump: Austin Winners

Last week was another great Stump the Chump session at Lucene/Solr Revolution in Austin. After a nice weekend of playing tourist and eating great BBQ, today I’m back at my computer and happy to announce last weeks winners: Barani Bikshandi ($100 Amazon gift certificate) Carlos Eduardo Sponchiado (Sponch) ($50 Amazon gift certificate) Aditya Varun Chadha […]

October 19, 2015
BarBQueTime3

Focusing on Search Quality at Lucene/Solr Revolution 2015

I just got back from Lucene/Solr Revolution 2015 in Austin on a big high. There were a lot of exciting talks at the conference this year, but one thing that was particularly exciting to me was the focus that I saw on search quality (accuracy and relevance), on the problem of inferring user intent from […]

October 19, 2015
big-data-gartner-hype-cycle

Data As a Virtuous Cycle

Deck from Lucidworks CEO Will Hayes’s opening remarks on the first day of Lucene/Solr Revolution 2015.

October 16, 2015
linkedin-apache-lucene-search-galene
dogs_poker
apache-solr-target

Quantifying Performance Gains When Batching Indexing Updates to Solr

Batching when indexing is good: For quite some time it’s been part of the lore that one should batch updates when indexing from SolrJ (the post tool too, but I digress). I recently had the occasion to write a test that put some numbers to this general understanding. As usual, YMMV. The interesting bit isn’t that the absolute […]

October 5, 2015
apache-lucene-approaching-join-index

Approaching Join Index in Apache Lucene

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Mikhail Khludnev’s session on joins and block-joins in Lucene. Lucene works great with independent text documents, but real life problems often require to handle relations between documents. Aside from several workarounds, […]

October 2, 2015
building-seo-sem-app-apache-solr

Building a Large Scale SEO/SEM Application with Apache Solr

As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Rahul Jain’s session on indexing large scale SEO/SEM data. Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine’s […]

October 2, 2015