Search Results for: document/9b4e2ebc5859462b/gsoc_timeline_and_misc_advice_to_help_you_have_a_winning_proposal

Getting Started with Lucene Setup

acronyms and abbreviations can lead to better results. You may already have a synonym list or abbreviation expansions that can be leveraged Priorities/Importance: Do you have any a priori knowledge…

Tags:

Content Extraction with Tika

interface for parsing multiple formats. The Tika API hides the technical differences of the various parser implementations. This means that you don’t have to learn and consume one API for…

Tags:

Scaling Lucene and Solr

be to have a master for adding and updating documents on, and then n slave servers that you would replicate the master index to (actually just the changed files in…

Tags:

Solr Cloud Document Routing

grouping’s ngroups feature and joins require documents to be co-located in the same core or vm. For example to take advantage of the ngroups feature in grouping, documents need to…

Tags: , , ,

pipeline_preview_5a

When the mapping gets tough, the tough use JavaScript

well as logging stages, and helper stages for composing pipelines. And yet, sometimes ready-made isn’t good enough. Sometimes you need a bespoke processing stage, tailored to your data and your…

Interview with Ian Holsman of Relegence (AOL)

certain fields and insert the document, and that’s it. You don’t have to worry about the bounding box calculations. There’s no sine, cosine calculations in your code anymore. It’s just…

Tags: ,

Exploring Lucene's Indexing Code: Part 2

Using some basic IR knowledge, we know that addDocument is going to use our Analyzer to break up each field in the given document, and use the resulting terms to…

Optimizing Findability in Lucene and Solr

results in the first five or ten documents. The downside is you are executing two queries per every one input, plus you have to load and process the top documents…

Tags:

Debugging Search Application Relevance Issues

…to a large audience and let them kick the tires. You should only do this once you are reasonably confident things work well. It is imperative to have good logging…

Tags:

xml_transform_config

Noob* Notes: Fusion First Look

One input was processed and the index now contains 165 documents. I have managed to index fresh content! Search and Results As a first test, I do a keyword search…