Enhancing Lucene Search

Tags

Accessibility AI Analytics ARM Template Art Azure Blob Storage Brightcove Cache Caching Catching Exceptions Chatbot Compiling Content Search Continuous Integration Copyleft Creativity CSharp Datasource Design Patterns DevOps DotNet Editor Tabs Experience Editor Extensions Extranet Field Suite First Fun Generics Genetic Algorithm Griping HTML Integration Testing Interfaces Java Javascript JQuery Layouts Linq Localization Logic Lucene Machine Learning Microsoft Microsoft Cognitive Services MongoDB Multi Site MultivariateTesting Page Editor People Personalization PHP Propaganda SDN Security SharedSource Sheer UI Sitecore Sitecore Cognitive Services Sitecore Community Sitecore Dictionary Sitecore Events Sitecore Experience Sitecore Express Sitecore Fields Sitecore Marketplace Sitecore Marketplace Sitecore Modules Sitecore Rules Sitecore Security Sitecore Symposium Sitecore Upgrade Software SQL Stories Strategy Sublayouts SVN Tactics Templates Test Driven Development Umbraco Unit Testing UserManager War Web Controls Web Testing WebApp Website What Whom Why Wysiwyg Xaml xDB XML XPath Yellow Screen Of Death YouTube

< Previous Post Next Post >

October 23, 2012

Tags: Sitecore, Sitecore Symposium

So I'm attending the "Enhancing search for Lucene" because I've been using Lucene lightly for years but haven't had the time to fully vet it for more mainstream use. I love it for its speed and flexibility but I have lingering questions about the memory/cpu costs while rebuilding large indexes on production servers. Overall I just want to get a feel about how other people are using it and any pitfalls they've run into.

The talk starts with a use case which as a developer always makes sense because there's a certain level of nuance that is hard to replicate with hypotheticals. The challenges they focused on were indexing new information quickly and being able to filters useful information from large document counts with little or no meta data.

To help users filter their large pool of documents, they're using additive filters in the right column on information like date ranges and authors which can be used in combination to further refine the result list. They're also using lazy loading of more detailed result information to decrease the html footprint and inherently the page load time as well as caching those individual requests to improve repeated use.

When the presenters dove into their breakdown they revealed that they were using their own custom search indexer which is a great example of the flexibility of Lucene. They also did a great job of breaking up their filter code to be more modular by creating an interface that all filters should implement. The sublayout data source is how they're able to differentiate pages that will be reusing the search functionality with different result sets. One of the most interesting pieces is the user interface within Sitecore that actually allows an editor to configure a new search page. They get a lot of points for accommodating that level of user control.

Overall it was a nice solution. They broke down their solution in a very MVC way. I also did get my question answered: they never needed to run a full index rebuild and only relied on the incremental updates from the publishing events which for me was all I needed to hear. I have a plan for further integration with Lucene in a number of ways and this was able to put my assuage my fears about the continual maintenance in a products environment.

< Previous Post Next Post >

Enhancing Lucene Search

Recent Posts

Recent Comments

Archives

Tags