This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Thursday, October 13 • 10:50am - 11:30am
Loading 350M documents into a large Solr cluster in 8 hours or less
This session is a Case Study that shows you how a large set of xml documents can be loaded into a multi-collection Solr cluster in a fast, efficient and controlled way.

The presenter will show how Solr is used within his organization and then explains how his team started out with loading content into their SolrCloud using the standard post.jar tool, which has some concealed limitations.

You will see how this led to their current solution that exists of multiple cloud-aware "content posting" worker-processes, controlled by a clever master-less queuing system in ZooKeeper. Also, the presenter will cover how to load content into a busy Solr cluster, without affecting the response times of running queries too much.

avatar for Dion Olsthoorn

Dion Olsthoorn

Senior Software Engineer, Wolters Kluwer
Dion Olsthoorn works as a Software Engineer for Wolters Kluwer, a publisher for professional content. | He’s currently working on Ovid®, an online information delivery platform for medical research, were he and his team are responsible for building, enhancing and maintaining a large Solr cluster. | Dion has 20+ years of experience in software development (mostly web) and is specialized in enterprise search systems.

Thursday October 13, 2016 10:50am - 11:30am
Back Bay A Sheraton Boston