Friday, October 14 • 1:10pm - 1:50pm
HHypermap: Heatmap Analytics of a Billion Tweets
The Harvard Center for Geographic Analysis has established the HHypermap (Harvard Hypermap) system, comprised of multiple open-source projects aimed at searching vast amounts of spatial data. This talk centers on a system based on SolrCloud that can do realtime search on a billion Twitter tweets with heatmap analytics of sentiment analysis. The open-source system is designed to be suitable for social media data sets or sensor data.

Harvard CGA commissioned Apache Lucene/Solr's heatmap faceting capability in 2015 and this work now continues in 2016. The first new part is computing numeric stats per cell (not just doc counts), which can be used for a variety of applications. The second part is improving Lucene's grid cell indexing scheme to cater to heatmaps, thus allowing heatmap generation to be very fast for large data sets.

This talk discusses the system design/architecture as well as the spatial details on how Lucene/Solr was improved.

avatar for David Smiley

David Smiley

Search Developer & Consultant, D W Smiley LLC
David Smiley is a well recognized Apache Lucene/Solr expert. He wrote the first book on Solr (currently in 3rd edition), he's a Lucene/Solr committer and PMC member that improves Lucene and Solr, he speaks at conferences about it, he does training, and he offers part time independent... Read More →

Friday October 14, 2016 1:10pm - 1:50pm
Independence Sheraton Boston