This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Thursday, October 13 • 2:20pm - 3:00pm
Evolving the Optimal Relevancy Scoring Model at Dice.com
A popular conference topic in recent years is using machine learned ranking (MLR) to re-rank the top results of a Solr query to improve relevancy. However, such approaches fail to first ensure that they have the optimal query configuration for their search engine, without which the re-ranked results may fail to contain the most relevant items for each query (lowering recall). Solr offers many configuration options to control how documents are ranked and scored in terms of relevancy to a user's query, including what boosts to assign to each field, and how strongly to boost phrasal matches. It is common for companies to manually tune these parameters to optimize relevancy, but this process is highly subjective and not guaranteed to produce the optimal results. We will show a data-driven approach to relevancy tuning that uses optimization algorithms, such as evolutionary algorithms, to evolve a query configuration that optimizes the relevancy of the results returned using data captured from our query logs. We will also discuss how we experimented with evolving a custom similarity algorithm to out-perform BM25 and tf.idf similarity on our dataset. Finally, we'll discuss the dangers of positive feedback loops when training machine learned ranking models.

avatar for Simon Hughes

Simon Hughes

Chief Data Scientist, Dice.com
I am currently the Chief Data Scientist at Dice.com, the technology professional recruiting site. I am also a PhD candidate at DePaul university, getting a PhD in machine learning and natural language processing. At Dice, I have developed multiple recommender engines using Solr for recommending jobs and candidates that are currently live on our site, as well as optimizing the accuracy and relevancy of our jobs and candidates search. I have also... Read More →

Thursday October 13, 2016 2:20pm - 3:00pm