YorkSpace has migrated to a new version of its software. Access our Help Resources to learn how to use the refreshed site. Contact diginit@yorku.ca if you have any questions about the migration.
 

Solr Integration in the Anserini Information Retrieval Toolkit

Loading...
Thumbnail Image

Date

2019

Authors

Clancy, Ryan
Eskildsen, Toke
Ruest, Nick
Lin, Jimmy

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Anserini is an open-source information retrieval toolkit built around Lucene to facilitate replicable research. In this demonstration, we examine different architectures for Solr integration in order to address two current limitations of the system: the lack of an interactive search interface and support for distributed retrieval. Two architectures are explored: In the first approach, Anserini is used as a frontend to index directly into a running Solr instance. In the second approach, Lucene indexes built directly with Anserini can be copied into a Solr installation and placed under its management. We discuss the tradeoffs associated with each architecture and report the results of a performance evaluation comparing indexing throughput. To illustrate the additional capabilities enabled by Anserini/Solr integration, we present a search interface built using the open-source Blacklight discovery interface.

Description

Keywords

Distributed retrieval, SolrCloud, Solr, Information systems, Search engine architectures and scalabilityInformation retrieval, anserini, Blacklight, Lucene

Citation

Ryan Clancy, Toke Eskildsen, Nick Ruest, and Jimmy Lin. “Solr Integration in the Anserini Information Retrieval Toolkit.” Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (2019).
Ryan Clancy, Toke Eskildsen, Nick Ruest, and Jimmy Lin. “Solr Integration in the Anserini Information Retrieval Toolkit.” Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (2019).