Capreolus is a toolkit for facilitating end-to-end neural ad hoc retrieval experiments. Capreolus provides fine control over the entire experimental pipeline through the use of interchangeable and configurable modules, allowing researchers to both easily experiment with existing methods (e.g., rerankers, first stage rankers, indexes, etc.) and implement their own methods within a configurable pipeline.
Pre-trained language models like BERT have been demonstrated to be both effective and efficient ranking methods when combined with approximate nearest neighbor search, which can quickly match dense representations of queries and documents. However, Pre-trained language models alone do not fully capture information about uncommon entities. EVA overcomes this limitation for enriching existing dense query and document representations with entity information from an external source. To handle documents that contain many loosely-related entities, EVA employs a strategy for creating multiple entity representations that reflect different views of a document. For example, a document about a scientist may cover aspects of her personal life and recent work, which correspond to different views of the entity.
To learn more, read the EVA paper or view the code on GitHub.