The Alwis suite has been developed as part of our, Benedikt Grundmann and Daniel Fischer, bachelor theses. The suite consists of a multitude of applications which provide all necessary means to compare cluster based retrieval schemes. We provide tools for corpus preprocessing, i.e. bundling text files into a stream, filtering unwanted characters and terms out of such streams and building sparse matrices of such streams. Furthermore we provide tools to calculate models from the sparse matrices using different cluster based retrieval schemes. Here we implemented LSI, PLSI and Spectral Clustering.
The main application is a visualization tool which enables the user to interactively explore the calculated models. It puts emphasis on the comparison of single specific queries across several concept based retrieval schemes. Nevertheless the user has the ability to measure the quality of a concept based retrieval scheme using "classical" measurement techniques such as precision recall curves as well.
Our work was supervised by Holger Bast and his colleague Ingmar Weber.
Please see the respective bachelor theses for more details. The theoretical foundations and models are discussed in Benedikt Grundmann's paper (PDF, 261 KB). The design and the implementation are discussed in Daniel Fischer's paper (PDF, 1100 KB).
Alwis is published under the GNU General Public License (GPL).
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.
For a complete version of the GPL please see here or visit the Free Software Foundation website.
Additionally to the bachelor theses stated above you will find more detailed information on using the Alwis suite in the Alwis documentation. The documentation is available
There is a binary distribution of Alwis for Microsoft Windows. The fully automated installers will do all the work for you. There are two installers available. The complete installer contains all programs belonging to Alwis as well as sample corpora and models. The compact installer contains only the programs belonging to Alwis but no sample corpora and models in order to reduce the download size.
Alternatively you may chose to build Alwis yourself from the source, e.g. on platforms for which no binary distribution is available. You will find quite detailed build instructions in the manual. The source package itself does not include the sample corpora and models.
The previously released versions of Alwis are available for download as well. Since the sample corpora and models did not change significantly only the previous versions of the packages without the sample corpora and models are maintained.