much.more about partners contacts home  

WP3: Test Data Preparation and Procedures


Work within WP3 has been completed. Deliverable D3.2: Performance Testing Plan has been submitted to the commission. It outlines the methodology and the tools that will be used to evaluate the effectiveness of the prototypes that are being built. Both large, so-called "TREC-style" tests for near-final or final prototypes and simpler tests for intermediate prototypes will be employed. It is planned to use well-established and proven measures for effectiveness, such as precision and recall, as well as known-item searches and overlap measures. The meaning of these tools for evaluation is well understood today, thanks to extensive research carried out in the past. The use of these popular measures allows us to maintain comparability with similar evaluations.

Deliverable D3.1 (Test Collection) originally only included the publicly available OHSUMED test collection, for which the corresponding set of queries was translated from English into German. However, in order to be able to evaluate on a truly bilingual corpus (both queries and documents are available in two languages), the test collection has been extended with a MUCHMORE specific, bilingual test collection that is based on a parallel corpus of scientific medical journal abstracts, obtained through the Springer Link web site. Remaining work on this corpus (pooling, relevance assessments) will be achieved in the context of work packages WP8 (Cross-Lingual Information Access) and WP9 (Performance Evaluation).


last modified, december 2001
more   close