WP8: Cross-Lingual Information Access


Work in WP8 concentrated on the application and further development of several methods for concept-based cross-lingual information retrieval. This includes completely automatic methods that are independent of any manually constructed resources, next to methods that rely on domain-specific resources such as UMLS.

EIT applied their method for generating a similarity thesaurus to the Springer corpus, using a frequency analysis tool for finding domain-specific terms.
CSLI worked on applying a bilingual, vector-based word-space model to the Springer corpus. An experimental demo that allows word-space based access to this and other corpora is available.
CMU implemented a hierarchical k-nearest neighbor (HkNN) classification, which relies on the MeSH (part of the UMLS Metathesaurus) hierarchy of concepts.
DFKI worked on automatic semantic annotation of the Springer corpus with UMLS terms and semantic relations. EIT adapted its retrieval algorithm to process these terms and semantic relations. In this context, DFKI and EIT together set up an experimental demo that gives cross-lingual access to semantically annotated Springer abstracts.
last modified, december 2001
