Work in WP8 concentrated on the application and further
development of several methods for concept-based cross-lingual
information retrieval. This includes completely automatic
methods that are independent of any manually constructed
resources, next to methods that rely on domain-specific
resources such as UMLS.
EIT applied their method for generating
a similarity thesaurus to the Springer corpus, using
a frequency analysis tool for finding domain-specific
terms.
CSLI worked on applying a bilingual,
vector-based word-space model to the Springer corpus.
An experimental demo
that allows word-space based access to this and other
corpora is available.
CMU implemented a hierarchical k-nearest
neighbor (HkNN) classification, which relies on the
MeSH (part of the UMLS Metathesaurus) hierarchy of
concepts.
DFKI worked on automatic semantic
annotation of the Springer corpus with UMLS terms
and semantic relations. EIT adapted its retrieval
algorithm to process these terms and semantic relations.
In this context, DFKI and EIT together set up an experimental
demo that gives cross-lingual access to semantically
annotated Springer abstracts.