In the context of WP7.1 (Multilingual Terminology Extraction),
each of the partners involved in this work package worked
on the modification of their bilingual term extraction tools
to work with project specific requirements. This includes
adaptation to the German/English pair, integration of a
bilingual thesaurus, sentence alignment of the Springer
corpus, the development of domain-specific morphological
resources and the development of new tools for bilingual
terminology extraction from comparable corpora. In addition,
WP7.1 covers the respective assessment of thesauri in different
languages. Some of the tools developed for bilingual terminology
extraction will be adapted to this end.
Developments in WP7.2 (Relation Extraction) include a quantitative
and qualitative evaluation of the distribution and usefulness
of UMLS-based semantic relations in the Springer corpus.
The evaluation showed that most relations were not very
useful. Therefore, further work will concentrate on extracting
novel relations through a co-occurrence analysis over concepts
within abstracts and individual sentences. Also, the extraction
and tagging of semantic relations will be improved by the
use of grammatical relations and by sense disambiguation.
In respect to grammatical relation tagging, an evaluation
corpus was developed that covers around 600 sentences from
the German part of the Springer corpus. Sentences are manually
annotated with grammatical relations, such as subject, object,
indirect object and various PP-argument roles.
|