Sonderforschungsbereich 732:

Project D11-N (2014-2018)

A Crosslingual Approach to the Analysis of Compound Nouns

Principal Investigators: Sabine Schulte im Walde, Lonneke van der Plas
Researchers: Stefan Müller, Patrick Ziering 

This project proposes a compositional approach to noun-noun (N-N) compound analysis with an interdependent three-level model that comprises compound splitting, capturing the meaning of the components and the covert relation that holds between them.

Ambiguity is found on all levels, with the highest ambiguity found on the level where the implicit relation is uncovered. The two possible split points in the German compound Kuhlerwartung,  Kuhl-erwartung (‘cool expectation’) vs. Kuhler-wartung  (‘radiator maintenance’) illustrate the ambiguity that arises at the level of compound splitting. The endless list of covert relations that can hold between the constituents of the compound becomes apparent when we look at the following examples: a chocolate cake is a cake made of chocolate, a wedding cake is a cake made for a wedding, and a cupcake is a cake made using a metal cup. 

Crosslingual approaches are promising for semantic analysis due to the regular variation found in different languages. For example, whereas English leaves the compound relation covert, in French we find prepositions that correlate with the relation type. Chocolate cake, cake made of chocolate, is translated with gateau au chocolat, whereas wedding cake, cake made for a wedding, is gateau de marriage. We will use multi-lingual data throughout the project, in analysis and evaluation.

We will work towards a wide-coverage integrational approach, using automatic, knowledge-lean, corpus-based methods.