Metaphor is a commonly used linguistic tool in human language, which compares two seemingly unlike things based on their similarities. People use metaphors in their everyday lives as well as fancy literature to deliver what they want to say more effectively. This importance of metaphor in human language has generated interest in computational modeling of metaphor in the natural language processing field. Properly processing metaphor allows developing more accurate and human-like language systems. Language applications such as machine translation, sentiment analysis, and dialog agents can benefit from metaphor processing.
The first three parts of the issue report introduce prior computational approaches to metaphor detection: selectional preferences violation, lexical abstractness and concreteness, and lexical cohesion violation. The idea behind the most popular approaches is selectional preference violation. Selectional preferences represent that a predicate has preferences for its arguments. For example, a verb “eat” prefers alive creatures such as a person or animal in its subject position and food in its object position. Metaphors, especially verb metaphors tend to violate the preferences, and this linguistic property of metaphor has been used widely for detecting metaphors, independently or together with other approaches. Next, lexical abstractness and concreteness idea is that metaphorically used words tend to have an unusual combination of words in terms of lexical concreteness. For example, a concrete noun is generally modified by a concrete adjective, but many metaphorical expressions show a pattern of an abstract noun modified by a concrete adjective. Lastly, unlike the previous two approaches that only use the information within a sentence, the lexical cohesion approach uses a wider context. Lexical cohesion represents that words in a text are semantically tied to the topic of the text. However, because metaphorically used words usually come from another domain, they tend to break the lexica cohesion of the text.
The next part describes key challenges in doing metaphor detection research. First, because metaphor is subjective, it is difficult to build an annotated dataset for training machine learning algorithms. This makes it hard to build a large corpus. Second, the existing metaphor datasets contain mostly dead metaphors. Models trained on that dataset could be beneficial for some language studies, but for downstream language technology applications such as machine translation or sentiment analysis, they could be less useful. Third, the current approaches only target specific metaphors. As most approaches use some sort of lexical expectation violation, they cannot detect metaphors when there are semantically similar metaphors that exist across sentences, or extended metaphors.
In summary, metaphor is central in human language. The computationally modeling metaphor will be beneficial for developing more effective and more human-like language applications. However, the current state-of the-art metaphor detection approaches have limitations to be deployed in real-world applications. Building a good corpus for the purpose of a target application as well as novel approaches to addressing diverse types of metaphors will be called for.