Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with FrenchReport as inadecuate




Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French - Download this document for free, or read online. Document in PDF available to download.

1 Computer Science Department Stanford 2 Linguistics Department

Abstract : Multiword expressions MWE, a known nui-sance for both linguistics and NLP, blur the lines between syntax and semantics. Previous work on MWE identification has relied primar-ily on surface statistics, which perform poorly for longer MWEs and cannot model discontin-uous expressions. To address these problems, we show that even the simplest parsing mod-els can effectively identify MWEs of arbitrary length, and that Tree Substitution Grammars achieve the best results. Our experiments show a 36.4% F1 absolute improvement for French over an n-gram surface statistics baseline, cur-rently the predominant method for MWE iden-tification. Our models are useful for several NLP tasks in which MWE pre-grouping has improved accuracy.

Keywords : lexicon-grammar multiword expression





Author: Spence Green - Marie-Catherine De Marneffe - John Bauer - Christopher D. Manning -

Source: https://hal.archives-ouvertes.fr/



DOWNLOAD PDF




Related documents