Identifying and extract terms from textual source is an indispensable task in information retrival and question answering systems by experiments multi-word terms represent the best candidates to represent a specific domain in Arabic. In this research, we assumed that the Multi-Word Terms (MWTs) consist of words with similar contextual representations and we propose a hybrid method of extracting multi-word terms from Arabic texts combines between linguistic and semantic approach, based on word embeddings which we use a linguistic and morphosyntactic analysis of the Arabic language to find candidate terms and we use cosine similarity between distributed representation of words for ranking candidate terms. The proposed methodology has been tested in a case studies carried out in the environnemental domains with promising results.
El-Khadir Lamrani, El Habib Ben Lahmer and Abdelaziz Marzak. Using Semantic Similarity with Word Embeddings for
Arabic Multi-Words Term Extraction.
DOI: https://doi.org/10.36478/jeasci.2018.10092.10100
URL: https://www.makhillpublications.co/view-article/1816-949x/jeasci.2018.10092.10100