Journal of Engineering and Applied Sciences

99
Views

0
Downloads

Using Semantic Similarity with Word Embeddings for Arabic Multi-Words Term Extraction

El-Khadir Lamrani, El Habib Ben Lahmer and Abdelaziz Marzak
Page: 10092-10100 | Received 21 Sep 2022, Published online: 21 Sep 2022

By Research Area

Medicine & Public Health Life Sciences Engineering Mathematics Biomedicine Physics Chemistry Computer Science Earth Sciences Social Sciences Business and Management Psychology Materials Science Economics Education Environment Philosophy Statistics Law Political Science and International Relations Pharmacy Dentistry Energy Linguistics Geography Finance Criminology and Criminal Justice Medicine Cultural and Media Studies History Architecture / Design Literature Biomedical Sciences Religious Studies Education & Language Food Science & Nutrition Public Health

By Volume and Issue

Abstract

Identifying and extract terms from textual source is an indispensable task in information retrival and question answering systems by experiments multi-word terms represent the best candidates to represent a specific domain in Arabic. In this research, we assumed that the Multi-Word Terms (MWTs) consist of words with similar contextual representations and we propose a hybrid method of extracting multi-word terms from Arabic texts combines between linguistic and semantic approach, based on word embeddings which we use a linguistic and morphosyntactic analysis of the Arabic language to find candidate terms and we use cosine similarity between distributed representation of words for ranking candidate terms. The proposed methodology has been tested in a case studies carried out in the environnemental domains with promising results.

How to cite this article:

El-Khadir Lamrani, El Habib Ben Lahmer and Abdelaziz Marzak. Using Semantic Similarity with Word Embeddings for Arabic Multi-Words Term Extraction.
DOI: https://doi.org/10.36478/jeasci.2018.10092.10100
URL: https://www.makhillpublications.co/view-article/1816-949x/jeasci.2018.10092.10100