TY - JOUR T1 - A Comparative Analysis of Euclidean Distance and Cosine Similarity Measure for Automated Essay-Type Grading AU - E. Oduntan, Odunayo. AU - Obe, Olumide O. AU - S. Falohun, Adeleye AU - A. Adeyanju, Ibrahim JO - Journal of Engineering and Applied Sciences VL - 13 IS - 11 SP - 4198 EP - 4204 PY - 2018 DA - 2001/08/19 SN - 1816-949x DO - jeasci.2018.4198.4204 UR - https://makhillpublications.co/view-article.php?doi=jeasci.2018.4198.4204 KW - reduced vector KW -modified principal component algorithm KW -automated essay-type grading system KW -Euclidean distance measure KW -cosine similarity measure KW -Evaluation AB - Evaluation of student’s performance is inevitable in any educational setting, allocating scores to student’s response is a function of how close the answer supplied to the question is to expected answer. This study delves into analyzing the effectiveness of cosine similarity measure and Euclidean distance which are both used in similarity measures for Automated Essay Type Grading System (AETGS). AETGS involves transcription of the contents of the marking schemes into electronic form to derive a txt file extension using text editor while student’s answers assumed txt format. The inherent stopwords and stemming in the txt document were pre-processed to address morphological variations using standard stopwords list and porters stemmer algorithm, respectively. N-gram terms were derived for each student’s response and the Marking Schemes (MS) using the vector space model. A Document Term Matrix (DTM) was generated with N-gram terms of MS and students response representing columns and rows, respectively. Modified principal component analysis algorithm was used to reduce the sparseness of the DTM to obtain a vector representation of the student’s answers and the marking scheme. The reduced vector representation of the student’s answers was graded according to the mark assigned to each question in the marking scheme using cosine similarity measure and the Euclidean distance measure. The developed Automated Essay-Type Grading System (AETGS) was implemented in Matrix Laboratory 8.1 (R2013a). The effect of the similarity measures on the developed system was performed using Pearson Correlation coefficient of two courses: CMP401-Organization of Programming Languages and CMP205-Operating System I. The result showed that cosine similarity measure has a high positive correlation than the Euclidean distance. ER -