Proposal of Japanese Vocabulary Difficulty Level Dictionaries for Automated Essay Scoring Support System Using Rubric

Expand
  • 1 School of Contemporary International Studies, Nagoya University of Foreign Studies, Iwasaki-cho, Nisshin 470-0197, Japan;
    2 School of Media and Design, Nagoya University of Arts and Sciences, Nagoya 470-0196, Japan;
    3 Faculty of Science and Engineering, Nanzan University, Nagoya 466-8673, Japan

Received date: 2018-12-18

  Revised date: 2019-07-16

  Online published: 2020-12-29

Supported by

This work was supported by JSPS KAKENHI (Nos. 18K11589, 17K00432).

Abstract

We are developing a Moodle plug-in, which is an AES (automated essay scoring) support system for the basic education of university students. Our system evaluates essays based on rubric, which has five evaluation viewpoints “Contents, Structure, Evidence, Style, and Skill”. Vocabulary level is one of the scoring items of Skill. It is calculated using Japanese Language Learners’ Dictionaries constructed by Sunakawa et al. Since this does not fully cover the words used in the student-level essays, we found that there is a problem with the accuracy of the vocabulary level scoring. In this paper, we propose to construct comprehensive Japanese vocabulary difficulty level dictionaries using Japanese Wikipedia as the corpus. We apply Latent Dirichlet Allocation (LDA) to the Wikipedia corpus and find the word appearance probability as oneoftheindexesofworddifficulty.WeusetheTF-IDFvalueinsteadoftheLDAvalue of the words, which rarely appears. As a result, we constructed highly comprehensive Japanese vocabulary difficulty level dictionaries. We confirmed that the vocabulary levelcanbescoredforallwordsinthetestdatasetbyusingtheconstructeddictionaries.

Cite this article

Megumi Yamamoto, Nobuo Umemura, Hiroyuki Kawano . Proposal of Japanese Vocabulary Difficulty Level Dictionaries for Automated Essay Scoring Support System Using Rubric[J]. Journal of the Operations Research Society of China, 2020 , 8(4) : 601 -617 . DOI: 10.1007/s40305-019-00270-z

References

[1] Corpus Survey:Well-known and influential corpora, http://www.lancaster.ac.uk/-fass/projects/corpus/cbls/corpora.asp (2018-04-30)
[2] Kyoto University Text Corpus Version 4.0, http://shachi.org/resources/4227(2018-04-30) (in Japanese)
[3] Introduction to the BCCWJ, http://pj.ninjal.ac.jp/corpus_center/bccwj/freq-list.html (2018-04-30)
[4] Yigal, A., Jill, B.:Automated essay scoring with e-rater? J. Technol. Learn. Assess. 4(3), 3-30(2006)
[5] Breland, H.M.:Word frequency and word difficulty, a comparison of counts in Four Corpora. Psychol. Sci. 7(2), 96-99(1996)
[6] Ishioka, T., Kameda, M.:Automated Japanese essay scoring system based on articles written by experts. In:Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp. 233-240(2006)
[7] Yamamoto, M., Umemura, N., Kawano, H.:Automated essay scoring system based on Rubric. Stud. Comput. Intell. Appl. Comput. Inf. Technol. 727, 177-190(2017)
[8] Yamamoto, M., Umemura, N., Kawano, H.:Implementation of automated report scoring system based on Rubric. In:The 79th National Convention of Information Processing Society of Japan, Proceeding DVD (2017) (in Japanese)
[9] Yamamoto, M., Umemura, N., Kawano, H.:Development and evaluation of the plugin for automated scoring of reports. In:Proceedings of Moodle Moot Japan 2017 Annual Conference, pp. 16-21(2017) (in Japanese)
[10] Sunakawa, Y., Lee, J., Takahara, M.:The construction of a database to support the compilation of Japanese learners' dictionaries. Acta Linguist. Asiat. 2(2), 97-115(2012)
[11] Qi, L., Xu, Z., Yang, Q.:Preface. J. Oper. Res. Soc. Chin. 5(1), 1-2(2017)
[12] Li, Q., He, Y., Wu, L., Wang, R.:Robust PCA for ground moving target indication in wide-area surveillance radar system. J. Oper. Res. Soc. Chin. 1(1), 135-153(2013)
[13] Kanasugi, T., Kasahara, K., Inago, N., Amano, S.:Selection of a basic vocabulary based on word familiarity ratings. In:IEICE, vol. 19, no. 6, pp. 502-510(2004) (in Japanese)
[14] Kondo, T., Amano, S.:Lexical properties of Japanese, Nihongo-no Goitokusei:significance and problems CIEICE technical report. Thought Lang. 100, 1-8(2000). (in Japanese)
[15] Kajiwara, T., Komachi, M.:Simple PPDB:Japanese. In:Proceedings of the Twenty-third Annual Meeting of the Association for Natural Language Processing, pp. 529-532(2017) (in Japanese)
[16] Takigawa, M., Yamana, H.:A proposal of word weighting method in specific field and its adoption for the estimation of users' expertise appearing in their tweets. In:FIT2016, pp. 1-7(2016) (in Japanese)
[17] Stephen, R., Hugo, Z.:The probabilistic relevance framework:BM25 and beyond. J. Found. Trends Inf. 3, 333-389(2009)
[18] Iwata, T.:Topic Models. Kodansha, Tokyo (2016). (in Japanese)
[19] Matsukawa, H., Oyama, M., Negishi, C., Arai, Y., Iwasaki, C., Hotta, H.:Analysis of free descriptions of course evaluation questionnaires using topic model. In:Japan Journal of Educational Technology, vol. 41, no. 3, pp. 233-244(2017) (in Japanese)
Options
Outlines

/