The GDELT Project

ML-SENTICON's 5 Language Editions Now Available in GCAM

Over the last few weeks we've added the Spanish adaptation of ANEW and Hedonometer's Arabic, Chinese, French, German, Hindi, Indonesian, Korean, Pashto, Portuguese, Russian, Spanish, and Urdu dictionaries to GCAM's non-English emotional dictionaries.   Today we're excited to announce the addition of "ML-SENTICON: Multilingual layered sentiment lexicons at lemma level."  ML-SENTICON includes dictionaries for English, Spanish, Catalan, Basque, and Galician.  Unlike other emotional dictionaries, ML-SENTICON is "multi-layered, allowing applications to trade off between the amount of available words and the accuracy of the estimations."  In essence, unlike other emotional dictionaries that lump all words together under a given emotion, ML-SENTICON groups the words under an emotion into successive layers, with each layer adding additional, but lower-confidence, words, making it possible to tune for recall versus precision in emotional scoring.

For more information, read the full paper describing the dictionary.  When using any of these scores, cite them as "Cruz, Fermin L., Jose A. Troyano, Beatriz Pontes, F. Javier Ortega. Building layered, multilingual sentiment lexicons at synset and lemma levels, Expert Systems with Applications, 2014."