.. _th: Sentiment lexicon-based model ================================= This model introduces a Sentiment Lexicon-based model into EvoMSA. The idea is to count the number of positive and negative words that appear on an affective lexicon. This model is appropriately described `here `_. This model has been implemented for Arabic, English, and Spanish, and can be used as follows: For example, let us read a dataset to train EvoMSA. >>> from EvoMSA import base >>> from microtc.utils import tweet_iterator >>> import os >>> tweets = os.path.join(os.path.dirname(base.__file__), 'tests', 'tweets.json') >>> D = list(tweet_iterator(tweets)) >>> X = [x['text'] for x in D] >>> y = [x['klass'] for x in D] Once the dataset is load, EvoMSA using lexicon model is trained as follows: >>> from EvoMSA.base import EvoMSA >>> evo = EvoMSA(TH=True, lang='es').fit(X, y) >>> evo.predict(['buenos dias']) Particularly, the following classes implement the lexicon-based model: * :py:class:`EvoMSA.model.ThumbsUpDownAr` * :py:class:`EvoMSA.model.ThumbsUpDownEn` * :py:class:`EvoMSA.model.ThumbsUpDownEs` These models can be tested as follow: >>> from EvoMSA.model import ThumbsUpDownEn >>> th = ThumbsUpDownEn() >>> th['good morning']