StackGeneralization API

https://github.com/INGEOTEC/EvoMSA/actions/workflows/test.yaml/badge.svg https://coveralls.io/repos/github/INGEOTEC/EvoMSA/badge.svg?branch=develop https://badge.fury.io/py/EvoMSA.svg https://dev.azure.com/conda-forge/feedstock-builds/_apis/build/status/evomsa-feedstock?branchName=main https://img.shields.io/conda/vn/conda-forge/evomsa.svg https://img.shields.io/conda/pn/conda-forge/evomsa.svg https://readthedocs.org/projects/evomsa/badge/?version=docs
class StackGeneralization[source]

The idea behind stack generalization is to train an estimator on the predictions made by the base classifiers or regressors.

Parameters:
  • decision_function_models (List of BoW or DenseBoW) – Represent the text by calling the decision function

  • transform_models (List of BoW or DenseBoW) – Represent the text by calling the transform

>>> from EvoMSA import DenseBoW, BoW, StackGeneralization
>>> from microtc.utils import tweet_iterator
>>> from EvoMSA.tests.test_base import TWEETS    
>>> emoji =  DenseBoW(lang='es', dataset=False, keyword=False)
>>> dataset = DenseBoW(lang='es', emoji=False, keyword=False)
>>> bow = BoW(lang='es')
>>> stacking = StackGeneralization(decision_function_models=[bow],
                                   transform_models=[dataset, emoji])
>>> stacking.fit(list(tweet_iterator(TWEETS)))
>>> stacking.predict(['Buenos dias']).tolist()
['P']
__new__(**kwargs)
__init__(decision_function_models: list = [], transform_models: list = [], decision_function_name: str = 'predict_proba', estimator_class=<class 'sklearn.naive_bayes.GaussianNB'>, estimator_kwargs={}, n_jobs: int = 1, **kwargs) None[source]
fit(*args, **kwargs) StackGeneralization[source]
>>> from EvoMSA import DenseBoW, BoW, StackGeneralization
>>> from microtc.utils import tweet_iterator
>>> from EvoMSA.tests.test_base import TWEETS    
>>> D = list(tweet_iterator(TWEETS))
>>> emoji =  DenseBoW(lang='es', dataset=False, keyword=False)
>>> dataset = DenseBoW(lang='es', emoji=False, keyword=False)
>>> bow = BoW(lang='es')
>>> stacking = StackGeneralization(decision_function_models=[bow],
                                   transform_models=[dataset, emoji]).fit(D)
transform(D: List[List | dict], y=None) ndarray[source]

Represent the texts in D in the vector space.

Parameters:

D (List of texts or dictionaries.) – Texts to be transformed. In the case, it is a list of dictionaries the text is on the key key

>>> from EvoMSA import DenseBoW, BoW, StackGeneralization
>>> from microtc.utils import tweet_iterator
>>> from EvoMSA.tests.test_base import TWEETS
>>> D = list(tweet_iterator(TWEETS))
>>> emoji =  DenseBoW(lang='es', dataset=False, keyword=False)
>>> dataset = DenseBoW(lang='es', emoji=False, keyword=False)
>>> bow = BoW(lang='es')
>>> df_models = [dataset, emoji, bow]
>>> stacking = StackGeneralization(decision_function_models=df_models).fit(D)
>>> stacking.transform(['buenos días'])
array([[-1.56701076, -0.95614898, -0.39118087, 0.45360793, -1.65985598,
        -1.08645745, -0.67770805, 0.9703371, -1.40547817, -1.01340492,
        -0.57912169, 0.90450232]])
property decision_function_models

These models create the vector space by calling the decision function.

property transform_models

These models create the vector space by calling the transform.