Text Classifier Competitions¶

https://coveralls.io/repos/github/INGEOTEC/EvoMSA/badge.svg?branch=develop

https://dev.azure.com/conda-forge/feedstock-builds/_apis/build/status/evomsa-feedstock?branchName=main

https://img.shields.io/conda/vn/conda-forge/evomsa.svg

https://img.shields.io/conda/pn/conda-forge/evomsa.svg

https://readthedocs.org/projects/evomsa/badge/?version=docs

Text classification (TC) is a Natural Language Processing (NLP) task focused on identifying a text’s label. A standard approach to tackle text classification problems is to pose it as a supervised learning problem. In supervised learning, everything starts with a dataset composed of pairs of inputs and outputs; in this case, the inputs are texts, and the outputs correspond to the associated labels. The aim is that the developed algorithm can automatically assign a label to any given text independently, whether it was in the original dataset. The feasible classes are only those found on the original dataset. In some circumstances, the method can also inform the confidence it has in its prediction so the user can decide whether to use or discard it.

Following a supervised learning approach requires that the input is in amenable representation for the learning algorithm; usually, this could be a vector. One of the most common methods to represent a text into a vector is to use a Bag of Word (BoW) model, which works by having a fixed vocabulary where each component represents an element in the vocabulary and the presence of it in the text is given by a non-zero value.

The text classifier’s performance depends on the representation quality and the classifier used. Deciding which representation and algorithm to use is daunting; in this contribution, we describe a set of classifiers that can be used, out of the box, for a new text classification problem. These classifiers are based on the BoW model. Nonetheless, some methods, namely DenseBoW, represent the text following two stages. The first one uses a set of BoW models and classifiers trained on self-supervised problems, where each task predicts the presence of a particular token. Consequently, the text is presented in a vector where each component is associated with a token, and the existence of it is encoded in the value. The methods used BoW models, and DenseBoW were combined using a stack generalization approach, namely StackGeneralization.

The text classifiers presented have been tested in many text classifier competitions without modifications. The aim is to offer a better understanding of how these algorithms perform in a new situation and what would be the difference in performance with an algorithm tailored to the new problem. We test 13 different algorithms for each task of each competition. The configuration having the best performance was submitted to the contest. The best performance was computed using either a k-fold cross-validation or a validation set, depending on the information provided by the challenge.

Results¶

Following an unconventional approach, the performance of EvoMSA 2.0 in different competitions is presented before describing the parameters used and the challenges. The following table presents the performance; it includes the performance of the system that wins the competition, the performance of EvoMSA 2.0, and the difference between them in percentage.

EvoMSA 2.0 Performance in different competitions.¶
Competitions	Edition	Score	Winner	EvoMSA 2.0	Difference
HaSpeeDe3 (textual)	2023	macro-\(f_1\)	0.9128	0.8845 (`Conf.`)	3.2%
HaSpeeDe3 (XReligiousHate)	2023	macro-\(f_1\)	0.6525	0.5522 (`Conf.`)	18.2%
HODI	2023	macro-\(f_1\)	0.81079	0.71527 (`Conf.`)	13.4%
ACTI	2023	Accuracy	0.85712	0.78207 (`Conf.`)	9.6%
PoliticIT (Global)	2023		0.824057	0.762001	8.1%
PoliticIT (Gender)	2023	macro-\(f_1\)	0.824287	0.732259 (`Conf.`)	12.6%
PoliticIT (Ideology Binary)	2023	macro-\(f_1\)	0.928223	0.848525 (`Conf.`)	9.4%
PoliticIT (Ideology Multiclass)	2023	macro-\(f_1\)	0.751477	0.705220 (`Conf.`)	6.6%
PoliticEs (Global)	2023		0.811319	0.777584	4.3%
PoliticEs (Gender)	2023	macro-\(f_1\)	0.829633	0.711549 (`Conf.`)	16.6%
PoliticEs (Profession)	2023	macro-\(f_1\)	0.860824	0.837945 (`Conf.`)	2.7%
PoliticEs (Ideology Binary)	2023	macro-\(f_1\)	0.896715	0.891394 (`Conf.`)	0.6%
PoliticEs (Ideology Multiclass)	2023	macro-\(f_1\)	0.691334	0.669448 (`Conf.`)	3.3%
DA-VINCIS	2023	\(f_1\)	0.9264	0.8903 (`Conf.`)	4.1%
DA-VINCIS	2022	\(f_1\)	0.7817	0.7510 (`Conf.`)	4.1%
Rest-Mex (Global)	2023	see overview	0.7790190145	0.7375714730	5.6%
Rest-Mex (Polarity)	2023	see overview	0.621691991	0.554880778 (`Conf.`)	12.0%
Rest-Mex (Type)	2023	see overview	0.99032231	0.980539122 (`Conf.`)	1.0%
Rest-Mex (Country)	2023	see overview	0.942028113	0.927052594 (`Conf.`)	1.6%
HOMO-MEX	2023	macro-\(f_1\)	0.8847	0.8050 (`Conf.`)	9.9%
HOPE (ES)	2023	macro-\(f_1\)	0.9161	0.5214 (`Conf.`)	75.7%
HOPE (EN)	2023	macro-\(f_1\)	0.5012	0.4651 (`Conf.`)	7.8%
DIPROMATS (ES)	2023	\(f_1\)	0.8089	0.7485 (`Conf.`)	8.1%
DIPROMATS (EN)	2023	\(f_1\)	0.8090	0.7255 (`Conf.`)	11.5%
HUHU	2023	\(f_1\)	0.820	0.775 (`Conf.`)	5.8%
TASS	2017	macro-\(f_1\)	0.577	0.525 (`Conf.`)	9.9%
EDOS (A)	2023	macro-\(f_1\)	0.8746	0.7890 (`Conf.`)	10.8%
EDOS (B)	2023	macro-\(f_1\)	0.7326	0.5413 (`Conf.`)	35.3%
EDOS (C)	2023	macro-\(f_1\)	0.5606	0.3388 (`Conf.`)	65.5%

Text Classifier Competitions¶

Results¶

Competitions¶

Table of Contents

Previous topic

Next topic

This Page