Dra. Daniela Moctezuma

Since September 2014, Daniela works at CONACYT as a researcher and is commissioned to CentroGEO. She is part of INGEOTEC research group.

Daniela principal research interests is related to Computer Vision, Pattern Recognition, Intelligent Video Surveillance and Text classification.

INGEOTEC research interest is text categorization seen as a supervised learning problem, that is, as a classification task. In this problem, we have developed two text modeling techniques that represent the text in a vector space model and use a Support Vector Machine as a classifier. These techniques are B4MSA which is a sentiment analysis classifier and microTC a general text classifier. In addtion this, we have been working on novel classifiers based on Genetic Programming EvoDAG.

In sentiment analysis, we have participated in a number of sentiment analysis competitions such as:

  • SemEval'17 (English and Arabic). INGEOTEC obtained the 6th place (39 participants) in English (see Results) and 4th (9 participants) in Arabic (see Results).
  • SENTIPOLC'16 (Italian). INGEOTEC obtained 5th place (15 participants) in subjective classification and 9th (15 participants) in polarity classification (see Proceeding).
  • TASS'16 (Spanish). INGEOTEC obtained the 3rd place in 3 and 5 polarity levels (see Proceedings).
  • TASS'15 (Spanish). This is our first competition where it was obtained 12th (17 participants) in 5 polarity levels and 10th (17 participants) in 3 polarity levels (see Proceedings)).

Current Students

  • M.C. Abel Coronado (PhD Student working on remote sensing problems)

Past Students

  • César Pérez Fernández
    Tesis: Reconocimiento facial con Kinect Grado: Ingeniería Técnica en Informática de Sistemas Universidad Rey Juan Carlos, España, 2013.

  • Fernando Flores García
    Tesis: Influencia de la aplicación de filtros de Gabor para la detección de personas Grado: Ingeniería Informática Universidad Rey Juan Carlos, España, 2013.

Software

A Baseline for Multilingual Sentiment Analysis (B4MSA)

B4MSA is a Python Sentiment Analysis Classifier for Twitter-like short texts. It can be used to create a first approximation to a sentiment classifier on any given language. It is almost language-independent, but it can take advantage of the particularities of a language.

It is written in Python making use of NTLK, scikit-learn and gensim to create simple but effective sentiment classifiers.

microTC

microTC follows a minimalistic approach to text classification. It is designed to tackle text-classification problems in an agnostic way, being both domain and language independent.
Currently, we only produce single-label classifiers; but support for multi-labeled problems is in the roadmap.

microTC is intentionally simple, so only a small number of features where implemented. However, it uses a some complex tools from gensimnumpy and scikit-learn.

CentroGEO

117 Circuito Tecnopolo Norte Col. Tecnopolo Pocitos II, C.P. 20313, Aguascalientes, Ags, México. Tel. +52 (449) 994 51 50 Ext. 5230

email: dmoctezuma at centrogeo.edu.mx