Repository logo
Research Data
Publications
Projects
Persons
Organizations
English
Français
Log In(current)
  1. Home
  2. Publications
  3. Thèse de doctorat (doctoral thesis)
  4. Feature weighting approaches in sentiment analysis of short text

Feature weighting approaches in sentiment analysis of short text

Author(s)
Kummer, Olena
Editor(s)
Savoy, Jacques  
Institut d'informatique  
Date issued
2012
Subjects
Sentiment Analysis Opinion Detection Natural Language Processing Machine Learning Data Mining Feature Selection Text Classification
Abstract
In this thesis, we propose a supervised classification scheme based on computation of the statistical scores for the textual features. More specifically, we consider binary classification (opinionated or factual, positive or negative) of the short text in the domains of movie reviews and newspaper articles. We analyze the performance of the proposed models on the corpora with the unequal sizes of the training categories. <br> Based on our participation in different evaluation campaigns, we analyze advantages and disadvantages of the classification schemes that use Z scores for the purpose of classifying a sentence into more than two categories, e.g. positive, negative, neutral and factual. As a new feature weighting measure, we give an adaptation of the calculation of the Kullback-Leibler divergence score, called KL score. Considering the performance of different weighting measures on training corpora with unequal sizes, we chose two best performing scores, Z score and KL score. Thus, we propose a new classification model based on the calculation of normalized Z score and KL score for the features per each classification category. One of the advantages of this model is its flexibility to incorporate external scores, for example, from sentiment dictionaries. <br> The experiments on datasets in Chinese and Japanese show a comparable level of performance of the proposed scheme with the results obtained on the English datasets without any use of natural language specific techniques. The advantage of the approaches analyzed in this thesis is that they can work as quick and easily interpretable baselines for short text classification.
Notes
Thèse de doctorat : Université de Neuchâtel, 2012
Publication type
doctoral thesis
Identifiers
https://libra.unine.ch/handle/20.500.14713/30143
DOI
10.35662/unine-thesis-2292
File(s)
Loading...
Thumbnail Image
Download
Name

00002292.pdf

Type

Main Article

Size

1 MB

Format

Adobe PDF

Université de Neuchâtel logo

Service information scientifique & bibliothèques

Rue Emile-Argand 11

2000 Neuchâtel

contact.libra@unine.ch

Service informatique et télématique

Rue Emile-Argand 11

Bâtiment B, rez-de-chaussée

Powered by DSpace-CRIS

libra v2.1.0

© 2026 Université de Neuchâtel

Portal overviewUser guideOpen Access strategyOpen Access directive Research at UniNE Open Access ORCIDWhat's new