Logo du site
  • English
  • Français
  • Se connecter
Logo du site
  • English
  • Français
  • Se connecter
  1. Accueil
  2. Université de Neuchâtel
  3. Publications
  4. Feature weighting approaches in sentiment analysis of short text
 
  • Details
Options
Vignette d'image

Feature weighting approaches in sentiment analysis of short text

Auteur(s)
Kummer, Olena
Editeur(s)
Savoy, Jacques 
Institut d'informatique 
Mots-clés
  • Sentiment Analysis

  • Opinion Detection

  • Natural Language Proc...

  • Machine Learning

  • Data Mining

  • Feature Selection

  • Text Classification

Résumé
In this thesis, we propose a supervised classification scheme based on computation of the statistical scores for the textual features. More specifically, we consider binary classification (opinionated or factual, positive or negative) of the short text in the domains of movie reviews and newspaper articles. We analyze the performance of the proposed models on the corpora with the unequal sizes of the training categories. <br> Based on our participation in different evaluation campaigns, we analyze advantages and disadvantages of the classification schemes that use Z scores for the purpose of classifying a sentence into more than two categories, e.g. positive, negative, neutral and factual. As a new feature weighting measure, we give an adaptation of the calculation of the Kullback-Leibler divergence score, called KL score. Considering the performance of different weighting measures on training corpora with unequal sizes, we chose two best performing scores, Z score and KL score. Thus, we propose a new classification model based on the calculation of normalized Z score and KL score for the features per each classification category. One of the advantages of this model is its flexibility to incorporate external scores, for example, from sentiment dictionaries. <br> The experiments on datasets in Chinese and Japanese show a comparable level of performance of the proposed scheme with the results obtained on the English datasets without any use of natural language specific techniques. The advantage of the approaches analyzed in this thesis is that they can work as quick and easily interpretable baselines for short text classification.
Notes
Thèse de doctorat : Université de Neuchâtel, 2012
URI
https://libra.unine.ch/handle/123456789/9650
Type de publication
Resource Types::text::thesis::doctoral thesis
Dossier(s) à télécharger
 main article: 00002292.pdf (1 MB)
google-scholar
Présentation du portailGuide d'utilisationStratégie Open AccessDirective Open Access La recherche à l'UniNE Open Access ORCID

Adresse:
UniNE, Service information scientifique & bibliothèques
Rue Emile-Argand 11
2000 Neuchâtel

Construit avec Logiciel DSpace-CRIS Maintenu et optimiser par 4Sciences

  • Paramètres des témoins de connexion
  • Politique de protection de la vie privée
  • Licence de l'utilisateur final