Logo du site
  • English
  • FranƧais
  • Se connecter
Logo du site
  • English
  • FranƧais
  • Se connecter
  1. Accueil
  2. Université de Neuchâtel
  3. Publications
  4. Searching strategies for the Hungarian language
 
  • Details
Options
Vignette d'image

Searching strategies for the Hungarian language

Auteur(s)
Savoy, Jacques 
Institut d'informatique 
Date de parution
2008
In
Information Processing & Management
Vol.
1
No
44
De la page
310
A la page
324
Mots-clƩs
  • Hungarian information...

  • Hungarian language

  • CLEF

  • evaluation

  • decompounding

  • n-gram indexing

  • TEXT RETRIEVAL

  • PROBABILISTIC MODELS

  • INFORMATION

  • CLEF-2003

  • ALGORITHM

RƩsumƩ
This paper reports on the underlying IR problems encountered when dealing with the complex morphology and compound constructions found in the Hungarian language. It describes evaluations carried out on two general stemming strategies for this language, and also demonstrates that a light stemming approach could be quite effective. Based on searches done on the CLEF test collection, we find that a more aggressive suffix-stripping approach may produce better MAP. When compared to an IR scheme without stemming or one based on only a light stemmer, we find the differences to be statistically significant. When compared with probabilistic, vector-space and language models, we find that the Okapi model results in the best retrieval effectiveness. The resulting MAP is found to be about 35% better than the classical tf Of approach, particularly for very short requests. Finally, we demonstrate that applying an automatic decompounding procedure for both queries and documents significantly improves IR performance (+10%), compared to word-based indexing strategies. (c) 2007 Elsevier Ltd. All rights reserved.
URI
https://libra.unine.ch/handle/123456789/6467
Type de publication
Resource Types::text::journal::journal article
google-scholar
PrƩsentation du portailGuide d'utilisationStratƩgie Open AccessDirective Open Access La recherche Ơ l'UniNE Open Access ORCID

Adresse:
UniNE, Service information scientifique & bibliothĆØques
Rue Emile-Argand 11
2000 Neuchâtel

Construit avec Logiciel DSpace-CRIS Maintenu et optimiser par 4Sciences

  • ParamĆØtres des tĆ©moins de connexion
  • Politique de protection de la vie privĆ©e
  • Licence de l'utilisateur final