Repository logo
Research Data
Publications
Projects
Persons
Organizations
English
Français
Log In(current)
  1. Home
  2. Publications
  3. Article de recherche (journal article)
  4. Indexing and stemming approaches for the Czech language

Indexing and stemming approaches for the Czech language

Author(s)
Dolamic, Ljiljana
Savoy, Jacques  
Institut d'informatique  
Date issued
2009
In
Information Processing and Management
Vol
45
No
6
From page
714
To page
720
Subjects
Czech language Stemming Evaluation Slavic languages
Abstract
This paper describes and evaluates various stemming and indexing strategies for the Czech language. Based on Czech test-collection, we have designed and evaluated two stemming approaches, a light and a more aggressive one. We have compared them with a no stemming scheme as well as a language-independent approach (<i>n</i>-gram). To evaluate the suggested solutions we used various IR models, including Okapi, <i>Divergence from Randomness</i> (DFR), a statistical language model (LM) as well as the classical <i>tf idf</i> vector-space approach. We found that the <i>Divergence from Randomness</i> paradigm tend to propose better retrieval effectiveness than the Okapi, LM or <i>tf idf</i> models, the performance differences were however statistically significant only with the last two IR approaches. Ignoring the stemming reduces generally the MAP by more than 40%, and these differences are always significant. Finally, if our more aggressive stemmer tends to show the best performance, the differences in performance with a light stemmer are not statistically significant.
Publication type
journal article
Identifiers
https://libra.unine.ch/handle/20.500.14713/65500
DOI
10.1016/j.ipm.2009.06.001
File(s)
Loading...
Thumbnail Image
Download
Name

Dolamic_Ljiljana-Indexing_and_stemming_approaches_for_the_czech_language-20130108.pdf

Type

Main Article

Size

623.32 KB

Format

Adobe PDF

Université de Neuchâtel logo

Service information scientifique & bibliothèques

Rue Emile-Argand 11

2000 Neuchâtel

contact.libra@unine.ch

Service informatique et télématique

Rue Emile-Argand 11

Bâtiment B, rez-de-chaussée

Powered by DSpace-CRIS

libra v2.1.0

© 2025 Université de Neuchâtel

Portal overviewUser guideOpen Access strategyOpen Access directive Research at UniNE Open Access ORCIDWhat's new