Repository logo
Research Data
Publications
Projects
Persons
Organizations
English
Français
Log In(current)
  1. Home
  2. Publications
  3. Article de recherche (journal article)
  4. Indexing and searching strategies for the Russian language

Indexing and searching strategies for the Russian language

Author(s)
Dolamic, Ljiljana
Savoy, Jacques  
Institut d'informatique  
Date issued
2009
In
Journal of the American Society for Information Science and Technology, Wiley, 2009/60/12/2540-2547
Abstract
This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (<i>n</i>-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the <i>Divergence from Randomness</i> (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical <i>tf idf</i> scheme and the <i>dtu-dtn</i> model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or <i>tf idf</i> models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an <i>n</i>-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.
Publication type
journal article
Identifiers
https://libra.unine.ch/handle/20.500.14713/60165
DOI
10.1002/asi.21191
File(s)
Loading...
Thumbnail Image
Download
Name

Dolamic_Ljiljana_-_Indexing_and_Searching_Strategies_for_the_Russian_20091209.pdf

Type

Main Article

Size

335.44 KB

Format

Adobe PDF

Université de Neuchâtel logo

Service information scientifique & bibliothèques

Rue Emile-Argand 11

2000 Neuchâtel

contact.libra@unine.ch

Service informatique et télématique

Rue Emile-Argand 11

Bâtiment B, rez-de-chaussée

Powered by DSpace-CRIS

libra v2.1.0

© 2025 Université de Neuchâtel

Portal overviewUser guideOpen Access strategyOpen Access directive Research at UniNE Open Access ORCIDWhat's new