Repository logo
Research Data
Publications
Projects
Persons
Organizations
English
Français
Log In(current)
  1. Home
  2. Publications
  3. Article de recherche (journal article)
  4. Information retrieval with Hindi, Bengali, and Marathi languages: evaluation and analysis

Information retrieval with Hindi, Bengali, and Marathi languages: evaluation and analysis

Author(s)
Savoy, Jacques  
Institut d'informatique  
Akasereh, Mitra
Dolamic, Ljiljana
Date issued
2013
In
Multilingual Information Access in South Asian Languages, Springer
From page
334
To page
352
Subjects
Hindi Bengali and Marathi information retrieval retrieval effectiveness with Indian Languages FIRE evaluation campaign automatic indexing
Abstract
Our first objective in participating in FIRE evaluation campaigns is to analyze the retrieval effectiveness of various indexing and search strategies when dealing with corpora written in Hindi, Bengali and Marathi languages. As a second goal, we have developed new and more aggressive stemming strategies for both Marathi and Hindi languages during this second campaign. We have compared their retrieval effectiveness with both light stemming strategy and <i>n</i>-gram language-independent approach. As another language-independent indexing strategy, we have evaluated the trunc-<i>n</i> method in which the indexing term is formed by considering only the first <i>n</i> letters of each word. To evaluate these solutions we have used various IR models including models derived from Divergence from Randomness (DFR), Language Model (LM) as well as Okapi, or the classical <i>tf idf</i> vector-processing approach. <br> For the three studied languages, our experiments tend to show that IR models derived from Divergence from Randomness (DFR) paradigm tend to produce the best overall results. For these languages, our various experiments demonstrate also that either an aggressive stemming procedure or the trunc-<i>n</i> indexing approach produces better retrieval effectiveness when compared to other word-based or <i>n</i>-gram language-independent approaches. Applying the Z-score as data fusion operator after a blind-query expansion tends also to improve the MAP of the merged run over the best single IR system.
Publication type
journal article
Identifiers
https://libra.unine.ch/handle/20.500.14713/65722
DOI
10.1007/978-3-642-40087-2_30
File(s)
Loading...
Thumbnail Image
Download
Name

Information_retrieval-Savoy_J.-20160121.pdf

Type

Main Article

Size

8.02 MB

Format

Adobe PDF

Université de Neuchâtel logo

Service information scientifique & bibliothèques

Rue Emile-Argand 11

2000 Neuchâtel

contact.libra@unine.ch

Service informatique et télématique

Rue Emile-Argand 11

Bâtiment B, rez-de-chaussée

Powered by DSpace-CRIS

libra v2.1.0

© 2026 Université de Neuchâtel

Portal overviewUser guideOpen Access strategyOpen Access directive Research at UniNE Open Access ORCIDWhat's new