Repository logo
Research Data
Publications
Projects
Persons
Organizations
English
Français
Log In(current)
  1. Home
  2. Authorities
  3. Projets
  4. Multilingual and Contextual Information Retrieval
Project Title
Multilingual and Contextual Information Retrieval
Internal ID
32695
Principal Investigator
Savoy, Jacques  
Status
Completed
Start Date
January 1, 2007
End Date
March 31, 2010
Organisations
Institut d'informatique  
Identifiants
https://libra.unine.ch/handle/20.500.14713/2077
-
https://libra.unine.ch/handle/123456789/1937
Keywords
Information retrieval (IR) multilingual IR (MLIR) contextual retrieval cross-lingual IR (CLIR) web search dedicated IR digital library
Description
This research proposal focuses on three main objectives. First, we want to design, implement and evaluate information retrieval (IR) systems to work with various East European languages (non-English monolingual IR). More specifically, in this part we design and evaluate linguistic tools for new and less frequently spoken languages, such as Hungarian, Polish, Czech and Turkish. In this part we also translate a short query from one language to another language (most likely it will be English, the lingua franca, before accessing information written in the various other languages).

Second, we undertake a more elaborate investigation of contextual IR systems used to retrieve information in a specific domain (e.g., biomedicine, law, enterprise, webblog), instead of evaluating IR systems using newspaper test-collections. In this part of our project we investigate the most appropriate response to user information needs (varying from “classical” document searches to new requests such as known-item searches (“where is the last e-mail sent to Paul?”), pros/cons of a given argument, searches for an expert in a given domain based on e-mails or other enterprise intranet document repositories, etc.). Specific users specifications could also be considered through identifying document length (varying from a short bibliographic notice to a large novel), the level of information needed (whole document, paragraph, single sentence or short summary), and the degree of editorial control (from newspaper articles to e-mails or webblogs). In this second part we also investigate and evaluate the impact of orthographic and vocabulary variations as well as the influence of extra-document information (e.g., document contexts, temporal information, links between documents within web or legal corpuses).

Third, we integrate the above two research objectives into a common task, in order to perform searches in a multilingual collection, starting with relatively well edited web pages (e.g., information made available from the European governments when using the EuroGOV corpus), or even less structured and less “polished” web pages (e.g., webblogs written in at least three different languages) or enterprise e-mails.
Université de Neuchâtel logo

Service information scientifique & bibliothèques

Rue Emile-Argand 11

2000 Neuchâtel

contact.libra@unine.ch

Service informatique et télématique

Rue Emile-Argand 11

Bâtiment B, rez-de-chaussée

Powered by DSpace-CRIS

libra v2.1.0

© 2025 Université de Neuchâtel

Portal overviewUser guideOpen Access strategyOpen Access directive Research at UniNE Open Access ORCIDWhat's new