Features Combination for Extracting Gene Functions from MEDLINE
Author(s)
Date issued
2005
In
Lecture Notes in Computer Science (LNCS), Springer, 2005/3408//112-126
Abstract
This paper describes and evaluates a summarization system that extracts the gene function textual descriptions (called GeneRIF) based on a MedLine record. Inputs for this task include both a locus (a gene in the LocusLink database), and a pointer to a MedLine record supporting the GeneRIF. In the suggested approach we merge two independent phrase extraction strategies. The first proposed strategy (LASt) uses argumentative, positional and structural features in order to suggest a GeneRIF. The second extraction scheme (LogReg) incorporates statistical properties to select the most appropriate sentence as the GeneRIF. Based on the TREC-2003 genomic collection, the basic extraction strategies are already competitive (52.78% for LASt and 52.28% for LogReg, respectively). When used in a combined approach, the extraction task clearly shows improvement, achieving a Dice score of over 55%.
Publication type
journal article
File(s)![Thumbnail Image]()
Loading...
Name
Ruch_Patrick_-_Features_Combination_for_Extracting_Gene_Functions_20100209.pdf
Type
Main Article
Size
544.93 KB
Format
Adobe PDF
