Downloads: 66

Files in This Item:
File Description SizeFormat 
bioinformatics_btv237.pdf334.71 kBAdobe PDFView/Open
Title: MeSHLabeler: Improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
Authors: Liu, Ke
Peng, Shengwen
Wu, Junqiu
Zhai, Chengxiang
Mamitsuka, Hiroshi  kyouindb  KAKEN_id  orcid (unconfirmed)
Zhu, Shanfeng
Author's alias: 馬見塚, 拓
Issue Date: 10-Jun-2015
Publisher: Oxford University Press
Journal title: Bioinformatics
Volume: 31
Issue: 12
Start page: i339
End page: i347
Abstract: Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. Methods: We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using 'learning to rank'. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. Results: MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9, 040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI.
Rights: © The Author 2015. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (, which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact
DOI(Published Version): 10.1093/bioinformatics/btv237
PubMed ID: 26072501
Appears in Collections:Journal Articles

Show full item record

Export to RefWorks

Export Format: 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.