Bulgarian OpenAIRE Repository >
FP7 Programmes >
FP7-S3T2009 >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10867/45

Title: Impact of Ngrams-based indexing on XML retrieval
Authors: Ben Aouicha, Mohamed
Tmar, Mohamed
Boughanem, Mohand
Keywords: Ngrams
XML retrieval
Issue Date: 2009
Publisher: S3T 2009
Abstract: We present in this paper a statistical approach of term clustering. This approach is based on a statistical analysis of NGrams shared by a pair of terms and is inspired from the t f × idf criterion commonly used in information retrieval. Being statistical, the approach is completely independent from the lexical and grammatical characteristics of the language in which documents to be indexed are written. Classical indexing is often based on stemming, which consists of transforming a term into its radical. This allows to provide large issues for customized information access. As for us, we consider that this can be made by building term clusters and perform information retrieval based on this concept. This approach is used for XML retrieval, therefore some experiments have been undertaken into a dataset provided by INEX to show its impact compared to Porter stemming method.
URI: http://hdl.handle.net/10867/45
ISBN: 978-954-9526-62-2
Appears in Collections:FP7-S3T2009

Files in This Item:

File SizeFormat
S3T2009_08_MBenAouicha_MTmar_MBoughanem.pdf324.46 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback