Exploiting Index Pruning Methods for Clustering XML Collections

Altingovde İ. S., Atilgan D., Ulusoy O.

8th International Workshop of the Initiative for the Evaluation of XML Retrieval, Brisbane, Avustralya, 7 - 09 Aralık 2009, cilt.6203, ss.379-386, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası: 6203
Doi Numarası: 10.1007/978-3-642-14556-8_37
Basıldığı Şehir: Brisbane
Basıldığı Ülke: Avustralya
Sayfa Sayıları: ss.379-386
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Orta Doğu Teknik Üniversitesi Adresli: Hayır

Özet

In this paper we first employ the well known Cover-Coefficient Based Clustering Methodology (C3M) for clustering XML documents Next, we apply index pruning techniques from the literature to reduce the size of the document vectors Our experiments show that for certain cases It is possible to prune up to 70% of the collection (or, more specifically underlying document vectors) and still generate a clustering structure that yields the same quality with that of the original collection in terms of a set of evaluation metrics