A hierarchical representation of form documents for identification and retrieval


Duygulu P., Atalay V.

International Journal on Document Analysis and Recognition, cilt.5, sa.1, ss.17-27, 2003 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 5 Sayı: 1
  • Basım Tarihi: 2003
  • Doi Numarası: 10.1007/s100320100077
  • Dergi Adı: International Journal on Document Analysis and Recognition
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
  • Sayfa Sayıları: ss.17-27
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

In this paper, we present a logical representation for form documents to be used for identification and retrieval. A hierarchical structure is proposed to represent the structure of a form by using lines and the XY-tree approach. The approach is top-down and no domain knowledge such as the preprinted data or filled-in data is used. Geometrical modifications and slight variations are handled by this representation. Logically identical forms are associated to the same or similar hierarchical structure. Identification and the retrieval of similar forms are performed by computing the edit distances between the generated trees. © 2002 Springer-Verlag Berlin Heidelberg.