7th Annual Document Recognition and Retrieval Conference, San-Jose, Costa Rica, 26 - 27 January 2000, vol.3967, pp.128-139
In this paper, we present a logical representation for form documents to be used for identification and retrieval. A hierarchical structure is proposed to represent the logical structure of a form by using lines. The approach is top-down and no domain knowledge such as the preprinted data or filled-in data is used. Logically same forms are associated to the same hierarchical structure. This representation can handle geometrical modifications and slight variations.