A heuristic algorithm for hierarchical representation of form documents

Duygulu, P
Atalay, Mehmet Volkan
Dincel, E
In this paper our aim is to develop a logical representation for form documents. We propose a hierarchical structure to represent the logical layout of a form by using lines. The approach is top-down and no domain knowledge such as the preprinted data or filled-in data is used. Logically same but physically different forms are associated to the same hierarchical tree. This representation can handle geometrical modifications and slight variations.(1).