ICDAR 2003

Competitions

Introduction
Datasets
Results
Cursive Script Recognition
Page Segmentation
Table Segmentation
Robust Reading
Robust Word Recognition
Robust Character Recognition
Text Locating
Sponsors
 
ICDAR2003 Logo
 

Table Location

NEW: Training data are now available.
To obtain ground truth files contact Simone Marinai (mailto:simone@dsi.unifi.it)

The aim of this competition is to locate tables in document images. 

Example:

Data Sets:  Image files will be not distributed to competitors. In fact they belong to widely available data sets: 

1)  images in UWIII CD-ROM (about 1400 images);
2)  images of papers in the IEEE on-line digital library (about 570 images).
Input format:   files it the second set are in PDF format. In order to convert them to image formats several tools can be used, for instance the pdftopbm tool .
Output format: the expected outputs are the coordinates of the tables (if any) in the image. Also pages without tables will be considered, and in this case the system should not find tables. The required output format will be specified.

Ground truth data for both sets will be distributed to participants.Competitors can provide results for both sets or for a unique one.
Sample ground truth (with some simple information) can be downloaded here as a zip file.

Contest participation:   Interested groups should contact Simone Marinai (Email: simone@dsi.unifi.it).

Contest organization:   The contest will be organized as follow: 

- Distribution of sample ground-truth ( already available );
- Specification of trial datasets and corresponding ground-truth data (now available);
- Specification of test datasets and corresponding ground-truth data  (April 2003)
- Collection and evaluation of results in a fixed format (details to follow in April 2003);

Performance evaluation:   we will collect two main information for each system: 

- A chart reporting the number of found, missed, split, merged and false tables;
- An overall measure baed on the Table Location Index that was introduced in "Trainable table location in document images" - F. Cesarini, S. Marinai, L. Sarti, G. Soda (ICPR 2002 Vol. 3, pag. 236-240) ;

The main interest of the competition is the understanding of what is easy and what is difficult in this task and not who's the winner of the competition.

Acknowledgement:  thanks to Matt Hurst for preliminary organization of table understanding exercise (planned at DAS 2002) and for helpufull suggestions.
 
 


Page updated by Simone Marinai and Lorenzo Sarti

Dante research group



 

Hosted with kind thanks to the University of Essex , © 2002.

University of Essex