Abstract
Several features for Neural Network based document region identification are tested. Specifically, this paper examines features for headline and subheadline region identification. The Neural Network based region identification algorithm is a key component of a document recognition system that segments a document into regions, classifies them into text, graphic, photo, and other region types, and then uses this classification to guide the processing and analysis of the image. The input data are unusually challenging: low quality images of newspaper documents obtained from microfilmed archives. Experiments on several newspaper documents show that the features used are capable of robust and accurate headline identification.
Original language | English |
---|---|
Pages | 2283-2287 |
Number of pages | 5 |
State | Published - 2003 |
Event | International Joint Conference on Neural Networks 2003 - Portland, OR, United States Duration: 20 Jul 2003 → 24 Jul 2003 |
Conference
Conference | International Joint Conference on Neural Networks 2003 |
---|---|
Country/Territory | United States |
City | Portland, OR |
Period | 20/07/03 → 24/07/03 |