Abstract
This paper presents an offline handwriting recognition system for Bangla script using sequential detection of characters and diacritics with a Faster R-CNN. This is an entirely segmentation-free approach where the characters and associated diacritics are detected separately with different networks named C-Net and D-Net. Both of these networks were prepared with transfer learning from VGG-16. The essay scripts from the Boise State Bangla Handwriting Dataset along with standard data augmentation techniques were used for training and testing. The F1 scores for the C-Net and D-Net networks are 89.6% and 93.2% respectively. Afterwards, both of these detection modules were fused into a word recognition unit with CER (Character Error Rate) of 11.2% and WER (Word Error Rate) of 24.4%. A spell checker further minimized the errors to 8.9% and 21.5% respectively. This same method is likely to be equally effective on several other Abugida scripts similar to Bangla.
Original language | American English |
---|---|
Title of host publication | 2019 International Conference on Document Analysis and Recognition (ICDAR) |
State | Published - 2019 |
Event | ICDAR 2019: 15th International Conference on Document Analysis and Recognition - Sydney, Australia Duration: 23 Sep 2019 → … |
Conference
Conference | ICDAR 2019: 15th International Conference on Document Analysis and Recognition |
---|---|
Period | 23/09/19 → … |
Keywords
- Bangla handwriting recognition
- character spotting
- handwriting recognition using faster R-CNN
- offline handwriting recognition
- segmentation-free handwriting recognition
EGS Disciplines
- Electrical and Computer Engineering