Introducing the XXX Bangla Handwriting Dataset and an Efficient Offline Recognizer of Isolated Bangla Characters

Nishatul Majid, Elisa H. Barney Smith

Research output: Chapter in Book/Report/Conference proceedingChapter

8 Scopus citations
115 Downloads (Pure)

Abstract

This paper presents a publicly accessible Bangla offline handwriting dataset, as well as benchmarking with a simple and robust isolated handwritten character recognition scheme. The dataset is named XXX Bangla Handwriting Dataset. The dataset contains 2 pages. The first has a 104 word/364 character essay. The essay uses 49 basic characters, all 11 vowel diacritics and 32 high frequency consonant conjuncts. The second page contains 84 isolated units containing all basic characters, numbers, vowel diacritics and several high frequency conjuncts. The initial release is based on the voluntary contribution of 100 different writers. One of the highlights and unique features of this database is that all of its contents are tagged with the associated ground truth information from different component hierarchies, such as characters, words and lines. It is expected to be useful for research on offline Bangla handwriting recognition, particularly with segmentation-based approaches. Furthermore, a basic character recognition method is presented where the features are extracted based on zonal pixel counts, structural strokes and grid points with U-SURF descriptors modeled with bag of features. The highest classification accuracy obtained with an SVM classifier based on a cubic kernel is 95.4% using the isolated characters from the XXX dataset together with 3 other datasets to ensure the versatility and robustness of this process.

Original languageAmerican English
Title of host publicationProceedings: 2018 16th International Conference on Frontiers in Handwriting Recognition: ICFHR 2018
DOIs
StatePublished - 1 Jan 2018

Keywords

  • Bangla character recognition
  • Bangla handwriting database
  • Bangla handwriting recognition

EGS Disciplines

  • Electrical and Computer Engineering

Fingerprint

Dive into the research topics of 'Introducing the XXX Bangla Handwriting Dataset and an Efficient Offline Recognizer of Isolated Bangla Characters'. Together they form a unique fingerprint.

Cite this