Handwriting Recognition in Historical Documents

2012-07-17
Research

This talk addresses recent advances in pattern recognition methods for handwriting recognition in historical documents.

July 17^th 2012 Tuesday

Starts at 11:00 CEST
TU Wien, Campus Favoritenstraße
Seminarraum FAV 01 A
1040 Vienna, Favoritenstraße 11
Stiege 4, 1. Stock, links, Raum HE0102

This talk addresses recent advances in pattern recognition methods for handwriting recognition in historical documents. The aim of these methods is to automatically extract textual content from digitized manuscript images. Based on their textual content, millions of historical manuscript images could be integrated in digital libraries, which would help to preserve our cultural heritage by making it readily accessible to researchers and the public.

Two state-of-the-art strategies are discussed to model and recognize characters, words, and sentences. First, a generative strategy using hidden Markov models (HMM) and secondly, a discriminative strategy using a special form of recurrent neural networks (NN). The learning-based systems are generic in the sense that they can learn character appearance models for arbitrary alphabetical languages as long as a number of training samples are provided. They operate at the level of text lines avoiding prior word and character segmentation which is prone to errors for touching characters, broken characters, variable word spacing, and difficult image conditions stemming, e.g., from paper texture, damaged parchment, faded ink, and ink bleed-through.

Four subproblems of handwriting recognition in historical documents are addressed in this talk, namely ground truth creation, automatic transcription, keyword spotting, and transcription alignment. Experimental results are presented for several historical scripts and languages. The IAM historical document database (IAM-HistDB) includes Latin texts from the 9th century written in Carolingian minuscules (Saint Gall database), medieval German texts from the 13th century written in Gothic minuscules (Parzival database), and longhand English texts from the 18th century (George Washington database). The experimental results are promising in terms of accuracy, speed, and costs for indexing historical documents in digital libraries.

Note

This talk is organized by the Computer Vision Group at the Institute of Computer Aided Automation.

Speakers

Andreas Fischer, University of Fribourg, Switzerland

Curious about our other news? Subscribe to our news feed, calendar, or newsletter, or follow us on social media.

Note: This is one of the thousands of items we imported from the old website. We’re in the process of reviewing each and every one, but if you notice something strange about this particular one, please let us know. — Thanks!

Study Programs

Study Services

People

About

Research

Impact

Internal

Popular

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

R

S

T

U

V

W

Handwriting Recognition in Historical Documents

July 17^th 2012 Tuesday

Note

Speakers

Subscribe

Handwriting Recognition in Historical Documents

July 17th 2012 Tuesday

Note

Speakers

July 17^th 2012 Tuesday