en fr Handwritten word recognition with context-dependent Hidden Markov Models: application to French, English and Arabic handwriting. Reconnaissance de mots manuscrits cursifs par modles de Markov cachs en contexte : applicationReport as inadecuate




en fr Handwritten word recognition with context-dependent Hidden Markov Models: application to French, English and Arabic handwriting. Reconnaissance de mots manuscrits cursifs par modles de Markov cachs en contexte : application - Download this document for free, or read online. Document in PDF available to download.

1 A2iA SA 2 LTCI - Laboratoire Traitement et Communication de l-Information

Abstract : This thesis aims at elaborating a new handwritten words recognition system that can be learned and applied on any handwriting style and any alphabet. An analytic approach is used. Words are divided into subparts characters or graphemes that have to be modelled. The division is made implicitly thanks to sliding windows, which transform the word images into sequences. Hidden Markov Models, widely known as one of the most powerful tools for sequence modelling, are chosen to model the characters. A Bakis-type HMM represents each character. This enables the model to absorb variations in handwriting. A word model is built by concatenating its compound characters models. In this thesis, the choice is made to strengthen the HMM modelling by acting directly within the models. To this end, a new approach is proposed, using context knowledge : each character model depends on its context its preceding and following characters. This new character model is named trigraph. Taking into account the characters environment allows more precise and more effective models to be built. However, this implies a multiplication of HMM parameters to be learned often on a restricted number of observation data. An original method for parameter grouping is proposed in this thesis to overcome this issue : a state-based clustering, performed on each state position and based on binary decision trees. This type of clustering is new in the handwriting recognition field. It has many advantages, including parameter reduction. Moreover, the use of decision trees allows the HMMs to keep one of their most interesting attributes : independence between training and testing lexicon.

Rsum : L-objectif de cette thse est d-laborer un systme de reconnaissance de mots manuscrits pouvant tre appris et appliqu sur diffrents styles d-criture. L-approche utilise est une approche analytique: les mots sont dcoups en sous-parties caractres modliser. Le dcoupage est effectu de manire implicite par l-utilisation de fentres glissantes qui permettent de transformer les images de mots en squences. La mthode choisie pour apprendre les modles de caractres utilise les modles de Markov cachs HMMs. Chaque caractre est reprsent par un HMM de type Bakis, ce qui permet d-absorber les variations d-criture entre scripteurs. Les mots sont reconstruits ensuite par concatnation des modles qui les composent. Dans cette thse, le choix est fait de chercher amliorer la modlisation HMM de caractres en agissant au coeur mme des modles. A cette fin, une nouvelle approche est propose, qui utilise l-aspect contextuel pour la modlisation : un caractre est modlis en fonction de son contexte et son modle est nomm trigraphe. La prise en compte de l-environnement d-un caractre pour sa modlisation implique cependant une multiplication des paramtres HMMs apprendre sur un nombre souvent restreint de donnes d-observation. Une mthode originale de regroupement de paramtres est propose dans ces travaux : le clustering d-tats par position l-aide d-arbres binaires de dcision. Ce type de clustering, indit dans les systmes de reconnaissance de l-criture, permet au systme de rduire le nombre de paramtres tout en conservant l-un des principaux attraits des HMMs : l-utilisation d-un lexique de test indpendant de celui d-apprentissage.

en fr

Keywords : handwriting recognition off-line HMM contextual models binary trees

Mots-cls : reconnaissance d-criture manuscrite hors-ligne MMC modles en contexte clustering arbres binaires multiscript





Author: Anne-Laure Bianne Bernard -

Source: https://hal.archives-ouvertes.fr/



DOWNLOAD PDF




Related documents