3.3 SEGEMENTATION
The preprocessing stage yields a clean document in the sense that maximal shape information with maximal compression and minimal noise on normalized image is obtained. The next stage is segmenting the document into its sub components and extracting the relevant features to feed to the training and recognition stages. Segmentation is an important stage because the extent one can reach in separation of words lines or characters directly affects the recognition rate of the script. There are two types of segmentation namely;
1. External Segmentation which is the isolation of various writing units such as paragraphs sentences or words prior to the recognition
2. Internal Segmentation which is the isolation of letters especially in cursively written words
The project make used of external segmentation decomposes the page layout into its logical parts. It provides savings of computation for document analysis. Page layout analysis is accomplished in two stages. The first stage is the structural analysis which is concerned with the segmentation of the image into blocks of document components (paragraph, rows, word etc.).
3.4 FEATURE EXTRACTION
During or after the segmentation procedure the feature set, which is used in the training and recognition stage, is extracted. Feature sets play one of the most important roles in a recognition system. A good feature set should represent characteristic of a class that helps distinguish it from other classes while remaining invariant to characteristic differences within the class. In this project, Gabor transformation is used, which is the variation of the windowed Fourier Transform. In this case, the window used is not a discrete size, but is defined by a Gaussian function.
3.5 TRAINING AND RECOGNITION TECHNIQUES
The training and recognition of the Yoruba handwriting recognition system designed will bases on Neural Networks. A Neural Network is defined as a computing architecture that consists of massively parallel interconnection of simple neural processors. Because of its parallel nature, it can perform computations at a higher rate compared to the classical techniques. Because of its adaptive nature it can adapt to changes in the data and learn the characteristics of input signal. It is used in pattern recognition by defining nonlinear regions in the feature space_ A neural network contains many nodes. The output from one node is fed to another one in the network and the final decision depends on the complex interaction of all nodes.
