Stephen V. Rice, Ph.D.     Computer Scientist,  Software Engineer,  Author,  Teacher

Pattern Recognition


Computers outperform humans at many tasks, but lag behind us when it comes to recognizing patterns.  For example, humans adeptly identify objects in a scene, but the most advanced computer systems struggle with the task.  With little effort, we detect even the most subtle variations in human speech, while speech-recognition systems labor to determine which words have been uttered.  Computers lag behind not only humans: a dog can navigate terrain better than the most sophisticated robot; and a bat's ability to echolocate flying insects is well beyond today's computer pattern-recognition systems.

 

In a pattern-recognition system, characteristics or "features" of an unknown pattern are analyzed, and the pattern is placed or "classified" by the system into one of several classes.  It is an error for the system to place the pattern into the wrong class.  A simple speech recognizer might classify a spoken word as "yes" or "no", considering only these two possibilities or classes.  On the other hand, a large-vocabulary speech recognizer determines which word in a dictionary has been spoken, clearly a much harder problem with a greater chance of making an error.

 

In optical character recognition (OCR), a computer system analyzes a scanned document page and attempts to identify the characters in the page image.  If the characters are printed clearly, an OCR system will accurately recognize them.  However, consider the following examples.  You will have no trouble recognizing the phrases, "operational efficiencies" and "DESIGN ISSUES", but an OCR system will struggle mightily.  The OCR system does not expect a lowercase "o" to appear in pieces, for example, or the characters "ES" to be fused together.

You may have seen such phrases on Web sites and been asked to type the words that appear.  This is known as a CAPTCHA, a mechanism for determining whether a human or computer is accessing the Web site.  A human can decipher the phrase, but a computer cannot.  When a human types the phrase correctly, the Web site knows that a human is using the site.  This mechanism cleverly exploits humans' superiority in pattern recognition to block computers known as "bots" from using the site.

 

From 2006 to 2008, I taught a graduate course on Pattern Recognition at the University of Mississippi.  Topics included linear and nonlinear classifiers, decision trees, Bayes decision theory, k-nearest neighbors, neural networks, feature selection and invariants, similarity/dissimilarity measures, optical character recognition, speech recognition, clustering algorithms, and system evaluation.  For their class project, students surveyed one of the following topics: face recognition, fingerprint recognition, gesture recognition, handwriting recognition, iris recognition, medical image analysis, motion analysis in video, satellite image change detection, speaker recognition, and target recognition in radar images.

 

I began working in pattern recognition in 1991 when I joined the Information Science Research Institute (ISRI) at the University of Nevada, Las Vegas (UNLV).  The U.S. Department of Energy tasked ISRI with evaluating the performance of commercial OCR systems.  From 1992 to 1996, I conducted the UNLV Annual Test of OCR Accuracy.  These were the first large-scale independent evaluations of commercial OCR systems and they attracted international interest and participation.  We not only measured the accuracy of commercial OCR systems, but we charted their progress from year to year, and these competitions spurred improvements in OCR accuracy.  These tests had a profound impact on the OCR industry and influenced a series of mergers and acquisitions among competing firms.

 

For these tests, I developed a number of new OCR performance measures using sequence-comparison algorithms.  I documented these measures and algorithms in my doctoral dissertation, and I implemented them in a suite of software tools which is used today by Google and other companies to measure the accuracy of OCR systems.

 

To many people working in pattern recognition, it is more glamorous to design pattern-recognition systems than to test them.  However, an inaccurate pattern-recognition system is not useful, no matter how elegant its design.  Conducting meaningful tests of system performance is a nontrivial undertaking that is essential for assessing and improving the accuracy of a pattern-recognition system.

 

In 1991, I developed a "voting" OCR system.  This system presented the same document page image to multiple commercial OCR systems; aligned their text outputs to locate differences of opinion regarding the identity of characters in the page image; and resolved these differences by majority vote.  The resulting output contained fewer errors than the output of each participating system.  This work inspired the development of new OCR systems that contain multiple OCR systems within them, for example, PrimeOCR from Prime Recognition, Inc.

 

In 1999, I published a research monograph entitled "Optical Character Recognition: An Illustrated Guide to the Frontier".  This book categorizes and depicts sources of error in machine-printed character recognition, and paints a picture of the state-of-the-art which has benefited OCR users, OCR researchers, and CAPTCHA developers.

 

Since 1997, I have worked in the area of audio pattern recognition and have developed pattern-recognition systems for locating similar sounds in audio databases and for real-time detection of abnormal machine sounds.  Please see Computer Audio for details.

 

Selected Writings


S. V. Rice, G. Nagy, and T. A. Nartker, Optical Character Recognition: An Illustrated Guide to the Frontier, Kluwer Academic Publishers, Norwell, MA, 1999 (link)

 

S. V. Rice, Measuring the Accuracy of Page-Reading Systems, Doctoral Dissertation, University of Nevada, Las Vegas, 1996 (pdf)

 

S. V. Rice, F. R. Jenkins, and T. A. Nartker, "The Fifth Annual Test of OCR Accuracy," presented at the Fifth Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, NV, 1996 (pdf)

 

J. Kanai, S. V. Rice, T. A. Nartker, and G. Nagy, "Automated Evaluation of OCR Zoning," IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(1), 1995

 

S. V. Rice, J. Kanai, and T. A. Nartker, "An Algorithm for Matching OCR-Generated Text Strings," International Journal of Pattern Recognition and Artificial Intelligence, 8(5), 1994