An algorithm is a sequence of steps that can be followed by a computer to perform a task or solve a problem. The computer cannot by itself determine the algorithm: a human must present the algorithm to the computer in a computer program. One of the challenges faced by human software developers is the choice of algorithms in their computer programs. New algorithms may be designed or existing algorithms may be selected. The implementation of these algorithms in computer software may be simple or complex.
When choosing among multiple algorithms that solve the same problem, the efficiency of the algorithms may be of great importance. Of particular interest is the running time, or speed, of an algorithm, but other forms of efficiency may also be of concern, such as memory use and ease of implementation. In "analysis of algorithms", the running time of an algorithm is determined and specified as a function of the size of the input.
An algorithm operates on data stored in memory or on disk, and typically depends on the manner in which this data is organized. This organization is called a data structure, and like algorithms, there are many data structures to choose from, with various trade-offs.
From 2005 to 2010, I taught a senior-level undergraduate course on Algorithms and Data Structures, and a follow-on course at the graduate level, at the University of Mississippi. Topics included analysis of algorithms, searching and sorting, binary search trees, binary heaps, hash tables, B-trees, algorithm paradigms (divide-and-conquer, greedy, dynamic programming), graph algorithms, sequence comparison, and intractability (NP-completeness). The algorithms and data structures covered in these courses comprise an essential toolbox for students interested in a career in software development.
I love the beauty and elegance of well-designed algorithms and data structures. The first classes I took on the subject were in 1979, as an undergraduate at Western Michigan University, and then as a graduate student in Edward Reingold's class at the University of Illinois. My favorite class at Illinois was taught in 1980 by Franco Preparata. It was one of the first classes ever taught on computational geometry and analyzed algorithms for solving geometric problems. Working in industry in the 1980s, I designed and implemented algorithms and data structures in commercial software products.
In the 1990s at the University of Nevada, Las Vegas, I became interested in sequence-comparison algorithms. When an optical character recognition (OCR) system processes a document page image, it produces text output which may contain errors. Using sequence-comparison algorithms, I aligned the OCR-generated output with the correct output to locate the errors and measure the accuracy of the OCR system. The techniques I developed are described in my doctoral dissertation. I also developed a voting system that aligned the output from multiple OCR systems and took a majority vote to produce a more accurate single output. While working on sequence-comparison algorithms, I discovered important theoretical properties of sequence alignments and I proved the relationship between two well-known problems in computer science: the string-editing problem and the longest-common-subsequence problem. I published these results in Algorithmica in 1997.
Starting in 1997, I began working on algorithms for comparing and searching sounds. This led to the creation of FindSounds.com, the first Web search engine for sound effects, which debuted in August 2000. Efficient algorithms and data structures are essential for providing fast query processing in a Web search engine. Users are unwilling to wait more than a few seconds to receive search results.
In 2005, I removed an inefficiency in the Simscript simulation engine by combining two well-known data structures, an AVL tree and a doubly-linked list, into a single data structure called a "braided AVL tree". I published a paper on this data structure in 2007.
S. V. Rice, "Braided AVL Trees for Efficient Event Sets and Ranked Sets in the Simscript III Simulation Programming Language," in Proceedings of the International Conference on High Level Simulation Languages and Applications, San Diego, CA, 2007 (pdf)
S. V. Rice, H. Bunke, and T. A. Nartker, "Classes of Cost Functions for String Edit Distance," Algorithmica, 18(2), 1997
S. V. Rice, Measuring the Accuracy of Page-Reading Systems, Doctoral Dissertation, University of Nevada, Las Vegas, 1996 (pdf)
S. V. Rice, J. Kanai, and T. A. Nartker, "An Algorithm for Matching OCR-Generated Text Strings," International Journal of Pattern Recognition and Artificial Intelligence, 8(5), 1994