original:http://cstheory.stackexchange.com/questions/19759/corealgorithmsdeployed/19773
To demonstrate the importance of algorithms (e.g. to students and professors who don’t do theory or are even from entirely different fields) it is sometimes useful to have ready at hand a list of examples where core algorithms have been deployed in commercial, governmental, or widelyused software/hardware. I am looking for such examples that satisfy the following criteria:
Update:Thanks again for the great answers and links! Some people comment that it is hard to satisfy the criteria because core algorithms are so pervasive that it’s hard to point to a specific use. I see the difficulty. But I think it is worthwhile to come up with specific examples because in my experience telling people: “Look, algorithms are important because they are just about everywhere!” does not work. 



Algorithms that are the main driver behind a system are, in my opinion, easier to find in nonalgorithms courses for the same reason theorems with immediate applications are easier to find in applied mathematics rather than pure mathematics courses. It is rare for a practical problem to have the exact structure of the abstract problem in a lecture. To be argumentative, I see no reason why fashionable algorithms course material such as Strassen’s multiplication, the AKS primality test, or the MoserTardos algorithm is relevant for lowlevel practical problems of implementing a video database, an optimizing compiler, an operating system, a network congestion control system or any other system. The value of these courses is learning that there are intricate ways to exploit the structure of a problem to find efficient solutions. Advanced algorithms is also where one meets simple algorithms whose analysis is nontrivial. For this reason, I would not dismiss simple randomized algorithms or PageRank. I think you can choose any large piece of software and find basic and advanced algorithms implemented in it. As a case study, I’ve done this for the Linux kernel, and shown a few examples from Chromium. Basic Data Structures and Algorithms in the Linux kernelLinks are to the source code on github.
Data Structures and Algorithms in the Chromium Web BrowserLinks are to the source code on Google code. I’m only going to list a few. I would suggest using the search feature to look up your favourite algorithm or data structure.
There are also such data structures and algorithms in the thirdparty code included in the Chromium code.
Programming Language Libraries
I think they are worth considering. The programming languages designers thought it was worth the time and effort of some engineers to implement these data structures and algorithms so others would not have to. The existence of libraries is part of the reason we can find basic data structures reimplemented in software that is written in C but less so for Java applications.
Allocation and Scheduling AlgorithmsI find these interesting because even though they are called heuristics, the policy you use dictates the type of algorithm and data structure that are required, so one need to know about stacks and queues.
Core utils in *nix systems
Cryptographic AlgorithmsThis could be a very long list. Cryptographic algorithms are implemented in all software that can perform secure communications or transactions.
Compilers
Compression and Image Processing
Conflict Driven Clause LearningSince the year 2000, the running time of SAT solvers on industrial benchmarks (usually from the hardware industry, though though other sources are used too) has decreased nearly exponentially every year. A very important part of this development is the Conflict Driven Clause Learning algorithm that combines the Boolean Constraint Propagation algorithm in the original paper of Davis Logemann and Loveland with the technique of clause learning that originated in constraint programming and artificial intelligence research. For specific, industrial modelling, SAT is considered an easy problem (see this discussion). To me, this is one of the greatest success stories in recent times because it combines algorithmic advances spread over several years, clever engineering ideas, experimental evaluation, and a concerted communal effort to solve the problem. The CACM article by Malik and Zhang is a good read. This algorithm is taught in many universities (I have attended four where it was the case) but typically in a logic or formal methods class. Applications of SAT solvers are numerous. IBM, Intel and many other companies have their own SAT solver implementations. The package manager in OpenSUSE also uses a SAT solver. 


PageRank is one of the bestknown such algorithms. Developed by Google cofounder Larry Page and coauthors, it formed the basis of Google’s original search engine and is widely credited with helping them to achieve better search results than their competitors at the time. We imagine a “random surfer” starting at some webpage, and repeatedly clicking a random link to take him to a new page. The question is, “What fraction of the time will the surfer spend at each page?” The more time the surfer spends at a page, the more important the page is considered. More formally, we view the internet as a graph where pages are nodes and links are directed edges. We can then model the surfer’s action as a random walk on a graph or equivalently as a Markov Chain with transition matrix M. After dealing with some issues to ensure that the Markov Chain is ergodic (where does the surfer go if a page has no outgoing links?), we compute the amount of time the surfer spends at each page as the steady state distribution of the Markov Chain. The algorithm itself is in some sense trivial – we just compute Mkπ0 for large k and arbitrary initial distribution π0. This just amounts to repeated matrixmatrix or matrixvector multiplication. The algorithms content is mainly in the setup (ensuring ergodicity, proving that an ergodic Markov Chain has a unique steady state distribution) and convergence analysis (dependence on the spectral gap of M). 


I would mention the widelyused software CPLEX (or similar) implementation of the Simplex method/algorithm for solving linear programming problems. It is the (?) most used algorithm in economy and operations research. “If one would take statistics about which mathematical problem is using up most of the computer time in the world, then (not counting database handling problems like sorting and searching) the answer would probably be linear programming.” (L. Lovász, A new linear programming algorithmbetter or worse than the simplex method? Math. Intelligencer 2 (3) (1979/80) 141146.) The Simplex algorithm has also great influence in theory; see, for instance, the (Polynomial) Hirsch Conjecture. I guess a typical undergraduate or Ph.D. class in algorithms deals with the Simplex algorithm (including basic algorithms from linear algebra like Gauss Elimination Method). (Other successful algorithms, including Quicksort for sorting, are listed in Algorithms from the Book.) 


As I understand it, the National Resident Matching Program was for a long time just a straight application of the GaleShapley algorithm for the stable marriage problem. It has since been slightly updated to handle some extra details like spousal assignments (aka the “twobody problem”), etc… 


If you’re also including PhDlevel stuff, many (most?) graduate CS programs include some course in coding theory. If you have a course in coding theory, you will definitely cover theReedSolomon code which is integral to how compact discs work and Huffman encoding which is used in JPEG, MP3, and ZIP file formats. Depending on the orientation of the course, you may also cover LempelZiv which is used in the GIF format. Personally, I got LempelZiv in an undergraduate algorithms course, but I think that might be atypical. 


GNU grep is a command line tool for searching one or more input files for lines containing a match to a specified pattern. It is wellknown that grep is very fast! Here’s a quote from its author Mike Haertel (taken from here):


More generally, the Kanellakis prize is awarded by the ACM for precisely such theoretical discoveries that have had a major impact in practice. the 2012 award is for localitysensitive hashing, which has become a goto method for dimensionality reduction in data mining for near neighbor problems (and is relatively easy to teach – at least the algorithm itself) 


The CountMin Sketch and Count Sketch, from data streaming algorithms, are used in industrial systems for network traffic analysis and analysis of very large unstructured data. These are data structure that summarize the frequency of a huge number of items in a tiny amount of space. Of course they do that approximately, and the guarantee is that, with high probability, the frequency of any item is approximated to within an an εfraction of the total “mass” of all items (the mass is the first or second moment of the frequency vector). This is good enough to find “trending” items, which is what you need in a lot of applications. Some examples of industrial uses of these data structures are:
Here is also a site that collects information about applications of CountMin. As far as teaching, I know that basic sketching techniques are taught at Princeton in undergraduate discrete math courses. I was taught the CountMin sketch in my first algorithms course. In any case, the analysis of CountMin is simpler than the analysis for almost any other randomized algorithm: it’s a straightforward application of pairwise independence and Markov’s inequality. If this is not standard material in most algorithms courses, I think it’s for historical reasons. 


Check out Jens Vygen’s project BonnTools for Chip Design. http://www.or.unibonn.de/~vygen/projects.html I have heard some talks on this and also looked at some of their papers. They use RaghavanThompson style randomized rounding as well as multiplicative weight update method for solving largescale multicommodity flow LPs. However, like any big project, they need to do some engineering as well but the methodology is very much based on wellknown algorithms. 


In the last decade algorithms have been used to increase the number (and quality, I think?) of kidney transplants through various kidney donor matching programs. I’ve been having trouble finding the latest news on this, but here are at least a few pointers:
This is one of my favorite examples not just of the success of algorithms, but of the importance of studying algorithms for NPcomplete problems. They can literally save lives, and have already done so! 


Viterbi’s algorithm, which is still widely used in speech recognition and multiple other applications: http://en.wikipedia.org/wiki/Viterbi_algorithm The algorithm itself is basic dynamic programming. From Wikipedia: “The Viterbi algorithm was proposed by Andrew Viterbi in 1967 as a decoding algorithm for convolutional codes over noisy digital communication links.[1] The algorithm has found universal application in decoding the convolutional codes used in both CDMA and GSM digital cellular, dialup modems, satellite, deepspace communications, and 802.11 wireless LANs. It is now also commonly used in speech recognition, speech synthesis, keyword spotting, computational linguistics, and bioinformatics. For example, in speechtotext (speech recognition), the acoustic signal is treated as the observed sequence of events, and a string of text is considered to be the “hidden cause” of the acoustic signal. The Viterbi algorithm finds the most likely string of text given the acoustic signal.” 




Funny that I saw this question today and then coincidentally clicked this link on the many uses of the Fourier Transform. 

singular value decomposition (SVD) has a strong connection to statistical factor analysis orprincipal components analysis and is comprehensible within an undergraduate linear algebra or statistics class, and has many important theoretical properties. it also plays a role in image compression algorithms. it played a key element in the winning entries in the $1M Netflix prize contest (one of the worlds largest datamining competitions in history) and is now implemented on their site to predict user ratings. it is also known to be highly related to Hebbian selforganizing neural networks which originate in biological theory. there is some connection also to gradient descent which is used widely in machine learning &artificial neural networks and as a very universally applied optimization technique of which caseNewton’s method is a basic 2d form. there is a gradientdescent algorithm for obtaining the SVD. 

Finding a Eulerian path is at the base of the genome assembly — a task commonly performed when working with full genomes (in bioinformatics, medicine, forensics, ecology). UPDATE Forgot this obvious one: UPS, FedEx, USPS all have to solve large instances of Traveling Salesman Problem every night. Saves a lot of time and money for them to send the drivers on an optimal route. UPDATE2 Minimum feedback vertex set problem is used for deadlock resolution in many OSes. 


I like this system for saving the maximum number of lives in the UK with kidney transplants, based on maximum matching algorithms: Paired and Altruistic Kidney Donation. They match up people needing kidneys who have a nonmatching friend/relative willing to donate, with other people in the same situation, in a maximal way. Then on donation day, all the donors donate at the same time, followed by a speedy transport of kidneys around the country to the recipients. 

The KnuthMorrisPratt string search is widely used, specific, and taught in undergrad / graduate CS. 


I’m rather surprised that with all the fancy algorithms above, nobody mentioned the venerable LempelZiv family of compression algorithms (invented in 1977/78).
Update Apparently it was mentioned briefly already. 

this relatively new book is worth considering as a complete/detailed answer to the question in convenient, extended/collected form and which could be used as supplemental material for an algorithms class. [some of these have already been mentioned; the strong overlap itself is notable.]



Thinking of very basic algorithms
Nice to show they appearing in real life: A. Many groups use a kind of covering tree algorithm to communicate, by dividing telephone lists in hierarchical way among people B. Cars in a intersection usually use a roundrobin algorithm (in a voluntary way) C. Most places, as banks and hospital, organize their clientes in a FIFO algorithm 


Bresenham’s line algorithm is the single most useful algorithm I’ve come across. Easy to understand Ive used it for lots of applications from line drawing to a complex spliner for 3d casting engine to a complex polygon renderer, as well as for complex animation and scaling uses. 

A fascinating algorithmic problem arises in medical application of CT scan. In Computed Tomography (CT), the body is exposed to Xrays from different angles. At one end of the scanner are the Xray transmitters and at the other end the sensors. From such a series of scans, an image is reconstructed for the physician to examine! The filtered back projection algorithm is the basis for reconstruction of an image from a set of scans. This algorithm is actually a form of an approximation problem in which the “signal” is sampled below Nyquist rate. This algorithm is in use “behind the scenes” at all hospitals and the basic filtered back projection utilizes undergraduate math such as Fourier transforms to achieve the Fourier Slice Theorem. 

An example of FFT I once helped port an FFT algorithm to a different system language. The algorithm was being used to determine line breaks in coaxial delivery of cable tv/internet/phone. Basically a technician would request a signal to be sent to the customer’s box, at the same time they would pull up a realtime display of the statistics for the specific customer, such as QoS, dB, …. The technician could use the data and a graph to determine within a few feet between the house and pole where a partial break existed (or multiple breaks as I was told). As mentioned above FFT is widely used, but this was one of the more blatant and obvious (in terms of why and how) I have seen in practice. Sorry I had to keep it a high level. 

Wikipedia has a decent collection of algorithms/applications classified more or less in a list. Microsoft provides the top cited papers but without any explicit explanation of the area of computer science nor the application. There is also a chronological list from different CS conferences _http://jeffhuang.com/best_paper_awards.html_ compiled by Prof. Huang. Spectral Clustering is an elegant clustering algorithm, known being Normalized cuts algorithm introduced by Jianbo Shi and Jitendra Malik for image segmentation. It has also well been developed in Data clustering applications, being an good intersection between the two communities. 

Customizable Route Planning (Daniel Delling, Andrew V. Goldberg, Thomas Pajor, and Renato F. Werneck) http://research.microsoft.com/apps/pubs/default.aspx?id=145688 powers Bing Maps: http://www.bing.com/blogs/site_blogs/b/maps/archive/2012/01/05/bingmapsnewroutingengine.aspx 

Rosetta Code lists applied algorithms by Programming Task (692) and by Programming Language (518) with Semantic MediaWiki. 

two other personal favorite examples firmly rooted in computer science but maybe easily overlooked by abstractionist theoreticians, which have undergone enormous/transformative advances and had majortomassive practical/applied impact in daily life over the last few decades. already an entire generation has grown up not knowing the world without them. basically the category of Modelling and simulation.


maybe all the major/preferred algorithms of interest to this audience have been mentioned at this point. however, a few more deserve mention for completeness. & some analysis of what is considered a significant algorithm is relevant here. in CS & IT fields there seems to be a phenomenon noticed long ago in AI called “moving the goalposts”. this is a psychological phenomenon where the field advances relatively quickly but people quickly mentally adjust to “the new normal” and take real or even breakthrough advances as mundane or unremarkable in retrospect, after accomplished, ie downplayed or minimized. this is highly captured in this question in the way that algorithms move from R&D into “deployment”. quoting the author of the question in later comments:
but this is problematic and basically a TCScentric redefinition of the word “algorithm”. presumably the interesting algorithms are advanced. does that mean that if a problem is reduced to an advanced algorithm, its no longer “interesting”? and “advanced” is clearly a moving target. so there is a way to define “algorithms” narrowly, or broadly. it seems the TCS definition changes on context, but note even in TCS, there is a trend toward the broad definition eg in the socalled “algorithmic lens”. sometimes the most ubiquitous algorithms are also the most overlooked! the internet and WWW is a large environment/nearecology for algorithms. still relatively young at only about 2 decades old (invented ~1991) it has grown massively and exponentially in a short amount of time. WWW site growth has probably even outpaced the famous exponential Moores law. the internet/WWW are supported by many sophisticated algorithms. the internet has complex routing algorithms built into routers (again powering multibilliondollar corporations such as Cisco). some advanced theory is applicable there eg in routing algorithms. these algorithms were a subject of emerging, advanced/cutting edge research decades ago however a now so finetuned & well understood that its somewhat invisible. we should not so soon forget that decades ago, leading researchers were not even sure if the internet world work or was possible (seen in early packet switching research, a radical new design pattern at the time departing from the prior circuitswitching), and even a few years ago there were fears that it would fail to scale at some point and begin to fail due to overwhelming spikes in volume. it also uses sophisticated error detection/correction. the internet probably the largest, mostfaulttolerant system ever built by humans, still growing. next, there is a strong case to make that the algorithms powering the WWW are advanced. HTTP& web servers are highly tuned/optimized and also use advanced security/encryption protocols (HTTPS). the rendering logic of a web page has become extremely advanced in HTML5 & CSS3, along with the Javascript programming language. the relatively new CSS has various principles similar to OOP programming such as reusability and inheritance. speaking of typesetting, TeX is an important, internally complex scientific typesetting system (not so different than a programming language) invented by Knuth that can now be rendered on web pages (and is used in possibly hundreds of thousands of scientific papers or more). another relatively new area of algorithms building on the internet, still emerging, those based on collective intelligence. stackexchange software itself is an example of a sophisticated collective intelligence system. social networking also exhibits the key features of collective intelligence and features are continually being added to increase that intelligence (for example facebook “Likes” are only a few years old). the field of rating systems is based on collaborative filtering algorithms and is still evolving based on new research and applications. so in short, all revolutionary successes transforming daily human experience actually quite far beyond merely “field goals”. as the title of the question states, all core algorithms deployed. now so ubiquitous and invisible as to be something like the IT expression, “part of the plumbing”. 


An amazingly successful (hardware) algorithm is the poweron reset. Without a system whereby a computer is in a known state when power is applied, nothing else happens right. Poweron reset is why anything works at all that has a CPU in it, whether considered embedded or otherwise. Next time you’re at the watering hole for programmers and computer scientists, raise your glass of cherry soda to the poweron reset. 

