Real-world Applications of Optical Character Recognition

Christopher Marshall

Data science technical writer

SUMMARY

Optical character recognition (OCR) has come a long way since its inception back in the 70s. OCR systems started out as esoteric services with limited reach in the business world. Today, the picture is very different: OCR is used all over the world by millions of people every day for a wide range of purposes.

We’ve previously explored what OCR is and how to use it. In this post, we explore three real-world use cases illustrating how OCR has helped modernise businesses and improve accessibility on a global scale.

Stadtwerke München: parking made easy in Munich

Stadtwerke München, a communal company owned by the city of Munich, was one of the first major government-led organizations to adopt a novel form of OCR—license plate scanning. The company offers public services such as public transport and parking management, and as such, parking inspection became an important part of their work. However, manual parking inspection can be quite a chore—in order to ensure accuracy, parking inspectors were made to input the license plate number into the system twice, manually. Not only was this cumbersome, it was also a waste of resources that could have easily been deployed elsewhere.

The city of Munich found their solution in Anyline, an OCR app that allowed parking inspectors to accurately input license plate numbers into their system in under one second. This allows them to save a lot of time and effort during parking management. A spokesperson for Stadtwerke München said OCR has “made it much faster for our parking attendants to collect license plate data. This efficiency improvement at the base of our parking control operations will help us to improve parking control as a whole within Munich.”

Cambridge Assessment: automated examinations

Cambridge Assessments is another company successfully using OCR to automate complex tasks. Cambridge Assessments runs some of the world’s biggest examinations, including GCSEs, IGCSEs, and Cambridge A Levels. They process over one million physical papers per year. Examination papers are handwritten by students, and Cambridge examiners work remotely from all over the world, making it infeasible to send physical copies of answer scripts. To solve their problem, Cambridge Assessments turned to technology.

Simple scanning and replicating won’t really do the job: although it transmits the papers electronically to the examiners, the process makes marking much harder. Examiners need to manually go through each and every answer script and have to sift out the relevant information. Instead, Cambridge Assessments uses OCR technology to scan each paper that passes through their system. This allows for proper documentation and sorting of answers. For example, with OCR technology, papers can be scanned into the system, and computers can group together sets of papers based on age range, location, examination type, examination level, and more. Humans don’t need to physically sit at sorting centers to group papers together—OCR makes it possible via computers. Additionally, OCR allows answers to be separated by question. For example, if an examiner wants to mark “question 2” from a given paper, OCR technology can scan and separate out all of the “question 2s” from all of the papers so that the examiner can mark them all at once. This is far more efficient and faster than having an examiner manually open each and every document, scroll to question 2, and then mark all of them one by one.

Google Translate: the one on every smartphone

One of the most widely used OCR applications stems from Google Translate’s OCR addition, which revolutionized the way we communicate in different languages. Instead of typing text into the app like before, you can simply point your mobile phone camera at text and it will be scanned into Google Translate via OCR.

This is extremely useful, and it can sometimes be the only viable method of translation. If you’re in a foreign country and you see a signboard, even with a normal translation app, you won’t necessarily know what it means if you can’t speak the language. Some languages don’t use the same script style as English, so you likely won’t even be able to input the correct characters into the translation app. Bringing OCR to the translation game changes everything — you don’t actually need to know the language to understand it. Google’s advanced OCR is able to identify and parse the language into its system, providing instant identification of what has been written. For millions of travelers worldwide, this has been a game-changer, especially in countries with vastly differing language scripts (China, Japan, Russia, India, etc.).

Not only does Google Translate’s OCR feature unlock a new level of worldwide communication, it also reduces a lot of inefficiencies. OCR means no more time fumbling with inputting characters into a translation app if you want to read a sign or a foreign menu—OCR does it in seconds with a single scan. The time-saving aspect of this becomes secondary to the new travel possibilities that such instantaneous translation opens up.

The three features OCR always brings

You’ll notice that the individual use cases of OCR may differ—parking management, worldwide communication, education assistance, business development, and more—but the key improvements that OCR tech brings are common amongst all of these use cases: speed, efficiency, and accessibility. These are the three features that OCR is guaranteed to bring to any existing text input system.

Do you know any creative or powerful examples of OCR in the real world or need help using OCR in your business? Reach out to us and let us know.

We’re going to continue exploring various techniques and use cases in the world of NLP over the coming weeks, so stay tuned to our blog to learn more.