Named entity recognition (NER) is a key component of natural language processing (NLP). It’s about breaking raw text up into predefined entities, such as “person”, “company”, “place”, and “time”. As we explored in a previous post, NER use cases extend across any situation where a high-level overview of a large quantity of text can provide valuable insight.
In this post, we’re going to explore in more detail some examples of NER out there in the wild, looking at how companies and organizations have successfully deployed the technique to help them achieve their goals.
Before we analyze the use cases in more detail, let’s clarify some of the key concepts necessary to assess the effectiveness of an NER model:
Booking.com, a leading hotel-booking platform is in “constant communication” with its customers. These interactions throughout their platform provide a “plethora of different textual information”. The company looked to make the most of this goldmine of information, acknowledging that, “first and foremost, we should recognise the entities in the text”.
To do this, they compared three models:
The team put together a small use case that allowed them to compare the performance of the three methods:
The top 10% of clicked destinations were used to build a sample dataset for the prototype models presented.
They identified three entities for labeling:
The whole dataset created using the different combinations of destinations, facilities and property types was around 200,000 rows and 20% of it was used as a test set to evaluate the models.
Ultimately, they concluded that L2S was by far the best model for their purposes, scoring better overall on precision, recall, and f1 score. L2S was followed by structural SVM. You can check out the individual scores of each method here.
The data science team recognized the potential power of the technique to “help tackle [their] problem by providing a better understanding of the various textual inputs of our customers”. They noted that NER techniques “fit well with the top priority of applied Data Science in Booking.com—to enhance the experience and satisfaction of Booking.com customers.”
A research project at German fashion ecommerce company Zalando introduced an NLP framework named Flair. This fashion-forward (and openly available) framework demonstrates impressive f1 scores across a range of NLP tasks. It also builds on PyTorch, making it as straightforward as possible to train your own models off.
Zalando demonstrated the framework with a design twist, using NER to identify fashion brands, occasions, seasons, colors and clothing items.
Learn more at the Zalando research blog.
The European Holocaust Research Infrastructure project (EHRI) used NER to automatically parse thousand of transcribed oral testimonials and recognize entities such as names, places, organizations, and events and make a museum catalog more accessible and interactive.
They faced some unique NER challenges, as the oral testimonials they were working on included wandering thoughts, varied references to people and places (including places that would be in one country one year and another the next), unusual names not often found in, e.g. Wikipedia, and Eastern European names with more than one translation into Latin characters.
As they needed precise recognition on a messy and complicated database, they quickly identified the need for “detailed guidelines specifying positive and negative examples” for labeling training data.
As they began to identify places, they also employed entity linking to tie place names to specific place IDs in a database, allowing them to display transcripts alongside an interactive map of mentioned places. Their use of NER also allowed for record extraction and “certain probabilistic inferences can be made based on place hierarchy and proximity”.
They found that “continuous research in the area of natural language processing along with the ability to quickly and affordably offload computationally intensive tasks to the cloud has made it more accessible than ever to explore practical applications of these methods on large inputs.”
Learn more about the process and discoveries here.
Do you know any creative or powerful examples of NER in the real world or need help using NER in your business? Reach out to us and let us know.
We’re going to continue exploring various techniques and use cases in the world of NLP over the coming weeks, so stay tuned to our blog to learn more.