AI-automated Redaction Could Save Self-driving Cars from a Privacy Nightmare

Sina Youn

Privacy Tech Lead

SUMMARY

As autonomous vehicles (AV) become more prevalent, the amount of data generated by self-driving cars is increasing exponentially. This data includes images and videos captured by the car's sensors, which can contain sensitive personal information such as faces or license plates.

Although the number and type of sensors vary depending on the system, most self-driving cars rely on a similar set of technologies. Light detection and ranging (LIDAR) allows vehicles to quickly and accurately calculate distances to objects, radio detection and ranging (radar) helps cars understand object angles, ranges, and velocities, sound navigation and ranging (sonar) is used to detect and communicate with objects, and cameras allow vehicles to see and interpret the environment around them.

This list isn’t exhaustive, the autonomous vehicle sensor ecosystem includes a wide variety of technologies. Many of these sensors generate data that must be captured, processed, transmitted, and stored, with images and videos being one of the largest and most important data types. A 30 frame per second video recording can create anywhere from 1.5GB per hour for 720p to 42GB for 4K. And self-driving cars have a lot of cameras. For example, Tesla’s Autopilot utilizes 8 cameras while GM’s self-driving Cruise cars have 14.

Artificial intelligence can be used to automatically redact sensitive information from visual data. There are many different ways to approach this problem, but one effective method is to use a deep learning neural network. This type of artificial intelligence can be trained to automatically detect and blur faces, license plates, and other sensitive information in images and videos.

This approach has the advantage of being completely automated, meaning that it can be quickly applied to large datasets with minimal effort. Additionally, deep learning neural networks improve as they are exposed to more data, making them more accurate over time. While automated redaction of sensitive information from visual data is not always perfect, it is a powerful tool that can help protect the privacy of individuals captured by self-driving cars.

Self-driving systems have an insatiable thirst for data

Self-driving cars need to be able to handle anything that a human driver might encounter. Data collection is essential for autonomous vehicles to learn how to handle edge cases, which are rare or unusual scenarios that fall outside of normal operating conditions for a system. Rare edge cases such as a child darting into the street or an animal crossing the road can be difficult to predict, but self-driving cars need to be able to handle them.

Training systems to effectively respond to edge cases requires that companies pursuing self-driving technology collect as much data, in as wide a variety of scenarios, as possible. There are a few ways companies can collect data to train self-driving cars. The first is to do it themselves. Companies like Waymo, Uber, and Cruise have all been working on their own self-driving technologies for years and have amassed large amounts of data that they can use to train their systems. These companies often use their own self-driving vehicles to collect data in the real world, which can be expensive and time-consuming.

Another way to collect data is to partner with other companies. For example, BMW has partnered with Mobileye, a company that makes vision systems for self-driving cars, to collect data that will be used to train BMW's autonomous vehicles. This data includes information about things like lane markings, traffic signs, and obstacles on the road.

Automotive original equipment manufacturers (OEMs) and tier 1 suppliers can also purchase data from third-party companies. This data is usually collected by human drivers using specially equipped test vehicles fitted with sensors and cameras. The data is then processed and labeled so that it can be used to train self-driving systems. This is often seen as a more efficient way to collect data, since it doesn't require companies to build or purchase their own test vehicles.

No matter which method they use, companies need to be sure that the data they're collecting is high quality and representative of the real world. If self-driving cars are only trained on a limited amount of data, they may not be able to handle unusual situations when they encounter them.

Data privacy laws complicate data processing

‍
Data privacy laws vary from country to country, but they typically impose strict restrictions on how data can be collected, used, and stored. These laws make it difficult for companies to collect the data they need to develop and operate self-driving cars. For example, data collected by sensors on self-driving cars must often be processed in order to be useful. This processing can involve cleaning, filtering, and merging data from multiple sources. Data privacy laws typically restrict how this processing can be done, making it more difficult and expensive.

Some argue that data privacy laws are necessary to protect people's personal information. Others argue that these same laws hamper innovation and make it difficult to develop new technologies like self-driving cars. The debate is sure to continue as we grapple with how to best balance data privacy with other important concerns.

For example, the EU's General Data Protection Regulation (GDPR) makes it difficult for self-driving car makers to transfer training data to third-party processors based in countries with lower labor costs. The regulation, which took effect in May 2018, requires companies to have a legal basis for data processing. Ideally, they should get explicit consent from individuals before collecting, using, or sharing their personal data.

Self-driving car makers typically collect large amounts of data from test vehicles in order to train their algorithms. This data includes information about the car's surroundings as well as the actions of the drivers and passengers. This data includes personal information such as faces, license plates, and addresses captured by cameras mounted on the interior and exterior of the vehicle.

In theory, obtaining explicit consent for processing self-driving car data could include consent from every driver, passenger, and pedestrian whose data is collected in order to lawfully transfer the data to a third party for processing. This would be a difficult and costly process that would require tracking down each individual and obtaining their consent. In addition, self-driving car makers would need to ensure that the third parties they are transferring data to are GDPR compliant. Each data subject would also have the right to have their data deleted, even if they previously gave consent, which would result in an ongoing, complex, and costly process..

If self-driving car makers are unable to outsource data processing, they may be forced to do all of it themselves. This could significantly increase the cost of developing autonomous vehicles and advanced driver assistance systems.

An international data privacy framework for AVs could help

The advent of connected AVs is bringing about novel ethical challenges for industries and governments around the world. These developments will likely lead to an evolution in how we think about car ownership as well as rights over data collected from these cars, which could have significant economic implications on a global scale.

Although true autonomous driving capabilities are not yet available for consumers to purchase (despite what some automakers might say), people in Beijing, San Francisco, and other cities around the world will soon be able to book taxis with no one at the wheel. As ride-hailing services roll out fleets of fully autonomous taxis, people will become increasingly familiar and comfortable with machines at the wheel. Many people will first encounter an AV through a ride-hailing service rather than purchasing their own.

Autonomous features are contingent on connectivity, therefore all autonomous cars will also be connected vehicles, which are already quite common. Data privacy regulations are currently fragmented from country to country, and even from state to state within the U.S. However, harmonization of regulations makes practical sense, simply because a mishmash of contradictory requirements or standards would render it impossible to operate a car across several jurisdictions. To help mitigate this, the U.S. Department of Transportation (USDOT) will collaborate with states and other organizations to prevent patchwork rules that could hinder AVs from crossing state lines.

Similar regulations specific to autonomous vehicles are likely to be implemented in the European Union (EU), and a global framework that helps unify protections across disparate jurisdictions is highly probable as well. As regulations mature, companies will have an increasingly severe onus of responsibility to make the right decisions when it comes to sensitive data collected and leveraged by autonomous vehicles.

When it comes to data privacy, it's better to be safe than sorry

When it comes to our personal information, we should always err on the side of caution. Self-driving car companies should implement privacy by design principles, such as federated learning, homomorphic encryption, or anonymization. By doing so, they can protect our faces and license plates from being collected and used without our consent.

Anonymization is a quick and effective way to protect our data. It's a process of obscuring personally identifiable information in order to prevent identification of individuals. In the context of autonomous vehicles, anonymization can be used to protect the faces and license plates of people captured by the car's camera.

Self-driving car companies can use AI to automatically anonymize image and video data. Automated anonymization is a fast and efficient way to protect our privacy without sacrificing data quality. It's a win-win for both car companies and consumers.

More AI-automated redaction resources

Read our blog: Solving Data Privacy Concerns for Self-Driving Cars
Download the white paper: Protecting Sensitive Information at Scale with AI
Explore super.AI’s solution: Super.Redact

AI-automated Redaction Could Save Self-driving Cars from a Privacy Nightmare

Self-driving systems have an insatiable thirst for data

Data privacy laws complicate data processing

An international data privacy framework for AVs could help

When it comes to data privacy, it's better to be safe than sorry

More AI-automated redaction resources

You might also like

AI in Accounting: 5 Ways Companies Benefit from AI to Improve Efficiencies

How Are Large Language Models Reshaping Intelligent Document Processing?

Unlock the Power of AI in Invoice Matching

Save Your Business Time and Money with Intelligent Data Extraction

Automating PO Matching with AI

AI-automated Redaction Could Save Self-driving Cars from a Privacy Nightmare

Self-driving systems have an insatiable thirst for data

Data privacy laws complicate data processing

An international data privacy framework for AVs could help

When it comes to data privacy, it's better to be safe than sorry

More AI-automated redaction resources

Get a customized demo with your documents

You might also like

AI in Accounting: 5 Ways Companies Benefit from AI to Improve Efficiencies

How Are Large Language Models Reshaping Intelligent Document Processing?

Unlock the Power of AI in Invoice Matching

Save Your Business Time and Money with Intelligent Data Extraction

Automating PO Matching with AI