Being an innovator is hard. Being an AI innovator is even harder.

Super.AI was built for the people who are blazing the trail. If Steve Jobs had to label data we would like to think he would have used super.AI.

Built BY the PIONEERS from

How do I use it

THe assembly line of the digital age

Secret #1 - Data Programming

Data programs break a complex task into many simple tasks

Just like an assembly line, the simpler the tasks, the more you can automate with AI.

quality
34
%
Avg.
Higher Quality
Latency
57
%
Avg.
Faster outcomes
Cost
62
%
Avg.
Lower Cost

The EXECUTION Engine

Secret #2 - The AI Compiler

Higher quality & cost efficiency over time as more data gets routed to AI.

Data program executes small tasks (like an assembly line)

Router channels the small tasks to the best human or machine to make predictions

Combiner takes each prediction and intelligently assembles them into a single label

AI Trainer automatically trains new AI algorithms and qualifies new humans and AI based on your data

QA system, measures label quality, relabels data if needed, and updates router

Data program executes small tasks (like an assembly line)

Router channels the small tasks to the best human or machine to make predictions

Combiner takes each prediction and intelligently assembles them into a single label

AI Trainer automatically trains new AI algorithms and qualifies new humans and AI based on your data

QA system, measures label quality, relabels data if needed, and updates router

Automate labeling

Secret #3 - Meta AI

The AI Compiler automatically trains AI models for you over time.

This keeps the quality the same but drastically lowers the cost.
You can also use the model in production.

Cost
62
%
Average cost
reduction

Compare

Routes to the mathematically optimal combination of labelers

Each time you add more qualified human labelers, train more AI, and update the router and combiner, you improve the quality and lower cost.

Quality

It’s important to know each part of the pipeline is high quality.

Qualification

In order to ensure high quality outputs, before a labeler can start working they must pass a rigorous suite of qualification tests.

1. Skill Tree

They must pass skill tests such as understanding English, drawing image annotations, following instructions, etc.

2. Template qualification

They must qualify on the data program template. This is useful if you haven’t yet uploaded enough ground-truth data.

3. Instance qualification

The final test is to be able to pass data program tasks based on your specific data. This is the most important qualitification.

Dynamic ground truth

We regularly send labelers audit tasks in order to verify they continue to maintain high quality standards.

Tasks we already know the answers to (ground truth) are sent to labelers to randomly check if they are still giving high-quality answers.
Labelers who are very good get sent less checks, labelers who are questionable get sent more checks. The frequency is determined by reinforcement learning (based up a partially observed Markov decision process).

Consistency score

Sometimes human-labelers can get fatigued, a cheap way to mitigate this is to send them the same task again but with permuted answers

Consistency score is set up for evaluating consistency of translation quality. For instance, if your last 3 scores are "7.0", "7.0", "7.0", your average score is "7.0". Meanwhile, if your last 3 scores are "10.0", "10.0", "1.0", your average score is sitll "7.0", but your "consistency score" will be signicantly lower than the previous case.
Sending the same task again but with some permutations 
is a good way to assess this drop in performance.

Performance incentives

To keep labelers motivated and constantly improving, reward is directly tied to value.

Labers get compensated based on the value they add to you. This creates a deep incentive to get the answer correct, improve their skills, and level up. High performing people get more responsibility over time.

Generative Model

The generative model creates a model for what an ideal labeler looks like and removes labelers which deviate too far.

The generative model, models correlations between labelers, estimates a latent trust score, and predicts how high quality a labeler will be on a specific task given other labelers working on the same task. If two labelers are very similar in their predictions it doesn’t make sense to combine them in a consensus task, but it does make sense to use them to predict and assess quality.

1.
Skill Tree

They must pass skill tests such as understanding English, drawing image annotations, following instructions, etc.

2.
Template qualification

They must qualify on the data program template.
This is useful if you haven’t yet uploaded enough ground-truth data.

3.
instance qualification

The final test is to be able to pass data program tasks based on your specific data. This is the most important qualitification.

Tasks we already know the answers to (ground truth) are sent to labelers to randomly check if they are still giving high-quality answers.

Labelers who are very good get sent less checks, while labelers who are questionable get sent more checks. The frequency is determined by reinforcement learning (based upon a partially observed Markov decision process).

Consistency scores are powerful because they can 
be used even if not ground truth is present. Sometimes there are temporary drops 
in performance, due to 
fatigue or distraction.

Sending the same task again but with some permutations 
is a good way to assess this drop in performance.

Labelers get compensated based on the value they provide to you. This creates a deep incentive to get the answer correct, improve their skills,

and level up. High performing people get more responsibility over time.

The generative model, models correlations between labelers, estimates a latent trust score, and predicts how high quality a labeler will be on a specific task given other labelers working on the same task.

If two labelers are very similar in their predictions it doesn’t make sense to combine them in a consensus task, but it does make sense to use them to predict and assess quality.

Consensus

Multiple labelers are combined together to make a high quality label. The more trustworthy and uncorrelated labelers are, and the more they agree with each other, the higher quality the output is.

The most trust worthy and uncorrelated labelers are and they agree, the higher quality the end result. If labelers disagree and the entropy of this disagreement goes above a threadhold (based on your desired performance) then the task goes down a further processing funnel to guarantee quality.

Incremental Relabeling

Tasks are incrementally relabeled until 
the quality is determined to be high-enough. The number of relabels are determined dynamically.

Not all tasks and data are the same difficultly level. Some tasks can be answered confidently by 1 labeler, some tasks need a combination of multiple labelers. The incremental relabeling system determines how many labelers are needed for a particular task to get a high quality result.

Anomaly Detection

A group of experts are used to determine if ultimately a task is high enough quality to be used as a final label.

The most trust worthy and uncorrelated labelers are and they agree, the higher quality the end result. If labelers disagree and the entropy of this disagreement goes above a threadhold (based on your desired performance) then the task goes down a further processing funnel to guarantee quality.

Expert Reviews

A group of experts are used to determine 
if ultimately a task is high enough quality 
to be used as a final label.

After a task is passed through the QA pipeline it will either get resolved as high enough quality, or it will eventually get passed to an expert review team to ensure ultimate quality.
This is a committee of trained experts who follow a control procedure to make a final determination on a task. You can also use your own experts. The goal is to use this commitee as little as possible.

ML Likelihood classifier

A MLE estimator is used to flag a data point 
if the label is too far outside it’s expected response.

A likelihood classifier is use to create a distribution of the expected bounds of outputs. If the actual output deviates too far far the expected statistical distribution then the task is flagged for further review.

The more trustworthy and uncorrelated labelers are, and the more they agree, the higher quality the end result. If labelers disagree and the entropy of this disagreement goes above a threshold (based on your desired performance), then the task goes down a further processing funnel to guarantee quality.

Not all tasks and data are the same difficultly level. Some tasks can be answered confidently by 1 labeler, some tasks need a combination of multiple labelers. The incremental relabeling system determines how many labelers are needed for a particular task to get a high quality result.

The most trust worthy and uncorrelated labelers are and they agree, the higher quality the end result. If labelers disagree and the entropy of this disagreement goes above a threadhold (based on your desired performance) then the task goes down a further processing funnel to guarantee quality.

After a task is passed through the QA pipeline it will either get resolved as high enough quality, or it will eventually get passed to an expert review team to ensure ultimate quality.

This is a committee of trained experts who follow a control procedure to make a final determination on a task. You can also use your own experts. The goal is to use this commitee as little as possible.

A likelihood classifier is use to create a distribution of the expected bounds of outputs. If the actual output deviates too far far the expected statistical distribution then the task is flagged for further review.

Instructions are created as modules. Each module can be audited and tested separately. If tasks in a data program categorically drop below the quality baseline, then instructions are sent through a master data program, broken down into their smallest pieces and run to check for consistency.

If the automatic escalation process does not yiled higher quality instructions, or an anomaly is detected, the instructions get sent to customer success representatives who are experts in writing and debugging instructions.

An NLP classifier is run to check for proper sytactic and semantic structure. The classifier is trained on millions of successful instructions and can automatically detect the most common errors and give an assessment of what to fix. You can set it to fix things automatically.

Data linter

The data linter automatically identifies potential issues and suggests fixes for common data errors in structured and unstructured data.

The data linting routines automatically 1) Flag potiential issues in the data (e.g. numerical features with widely different scales), and 2) Suggest fixes or potientially useful feature transformations of the data. It works on structured and unstructed data.

Noise model

The noise model has been trained over billions of data points to flag data if it’s not high-quality enough to meet your desired quality.

The noise model is trained over billions of unstructured data points (images, audio, video, etc.) to determine a proper range of the signal to noise ratio of your particular data. It merges Gaussian additative noise, Poisson noise, and multiplicative noise to identify bad data. In addition to the pior distribution of noise, the noise models are trained on your successfully labeled data to decrease the variance of the proper signal to noise ratio of your data.

Master Quality Data Program

The master data quality data program executes a workflow to assess using human + AI if the data is quality enough to meet your desired quality. If not you will be alerted, and suggestions will be made to help fix the data.

If the certainty of the noise model or the data linter is not high enough yet, the data will get passed the the master data quality data program to get audited by a crowd of expert labelers. If the data is determined to be below a threshold needed for the desired quality, you’ll get alerted and suggestions will be made on how to fix it.

The data linting routines automatically 1) Flag potiential issues in the data (e.g. numerical features with widely different scales), and 2) Suggest fixes or potientially useful feature transformations of the data. It works on structured and unstructed data.

The noise model is trained over billions of unstructured data points (images, audio, video, etc.) to determine a proper range of the signal to noise ratio of your particular data. It merges Gaussian additative noise,

Poisson noise, and multiplicative noise to identify bad data. In addition to the pior distribution of noise, the noise models are trained on your successfully labeled data to decrease the variance of the proper signal to noise ratio of your data.

If the certainty of the noise model or the data linter is not high enough yet, the data will get passed the the master data quality data program to get audited by a crowd of expert labelers. If the data is determined to be below a threshold needed for the desired quality, you’ll get alerted and suggestions will be made on how to fix it.

Data Engine

Fastest way to train your machine learning models.

super.AI’s Data Engine makes efficient use of your data by sampling only the most important parts, and finding data similar to it.

Dashboard

Central hub of training data and models.

The dashboard is a system of record for all your data. Collaborate with role based access controls, view labeler performance, flag and review labels, and use it as a central source of truth.

You can customize and iterate in real time.

Real world data changes, so it’s important to be able to update your project in real time.

FLEXIBILITY

Billions of custom data and task variations.

Case Studies

You are in great company

Old & New

A new programming paradigm for AI

Making access to AI available to everyone

50 years ago

Only the smartest people in the world could use computers 50 years ago. They were expensive and could only be programmed in binary. Over time people built abstractions: assembly, compiled languages, interpreted languages, GUI. Now almost anyone can use a computer. This changed the world.

Today

Only the smartest and most well funded companies can effectively use AI. Just like 50 years ago, when computers only understood binary, today's AI only understands labeled data. At super.AI, we built abstractions on top of labeled data, AI assembly language, AI compiler and Data Programming.

Vision

But we aren't even close to being done. We made big steps with the AI Compiler and Data Programming, but our mission is to ultimately make AI available for everyone.

Gui
AI for everyone
Api
Marketplace
Compiled language
Data program
Assembly language
Labeling primitives
Machine language
Labeled data
Old way
New Way
Old
Way
NEW
way
Gui
AI for everyone
Api
Marketplace
Compiled language
Data program
Assembly language
Labeling primitives
Machine language
Labeled data

Old way

New way

Label data randomly
Pray it works
Label more data if doesn’t look good
failure rate
90%
Enterprise AI projects
Quality
Speed
Satisfaction
Flexibility
Decompose problem into reusable program
The AI compiler automatically:
– Finds and labels only the most valuable data
– Routes only to optimal labelers
– Improves over time
– Update labels incrementally
– Finds and labels only the most valuable data
failure rate
90%
Enterprise AI projects
Efficiency
100%
Fastest way to train AI

Old way

Label data randomly
Pray it works
Label more data if doesn’t look good
Quality
SPeed
SATISFACTION
FLEXIBILITY
failure rate
90%
Enterprise AI projects

New Way

Decompose problem into reusable program
The AI compiler automatically
Finds and labels only the most valuable data
Routes only to optimal labelers
Improves over time
Update labels incrementally
Quality
SPeed
SATISFACTION
FLEXIBILITY
efficiency
100%
Fastest way to train AI

Get a demo

Loading ...
Oops! Something went wrong while submitting the form.