IDP Reimagined: How UDP Solutions Are Addressing the Limitations of First-Generation IDP

Watch Now

Learn how you can consolidate point AI solutions and leverage crowd-sourced training and validation resources to achieve higher quality results at lower costs than first-generation IDP and other point AI solutions.

First-generation IDP solutions had the following limitations:

Processed structured and semi-structured documents only
Required extensive training and setup for new document types, and substantial ongoing maintenance
Did not guarantee outcomes

Next-generation UDP/IDP solutions are addressing the limitation by providing:

‍A unified AI platform that leverages the best available AI, to process any data type – documents, images, video, etc., and address any AI use case‍
Touchless automation via a composable platform that can easily reuse AI components to build complex AI apps in days and learns continuously to increase automation rates‍
Guaranteed results, not just AI confidence levels that require constant tuning.

This recording of IDP Reimagined: How UDP Solutions Are Addressing the Limitations of First-Generation IDP was originally streamed as part of the Intelligent Automation Network's 2nd Annual IDP Solution Showcase. It features super.AI CEO, Brad Cordova, and super.AI Marketing VP, Manish Rai.

Full transcript:

Speaker 1 [00:00:02] Hi everyone. Welcome to our third session of the day. IDP Reimagined How IDPs Solutions Are Addressed, Addressing the limitations of first generation IDP. Our speakers are Brad Cordova, the founder and CEO of super.AI, and Mansh Rai the VP of Marketing at super.AI. I think it's safe to say we saved the best for last. So I'm going to hand things over to Manish and Brad. Hi, guys.

Manish Rai [00:00:30] Hello, everyone. Thank you for having us. For those who don't know me, my name is Manish Rai. I'm VP of marketing at super.AI. I've been in Silicon Valley for around 20 years. And living automation five years for over the last five years. Three years at automation anywhere a year at Appian and now at super.AI. So when you look at the industry, you know, in, in the intelligent automation market, RPA took off and you look at RPA, it has been taking care of mostly structured data out there. However, when when you look at all the available data out there, there's hundreds, 75 zettabytes of data from estimates out there. That is ten to the power of 21 one followed by 21 zeroes. That's a lot of data. And according to most analysts that you probably have heard, 90% is unstructured. So we have only scratched the surface of unstructured data. It's the unstructured data that's growing exponentially. If you look at the bottom, the straight line, that is how structured data has more growing at a steady pace. So, so, so. So there's a huge opportunity out there. I mean, if you can make sense of this unstructured data, you can dramatically transform your customer experience, you can dramatically lower your cost. On the other hand, if you don't take advantage of data. There are nimble startups sitting there looking to disrupt every little segment of the business out there. So coming back to the topic of today, where we're going to be talking about how next generation, unstructured data processing or ITP solutions are addressing limitations of first generation. We saw that once we started taking care of most of the structured data with RPA, a new crop of intelligent document processing solutions started showing up about five years ago. The first version of these solutions actually, simply put, a new modern user interface on top of OCR. They added some machine learning on top of it and you found that it they were they became simpler to use for structured forms and somewhat semi-structured documents as well. They still had some limitations. One so far for a shared services organizations looking to bring AI, you're still faced with the challenge of that. You have to find a different point solution to solve every problem. If you may need ATP for documents, what about all the other data types, images, videos, etc.? You might need another solid solution for them. If you want to redact information in need, another type of solution, each with its complex sourcing, procurement, training and set up mechanism. The second area we found where IDP and solutions were lacking was in exception handling. They worked well when when the document types limit limited and you have limited variability. But the fact is, when you're handling invoices and purchase orders, you're getting new types of invoices, purchase orders all the time. And then, you know, the models, AEO models that have been trained. They start drifting and your accuracy rates start changing and going down over time. So there's a lot of burden in exception handling and often you have to find your own people to train those initial models by labeling the data initially during setup. And then you may have to have your own validators. Then when the confidence is low, you're out to document some. But to validate that that information and you have to find those people, train those people to manage that workforce. And it is expensive and time consuming. And at the same time, you need to be constantly tuning the systems because when you look at the next point, no outcome guarantee. Most of the early solutions that used AI, they'll expose confidence score to you. You have a confidence level or confidence score for the page document or at the fee level. So now it's left up to you. Should I input 80% confidence level to get the desired outcome? What happens? Had 85. So you found that? We found that a lot of people had to experiment, but these confidence intervals had to be constantly tuning to figure it out till they got the right level of results. It's time consuming, it's complex. And when humans were involved, it was a very simple interface. You send it to a human and it was lacking a lot of gamification that people expect today to keep the remote workforce engaged. Escalation mechanisms when to if somebody is not responding, then within a given period of time, there were no escalation mechanism built in to make sure we can get into a result in a limited amount of time. So those were the issue. And I think that's why. Super. Not. II. And I think that is that is an opportunity. And Brad and team started trying to battling this and built data types, not just documents. So with that, I'd like to turn it over to Brad to talk a little bit about what super is and how is different and also to show you a demonstration of our platform.

Brad Cordova [00:07:02] Thanks, Manish, and thank you for joining today. Yep. I'm Brad, the CEO and founder of super.AI. Just a quick background. So I did my PhD at MIT and in symbolic AI and machine learning and then I founded another company. We got it to unicorn status and sold that last year. And so I've seen the AI space from academia, from industry, and I think it's it's not difficult or I guess it is it's not difficult in retrospect to see patterns that are successful and patterns that aren't. And so we've encoded those in the Super eight platform to take ITP to the next level. So, so what are these things? So I'll go over them in slides and then and then we'll jump in the demo. And if you have any questions, obviously you'll feel free to jump out. So the first piece of of what you need in this next generation IDP solution is a unified AI platform. So if you don't have this, this is like not having your iPhone in 2022. Previously, if you wanted a calculator where you go buy a calculator, if you wanted something to play music, you'd go buy a radio. Whereas now all these things are in our phones. And so it's the same idea for for air. Why would you go buy a bunch of separate things when you can have a unified platform and it's even easier in the field of air because the if you had a Venn diagram of the overlap between applications, it's huge. And this is the first time in history that we've had a technology where you train, you validate. I was a physicist and every experiments in physics needed a totally different experiment. You would have to invest millions and millions of dollars and sometimes billions of dollars to do that, whereas every air model has the exact same experimental framework. It's amazing. So the the first piece of this unified air platform that we built is any application. So as many said, you can you can do document extraction reduction, classification, transaction enhancement. We have this app store of hundreds of different applications which all share a very similar infrastructure, basically identical. So you have any app, but then you have any AI. So the world is changing so fast, unbelievably fast. Every day it's sometimes hard to keep up. And so if you try to create the best machine learning model, and even if you succeeded probably tomorrow or the next week, some somebody is going to come up with something better. And so you need your platform to keep up with this fast paced change in the world of air. And so our platform allows you to use any A.I., whether it's, let's say, Google, MWC, Microsoft, an open source model, internal models that you built. Are we we can build custom models. We have a large partner network, but you want to be able to use any and we make this easy and then finally any data type. So whether it's document images, emails, audio, video, satellite and more. So you want to have a platform where you can have any app, any and any data type. The second piece is what we call a touchless automation. And the first piece of that is there's a lot of machine learning models out there. And so and there's a lot of data out there. And so what you want to do is you want to start with pre-trained AI to start with the high level of automation. So for example, if you look at signatures, it doesn't matter what kind of document it is. If it's an invoice or a bill of leading a signature as a signature, as a signature. And so we've trained these signature detectors over hundreds of millions of different documents. And regardless of where the signature is located, on what type of document it is, you start with a high level of automation for signatures, and this is just one of many fields. So you start with the Pre-Trained A.I. but then again, every document is different. Maybe the signatures on the bottom left on this document, maybe it's on the bottom right on a different document. And so if you only had pre-trained, I will. Of course, you can start with the high level of automation where you're capping yourself out to low because you're not learning the specific information about your document. And so the second piece is we have these composable AI which can take these Pre-Trained AI or other AI, and it's it's a massive neural network that learns the specifics of how things are laid out on a document, for example, and allows you to get a really high level of automation over time. So preaching allows you to start with a high level of automation, but it reaches a ceiling composable. It takes you to the next step and customize it to to your exact documents. And this allows you to do what we call automatic automation. And this allows you to automatically go from humans in the beginning to close to 100% AI eventually, with built in exception handling, labeling, validation, all the infrastructure that comes along with it. And our customers love this because you don't need to hire rare, expensive PhDs to get this done. This happens automatically. And then finally we have the quality, cost and speed guarantee. And so you may notice that this isn't just the quality guarantee. It's just not a speed guarantee. It's all of these at once. And we find this to be very important, and it starts with 150 plus quality assurance mechanisms. So we check data quality. If you have traction, then you get trash out. We look at worker quality, test quality. I'll show you an example of this later. And then we have routing and combining. So this is really important because AI is good at some things. Humans are good at other things, software is good at other things. And so it's a it's a false dichotomy or I guess try economy in this case. You say, I'm only going to do this with air because it's good at some things, that humans are just great at other things and software is good at other things. And so we use all of these at the same time. We reach out to different workers of different species, and then we combine them again. I'll show you an example. And then the third point is a digital assembly line. So Henry Ford taught us not to just build a car at once because this is slow, it's error prone, and it's expensive. He taught us to break the building of things into small steps. For example, one one task could be putting a wheel on a car, another is putting a mirror on a car, etc.. And so the same thing holds true in the digital world, and we have different efforts that make this even more powerful. So those are those are the three things that make super a difference of the unified air platform, such as automation and quality costs and speed guarantee. So let's just jump into a demo to make this more concrete. So I'm going to share my screen here. I think it takes a while to to share. So I will just wait a second. And what I'm showing you here is. A data extraction, a document data extraction application that's that's built on this platform. Of course, there's there's many different applications in the marketplace, but I will choose this one for now. And so the first step of any application, we also call them data programs, is to customize which fields you want extracted. And so, as I mentioned, with these preaching models, we have pre-trained documents. So for example, if the system classified this as a bit of leading, then by default it would extract things such as company agent number, the signature. And all of these things represent a pre-trained model. And then this bill of leading represents the composable AI, which has been trained on on many different build weightings, or if it classified as a warrant, it would extract the warrant number, the police report number, etc.. So we have clean forms. We have hundreds of different documents which are pre-trained. And of course, if you don't want to use this, you can delete that. You can you can add new pre-trained model fields or you can add custom model fields and upload a model for that. But we'll just use this for no. Okay. And then the second piece of this is, as we mentioned in this industrial enterprise grade, you want to guarantee the quality, cost and speed. And so the first step we do is set the bounds of what's acceptable for our project. So I don't know, let's just say I want a minimum accuracy of 93.7%, and that's fine for me. So what I'm going to do is filter the system and our system will root and combine and do all these things in such a way that it will at least be above 93.7% quality. And then you can filter the rest. So maybe. Maybe I care more about time. So I want to filter the the workers. And again, we have eight workers, software workers, human. We have a crowd around the world. We have partners who will who can do that. You can integrate Google models, etc.. But maybe I want to survey by cost. And so I want as long as it's over 93.7. I want the cheapest thing. Or maybe you don't know exactly how you want to filter it, but you just say, okay, well, I care. Let's say slightly more about cost and then about time. So you can you can filter in a more fuzzy way and we can filter that. So I will just choose this one for now, for the sake of moving forward. And I will just start processing. And what we'll do here is look at the interactive view. Open up the kimono and see how this happens behind the scenes. So the first step is you pull data from your database, whether it's an RPA system and on prem database, a cloud database, and it inputs it into this digital assembly line. And then, as we mentioned, we decompose this complex task into simpler tasks. So in this case, it'll first classify the document and it'll classify each page because sometimes you'll have a 100 page document with many different types of documents. And then you run a detection algorithm that detects each of those fields that were identified in the onboarding new transcribes it. And then you do fuzzy matching, which is fuzzy matching just means let's say you have to date which date is the birthdate which dates to the expiration date, and you want a fuzzy matching algorithm to help you do that. So now we start executing this workflow and we start with the classification task. And so this router, which is a reinforcement learning algorithm, decides based on the quality, cost and speed you chose, how is it going to route the workers? And so in this case, it looks like it routed to two AI in one software function. It determined that that was enough to get to the 93% accuracy. I'm just going to switch the output view here so we can see what they do, but I'll go to the next step. And so now we're waiting for the workers to work on this task. And so it looks like the CEO, a team that's that bankbook, the vision transform. Okay, looks like they all said bankbook, but they don't all need to agree. And sometimes it could routs even dozens of different workers and they may not all agree, which is why we have this combiner. It's combining this is generative model. And what it does is it takes the output of all the workers and it looks at how trustworthy they are. So, for example, maybe an expert human, you want to trust more and so you want to assign a higher weight in this generative model, but it outputs a single output. And at this stage the output is really, really high quality. And of course in this case it was bankbook, which is not surprising, but sometimes it is surprising again and we mentioned earlier that we want to actually guarantee the quality, cost and speed. And so we run these 150 plus different quality assurance algorithms. So we look at, for example, the data or the data high quality enough. And if it wasn't, this would be an X or we have an anomaly detector. So someone answer too quickly or too slowly doesn't mean they're incorrect or just it means there's an anomaly. Or sometimes, well, we have a consistency score. We'll send the same task. We'll just compute the outputs because sometimes if you have a human in there, so they'll get into this autopilot mode where they're just clicking. And so we want to avoid many things like this. In this case, none of the two checks failed. And then we go to training the AI on this classification task. So what we do, this is the composable I mentioned. So we take this really high quality output. We train an air model so that eventually it's 100% air. And as you saw this happen automatically, I'm just going to hit the play button here and and let it run. But now we do the exact same thing under the detection task where certain workers and as you saw here, I'm just going to go back a step the this was a slightly more complex task and this is a really old document. And so it decided that we will write to one eye, but it rooted in two humans and they all handled separate things because maybe they in the past were better at detecting signatures or stamps. And so this is this assembly line idea where you decompose the task into small pieces. But now we we combined. They work together. You know, we do Cuba and now move to the transcription task where everything now gets transcribed. That's complete. Again, you combine it, run away, you train it, and then you do fuzzy matching. I will just pause it and we can look at the metrics here. And so this is the automatic automation I was talking about. So in the beginning the system estimated that only 13.7% will be handled by AI and software bots. But even after just a few iterations it's already 41% automated and you can see that the cost in time because of that is automatically decreasing and even the quality is increasing because it gets better at routing and combining. And so this is the the power of of automatic automation. So I will I will pause, pause there. And of course, this is just one of many applications. You can try these out. You can sign up for a free. Seven day trial. There's many applications you can do. And with that, I will turn it back over to Maneesh.

Manish Rai [00:21:56] Thanks, Brad. What you saw is what's happening under the hood. But at the end of the day, if you're running a coffee or a shared services center, what you get is an application in days where simply there's an input which could be a file or a document you upload or an image you upload and you get an output in a structured format. So our platform becomes a step in your RPA or Low-Code platform. To do that. It takes any kind of unstructured data and gives you an structured format so you can achieve end to end automation. And then you look at use cases. Clearly the top priority of many of the shared services and GBS organizations today is documents. They're finding that the first generation IDP promised a lot and sometimes they're not able to handle all the different types of documents at volume as well as they hope they should. In that case, I hope you will consider this as a solution for invoices, videos, love leading. We have many customers who are using us now for those type of documents, and once you've moved beyond documents, you find there's a lot of different type of emails, be it customer service helpdesk, case related emails. And when you move into specialized areas such as banking, there could be interbank settlement. The images as part of KBC, you might have ID images as part of video surveillance. You might have to recognize nameplates. We are seeing a lot of use cases in terms of visual inspection. So if you want to look at expand the scope of your GBS and serve more of the business units inside your organization, be it in supply chain and in manufacturing, we had a lot of visual inspection is involved. We can take images, determine cracks in boilers, determine rust or even visual inspection related documents. Sometimes those documents that come in have some some are handwritten in various formats. And those processes, inspection processes, can be made much more streamlined. We have video use cases coming to us it finding defects in and in jewelry to finding other kinds of defects in products through through video images. And we can take and label one frame inside of video and very easily take 32nd or longer videos and to train it very quickly to identify those defects and so on and so forth, audio and text files as you move beyond that. So you saw the power of the platform in what Brad presented. Basically, you're looking here as one of the most hardest, hardest type of invoices on the left hand side of the image. We picked that up. I think it's the Canadian transportation invoice used over there. It has logos, it has handwritten notes, it has tables. And what our platform, what Brad showed you is net. You pick the quality, cost and speed trade off and we break it down into smaller tasks such as head, detect, the signature, detect, determine the logo, the address and and table and so forth. So we'll use a different API that's specialized for different section of this document. And based on your quality, cost or speed will pick the best worker if you want costs to be low, maybe is the best at keeping the cost low, but if you want 99.9% quality guaranteed, you may need to route it to humans and for them to agree on the output to give you that 99.9% guarantee, we'll make that determination. It shouldn't automatically combine all that output and give you a single structured format output out. There is a example of our prebuilt application for redaction. We can product images, documents and videos. For example, blurring faces, license plate numbers, brands in documents, taking out national ID number, Social Security numbers, credit card numbers, names and addresses. We have ability to kind of anonymize those as well. And his example of a. Visual inspection use case in this case detecting corrosion. But we have many, many different use cases for visual inspection. It could be determining cracks. It could be a video file instead of image and drone footage where you might be needing it. And our platform is capable of addressing all of those use cases. So with that handed back to Brad to to summarize our differentiator, one thing, Brad, I do want to point out before I hand it over to you is a lot of these IDP platforms out there, one thing they are lacking is a crowdsourced workforce that can help you train your AI, label the data that you have or train your models, and then all the sophisticated gamification built in to keep the crowdsource engaged de-escalation built in. If you crowdsource worker is not available, we can escalate to another worker a quality mechanism to determine the quality of those workers. Or we have a lot of sophistication built in. So to give you peace of mind so that you can rely, rather than having to source all the people to set up the system and do on ongoing exception handling, we can provide that workforce for you in a crowdsource manner which is cost effective and all the quality control mechanisms to to sort of guarantee the outcome you desire. So over to you, Brett.

Brad Cordova [00:28:31] Yeah, thanks, Maneesh. And, and so when we started with, with going over the history of, of using structured data and then the first versions were just the UI over OCR. And so we kind of fault those. But, but we also learned a lot from them. They were just an older technology. And so sometimes I hate putting those down because those are the necessary stepping stones to get us here. And yes, while this is the the next evolution and in this makes this whole process much easier and cheaper and faster and higher quality, I really respect those other platforms for for getting us to be able to do this stuff. And and yes, while this this does have an amazing amount of features that you saw here. This is definitely the next I hate selling it because there's there's already so much hype in the in the air space and everyone saying they do everything. And so what we love to do is just just you try it and and you see there's all this stuff of unified air platform touches automation. But in the end of the day, it just needs to make your life easier. Not everyone cares about this. They want to go home to their family and not have to think about this stuff. And I think that's really our goal here, is to allow humans to be more human, to save them money and time. And that's really what what we're doing here. And instead of doing yet another sales pitch, whatever, you just try it. And and I our customers are really happy with what we do. We love to save them a lot of time. We love to deliver high quality results. And we absolutely love to save people money. And Maneesh and I are here pitching this, but it's really the hardworking people in the offices that make this possible. And this really came together because we were able to hire the best people, people founders of some of the people on the original team of Google. I we had two of the people who built the deep learning platform at Microsoft. Our chief product officer was a top head of product. It was our our principal engineer built the deep learning infrastructure sap. And so so we're here talking. But it was this confluence of ideas that allowed us to to build this platform. And so we're really grateful for that. And we're really grateful for you listening to this today. And we wish you the best in your automation process. And if that's super AI or anything else, good luck. And if there's any questions, we can we can jump into those.

Speaker 1 [00:31:14] Thanks, guys. So we are out of time today. So, yes, if you can type in your question really quickly, obviously, Brad, many too happy to answer it right now. Or if you look to the right of your screen, you can also see that broad have provided their LinkedIn information. So, guys, I mean, I'm sure, you know, whether you have questions now or six months from now, you know, Connect with the Nation broad, they'd be more than happy to help you. But otherwise, thank you so much for attending our event these past few days. We love having you. I am. Once again, I'm Elizabeth Mixon, the editor in chief of the Intelligent Automation Network. If you have any feedback, questions, anything like that, just write back to your registration email and we'd be happy to answer them. But otherwise, thanks everyone and enjoy your day. Thanks for the Brad.

Brad Cordova [00:32:06] All right.

Manish Rai [00:32:07] Thank you, Elizabeth. Thank you, everyone.

Read less