MuleSoft Intelligent Document Processing (IDP) enables organizations to automate the extraction, processing, and analysis of data from physical documents. By leveraging machine learning and natural language processing, IDP transforms unstructured data - such as PDFs, images, and scanned documents - into structured data that can be seamlessly integrated into business processes.
Under the hood, Mule IDP leverages Amazon Textract for extracting data from the document, while Einstein AI does the content analysis based on the provided prompts. The extracted information is then formatted in a structured format (e.g., JSON) to integrate seamlessly into your business workflows.
The configuration settings for this process are called "document actions," which can be customized for different extraction needs. Once set up, the document action is published to MuleSoft Exchange, where it can be accessed through a RESTful API or used in applications like Anypoint Studio or RPA Builder for further automation.
For more information, please refer to Integrating IDP with Anypoint Studio and Automate Document Processing with RPA.
Traditional Optical Character Recognition (OCR) tools have been widely used for extracting data from documents, but they require systematic selection and delimitation of document areas before data extraction. When dealing with dynamic documents lacking a consistent template, OCR solutions need training for each template, requiring significant maintenance.
MuleSoft Intelligent Document Processing (IDP) goes beyond OCR by utilizing document analysis and extraction through a Large Language Model (LLM), a type of Artificial Intelligence (AI) trained to dynamically identify where desired information resides in any document. This approach requires less maintenance compared to OCR. With consistent and concise queries, MuleSoft IDP can dynamically locate information without configuring specific document areas before extraction.
Demo: Invoice Processing
Using MuleSoft IDP, data is extracted from invoices in English, allowing for swift, automated processing with options for manual review when confidence in data extraction is low.
Creating and Publishing the Document Action
Visualizing the Exchange asset, and calling the API through Postman
Triggering the manual review
In cases where the IDP isn’t confident about the extracted data, a manual review can be triggered for further verification.
Demo: Custom Document Processing IDP can handle multiple languages, as seen in the extraction of information from Portuguese documents, making it an ideal solution for global businesses with diverse document needs.
Creating and Publishing the Document Action
Visualizing the Exchange asset, and calling the API through Postman
MuleSoft’s Intelligent Document Processing solution brings speed, accuracy, and security to your document workflows. Reach out to discover how this solution can be implemented to transform your business workflows or explore our MuleSoft services to unlock the full potential of MuleSoft for your organization.
Contact us today to discover how we can tailor Salesforce solutions to your unique business needs.Disclaimer: Northern Trail Outfitters is a fictional company used for illustrative purposes in this article.
Leonardo is a Technical Architect with expertise in Salesforce and MuleSoft, dedicated to designing innovative solutions that drive digital transformation. With a strong focus on aligning technology with business needs, Leonardo delivers practical, effective implementations to help organizations achieve their goals.