September 4, 2024

How to Extract Insights from Documents with AI

Manu Suarez

How to Extract Insights from Documents with AI

Companies handle thousands of documents daily, such as reports, contracts, and vendor offers.

When we take a closer look, we see employees in these companies manually extracting information and populating PowerPoint presentations, Word documents, and Excel files. For a typical corporate worker, this process can consume an average of 30% of their time.

The good news is that, in most cases, the same type of information is extracted over time, making it particularly appealing for Large Language Models (LLMs).

Why? LLMs excel at reading documents and extracting information. While a human reads at an average of 250 words per minute, an LLM can process thousands, or even millions of words per second—a difference of more than 100 million times.

Document analysis, especially for long and structured documents, is one of the most successful applications of LLMs in business processes.

Let’s understand how to build this use case in Stack AI.

Building an AI Agent to extract insights

Document
Analyzer

Drop a Documents node from the Data Loader section into the canvas. With this functionality block, the entire content from the document is extracted and parsed, ensuring that no information is lost.

Knowledge bases, on the other hand, filter the content returned based on the user's message. The advantage of knowledge bases is that they can be as large as you need them to be.

In this case, the Documents node is the most appropriate.

Document
Analyzer

Click Settings and select Advanced Data Extraction and/or Advanced Text in Image Extraction options to retrieve information from tables, images, etc. This will impact the processing speed of your application but increase its accuracy.

Document
Analyzer

Drop a Large Language Model and specify in the prompt section how you need it to process the document. In this case, we ask it to extract any information regarding growth rates first.

Reference the node from which the report information will come — in our case, it’s the doc-0 node (label of the Documents node we dropped in).

Document
Analyzer

Drop as many Large Language Models as you need. We recommend having one per task; in this case, one per insight or analysis to be performed. Add one output node to each Large Language Model.

Document
Analyzer

Click the Expose as input option in the Document node to make it available in the user interface. This will allow users to upload their documents.

Document
Analyzer

When ready, click Publish so that external users can access it.

Select the Right User Interface

Document
Analyzer

Once in the Export tab, select the Form interface. Customize the fields of the Form based on your specific report and business process. Click Save Interface.

When done, click the URL at the top or share it with your colleagues. Your AI application is now ready for users to interact with it.

Document
Analyzer

As you interact with the application, you will see a field for each output node that was added in the workflow builder. Simply click the download button and choose your preferred format.

Wrapping up

Extracting information from long reports is very time-consuming. This guide gives you the general guidelines to automate any document extraction process your business might need.

Create a free Stack AI account and explore our AI tool tutorials:

Deploy custom AI Agents, Chatbots, and Workflow Automations to make your company 10x more efficient.