AI in insurance: OCR AI in input management | Update 2024

Digitizing processes via input management systems is nothing new in insurance companies. These systems process incoming mail through to archiving. The primary aim is to prepare data in a structured manner, which is then passed on to subsequent systems - such as an ERP system. However, these tools are often outdated and very expensive.

AI-driven input management through OCR and NLP

An expansion of input management by combining various artificial intelligence (AI) solutions such as automatic text recognition (OCR) and word processing (NLP) is already used today for more than 62 % of customer interactions in insurance companies [1]. The intelligent OCR uses keywording and extraction of text fields or entire sections of text in documents or emails and increases the accuracy of rule-based approaches by 6 % to 93 %. Insurance companies also save time by using intelligent automation solutions such as hyperautomation.

How does AI OCR work?

Automated document processing with OCR technologies


Figure 1. automated document processing with OCR technologies.

The sequence of an automated document processing with OCR is shown in Figure 1. In general, they all follow the same structure:

  1. Input

    The data input (the document) is taken from a database, from one of the front-end systems such as a Robotic Process Automated Bot, an email, or others. More Low-code and no-code providers you will find in our following article.

  2. Preprocessing

    The files are pre-processed to be processed regardless of the file type, the quality of the scan and the number of pages.

  3. intelligent detection

    Neural-based automatic document classification technology enables sorting of documents by type (e.g., driver's license, bank statement, tax form, contract, invoice) and user-defined subcategories (e.g., vendor A invoices, vendor B invoices) by identifying text content and image patterns.

  4. Assignment and categorization

    The neural machine for classification defines a document type and selects a correct document definition for further content processing.

  5. Subject data extraction

    After recognizing certain fields, the structured or semi-structured text is extracted from the document and exported to the target system.

If desired or required, AI OCR enables human review by setting a confidence level threshold. This human feedback helps the AI to learn continuously. The human feedback, also called human-in-the-loop can be provided flexibly and individually via the Document Validation UI, be built into each process. If a specified threshold is not reached, a manual check is performed before exporting the data to the target system. The final output of this process can be an XML, JSON, CSV, XLSX/XLS, TXT or HOCR file.

Scope of Input Management functions

1. letter mail

Incoming letter mail is received by the appropriate postal logistics providers.

2. letter sorting

The received letters are sorted according to the criteria "open" or "do not open".

3. letter opening

Letters classified as "open" are opened. Letter opening technologies are usually used for this purpose.

4. fine sorting

All sorting and preparation activities for the subsequent digitization of letter mail are subsumed in the fine sorting area. This includes sorting according to, for example, special formats, clients, process types and process subtypes, but also scan preparation: unstapling, preparation of individual pages, smoothing, insertion of separator sheets or application of barcodes for process/document separation.

5. scanning

Scanning is the process of converting analog paper-based documents into digital file formats using MFPs, desktop or production scanners.

6. mobile scanning

With mobile apps, paper-based documents can be scanned directly into Input Management by the customer or partner.

7. fax import

During fax import, the faxes are imported directly from the fax server. In addition, image enhancement takes place, e.g. in the area of compression and scaling.

8. email import

Electronic files from e-mail systems (Exchange, Lotus Notes) can be imported into Input Management via e-mail import.

9. email preparation

E-mails are converted in such a way that the e-mail body and the e-mail attachments can be analyzed separately in the subsequent process steps. Often, this also requires conversion of the e-mail attachments into a machine-readable format.


Optical Character Recognition (OCR) is a method of converting text that is not in the form of machine-readable characters but in image format into an encoded string of characters that can be processed by the computer (encoded information). In addition, Optical Mark Recognition (OMR) can be used to recognize marks (e.g. checkboxes) and Optical Barcode Recognition (OBR) can be used to recognize barcodes and data matrix codes.

11. web/portal/file import

Via the web/portal import, electronic files can be imported from internet pages or internet portals into the Input Management. Via the file import, electronic files can be imported from the file system into the Input Management.

12. voice messages

Voice messages, for example, are imported from a telephone system into Input Management.

13. voice to text

Voice to Text converts spoken words into processable strings.

14. classification

Automatic assignment of a document type or document class to a scanned document.

15. extraction

Machine reading of document fields from scanned documents.

16. plausibility check

Error-tolerant checking of document fields captured via extraction against reference databases.

17. data enrichment

Enrichment of extracted document fields using reference databases.

18. manual correction

Document types or document fields not recognized in the classification and extraction process are manually post-processed and the data is completed.

19. special processing & 1st level

Missing data required for further processing is obtained manually through queries. 1st level processing, broad processing or simple processing is when simple business transactions are processed on a case-by-case basis. For example, the processing of a return shipment, including the necessary address determination, falls into this area.

20. handover electronic mailbox

Metadata and the document are given to the electronic inbox for further transaction processing.

21. handover electronic archive

Metadata and the document are transferred to the electronic archive for audit-proof archiving.

22. e-mail response management

Metadata, the document and the e-mail in its original format are transferred to a system for automated e-mail response. By using email response management, incoming emails can be processed and answered more efficiently, optimizing communication with customers and partners.

Process automation with hyperautomation in insurance

In view of the pandemic and the resulting economic crisis, it has become increasingly important to optimize and stabilize processes in insurance companies. The further development of automation technologies such as OCR, RPA (Robotic Process Automation) and AI result in economically and technologically advanced solutions for process automation - hyperautomation. The aim of many companies is to increase service quality or boost sales and make existing processes even more robust for the digital future of the company. The use of hyperautomation enables the automation of processes beyond rule-based standard applications.

Automatic fraud detection through AI in insurance companies

The insurance industry is increasingly struggling with cases of fraud, which cause billions of euros in damage every year. According to the German Insurance Association, 10 % of claims paid out in Germany go to the accounts of fraudsters [2]. To better detect the fraud attempts, technical solutions are needed that can always adapt to new circumstances and fraud patterns and go beyond rule-based approaches of input management. This is because the error rate there is high and additional manual effort is required. AI and OCR can be used to check claims for conspicuous content patterns and automatically detect anomalies. With the use of AI, an average claim amount of approximately €3,000 and the detection of 1,029 fraud cases, a savings potential of over €3.1 million could be achieved in an insurance company. 

AI in insurance individualizes the customer approach

Individualization and personalization are among the megatrends of the 2020s. Standard solutions do not inspire customers much and the demands for an individual customer approach are increasing. Insurance companies can use this development as a great opportunity for cross-selling and up-selling by using an AI-based solution as support. Based on customer information, individual e-mails can be generated automatically and the quality of communication sustainably increased. In the process, automatically generated texts can no longer be distinguished from manually created texts and the response rate can be increased from approximately 1.5 % to up to 35 %. The AI application allows for automatic learning through new input, closing knowledge gaps and making new connections on its own. Pre-trained language models such as GPT-3 are powerful text generators that independently write coherent texts and are used for successful customer engagement [3]. 

Through AI in insurance Understand documents better

The transfer of insurance documents between insurers, brokers and other partners is largely standardized by BiPRO Standard 430, but not automated [4]. AI processes data in millions of documents and helps employees find cross-selling potential in customer portfolios and save money in contract negotiation and input management. By using AI, content in documents can be retrieved in a structured way. Work steps such as typing, renaming, filing and validating are almost completely eliminated. This makes it possible to process these documents purely digitally, enrich them with known master data and harmonize them across systems. AI software learns to understand and structure information from documents 24 times faster than a human. As a result, insurance companies benefit from faster and more efficient processing of their documents.


[1] Capgemini Research Institute (2020). Smart Money.

[2] Friedrich, S. (2018). Du Lügst! in GDV Positions magazine, issue 3/2018, pages 24-26.

[3] Tan, B., Yang, Z., AI-Shedivat, M., Xing, E. P., & Hu, Z. (2020). Progressive generation of long text. 

[4] BiPRO e.V. (2021). Standard 430. 

Photo by Adrianna Calvo

Florian Zyprian Avatar

Latest articles