Frequently asked questions

How does the AI Konfuzio work?

Our AI software uses supervised learning to learn regularities and apply them automatically to new incoming documents. Algorithms learn regularities of your expert knowledge based on examples. The results of the learning process can be compared with the known, correct results, i.e. "monitored" particularly well.

What added value does AI software offer compared to classic OCR solutions?

The software completely dispenses with the creation of rules or layouts. This manual activity, which is common with older providers, is taken over by algorithms. This saves resources of IT experts, because the software learns from non-IT experts and it is still possible to customize the software. The software does not use any kind of swarm intelligence, so your expert knowledge is not shared.

What ready-made document models do you offer?

You obtain the AI software from us without pre-trained models. If you wish to use ready-made AI, we will forward you to partners who have made advance payments and offer ready-made models.

How many examples does the AI software need to learn?

For each field, the AI needs approx. 20 individual cases so that the AI can learn a so-called label. You receive an individual confidence value for each piece of information read out.

Who is the AI software designed for?

The software meets the needs of IT departments, software vendors, IT system integrators and data scientists designed.

Do I get the data and source code of the AI models?

The data to train your own models is accessible via the API or the Python package. The source code is only accessible for individual model developments that go beyond pure low-code training via the web interface. If you are interested, a knowledge transfer is possible to enable you to train your own models.

Is an OCR software operation hosted in the cloud?

Yes, Helm & Nagel GmbH offers the AI software as a SaaS variant in the OTC, Open Telekom Cloud on servers in Germany and the Netherlands.

Is the SaaS variant GDPR compliant?

Our technical and organizational measures (TOMs) apply. The Data Processing Agreement between Helm & Nagel GmbH and you as a customer regulates the full compliance of you processing personal data in accordance with Article 27/28 GDPR. Not only the Data Processing Agreement, but also the non-disclosure agreement (NDA) also extend the scope to the customer's affiliated companies. Whether the data to be processed by you is personal data, you have to check yourself.

Is an on-premise solution also offered?

Yes, on-site installation in a Kubernetes cluster or as a Docker image on Redhat Linux servers without a graphics card with at least eight 2.6 GHz processors incl. AVX2 CPU command extension and 64 GB RAM.

Can I get a trial account?

Yes, please contact us via the contact form.

Where can I find the price list?

You can find the prices under https://konfuzio.com/de/aktuelle-preisliste/. The page is protected with a password. Please contact us to get the password.

How is the billing done?

You will receive the invoices by e-mail.

Is it possibleh rules to be stored?

No, the AI software does not replace a business rule engine (BRE) or workflow engine.

What is the effort involved in training?

For each field you want to extract, you need about 2 hours. A field can be e.g. the delivery note number or the article number. If you wish, Helm & Nagel GmbH will learn your model in the order.

When does the AI learn?

The fully automated training of the AI can be started via API or web interface.

What scan resolution is required?

There are no limitations. Please test the text recognition of your scans as part of a test access. You will be surprised how well even vehicle documents and scans in poor resolution are recognized.

Which technologies are used for the offered interfaces?

By using the established Django Rest framework, it is very easy to provide and consume interfaces in many formats, so that the application can quickly link to third-party systems and can also be used when developing modern single page applications or native mobile apps.

What are the requirements for using the interfaces?

For more information on how to use the interfaces, see Swagger RESTful API Documentation Specification.

Does OCR recognize handwriting?

Yes. However, we ask you to reconsider whether the automation of handwriting cannot be solved by a web form.

Are check boxes supported?

The OCR recognizes text and handwriting. Recognition of text boxes cannot be guaranteed.

How quickly can the processing data be provided?

Measured on 100 pages, text recognition takes an average of 1.4 seconds. In addition, each field to be recognized takes about 0.1 seconds.

How do I get the results?

You receive the results either synchronously or asynchronously via the REST API as JSON or via manual download as CSV. In addition, it is possible to define an individual webhook URL for each document.

Can documents be automatically separated by the AI?

There is no reliable approach for document separation with AI in the world. We are working on one and offer rule-based separation until further notice.

How do I interpret the statistical values?

The calculation of the statistical values is based on the relevant literature. Wikipedia offers an introduction at https://en.wikipedia.org/wiki/F-score. We disclose the source code for the calculation of the evaluation.

Is there a separation between the training and productive environments?

Yes, all software innovations are tested automatically and rolled out on a test system if there are no errors. Before each release, we perform extensive integration tests in the staging environment. After all tests have been successfully completed, we make the changes available to you. Please inform us if you find an error against your expectations, https://konfuzio.com/support/.

For which loads is the system used productively?

The SaaS variant scales automatically and basically offers a capacity of 10,000 pages per hour. On-site installations offer a capacity of at least 6,000 pages per hour and are fully scalable as a cluster.

How is monitoring carried out during operation?

With the SaaS variant, we take over the monitoring of the application for you. With an on-site installation, monitoring is done via Grafana and a Kubernetes dashboard. You also have access to the processing status of each document via the API.

What detection rates can be guaranteed?

We guarantee the accuracy measured on your documents with the record status "Test" on request.

Where does AI find application in software?

An example is the prediction of item numbers, unit prices and total prices of an item of a newly received Invoice. It is often assumed that only one "AI" is used. Unfortunately, document processing is much more complex, so we need to use different AI algorithms.
In most cases, a digital document does not contain any text, but only provides a visual representation of the content. We use OCR to extract characters, words, lines, tables, and paragraphs from this image. To assign a subject context to these text elements, a NER algorithm assigns individual characters, letters, sentences or paragraphs. Already here, a wide variety of application possibilities arise. From the recognition of a currency to contract analysis. Coming back to the case of the invoice, this algorithm finds various article numbers, individual prices and total prices. In the last step, another AI algorithm assigns the found article numbers, unit prices and total prices to a technical concept. In this case, the algorithm recognizes the assignment to a single item. This is particularly challenging because the item number, unit price, and total price of a line item are not necessarily within a single line. To enable a particularly reliable assignment, we have developed an AI that takes into account relative space-text information to recognize a subject context and can thus, for example, split part numbers, unit prices and total prices into separate line items and separate them from vendor and customer related information.