Frequently asked questions
Our AI software uses supervised learning to learn regularities and apply them automatically to new incoming documents. Algorithms learn regularities of your expert knowledge based on examples. The results of the learning process can be compared with the known, correct results, i.e. "monitored" particularly well.
The software completely dispenses with the creation of rules or layouts. This manual activity, which is common with older providers, is taken over by algorithms. This saves resources of IT experts, because the software learns from non-IT experts and it is still possible to customize the software. The software does not use any kind of swarm intelligence, so your expert knowledge is not shared.
You obtain the AI software from us without pre-trained models. If you wish to use ready-made AI, we will forward you to partners who have made advance payments and offer ready-made models.
For each field, the AI needs approx. 20 individual cases so that the AI can learn a so-called label. You receive an individual confidence value for each piece of information read out.
The software meets the needs of IT departments, software vendors, IT system integrators and data scientists designed.
The data to train your own models is accessible via the API or the Python package. The source code is only accessible for individual model developments that go beyond pure low-code training via the web interface. If you are interested, a knowledge transfer is possible to enable you to train your own models.
Yes, Helm & Nagel GmbH offers the AI software as a SaaS variant in the OTC, Open Telekom Cloud on servers in Germany and the Netherlands.
Our technical and organizational measures (TOMs) apply. The Data Processing Agreement between Helm & Nagel GmbH and you as a customer regulates the full compliance of you processing personal data in accordance with Article 27/28 GDPR. Not only the Data Processing Agreement, but also the non-disclosure agreement (NDA) also extend the scope to the customer's affiliated companies. Whether the data to be processed by you is personal data, you have to check yourself.
Yes, on-site installation in a Kubernetes cluster or as a Docker image on Redhat Linux servers without a graphics card with at least eight 2.6 GHz processors incl. AVX2 CPU command extension and 64 GB RAM.
Yes, please contact us via the contact form.
You can find the prices under https://konfuzio.com/de/aktuelle-preisliste/. The page is protected with a password. Please contact us to get the password.
You will receive the invoices by e-mail.
No, the AI software does not replace a business rule engine (BRE) or workflow engine.
For each field you want to extract, you need about 2 hours. A field can be e.g. the delivery note number or the article number. If you wish, Helm & Nagel GmbH will learn your model in the order.
The fully automated training of the AI can be started via API or web interface.
There are no limitations. Please test the text recognition of your scans as part of a test access. You will be surprised how well even vehicle documents and scans in poor resolution are recognized.
By using the established Django Rest framework, it is very easy to provide and consume interfaces in many formats, so that the application can quickly link to third-party systems and can also be used when developing modern single page applications or native mobile apps.
You can find the API documentation at https://app.konfuzio.com/api/.
For more information on how to use the interfaces, see Swagger RESTful API Documentation Specification.
Yes. However, we ask you to reconsider whether the automation of handwriting cannot be solved by a web form.
The OCR recognizes text and handwriting. Recognition of text boxes cannot be guaranteed.
Measured on 100 pages, text recognition takes an average of 1.4 seconds. In addition, each field to be recognized takes about 0.1 seconds.
You receive the results either synchronously or asynchronously via the REST API as JSON or via manual download as CSV. In addition, it is possible to define an individual webhook URL for each document.
There is no reliable approach for document separation with AI in the world. We are working on one and offer rule-based separation until further notice.
The calculation of the statistical values is based on the relevant literature. Wikipedia offers an introduction at https://en.wikipedia.org/wiki/F-score. We disclose the source code for the calculation of the evaluation.
Yes, all software innovations are tested automatically and rolled out on a test system if there are no errors. Before each release, we perform extensive integration tests in the staging environment. After all tests have been successfully completed, we make the changes available to you. Please inform us if you find an error against your expectations, https://konfuzio.com/support/.
The SaaS variant scales automatically and basically offers a capacity of 10,000 pages per hour. On-site installations offer a capacity of at least 6,000 pages per hour and are fully scalable as a cluster.
With the SaaS variant, we take over the monitoring of the application for you. With an on-site installation, monitoring is done via Grafana and a Kubernetes dashboard. You also have access to the processing status of each document via the API.
We guarantee the accuracy measured on your documents with the record status "Test" on request.
An example is the prediction of item numbers, unit prices and total prices of an item of a newly received Invoice. It is often assumed that only one "AI" is used. Unfortunately, document processing is much more complex, so we need to use different AI algorithms.
In most cases, a digital document does not contain any text, but only provides a visual representation of the content. We use OCR to extract characters, words, lines, tables, and paragraphs from this image. To assign a subject context to these text elements, a NER algorithm assigns individual characters, letters, sentences or paragraphs. Already here, a wide variety of application possibilities arise. From the recognition of a currency to contract analysis. Coming back to the case of the invoice, this algorithm finds various article numbers, individual prices and total prices. In the last step, another AI algorithm assigns the found article numbers, unit prices and total prices to a technical concept. In this case, the algorithm recognizes the assignment to a single item. This is particularly challenging because the item number, unit price, and total price of a line item are not necessarily within a single line. To enable a particularly reliable assignment, we have developed an AI that takes into account relative space-text information to recognize a subject context and can thus, for example, split part numbers, unit prices and total prices into separate line items and separate them from vendor and customer related information.