Document AI Evaluation Framework

In this blog post, we will present an evaluation framework for Document AI that is conducted at three different levels: NER (Named Entity Recognition) level, concept level and document level. We will illustrate this evaluation framework using a use case - the financial KIDs (Key Information Documents) - which contain various concepts such as issuers and assets.

You read an auto-translated version of the original German post.

Evaluation at NER level

    When evaluating at NER level, the focus is on the accuracy of the recognition of individual named entities such as persons, places or organizations. For our example of financial KIDs, named entities such as issuer names, locations or product names could be relevant.

    • Accuracy: The accuracy indicates how many of the recognized named entities are actually correct. A high accuracy value indicates that most of the recognized entities are correct.
    • False positive rate: The false positive rate shows how many of the detected entities were incorrectly identified as correct. A low value is desirable as false positive results should be avoided.
    • False negative rate: The false negative rate indicates how many actually existing entities were not recognized. A low false negative rate indicates that most of the relevant entities were recognized.

    Evaluation at concept level

      At the concept level, various named entities are combined into one concept, e.g. an issuer with first name, surname, date of birth, street, house number and zip code. In our example, concepts such as issuer data or asset details could be relevant.

      • Accuracy: The accuracy shows how many of the recognized concepts are correct, including all associated entities. A high accuracy value means that most of the recognized concepts are correct.
      • False positive rate: The false positive rate indicates how many of the recognized concepts were incorrectly identified as correct. A low value indicates that false positive results have been minimized.
      • False negative rate: The false negative rate shows how many of the concepts actually present were not recognized. A low false negative rate indicates that most of the relevant concepts were recognized.

      Evaluation at document level

        At the document level, all concepts with all correct entities must be found. If the score is 100 %, all documents can be processed fully automatically.

        • Accuracy: The accuracy indicates how many documents were processed correctly and all relevant concepts and entities were recognized. A high accuracy value means that most documents were processed correctly.
        • False positive rate: The false positive rate shows how many documents were incorrectly identified as correctly processed. A low value is desirable as false positive results should be avoided.
        • False negative rate: The false negative rate indicates how many documents were not processed correctly and relevant concepts or entities were not recognized. A low false negative rate indicates that most of the relevant information was successfully captured.

        By applying this evaluation framework at different levels, we can comprehensively assess the performance of Document AI and ensure that all relevant information is captured correctly. This is crucial for automating document handling processes and ensuring their accuracy and efficiency.

        "
        "
        Charlotte Goetz Avatar

        Latest articles