OCR SDK - Automatically rotate PDF documents

Konfuzio

The API based IDP solution from OCR and AI from Konfuzio reads documents reliably and intelligently. Through the Software Development Kit (SDK), ready-made SDK modules can be used in the development of individual solutions and thus adapted to the needs of each company. For example, with the automatic rotation of images and digitized documents.

Rotate PDF: Small step, big effect 

This step sounds simple, but it makes the processing of submitted documents much easier, especially when dealing with large volumes of documents. Manual, repetitive turning and flipping of invoices, applications and the like not only costs valuable time, it also causes delays in the processing of documents, which are due to the manual work of the rotation and can be eliminated automatically by using an OCR SDK.

The rotation of documents is therefore important for a smooth and effective flow of digital processes and forms the basis for many downstream tasks. For example, text recognition and information processing are tied to the correct rotation of documents. Intelligent document classification components can help reduce manual input and interventions such as these, through digital automation, while significantly increasing data quality.

Before

After

By automating the correction of documents (PDF, documents, images), OCR SDK can increase the processing efficiency in companies. This, in addition to speeding up all processes, significantly reduces the cost of administration and prevents employees from being held up with simple, tedious as well as repetitive tasks. Assuming 100,000 scanned documents, 10% of which are submitted in the wrong orientation, employees are busy rotating about 10,000 documents, which they have to check and correct either before or after the update - a huge time commitment and a waste of the valuable resources of their own employees. Thus, the benefits of automatic rotation:

  • Reduction of manual effort
  • Acceleration and optimization of processes
  • Higher data quality of archive documents through archivable PDF files
  • Saving resources

OCR SDK advantages 

With the help of the AI-based Software Development Kit (SDK), this rotation can be implemented and the OCR solution of Konfuzio can be adapted to individual needs of companies. Through professional training of the AI, the software adapts to the common documents in different companies and the text from PDF and paper documents as well as images or scans can be extracted and converted into structured information even more reliably. 

In addition, the OCR SDK can ensure compatibility with various platforms and operating systems. The Konfuzio OCR works on different platforms and has almost all relevant interfaces. In this way, the OCR software from Konfuzio can be easily integrated even into complex processes. The advantages of OCR SDK at a glance:

  • Mature flexibility and purchase directly from the manufacturer 
  • Solution templates for process optimization for media breaks in processes 
  • Autarkic individual developments for their customers and users 
  • Fast innovation cycles due to extensive and daily updated documentation 

Tesseract OCR as an alternative?

Many companies use Tesseract as their OCR solution. The software enjoys widespread popularity, but brings with it a number of weaknesses that produce errors in processes and workflows. These are mainly related to the accuracy of character and text recognition, as well as the selection of unclear data sources, including documents in the wrong orientation. Tesseract cannot solve the problem that manual rotation causes enormous effort for employees and slows down processes. More information about the weaknesses and possible alternatives of Tesseract can be found on the page of the Frankfurter Allgemeine Zeitung.

Besides the OCR SDK, a REST API interface offers the possibility to extend own software with data capturing from Konfuzio. This is also provided by Konfuzio to provide high quality text extraction. By simply uploading the documents via the Konfuzio interface, the OCR API an intelligent text recognition for 70+ languages and a digital Extraction of all relevant information from the respective text. 

OCR SDK: Rotate and save PDF

One feature offered by the OCR SDK is the automatic rotation and alignment of incoming documents. Whether already submitted as a scan or, image, Konfuzio's software first converts them into archivable PDF format and then converts the contained information into machine-readable JSON format. The software detects the correct orientation of the scanned document before extracting the contained information and rotates it directly into the appropriate position in the process of uploading. Thus, the text is correctly aligned and can be accurately understood by the AI. 

How to rotate PDF pages automatically?

  1. Processing of the entire document (PDF, JPEG, JPG or TIF)

  2. Division into individual pages

  3. Each page is automatically rotated

  4. Combination of all pages into one PDF

Other options: Handwriting recognition and JSON format

In addition to reliably extracting relevant information from scanned text and automating the rotation of submitted documents, Konfuzio's OCR SDK enables additional features. These include the recognition and processing of handwritten text as well as the extraction of the documents from PDF to machine-readable Format JSON

FAQ

Link the SDK to the OCR API?

It is possible to connect the SDK to the cloud-based or on-site installed OCR API from the Konfuzio server. After scanning the document, a request is processed through the API and the JSON response is returned to the application.
In some cases, it is also possible to implement OCR on the device itself. This is a custom addition, so an additional investment must be considered in this case.

What programming languages are supported for the SDK?

The publicly available SDK is suitable for Python

Is there SDK documentation?

Yes, the SDK documentation is next to the server documentation online here available. Please check our open SDK documentation for the latest version.

Is it possible to test the OCR SDK?

Since we believe that you can only believe what you see, it is not only possible to see the Konfuzio SDK as PyPI Package but also to test the Source code on GitHub to test. Please contact us to learn more and get your license to test the SDK. 

What is the difference between the SDK and the REST API?

The difference between the REST API and the SDK is quite simple. The API is a service that allows you to send a document and get structured data back within seconds. The open source SDK contains comprehensive components that can be used to process high quality images and scans.

How does the SDK pricing work?

Access to the SDK is free of charge.

0 comments

Write a comment

More Articles

Automatic text summarization Faster R-CNN for page segmentation

Automatic text summarization in documents with faster R-CNN and PEGASUS

Increasing volumes of documents and the information they contain need to be processed by businesses today in order to harness the hidden content...

Read article
digital archiving

Digital archiving through artificial intelligence

The digital age brings many challenges and changes, but also tremendous short- and long-term improvements for companies of all sizes. One...

Read article
Document Management Software

Document management software: 7 steps to intelligent document management

In any company there are a lot of documents relevant to the business - contracts, attachments, minutes, invoices, bank statements, reports, applications, job applications, deeds....

Read article

    Are you looking for more information?

    You are also welcome to call us at +49 6441 8994005 or book a meeting.
    Arrow-up