The API based IDP solution from OCR and AI from Konfuzio reads documents reliably and intelligently. Through the Software Development Kit (SDK), ready-made SDK modules can be used in the development of individual solutions and thus adapted to the needs of each company. For example, with the automatic rotation of images and digitized documents.
Rotate PDF: Small step, big effect
This step sounds simple, but it makes the processing of submitted documents much easier, especially when dealing with large volumes of documents. Manual, repetitive turning and flipping of invoices, applications and the like not only costs valuable time, it also causes delays in the processing of documents, which are due to the manual work of the rotation and can be eliminated automatically by using an OCR SDK.
The rotation of documents is therefore important for a smooth and effective flow of digital processes and forms the basis for many downstream tasks. For example, text recognition and information processing are tied to the correct rotation of documents. Intelligent document classification components can help reduce manual input and interventions such as these, through digital automation, while significantly increasing data quality.
Before

After
By automating the correction of documents (PDF, documents, images), OCR SDK can increase the processing efficiency in companies. This, in addition to speeding up all processes, significantly reduces the cost of administration and prevents employees from being held up with simple, tedious as well as repetitive tasks. Assuming 100,000 scanned documents, 10% of which are submitted in the wrong orientation, employees are busy rotating about 10,000 documents, which they have to check and correct either before or after the update - a huge time commitment and a waste of the valuable resources of their own employees. Thus, the benefits of automatic rotation:
- Reduction of manual effort
- Acceleration and optimization of processes
- Higher data quality of archive documents through archivable PDF files
- Saving resources
OCR SDK advantages
With the help of the AI-based Software Development Kit (SDK), this rotation can be implemented and the OCR solution of Konfuzio can be adapted to individual needs of companies. Through professional training of the AI, the software adapts to the common documents in different companies and the text from PDF and paper documents as well as images or scans can be extracted and converted into structured information even more reliably.
In addition, the OCR SDK can ensure compatibility with various platforms and operating systems. The Konfuzio OCR works on different platforms and has almost all relevant interfaces. In this way, the OCR software from Konfuzio can be easily integrated even into complex processes. The advantages of OCR SDK at a glance:
- Mature flexibility and purchase directly from the manufacturer
- Solution templates for process optimization for media breaks in processes
- Autarkic individual developments for their customers and users
- Fast innovation cycles due to extensive and daily updated documentation
Tesseract OCR as an alternative?
Many companies use Tesseract as their OCR solution. The software enjoys widespread popularity, but brings with it a number of weaknesses that produce errors in processes and workflows. These are mainly related to the accuracy of character and text recognition, as well as the selection of unclear data sources, including documents in the wrong orientation. Tesseract cannot solve the problem that manual rotation causes enormous effort for employees and slows down processes. More information about the weaknesses and possible alternatives of Tesseract can be found on the page of the Frankfurter Allgemeine Zeitung.
Besides the OCR SDK, a REST API interface offers the possibility to extend own software with data capturing from Konfuzio. This is also provided by Konfuzio to provide high quality text extraction. By simply uploading the documents via the Konfuzio interface, the OCR API an intelligent text recognition for 70+ languages and a digital Extraction of all relevant information from the respective text.
OCR SDK: Rotate and save PDF
One feature offered by the OCR SDK is the automatic rotation and alignment of incoming documents. Whether already submitted as a scan or, image, Konfuzio's software first converts them into archivable PDF format and then converts the contained information into machine-readable JSON format. The software detects the correct orientation of the scanned document before extracting the contained information and rotates it directly into the appropriate position in the process of uploading. Thus, the text is correctly aligned and can be accurately understood by the AI.
How to rotate PDF pages automatically?
- Processing of the entire document (PDF, JPEG, JPG or TIF)
- Division into individual pages
- Each page is automatically rotated
- Combination of all pages into one PDF
Other options: Handwriting recognition and JSON format
In addition to reliably extracting relevant information from scanned text and automating the rotation of submitted documents, Konfuzio's OCR SDK enables additional features. These include the recognition and processing of handwritten text as well as the extraction of the documents from PDF to machine-readable Format JSON.
FAQ
It is possible to connect the SDK to the cloud-based or on-site installed OCR API from the Konfuzio server. After scanning the document, a request is processed through the API and the JSON response is returned to the application.
In some cases, it is also possible to implement OCR on the device itself. This is a custom addition, so an additional investment must be considered in this case.
The publicly available SDK is suitable for Python
Yes, the SDK documentation is next to the server documentation online here available. Please check our open SDK documentation for the latest version.
Since we believe that you can only believe what you see, it is not only possible to see the Konfuzio SDK as PyPI Package but also to test the Source code on GitHub to test. Please contact us to learn more and get your license to test the SDK.
The difference between the REST API and the SDK is quite simple. The API is a service that allows you to send a document and get structured data back within seconds. The open source SDK contains comprehensive components that can be used to process high quality images and scans.
Access to the SDK is free of charge.
Write a comment