The AI-based processing of documents first requires training with suitable test data. These are often subject to the GDPR and must therefore subsequently be removed from the cloud be deleted. However, this would also mean the loss of valuable annotations that may be useful for later training runs. The Konfuzio snapshot therefore allows a local, GDPR-compliant storage of all this annotated data so that it can be re-imported and re-used at a later date.
The Snapshot feature is a powerful tool for managing your projects and data in Konfuzio. The snapshot allow to save the state of a project at a certain point of time and restore it later.
In this blog post, we will take you step-by-step through the various snapshot features and show you how to use the Konfuzio snapshot effectively. For general information on snapshots, see our background article in the AI Blog category "Snapshot feature for more data security".
This article was written in German, automatically translated into other languages and editorially reviewed. We welcome feedback at the end of the article.
What is a Konfuzio snapshot?
A snapshot is a saved state of your project at a specific point in time. Similar to a backup, the Konfuzio Snapshot allows you to back up or transfer data between projects or installations. With snapshots, you quickly recreate the setup of a project completely without having to start from scratch.
Konfuzio Snapshot Modes
Konfuzio offers several snapshot modes that you can use individually or in combination. These include:
Snapshot record mode
The record mode allows you to, Documents, labels, label sets, annotation sets and annotations to save. This is especially useful if you have invested a lot of time in creating annotations and categorizing documents. All properties of the contained values are taken over by the snapshot 1:1.
Create and share records
The metadata-based mapping of memory states allows to create own data sets and make them available in other environments as well. Compared to the 2019 introduced FUNSD revised and extended data sets for form analysis are used (FUNSD+). This enables access to a significantly increased number of documents and their automated data extraction. This also includes the processing of complex individual cases. At Dataset mode snapshots take into account all the tabulations and annotations made in the process. This saves users from having to start from scratch every time they edit. The same applies to migration to separate installations.
Snapshot AI mode
Snapshot AI mode lets you save all your active AI models. This is ideal if you want to export the AI features of a project without including the associated documents. Again, all properties of the included values are preserved.
Snapshot bundle or combination mode
By combining different Konfuzio snapshot modes, new functions of this feature have emerged, enabling you to create more comprehensive snapshots.
This combined mode includes all values from the previous Konfuzio Snapshot modes - except those that are exclusive to the respective modes.
By creating such a combined or bundled snapshot, you practically create a save point or backup of your entire project, ignoring the values that are mutually exclusive.
AI types support in Konfuzio snapshot
The Konfuzio snapshot currently supports
- Splitting and
- Categorization AI types.
Note, however, that the categorization AI that does not appear automatically after restoring a snapshot must be recreated. For this, you can upload a single document to the restored Snapshot project if no other categorization AI is active.
Categorization AI support - cut-off date and version number
Ensuring the integration of Categorization AIs into Konfuzio Snapshot requires an understanding of the technical updates made to the Categorization AIs during the development of the Snapshot feature. This applies to both SaaS users who need to adhere to a specific deadline and self-hosted Konfuzio users who should be using a specific version.
The specified cut-off date for SaaS users is October 4, 2023, and the special version for self-hosted Konfuzio users was also released on October 4, 2023.
All about actions for AIs to categorize before the deadline and actions for self-hosted down version installations can be found at here.
Konfuzio snapshot creation how to - tutorial
To create a snapshot, perform the following steps. These are identical for both the web-based application and self-hosted installations.
- In Konfuzio, navigate to the Snapshot section found in the left sidebar under DATA > Snapshots.
- Click the Add Snapshot button and select one or both modes you want to use.
- After you select the mode or modes, click "Save" to start the process of creating the snapshot.
The time needed to create the Konfuzio snapshot depends on the size of your project and the selected modes. After the process is complete, the Status of the snapshot as "Snapshot created". displayed.
Konfuzio snapshot recovery how to - tutorial
Restoring a snapshot is an important part of the process. Here are the steps you need to follow:
- In Konfuzio, navigate to the Snapshot section.
- From the list, select the snapshot you want to restore with the status "Snapshot created".
- Select the Snapshot Restore to New Project option from the drop-down menu.
The recovery process begins. The duration depends on the snapshot size.
There are different statuses during recovery, including
- "Queuing for Snapshot Restoration,
- "Snapshot Restoration in progress...",
- "Snapshot restored." and
- "Contact support.
Your snapshot is only fully restored, When the Status "Snapshot restored is reached.
Create backup copies
It is not always immediately obvious what consequences the correction or adjustment of data will have for their further processing.
It is therefore important that such steps can be undone if necessary. This is easily done by storing the snapshot in a new project that can be called up again at any time.
This may also be necessary in critical legal cases when the authorities demand to see a certain interim status. Another use case for such copies is the training of new employees. They use the snapshots to learn how to handle the corresponding projects without any consequences in case of errors.
Restore snapshot between different installations
It is possible to restore snapshots between different self-hosted installations, provided that these installations are connected to a shared data store (e.g. S3 or Azure). This allows you to manage projects and data across different installations.
Konfuzio Snapshot Recovery Process - Overview
An Step-by-step instructions for restoring a snapshot from one environment to another via the web interface you will find here.
Konfuzio Snapshot management
After recovery, it is important to know how to best manage your snapshots. Here is some useful information as well as actions to manage snapshots in Konfuzio:
It is not possible to change the contents of a snapshot after it is created. If you want to make changes, it is recommended to create a new snapshot and delete the old one.
You can delete a Snapshot and all its associated data by selecting it from the Snapshot List view and performing the Delete Selected Snapshot action. However, this should be done with care because deleted data cannot be recovered.
Downloading snapshot data is not currently available for SaaS users, but is supported for self-hosted installations. For more information, see here.
To use all Konfuzio snapshot features, users must have appropriate Authorizations via roles are assigned. These include:
- "can view snapshot",
- "can add snapshot",
- "can change snapshot",
- "can delete snapshot",
- "can view snapshotrestore",
- "can add snapshotrestore",
- "can change snapshotrestore" and
- "can delete snapshotrestore".
Contact our experts via the Contact formfor more information about Konfuzio snapshot permissions and the requirements for them.
In practice, snapshots also prove particularly useful for automated processes with data sharing and the need for regular annotations as useful. This is the case, for example, with AI-based Document management the case: On the one hand, certain regulations apply to the processing of business documents in terms of a audit-proof archivinge.g. the recoverability of the original. On the other hand, certain optimizations in the data are necessary to increase the degree of automation and gain of knowledge.
More flexibility in the processing of documents
The AI-powered document software from Konfuzio offers you an integrated snapshot feature with different modes to ensure maximum flexibility in this respect. A look at the versatile snapshot use cases clarified, Why users use the snapshot feature.
Overall, the Konfuzio Snapshot feature provides an efficient way to manage your projects and data. Use this powerful feature to optimize your work in Konfuzio.
If you need more information, visit the Konfuzio documentation or contact our team directly: