Skilja

The End of OCR: How Word-Level Understanding Changes Everything

Alexander — Thu, 30 Oct 2025 16:10:19 +0000

For more than half a century, OCR—Optical Character Recognition—has meant one thing: machines deciphering single characters. From early template-matching in the 1960s to the statistical engines of the 2000s, OCR has always approached reading as a mechanical process. It segmented text into character-shaped fragments and tried to analyze what each one meant, letter by letter. But that era is now ending. Modern systems don’t read characters at all—not individually. They read words, phrases, and even meaning. OCR has evolved into something fundamentally different, and this shift delivers a new level of accuracy, fluidity, and naturalness that previous generations could not reach.

This is what Skilja has done with Lesa, our deep-learning, transformer-based system designed to read like humans do: holistically, contextually, and intelligently.

A Brief Look Back: From Characters to Context

Traditional OCR started as an analogous process (therefore the O = Optical) and went through several stages:

1950s–1980s:Rigid template matching—effective only with perfect typewritten pages.
1990s:Feature extraction and early machine learning—better but still brittle.
2000s–2010s:Statistical modeling and improved analysis workflows—good enough for books and printed forms and constrained hand print, but always character-bound.

Even at its best, classical OCR remained a guessing game. It mistook 1 for l, turned smudges into glyphs, and struggled with anything outside its narrow expectations. Especially handwriting.

The problem wasn’t better algorithms—it was characters – but humans read words and not characters as anyone who has watched a child learn to read will confirm.

The Deep Learning Shift: From Decoding Shapes to Understanding Language

Transformers changed everything. Instead of interpreting characters, transformer-based models interpret sequences, context, and linguistic probability. They don’t see text as isolated shapes but as parts of sentences, paragraphs, and concepts.

This allows Lesa to:

recognize entire words, not just letters,
use surrounding text as context,
maintain coherence across full pages,
and adapt to different visual styles.

Reading becomes a language-understanding task, not a pixel-decoding task.

Lesa: Built for Word-Level Intelligence

Lesa is designed from the ground up to treat a document as a linguistic landscape. Using a transformer architecture trained on diverse text and real-world images, Lesa doesn’t ask, “What is this character?” It asks, “What does this word say—and how does it fit into the sentence?” This matters since now we have:

Far fewer errors: No more character-by-character error cascades.
Natural output: Clean spacing, correct punctuation, coherent text.
Font and layout robustness: Works across stylized text, messy receipts, tables, multi-column pages.

One of the most transformative outcomes of word-level understanding is that constrained handwriting—the kind found in forms, notes, medical records, delivery slips, and corporate paperwork—now works much better. This means that messy capital letters, box-filled handwriting, and half-printed forms suddenly become highly readable. Handwriting recognition suddenly is at our fingertips and routinely applied.

Calling this the “end of OCR” isn’t hype. It’s a technical recognition of what has changed

Traditional OCR = character recognition
Modern OCR = language understanding

Lesa is part of a new generation of systems that read documents the way humans do—by interpreting words, not decoding symbols.

OCR as we knew it is over. Something better has replaced it. Lesa represents this new era.

Auto Classification and Bias

skiljaadmin — Tue, 22 Jul 2025 12:01:16 +0000

Personal bias and individual opinions are a big issue in standardized business processing if they happen to influence the outcome of a process and the decisions made. Nobody wants to be subject to random changes in the outcome of a personal request – and yet it happens. Because humans have a bias in how they see facts, based on their education, cultural background and even the mood they happen to be in at a certain time in the week. So in addition to different persons making different decisions you can even expect that the same person makes different decisions during the week. You just look differently at a task on Monday morning than on Friday evening. The reason is so-called priming which happens to all of us day-by-day through our experience, knowledge, physical condition, context and a lot of other small factors.

In a recent article on the meaning of words we have shown how the sound of words influences our perception. There are a lot more linguistic associations that influence the way we think and behave, which introduce bias. If for example I tell you that I am driving north across a hilly terrain, would you expect the trip to be mosty uphill or downhill? In fact most people associate movement to north with uphill and to south with downhill. An interesting study by psychologists Leif D. Nelson from UC San Diego and Joseph Simmons from Yale shows that these associations can actually be measured and produce some strange biases: People think it will take longer to travel north than south, that it will cost more to ship to a northern than to a southern location, and that a moving company will charge more for northward movement than for southward movement. A similar study concluded, that people assume that property is more valuable when it sits in the northern part of town. Of course these opinions stem from the decision by the old Greeks to plot the map of the world with northern parts above the south. But also it shows us clearly how much we are biased by our language and of course north/south is only one of many linguistic associations we are exposed to.

Ancient mapmakers introduced north and south unwittingly, but lawyers do have an intention when they describe car accidents. While the defense might call a car accident “contact”, the plaintiff might say one car “smashed” the other. Elizabeth Loftus and John Palmer showed in a classic experiment that these labels really matter. They had a group of students watch the same series of traffic accidents. Then they were asked to estimate the speed of the cars when the accident occurred. When the scene was described in a way that the cars “contacted” one another, the average speed estimation by the students was thirty-two miles an hour, whereas the estimate was forty miles an hour when they said that the cars “smashed” one another. In a another experiment, 14% of participants incorrectly remembered seeing shattered glass when told that the cars “hit” one another, whereas 32% of participants made the same error when told the cars “smashed” into one another. This shows that even a single word can change how people remember an event they witnessed only minutes earlier – making it very clear how priming can bias our decisions.

This brings us back to auto-classification. A classifier like the Skilja Content Classifier is trained with representative samples that are collected by a group of people. If applied to a bunch of documents it will then make the same decision over and over again. It represents – through machine learning – the average opinion about the content of a document and will repeat it without tiring. Same in Monday morning and Friday evening – with a speed of several 100.000 pages per hour. It will make errors – based on statistics – but not more than a human. And the errors are reproducible and can be corrected if necessary. If you have ever thought about compliance – this is a good example. Because compliance does not say that you cannot make errors. It says that you need a reproducible, documented procedure how you store and treat your documents. Auto-classification can help to achieve this goal. It is a great tool for boosting productivity. But it is even more helpful to avoid bias, irreproducible results and non-compliance.

Intelligent Document Processing and Enterprise Security

Alexander — Sun, 22 Jun 2025 11:15:06 +0000

For those of us who have historically worked in the area of Intelligent Document Processing (IDP), or Capture as it was simply called before, it is a very pleasant observation that IDP, that has been around for a long time, creates more and more interest in the general CEO discussions and is now seen as an integral part of process optimization.

This is on the one hand due to the rise of AI technologies and the subsequent understanding what can be achieved with algorithms that mimic human understanding. AI having arrived in the mainstream (and even dominating the mainstream discussions) is now generally understood as being capable to perform cognitive tasks that humans perform. We have known and preached this for a long time but it is a good development that our former niche becomes standard.

On the other hand IDP gets more and more integrated into the main business processes. In the past Capture used to be almost always departmental. Capture was only allowed in a (badly lit) corner of an Enterprise mainly because no Capture system was able to fully integrate into Enterprise IT and most importantly comply with all security rules established for enterprises.

This has changed to the better in the past few years. At Skilja we have invested a lot of effort to strictly follow all security requirements so Vinna (our enterprise platform) can become a part of enterprise IT. This means on the one side that during development we are running frequent security screenings and penetrations tests to ensure utmost security for the software. For example Vinna and Skilja Software is Veracode Verified since many years. On the other hand we independently make sure to follow all defined industry standards very closely.

The most important aspect to be allowed to run in an enterprise is authentication and authorization. Typically an enterprise will not (or only grudgingly) allow an application to store user names or passwords outside of their internal identity management software. A platform should not have its own user management. This is a no-go for many customers. Therefore Vinna from the beginning always used roles that then are mapped to users in the enterprise user directory. In Vinna 3.0 still the password needed to be entered by the user and was sent (encrypted) to the authentication backend.

Since version 3.1 our Vinna platform uses the OAuth2 protocol for the authorization of users. OAuth2 itself does not directly deal with the authentication of users and clients. Instead the authentication backend is required to grant authorization and thus access. OAuth2 is supported by a lot of backends, namely Microsoft Azure AD and Keycloak. All communication with the backend (Resource Server) is bundled in the Skilja Authorization Server that is used by all Vinna Platform services, Clients and Activities.

Vinna provides authentication via different methods:

Authorization Code Flow with PKCE via a web browser
User name and password authentication via the password grant flow
Client authentication via the client credentials grant flow

Authorization Code Flow with PKCE (proof key for code exchange) is the current best practice when logging into any client application because it avoids entrusting a client application with user credentials.

A user that wants to log into a web site is redirected to the log in page of the Skilja Authorization Server. If the user has not yet authenticated itself to the Skilja Authorization Server, they enter a user name or password into the login field. After the credentials have been verified, the user is redirected back to the web site where they started, along with an authorization code. This authorization code is used by the website to exchange it with the Authorization Server for an access token. The authorization code part prevents user credentials to ever be entered into a potentially non-trusted client application. The PKCE part in this flow prevents the access tokens to be passed around in redirect URLs shown in the browsers URL tab, thereby preventing accidental token leakage by copy/pasting the url.

The client credentials grant flow type of authorization as shown above allows to register client credentials with the authorization service, along with claims which each clientId may receive. This type of authorization is intended for machine-to-machine communication and not for typical user interactions. In this flow, a client Id and matching client secret is sent to the authorization service that issues an access token. Once the access token expires, client Id and client secret can be used again to obtain a new access token. Also, in some cases a refresh token is made available that can also be used to acquire a new access token

Overall this new architecture of authorization and authentication allows an enterprise to integrated Vinna and all IDP activities that run within Vinna into their environments, available to all users opening up much more options to use IDP in their processes than if they were separated in their own network.

Happy 12th birthday Skilja

Alexander — Thu, 25 Jan 2024 17:54:00 +0000

Unbelievable – but 24.01.2024 was the twelfth birthday of Skilja. Meaning that next year Skilja will become a teenager.

12 years ago we founded Skilja in the middle of the German winter with the goal to create the best possible Document Understanding or IDP (as others call it) solution. And now we look back and we are proud that we have achieved the goal – at least in our view. A lot has happened on the way. We have created:

A leading classification technology with statistical, semantic and LLM classifiers with more than 1.000 customers world wide
An extraction framework that allows a customer to do anything from forms, invoices to totally freeform contracts in the same environment with no technology disruption from one designer
Our own recognition engine – AI based – as we were not happy with existing engines
A powerful process platform – used by > 250 enterprises world wide on premise and in the cloud with some of them processing 500.000 documents per day.

And of course we look back to a big number of great projects and successes in the market and a lot of happy customers. This we achieved not alone but only with the help of our partners that one by one joined our partner network in the last 12 years. Thanks a lot to them for their commitment, their dedication, their encouragement.

Sometimes we also ask ourselves – how did we do that? In the end we are just a medium sized company with (now) 24 developers and engineers. The answer – I believe – is FOCUS. We were always very clear what we wanted to achieve and focused our energy on these ideas and did not allow ourselves to be distracted. This is the key of our success together with our absolut customer centric approach (which might seem in conflict with focus, but in reality is not if it is handled and communicated correctly). And of course we have a tremendous team and pool of talent which I am very grateful for. It also seems that we made some correct technological decisions on our way which pay of now – focus on services, web interfaces, machine learning approaches, ease of use, high quality of software development with most advanced tools…

Skilja was created as an independent technology provider with a strong partner network and our plan is to stay such in the years to come. Independence is no value in itself but is the core of our ability to innovate. We do not need to look at quarterly results but can define long term development goals, decide on them and follow them to the end. This makes us strong and successful and will continue to do so.

Thank all of you to make that possible.

Alexander Goerke, CEO

How Meanings of Words Change

skiljaadmin — Fri, 02 Dec 2022 11:57:32 +0000

We all know that our language is fluid and words can change their meaning over time. Words get extinct and new words are created but more often existing words are adapted to new circumstances. It is interesting to see how this happens in the course of years but sometimes words change their meaning overnight.

In a study on “Statistically Significant Detection of Linguistic Change”, published last year (available online from arxiv.org), the researchers Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, Steven Skiena have used data mining to find out how the way we use words is revealing the linguistic earthquakes that constantly change our language. There findings are very interesting for anybody who works professionally with text analytics as it reveals a lot about how semantics in our language work. Kulkarni et al. have tracked these linguistic changes by mining the corpus of words stored in databases such as Google Books, movie reviews from Amazon and of course Twitter.

In the pre-internet times the usage and meaning of words changed relatively slowly. This is can be seen in the metamorphoses of the word “gay” from its social meaning in the fifties to the purely sexual-orientation meaning in our time. This is nicely displayed in the word cloud view below:

A faster change occurred in the 1970s to the word “mouse”, when it gained the new meaning of “computer input device” and later, the word “windows” was used internationally as the name of the Microsoft operating system within a few years.

Today the meaning of a word can change almost instantly. Before October 2012, the word “sandy” was an adjective meaning “covered in or consisting mostly of sand”. Then Hurricane “Sandy” approached. Almost overnight, this word gained an additional meaning as a proper noun for one of the costliest storms in US history.

Now this might not sound like a big deal for us – but just imagine the insurance industry and the thousands of e-Mails they suddenly receive referring to damages by Sandy! If they are using a static or rule based classification system, they might easily miss the point on these.

So this is a big challenge for automatic classification systems that have been trained in a machine learning algorithm on a specific set of documents containing words in a specific meaning. When these meanings suddenly change or the usage of words is widened suddenly the classifier will make wrong decisions. The only way to cope with this problem in a living and productive system is continuous learning. The system must learn from user corrections – supervised learning – but also from good classification that contains some new aspects and features. This is called unsupervised learning and is so important as the number of documents that can be used for training the system is much higher than the manual correction. By unsupervised learning – or better enhancement, which also includes forgetting btw – the classification system will be able to cope with changed meanings over the time. Abrupt changes as mentioned above will lead to a drop in classification rate from which the classifier will recover within a few days – like humans who will also need a short time to adapt.

On the Benefits of Page Classification

skiljaadmin — Thu, 29 Sep 2022 14:37:47 +0000

Classification deals with the categorization of objects. In our process automation and digitization world, we often think of the objects as complete documents that need to be classified. Of course, it is important to understand what the type of a document is and automatic classification can determine exactly this. But documents in a business context normally are complex and not homogenous. As a person when you get a multipage document you typically will browse through it to see what is in it to understand what it is about. A document in an envelope or a manila folder that you receive on your desk may consist of an opening letter, some notes, then the real important document, like for example the court order, and maybe attached some standard forms. To understand which process to initiate and what to do with the document you will therefore look at the pages and decide how you can determine from their content what this is all about. Maybe even two or more processes originate from different pages within one document where you might need to answer a request from one page and execute a payment from another page.

Laera Classifier – Page Classification for Claims Processing

This is exactly what page classification in document understanding is able to provide automatically. Instead of looking at the document as a whole the algorithm will classify page by page and derive decisions from the results. This is much more granular than taking only the complete document. And it is different from automatic document separation which is physically splitting the document. Of course, separation is another option based on page results, but it is error prone and risky as the document is ripped apart maybe incorrectly. Often this is not at all necessary but it is sufficient to structure and digitize the document page-wise to achieve the process goals intended.

Page classification requires a solid infrastructure and understanding of physical documents. We provide this with the Laera Classification Framework that inherently understands structured documents. Going even further would be paragraph and sentence classification but this will be a topic for another article. In Laera you can simply define a page classification scheme alongside the document classification. And you can even use the page classification results to determine the document type (e.g. by majority rule or by priority rule).

An example of a real-life project that is in production since more than a year is shown above.

In this case the customer receives thousands of car insurance claims per day. These are 10 to 50 page documents that contain all different kind of pages, as examples:

Covering letter or e-mail (“Anschreiben”)
Attorney’s letter
Expertise (“Gutachten”)
Calculation of repair (“Kalkulation”)
Declaration of Assignment (“Abtretungserklärung”)
Photos

Laera Classifier is able to automatically determine all of these types with a rate in the high 90%. Photo detection tags all photos and hides them for the following recognition steps as they are unnecessarily blocking OCR and extraction steps otherwise. The page classification results allow to structure and reorder the document in an optimal way for subsequent extraction of data from the different page types. Being able to define specific extraction for each page type leads to a significant increase in extraction quality and speed. It also greatly eases the task for the clerks in the subsequent process steps as they already receive a structured document (in this case a PDF that is assembled) with tags and always in the same order.

In such way page classification plays an important role in streamlining the process getting a bit closer to the way how a person would look at the document and work from it.

Document Separation Revisited

Alexander — Thu, 08 Sep 2022 09:50:54 +0000

One of the frequently overlooked and really difficult problems in document automation, which is also really annoying in daily processing, is the automatic separation of a stack of documents into single meaningful documents and assignment to a document class. In traditional scanning processes this is often achieved by manual preparation of the paper and sticking a barcode as a document separator on each first page. But this is labor intensive and error prone. In addition as we are going more and more digital, even with paper based processes, normally the processing facility does not have access to the paper any more. So the goal would be to simply scan the whole stack and have it separated by an intelligent algorithm.

Fortunately this is readily available today for example from the Skilja technology stack as a built in feature into the Laera classifier. This does not say it is easy. It requires quite some experience and infrastructure to manage several interdependent steps of classification and separation in a stable and reliable way. This is what Laera provides out of the box.

How does document structuring work in principle? Well, in exactly the same way (our credo!) as a human would do it. Go through the stack page by page, determine what page type it is, if it is related to the previous page or if a new topic/form starts. Then check page numbers for security if they are present. If in doubt, go back one or a few pages to check back and then make your decision to separate.

Laera Document Separation

In AI classification, what Laera is, this is built into a sequence of algorithms. The system is trained on a sample that is already correctly separated. Laera does learn for each page if it is a first, a middle, and end or a single page. The user does not have to specify this explicitly as the Laera AI finds that out automatically from the samples and hides this complexity from the users. The training interface just requires you to drop the single documents into the training set. It is not required to have an exact number of pages (range) for each document type. Laera automatically takes into account that these can vary for each document type. However if you know you can also restrict allowed pages for example for single page forms that are always single page.

Laera will then learn the structure and apply it to the whole document stack of unseparated single pages during runtime. Each page is analyzed. A second classifier (we could call it a “meta-classifier”) will then take these results and find the most probable separation based on the trained model. So even if a first page has not been identified as a first page there is a chance that the meta-classifier still will see it as more probable to be a first page and correctly separate. A third classifier will then determine the document type for the separated documents. As usual in Laera all this is very fast and a separation of a stack of 200 pages with 150 document types takes less than 30 seconds in total.

The example below shows the results from separation of a mortgage application stack with 153 pages and classification in 244 document types. The horizontal lines indicated the found separators and the “New page” column shows the new numbering of pages in the separated documents.

Laera Mortgage Separation Result (click on image to see full screen)

The detail view of the separation result for one page nicely shows how the separation algorithm came to a decision for the first page of a URLA supported in addition by the detected page count on the page (“Page 1 of 9).

Laera Mortgage Separation Details

Training of this model takes about 10 minutes so it is easy to frequently test and refine it. All this can be done by the end user and does not need an AI engineer.

Quality is very important and Laera makes sure to bias towards precision to no errors are made and allow the workflow to show unconfident separations to a user for decision. In a project that was done 18 months ago for a large Swiss insurance company Laera achieved an automation rate of 87% with an error rate (false positive) or 0.14%. Still of course each separation result needs to be checked and the correction results will be used by Laera online learning to improve the model.

But overall the reduction of work in separation and the increase in quality is very measurable and yields huge benefits. All this is available either on premise or as cloud service to be used through RPA or RESTful API in any backend. Let us know if you are interested and we can show you a demo. Also a setup with your own documents is easily achievable with little effort. Contact is info (at) skilja.com.

Confusion Matrix

skiljaadmin — Wed, 10 Aug 2022 11:01:43 +0000

Understanding the quality of an automatic classification system is crucial for its acceptance and any attempt to improve it over time. Quality means that we need to look at errors and at the recognition rate. In classification terms these values are called precision and recall. Precision gives the percentage of documents that have been classified correctly with respect to all documents assigned by the classifier (a/a+b), recall is the number of documents classified into a class with respect to the total number of documents that should be in this class (a/a+c). In a previous post (Measuring Classification Quality) we have already discussed these and how important they are. It is easy to depict them in a graphical visualization:

While these values might appear a little abstract their advantage is that they are independent of the size of the set. But it might be more intuitive to talk about the actual number of documents that are imported into a class from other classes (set b) or exportedand lost from the class (set a). Because it becomes obvious that recall and precision are related and have the same value if no threshold is applied – as every document that is imported to a class must have been lost in another class. Also it makes it easy to look at particular problem classes with a lot of imports (attractors) or exports (donors).

For a classification system these values can be depicted in a so called confusion matrix (also known as a contingency table or an error matrix) showing all relations between classes in one glance.

Our classification designer in the Skilja Content Classification system has a built in visualization that lets you easily see the migration of documents into other classes. As an example we have used our popular Reuters news wire test set and arranged the classes in 7 hierarchical groups. If you run a 90:10 split benchmark on all 5917 documents (which fortunately only takes a few seconds because the SCC is so incredibly fast) the confusion matrix obtained for the 51 classes looks as follows:

Each column of the matrix represents the instances in a predicted class, while each row represents the instances in an actual class. Of course the user interface allows you to zoom in to look at the details.

The correctly classified documents are summed up on the diagonal, the exports are on the right upper side and the imports on the left lower side. In our case you see quite some exports from the class “acq”, which is news on acquisitions to “earn”, which is earning. But this is to be expected as these classes are close by topic and often a report on acquisition talks about the same topics (shares, revenue, board) as for earnings. The user can now use this display to click on the box of the 57 exported documents, open them in a list and review them to improve classification if desired. Such it becomes easy to drill down into the results and see exactly what can be improved. You will never achieve 100% precision but remember that also manual human classification only achieves 95% on average as proven in experiments.

When the classes are organized in a hierarchy, the confusion matrix by Skilja also allows you to collapse the nodes and look at upper levels only. In this case the values of the hidden subclasses are summed up and shown for the parent class.

The diagonal has two values now. For example 4.320 of the finance documents have been correctly classified but 175 have been exported/imported within the finance category. Often you are only interested in the migration between the main parent classes, while errors under one parent are less problematic.

Typically an organisation can assign a cost with each export and import. The cost can be different for each pair of classes where this happens. Migrations within a set of subclasses are often not very expensive if they relate for example to documents that anyway are processed in a department. On the other side an import into a class that leads to an automatic payment can be very expensive. This can be mitigated by assigning different thresholds to such classes, which SCC allows. The confusion matrix allows you to find out where these need to be applied. But the matrix can also be exported and you can apply your own cost matrix to the results to determine, which improvement make sense. We are currently working with a real client to create a case study that shows these numbers in a real world example at an insurance company. When available, this study will be published here. Stay tuned!

Vinna 3.0 Released

Alexander — Tue, 24 Aug 2021 10:17:26 +0000

We are proud to announce the release of Vinna 3.0, our open 4th generation Document Processing Platform. After 18 months of concentrated and intensive development time we are very happy that we now can provide even more value to our customers. We invested a lot to take our customers feedback back to our engineers and create a totally new and modern UI – with an improved backend to support enterprise performance, scalability and security requirements. Process Editor and Process Monitor are completely redesigned and both are now available in English, as well as in German language. To avoid any pain for our many existing customers, special effort has been spent on compatibility with Vinna 2.4 so all projects can be smoothly upgraded. You can either manage a 2.4. runtime from 3.0 design time to achieve a step-by-step upgrade without disrupting production, but also the transfer of old process versions into 3.0 has been very thoroughly tested.

New Process Editor UI

The new design makes creating processes with no coding – no scripting – no configuration file editing as easy as it should be. Vinna 3.0 comes with the new BPMN process editor, with improved speed and usability. Now in Angular 10, all functions are componentized and can be integrated separately.

Plenty of new features improve the design and runtime management of processes. Cooperate better with your team members, as you write comments directly to activity instances. Work together designing the process. All activities can be configured through the UI – either through standard dialog or individual extended dialogs of activities, which can even bring up their own web UI. When a process is locked, you can now immediately see by whom. Besides many graphical changes, e.g. in the view of processes, document types and variables, it is now also possible to switch all views to lists and search in all trees and lists. You can see all environments where a process(-version) has been published to and we allow deletion only when no published version exists.

Vinna 3.0 Process Designer

Process Version Management

If you ever were in charge to manage a production system you know how important staging and versioning is. “Never touch a running system” is common but in the end leads to legacy problems as nothing can be updated any more. Key to any enterprise-critical production system is version management that allows full control over what is changed – of course with thorough testing in staging steps. Therefore, version management and staging is a central part of Vinna architecture from the start and has been further improved in version 3.0. You can now create major and minor versions (1.0, 1.1, 2.0, …) of processes. The latest version you edit is always marked as a draft version – you can’t break anything! Deploying a process happens for a certain selected version. So it is easy to work on major changes of a process and already test it but at the same time create hot fixes (patches) for existing production processes if necessary. And you can even change variables in each of your runtime environments separately for each version.

Vinna 3.0 Process Version Management

Environment Management

Environment ist the runtime system where a process is published to. Many environments – on premise, private cloud, public cloud – can be managed from the same Designer. The environment is executing the process by hosting and running the activities as necessary in as many Activity Servers as needed. In Vinna 3.0 we now have “transient” Activity Servers that auto-start with a VM or in a Docker, do their work and shut down again when not needed. Together with the separation of Activity Server configuration from the instance, you can easily assign arbitrary resources to a project to scale up dynamically in peak hours, or reduce hardware cost by using just as many servers as you need. An overview over all assigned activities and activity servers across an environment fulfills a long-requested requirement.

Process Monitor

The 3.0 runtime backend is fully compatible and introduces a lot of invisible changes related to scaling, performance and security. Process Monitor is the GUI to monitor the runtime and also has been completely redesigned. It comes with a lot of improvements and usability enhancements. Many visual usability enhancements were made with the new controls like grouping, filtering and customization of the UI for business operators. There is now a new tab for directly previewing documents in a work item with all their data. Licenses can now be reviewed and managed either in runtime or in design time with a common license view including status and report for click rates.

Vinna 3.0 Process Monitor with grouped work item list.

Vinna is an open and process-oriented platform, that allows users to define a process in exactly the way as it is optimally operated in a company. The design allows full flexibility in the data model with a hierarchical document model supporting batches, folders, documents and pages. The documents are processed as work-items in the flow and passed through activities. The activities are either standard tasks like OCR or Classification, or custom tasks as integrations into the platform. Any number of activities can be defined in the process as micro services, including arbitrary routing decisions based on intermediate results. The architecture of Vinna is service oriented (SOA) and the runtime is easily deployed either in the cloud (Microsoft Azure, AWS or private cloud), on premise or in mixed environments where the data storage is kept in house and processing happens outside.

All communication between services and databases is transaction based, securely encrypted and uses standard REST protocols over HTTP and HTTPS. Three powerful HTML based graphical user interfaces are provided for defining, managing and monitoring processes. Vinna is available for small projects but also incorporates all enterprise features need for large production systems. The biggest Vinna customer now processes 100M documents p.a. in one system, which is 400.000 documents per day.

Whitepaper and data sheets are available if you are interested in further details, please contact us through info(at)skilja.com to obtain your copy.

Process as a Service

skiljaadmin — Wed, 05 May 2021 13:57:54 +0000

Imagine that you have created a powerful process for superb document automation using all kind of advanced recognition, image processing and AI technologies available. With these technologies it is possible to automate almost any document driven process that involves repetitive cognitive tasks like classification, indexing and decision making today. VINNA by Skilja is a powerful platform that enables and orchestrates the LAERA components by Skilja to perform all these miracles. In addition VINNA plugs in a lot of other powerful tools from other technology companies like barcode recognition, office format conversion, e-Mail normalization, PDF-A generation etc. etc.

Now as this process is built and works well, the question arises how to integrate the new capabilities in your line-of-business applications and existing processes. File import and export is insecure and outdated. Full integration requires too much effort from IT that might not be available.

Fortunately there is a solution 🙂 : By plugging in an Event Driven Activity (EDA) you can enable ANY process to be accessible through a standard web service protocol. Simply by adding the EDA to an existing process you make it available to a RESTful service call. The EDA can be the only start point of a process but it can also be added in addition to existing starters like file importers, message queues or IMAP collectors that pump documents into the same process.

Typically you add one EDA starter and one or several EDA Reporter or Listeners. These are accessed through a simple web protocol by the EDA consumer. The Consumer can either be a web page (as in the example below) or a Windows application like an RPA client that automates the cognitive task scheduling. Both are provided as sample source codes with your installation. At any stage of the process the Consumer is optionally updated via event on the progress of the work item in the process. When processing is finished the results are retrieved either through an event or from a queue that is queried via REST.

VINNA Process as a Service schematically

The process can be deployed anywhere – there is zero setup. Simply use the URL of the service and of course the credentials for the secure and encrypted communication with the authentication service. The process can sit anywhere in the cloud and be used by any client world wide that has access –

Through EDA VINNA is used completely in slave mode and the consumer is shielded from any complexity of the process.

The process can contain any number of steps and routes, including manual correction, approvals or even sending tasks to the crowd. The Consumer will see none of this, it will simply get notified on the results on a standard API irrespective of the process. So when the process is changed and enhanced, the Consumer can stay as it is. This is the true power of process as a service. “Technology under the hood”- Especially when combined with online learning that will continuously improve the result.

Vinna Process as a Service for RPA with Classification Example

A nice example is shown in the graphic above. The Consumer (in this case any RPA client) choses to use a classification service by selecting the process (through EDA) and the classification project that should be used. In this case both classification activity and classification Web designer access the same classification model in the database. Therefore an admin user can even modify the classification model and the taxonomy or create a new one. Because the process itself and the call to use it stay unchanged. This process also contains a manual correction step (Batch Review) that is conditionally used in case of uncertain results. The system can even send a link for a correction task as a web page so the user can make any decisions or corrections of submitted tasks herself without having to install anything. The corrected results in turn will then be aggregated and used for online learning. If several hundred users are using this process it will quickly optimize the results automatically without any further effort. And at all time the actually processing can happen anywhere in the world on any cloud server.

This is the power of process as a service.