Moderne Systeme lesen keine Zeichen – zumindest nicht einzeln. Sie lesen Wörter, Phrasen und sogar Bedeutungen. OCR hat sich zu etwas grundlegend anderem entwickelt, und dieser Wandel sorgt für ein neues Maß an Genauigkeit, Fluidität und Natürlichkeit, das frühere Generationen nicht erreichen konnten.
Genau das hat Skilja mit Lesa erreicht, unserem Deep-Learning-System auf Transformer-Basis, das so konzipiert ist, dass es wie Menschen liest: ganzheitlich, kontextbezogen und intelligent.
Happy 12th birthday Skilja
Unbelievable – but 24.01.2024 was the twelfth birthday of Skilja. Meaning that next year Skilja will become a teenager.
12 years ago we founded Skilja in the middle of the German winter with the goal to create the best possible Document Understanding or IDP (as others call it) solution. And now we look back and we are proud that we have achieved the goal – at least in our view. A lot has happened on the way.
Intelligent Document Processing and Enterprise Security
For those of us who have historically worked in the area of Intelligent Document Processing (IDP), or Capture as it was simply called before, it is a very pleasant observation that IDP, that has been around for a long time, creates more and more interest in the general CEO discussions and is now seen as an integral part of process optimization. IDP gets more and more integrated into the main business processes. In the past Capture used to be almost always departmental. Capture was only allowed in a (badly lit) corner of an Enterprise mainly because no Capture system was able to fully integrate into Enterprise IT and most importantly comply with all security rules established for enterprises.
Document Separation Revisited
One of the frequently overlooked and really difficult problems in document automation, which is also really annoying in daily processing, is the automatic separation of a stack of documents into single meaningful documents and assignment to a document class. The goal would be to simply scan the whole stack and have it separated by an intelligent algorithm. Fortunately this is readily available today from the Skilja technology stack as a built in feature into the Laera classifier. This does not say it is easy. It requires quite some experience and infrastructure to manage several interdependent steps of classification and separation in a stable and reliable way. This is what Laera provides out of the box.
Vinna 3.0 Released
We are proud to announce the release of Vinna 3.0, our open 4th generation Document Processing Platform. We created a totally new and modern UI – with an improved backend to support enterprise performance, scalability and security requirements. Process Editor and Process Monitor are completely redesigned with latest web technologies. Vinna is an open and process-oriented platform, that allows users to define a process in exactly the way as it is optimally operated in a company. The architecture of Vinna is service oriented (SOA) and the runtime is easily deployed either in the cloud (Microsoft Azure, AWS or private cloud), on premise or in mixed environments where the data storage is kept in house and processing happens outside.
Classification and Context
This is a situation that probably sounds familiar to you: You meet a person and you are sure that you know him/her well and that you have already seen him or her many times – but you cannot remember who it is. More precisely – you cannot put the person into a context....
Auto-Classification Technologies and RFID Smart Docs
Editor’s note: This is a guest post from Cláudio Chaves from TCG Brazil Recent advances in the auto-classification technologies – as described in this blog – have provided a substantial manual labor reduction for several companies related to physical...
What is a good classifier? (2/4)
In our small series about classification quality we have used the precision-recall graph to show the difference between a very good and a so-so classifier in a recent post that you can find here. This graphical representation is very common and easy to understand....
OCR on Historical Documents
Skilja is proud to announce that we have received a grant from the European Union supporting a research and development project to improve OCR on historical documents. The grant is provided through the Eurostars program of the European Union. This program supports...
Visual Classifiers From Random Images
Now this is an interesting experiment that leads us very close to the touch point between machine classification and human imaginations. As described in previous posts, auto-classification algorithms are using features that are extracted from the objects to be...
Visiting Docville – October 2014
Now already a tradition we just had the fifth meeting of Docville in Brussels this week. Docville is a networking & exchange initiative for executives from the international Information Management ecosystem (Capture, ECM, BPM, BI and BPO),...
Intelligence Everywhere: Top 10 Technology Trends 2015
Gartner has just published their outlook on the strategic technical developments that will be important for enterprises in the next three years. Well – it seems that we at Skilja are at the right place with what we are doing. We have known this for a long time and our...
What is a good classifier? (1/4)
Auto-Classification is able to assign categories and hence meaning to documents with an unprecedented speed and quality. The technology for auto-classification has been developed over the last 15 years – from the first tentative rule based systems to elaborate...
Read Faster with Text Streaming
An interesting new approach to human document understanding is presented by the Boston based startup company „Spritz“. They believe that human understanding of text (i.e. reading) is slowed down by the eye-movement on the text. Therefore they have developed a new...
2014 – Keep Going
Hello everybody, It is a New Year and I want give all our great readers our best wishes for this year 2014! Our small company, Skilja, is now going into its third year already – growing and doing well. Providing excellence in software and consulting is our goal and...
Practical Semantics
As you have seen in previous posts I was pointing out the difficulty for software to really understand English language mainly lies in the ambiguity of words and their related meanings. The technology to resolve this difficulty is called semantic analysis. I recently...
Complex Predicates
Editor’s note: This is a guest post from Prof. Dr. Jürgen Lenerz from University of Cologne As we have seen (read the previous post here), predicates such as be red, be sleeping, snores, be a horse etc. may be conceived of as functions from individuals in...
Understanding the Weather
If you live in Europe you will agree with me that the weather this spring has been the worst since a long time. In fact it was freezing and snowing all through March right into April. And it has not been much better on the East coast in the U.S. We experienced some...
Visiting AIIM 2013
AIIM is the community that provides education, research, and best practices on information management and collaboration. This year the AIIM community met in New Orleans, the city of music and French Creole architecture. New Orleans is also called the “Big...
Why Semantics is Important for Classification
Many automatic classification systems out there today use a pure bag of words approach for finding relevant features that determine the meaning of a document. Few are using correlation and collocation – to account for the fact that words have a different meaning based...