Skilja Blog

Document Separation Revisited

One of the frequently overlooked and really difficult problems in document automation, which is also really annoying in daily processing, is the automatic separation of a stack of documents into single meaningful documents and assignment to a document class. The goal would be to simply scan the whole stack and have it separated by an intelligent algorithm. Fortunately this is readily available today from the Skilja technology stack as a built in feature into the Laera classifier. This does not say it is easy. It requires quite some experience and infrastructure to manage several interdependent steps of classification and separation in a stable and reliable way. This is what Laera provides out of the box.

Confusion Matrix

Understanding the quality of an automatic classification system is crucial for its acceptance and any attempt to improve it over time. Quality means that we need to look at errors and at the recognition rate. In classification terms these values are...

Auto Classification and Bias

Personal bias and individual opinions are a big issue in standardized business processing if they happen to influence the outcome of a process and the decisions made. Nobody wants to be subject to random changes in the outcome of a personal request – and yet it...

Classification and Context

This is a situation that probably sounds familiar to you: You meet a person and you are sure that you know him/her well and that you have already seen him or her many times – but you cannot remember who it is. More precisely – you cannot put the person into a context....

What is a good classifier? (2/4)

In our small series about classification quality we have used the precision-recall graph to show the difference between a very good and a so-so classifier in a recent post that you can find here. This graphical representation is very common and easy to understand....

Classification methods

Classification  tries to mimic human understanding. Several methods have been developed in the past to achieve what we as humans can do almost effortless.  These methods can be divided into two groups. Rule based classification Rule based systems are...

Taxonomy and Hierarchies

Classification can be defined as modeling real objects via a simplified mathematical representation consisting of a set of characteristic features. The goal of such a description is to collect objects, which are quite similar into one group. A big benefit of the...

Measuring Classification Quality

For an active production system, but also when the classification scheme is set up, it is very important to measure the quality of the classification. The goal is to create as few as possible errors in classification (also called false positives) as these can severely...

Classification of – Chairs

If you have followed my presentations in the past you know that document classification closely corresponds to concept creation of the human mind. The concepts represent classes of real life objects. We are able to recognize concepts and group objects and group them...