<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Guest | Skilja</title>
	<atom:link href="https://skilja.com/de/author/guest/feed/" rel="self" type="application/rss+xml" />
	<link>https://skilja.com/de/</link>
	<description>Document Understanding – Deep Learning</description>
	<lastBuildDate>Fri, 24 Nov 2023 12:43:51 +0000</lastBuildDate>
	<language>de</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://skilja.com/wp-content/uploads/2021/06/cropped-skilja_logo_transparent_02-32x32.png</url>
	<title>Guest | Skilja</title>
	<link>https://skilja.com/de/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>IDP: Solving bot illiteracy in the digital workforce &#8211; Part 2</title>
		<link>https://skilja.com/de/idp-solving-bot-illiteracy-in-the-digital-workforce-part-2/</link>
		
		<dc:creator><![CDATA[Guest]]></dc:creator>
		<pubDate>Wed, 21 Oct 2020 10:12:00 +0000</pubDate>
				<category><![CDATA[Gastbeitrag]]></category>
		<category><![CDATA[Grundlagen]]></category>
		<category><![CDATA[Markt]]></category>
		<guid isPermaLink="false">https://skilja.com/?p=1387</guid>

					<description><![CDATA[Editor’s note: This is a guest post from Jupp Stöpetie In this post we examine the role of Intelligent Document Processing (IDP) relative to Robotic Process Automation (RPA) and how these technologies drive Digital Transformation when combined. Part one was looking at what is driving and enabling Digital Transformation. This part two is dedicated to RPA and [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p><span class="has-inline-color" style="color: #7cb955;"><strong>Editor’s note</strong>: </span>This is a guest post from Jupp Stöpetie</p>



<p><em><em>In this post we examine the role of Intelligent Document Processing (IDP) relative to Robotic Process Automation (RPA) and how these technologies drive Digital Transformation <em>when combined</em>.</em> <em>Part one was looking at what is driving and enabling Digital Transformation. This part two is dedicated to RPA and why IDP is essential.</em></em></p>



<h2 class="wp-block-heading"><strong>Robotic Process Automation </strong></h2>



<p>Since maybe 3 or 4 years RPA has become the tool of choice for automating repetitive tasks for many companies. RPA vendors claim that their systems are easy to set up and maintain without the need of coding. Vendors propagate that RPA systems can be operated by business managers with some light-weight training. Basically every manager can create bots that replace human workers. This has led to a wide spread of RPA systems in businesses. There is no need anymore for complicated, lengthy and costly automation projects managed by IT departments. RPA puts automation and the use of AI in the hands of business managers. Today the workforce in companies is increasingly a combination of humans and robots. But of course things are more complicated when you look a bit deeper than the RPA marketing collaterals. What is for example when we need to retrieve information from a document? That is no problem for human workers. But bots have no humanlike reading skills.  They only can “read” data from structured data like files and databases. That is because it is easy to instruct bots with the exact location where to find the data.</p>



<h2 class="wp-block-heading"><strong>Why bots can&#8217;t read documents</strong></h2>



<p>A document can be seen as a container for content. The content is made up of static data and explicit and hidden information describing the relationships between the data which gives meaning to the document.  A document also has meta-data being the properties of the container itself.  Content is represented in documents in a way so that humans can process the content. Note that this has an important implication. Most documents were never designed to be read by bots. When data must be extracted it doesn’t matter for human workers if a document has a fixed structure like in a form, a semi-structure like in an invoice or no structure like in a contract. With some proper instructions human workers will be able to find the data they are looking for. Depending on how much structure there is, processing time may vary significantly of course. Bots however have no cognitive reading skills. And adding OCR and data capture technology to an RPA solution is often not enough to make bots really skilled at processing documents.</p>



<figure class="wp-block-image size-large"><img decoding="async" class="wp-image-711" src="https://skilja.com/wp-content/uploads/arlington-research-kN_kViDchA0-unsplash-1-1024x683.jpg" alt="" />
<figcaption>Photo by Arlington Research on <a href="https://unsplash.com/s/photos/business?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText" target="_blank" rel="noreferrer noopener">Unsplash</a></figcaption>
</figure>



<h2 class="wp-block-heading"><strong>Why OCR and Data Capture often do not offer the right reading skills for bots</strong></h2>



<p>The short answer: these technologies fall short because they were not developed for RPA. OCR was not designed to understand content. OCR is a technology for converting pixels into characters. Most OCR packages can also convert document images (scans) into text files while recreating the original layout. Data Capture systems use OCR technology and also many other AI technologies. Data capture systems were designed for extracting data from large volumes of documents with the highest possible accuracy. Neither OCR nor data capture were designed for RPA users to teach their bots how to read stuff. </p>



<p>The users who usually set up data capture installations are engineers who know exactly how to use all the levers and parameters of these systems. They often will create scripts or even add self coded additions in order to achieve the highest possible accuracy. The initial set up is high but that makes a lot of sense. The ROI of data capture installations doesn’t have to be fast. Almost always these installations are set up to run for many years. </p>



<p>Batch oriented data capture systems, although very powerful when setup correctly, logically are not the first choice of RPA users when they need to add document processing capabilities to their bots. These users are looking for simple, easy, fast and flexible functionality. The volumes they need to process are small. They also have higher needs for these systems to learn while doing because not the whole spectrum of variability in the documents that need to be processed will be available at the start of the project. And often they also need a higher level of intelligence because they want to automate tasks that were formerly performed by humans. And when they design new processes RPA users want to create intelligent bots that behave just like humans. But what RPA users cannot handle and what would break the RPA paradigm is if adding reading skills to bots comes at the expense of requiring a lot of investment and special technical skills like solution design, coding, production testing etc.</p>



<p>Note that in cases where RPA systems are used for processing large volumes of documents it makes sense to use data capture systems to extract data from these documents. Because in these cases efficiency aka accuracy most likely will play a significant role. Processing large volumes of documents is data capture’s sweet spot.</p>



<p>What RPA users really need when they have to process documents is something that may look a bit like OCR and data capture <strong>but is much smarter than that because it is operating a much broader set of AI technologies</strong> and at the same time it also should be easier to use.</p>



<h2 class="wp-block-heading">IDP: a new product category for a new market</h2>



<p>The massive pervasion of RPA installations in companies that happened during the past 3 to 4 years has led to an increasingly high demand for these easy, simple, flexible but yet powerful intelligent data capture solutions. This fast growing demand has spawned a new generation of companies who has gone down a different path using different technologies than what the incumbents in the data capture market have been doing for more than 20 years. These new systems are all based on the idea that Deep Neural Networks and other forms of Machine Learning are better and much easier ways to fulfill the needs of RPA users. All you need is a lot of samples, train your neural networks and off you go. What however is unclear at this stage is if ML can actually deliver the accuracy that is needed when bots should have humanlike reading skills executing mission critical tasks. When deep neural networks make mistakes you cannot go in and correct for these mistakes. These systems are black boxes. It is noteworthy that all incumbents are updating their existing offerings with adding ML technologies. Especially with the goal to become better at processing unstructured documents. And it seems not such a ridiculous assumption that these companies who have many years of experience developing document processing systems have an advantage over the competition that is fully ML focused. It looks quite plausible that incumbents are better set up to marry old AI and new AI technologies based on their solid understanding of how to build robust document processing systems.</p>



<p>All these developments of old and new companies to develop document processing skills for RPA bots have led to a new category of products that cater to the digital transformation market and not so much to the traditional capture market. The emergence of these new products was the reason for the Everest Group, a leading management consulting and research firm to come up with a new product category: <strong>Intelligent Document Processing or IDP</strong>.</p>



<p><a href="https://www.everestgrp.com/2019-12-understanding-enterprise-grade-idp-solutions-market-insights-52033.html">Everest Group defines IDP</a> as any software product or solution that captures data from documents (e.g., email, text, pdf, and scanned documents), categorizes, and extracts relevant data for further processing using AI technologies such as computer vision, OCR, Natural Language Processing (NLP), and machine/deep learning. These solutions are typically non-invasive and can be integrated with internal applications, systems, and other automation platforms.</p>



<figure class="wp-block-image size-large"><img decoding="async" class="wp-image-721" src="https://skilja.com/wp-content/uploads/markus-spiske-3Tf1J8q9bBA-unsplash-1024x683.jpg" alt="" />
<figcaption>Photo by Markus Spiske on <a href="https://unsplash.com/s/photos/business?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText" target="_blank" rel="noreferrer noopener">Unsplash</a></figcaption>
</figure>



<h2 class="wp-block-heading"><strong>About OCR and Online-learning</strong></h2>



<p>In many blogs about IDP authors create a contradiction between OCR being the old fashioned unintelligent way of data extraction and modern AI based extraction methods that are state of the art and intelligent. First of all as I pointed out OCR is not data extraction. OCR is one of many AI technologies that are used in data capture systems. And there are different ways we can build intelligence into systems that will help to find and interpret data in documents. Machine learning may perform better on unstructured documents. When dealing with forms and semi-structured documents systems that use templates, classifiers and other AI technologies including machine learning will almost always outperform systems that are solely based on machine learning. Note that these comprehensive systems operating ML in a smart way will learn from automated feedback through users correcting mistakes and from the results of successful and correct classification and extraction to generate additional knowledge (expanding the space) and statistics of usage of existing knowledge. See <a href="https://skilja.com/de/the-magic-of-online-learning/" target="_blank" rel="noreferrer noopener">The Magic of Online-Learning</a>.</p>



<h2 class="wp-block-heading"><strong>What is the ideal IDP solution?</strong></h2>



<p>The challenge for intelligent document processing is that there seems to be no one ideal approach. Depending on the type of documents, the volumes that have to be processed and the importance of accuracy ML or more traditional AI approaches or a blend thereof will be the best choice. For example when document volumes are big accuracy is important. When a shared service centre for example is processing 50 million documents a year improving accuracy from say 95% to 96% is significant. We just reduced the number of documents that have to be corrected by 500k. Another case where accuracy is critical is in straight-through processing.</p>



<p>It seems that the best option customers have is to adopt an IDP platform that enables them to operate different solutions or combinations of such solutions while shielding users from the complexity of these solutions. Vinna is such a platform that even allows to add crowdsourcing (Human In The Loop Verification) without users ever being aware of the complexity of what goes on under the hood.</p>



<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>



<div class="wp-block-image">
<figure class="alignleft size-large is-resized"><img decoding="async" class="wp-image-654" src="https://skilja.com/wp-content/uploads/0-3.jpeg" alt="Jupp Stöpetie" width="170" height="170" /></figure>
</div>



<p class="has-text-align-left">Jupp Stöpetie as CEO of ABBYY Europe established ABBYY&#8217;s presence in the Western European markets, growing the brand and market presence to a leadership status for +25 years. His experience includes founding and growing companies and managing all levels of business operations, sales, and marketing. Jupp left ABBYY in spring 2020 and now works as an independent consultant based in Munich, Germany.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>IDP: Solving bot illiteracy in the digital workforce &#8211; Part 1</title>
		<link>https://skilja.com/de/idp-solving-bot-illiteracy-in-the-digital-workforce-part-1/</link>
		
		<dc:creator><![CDATA[Guest]]></dc:creator>
		<pubDate>Wed, 15 Jul 2020 10:11:47 +0000</pubDate>
				<category><![CDATA[Gastbeitrag]]></category>
		<category><![CDATA[Grundlagen]]></category>
		<category><![CDATA[Markt]]></category>
		<category><![CDATA[Prozess]]></category>
		<guid isPermaLink="false">https://skilja.com/?p=1518</guid>

					<description><![CDATA[Editor’s note:&#160;This is a guest post from Jupp Stöpetie In this post we examine the role of Intelligent Document Processing (IDP) relative to Robotic Process Automation (RPA) and how these technologies drive Digital Transformation when combined. In part one we look at what is driving and enabling&#160;Digital Transformation. Part two then is dedicated to RPA [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p><span style="color:#7cb955" class="has-inline-color"><strong>Editor’s note</strong>:&nbsp;</span>This is a guest post from Jupp Stöpetie</p>



<p><em>In this post we examine the role of Intelligent Document Processing (IDP) relative to Robotic Process Automation (RPA) and how these technologies drive Digital Transformation <em>when combined</em>. </em> <em style="font-style: italic;">In part one we look at what is driving and enabling&nbsp;Digital Transformation. Part two then is dedicated to RPA and why IDP is essential.</em><i>&nbsp;</i></p>



<h2 class="wp-block-heading">Digital Transformation</h2>



<p>Companies all over the world are redesigning and digitizing their businesses at an ever faster pace and they have many reasons for doing so:</p>



<ul class="wp-block-list"><li>They want to serve their customers faster and better. </li><li>They understand that automation and using AI technologies improves innovation, agility, scalability and cost-efficiency</li><li>They appreciate that in contrast to human labor-intensive processes digital processes<ul><li>take less time to design</li><li>need less capital investment</li><li>are faster to deploy </li><li>and (much) less costly to run. </li></ul></li></ul>



<p>The above is commonly referred to as Digital Transformation. Citing Salesforce’s definition: “Digital Transformation (DX) is the process of using digital technologies to create new — or modify existing — business processes, culture, and customer experiences to meet changing business and market requirements.“&nbsp;</p>



<p>The main drivers of Digital transformation are the ongoing globalisation and the Fourth Industrial Revolution which has led to dramatically increased levels of competition. DX greatly improves the agility of businesses so they can adapt much faster than ever before to changes in the market. Changing or even completely redesigning digital processes is a lot easier and comes at much lower cost which results in significantly increasing a business’ competitiveness. Note that even terminating digitised processes comes at a much lower cost than when a lot of capital investment and labor was involved. Basically businesses have no choice. They must become digital. And those who are slow to change find themselves in an increasingly disadvantageous position. New businesses nowadays will always start with a digital concept in mind and by doing so will avoid manual processes if that is possible and makes sense.&nbsp;</p>



<figure class="wp-block-image size-large"><img decoding="async" src="https://skilja.com/wp-content/uploads/mike-kononov-lFv0V3_2H6s-unsplash-1024x576.jpg" alt="" class="wp-image-707"/><figcaption><span class="has-inline-color has-black-color">Photo by&nbsp;Mike Kononov&nbsp;on&nbsp;</span><a rel="noreferrer noopener" href="https://unsplash.com/s/photos/business?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText" target="_blank"><span class="has-inline-color has-black-color">Unsplash</span></a></figcaption></figure>



<h2 class="wp-block-heading"><strong>Market Data on Digital Transformation</strong></h2>



<p>What actually makes things very different from say ten years ago is the enormous progress. Worldwide spending on technologies and services that enable digital transformation (DX) by business practices, products, and organizations is forecast to reach $2.3 trillion in 2023, according to a new update to the International Data Corporation (<a href="http://www.idc.com/">IDC</a>) <a href="http://www.idc.com/getdoc.jsp?containerId=IDC_P32575" target="_blank" rel="noreferrer noopener">Worldwide Semiannual Digital Transformation Spending Guide</a>. DX spending is expected to steadily expand throughout the 2019-2023 forecast period, achieving a five-year compound annual growth rate of 17.1%. &#8222;We are approaching an important milestone in DX investment with our forecast showing the DX share of total worldwide technology investment hitting 53% in 2023,&#8220; said <a href="https://www.idc.com/getdoc.jsp?containerId=PRF005039" target="_blank" rel="noreferrer noopener">Craig Simpson</a>, research manager with IDC&#8217;s <a href="https://www.idc.com/promo/customerinsights" target="_blank" rel="noreferrer noopener">Customer Insights and Analysis Group</a>.</p>



<p>In their July report Grand View Research reported that the global Robotic Process Automation market size which is a segment of the overall DX market was valued at USD 1.40 billion in 2019 and is projected to exhibit a compound annual growth rate (CAGR) of 40.6% from 2020 to 2027.</p>



<h2 class="wp-block-heading"><strong>Why is Digital Transformation taking place now</strong></h2>



<p>In the last decade we have seen a tsunami of digital transformation projects driven by an accelerating desire of companies to increase competitiveness and innovation. But of course one could argue that businesses always had that desire. The question one could ask is: Why is all that happening now? Haven’t companies not been automating for decades already? Yes, but not at the current pace.&nbsp;</p>



<p><em>Note: at this stage it is unclear how the Covid-19 pandemic will influence market dynamics. It seems however unlikely that the need for companies to digitize their businesses will slow down. On the contrary, it is much more likely that the opposite will happen.</em></p>



<p>What has made things totally different from say ten years ago and has really enabled Digital Transformation to take place is the enormous progress in both computer science and in the computer industry. Compared with say 10 years ago we see:</p>



<ul class="wp-block-list"><li>an enormous increase in computational power</li><li>the rise of super powerful algorithms, algorithms that need incredible amounts of computational power which is now available</li><li>vastly improved connectivity both in speed and access points and it is improving at an accelerating pace (5G)</li><li>smartphones: in 2009 170 million smartphones were sold. In 2020 1.5 billion units are estimated to be sold. That makes for an estimated 3.5 billion people having a smartphone in 2020</li><li>huge and disproportionate &#8211; compared to the rest of the economy &#8211; amounts of money, that have been invested in tech companies. Successful tech companies have rewarded their investors&nbsp; with multiples that dwarfs any other industry.</li><li>and last but not least an insistently growing appetite of consumers and businesses for more, better and faster service anywhere at any time.</li></ul>



<h2 class="wp-block-heading"><strong>Some examples of how robotic process automation and document processing drive Digital Transformation in companies</strong></h2>



<p>A large international pharmaceutical company wanted to capture all details from their purchase orders and perform lookups and validation using their ERP. Tasks were automated using a RPA system. An intelligent document processing system was needed to read the data from the POs.</p>



<p>A large financial services company wanted to use RPA to automate their KYC process which involves capturing, verifying driver licences, passports etc. Early in the process they found that they also needed an intelligent document processing system.</p>



<p>A global logistics company wanted to automate their invoice processing (millions of documents). Their RPA system was found to be up to the challenge. But the company initially backed off because of the complexity of extracting data from millions of invoices (semi-structured) with a wide range of varying lay-outs. Only after a proof of concept clarified that there was an intelligent document processing system on the market that was up to the challenge the company proceeded with the project.</p>



<p>In part 2 of this post we will discuss how the combination of RPA with IDP works and what are the challenges that need to be considered.</p>



<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>



<div class="wp-block-image"><figure class="alignleft size-large is-resized"><img loading="lazy" decoding="async" src="https://skilja.com/wp-content/uploads/0-3.jpeg" alt="Jupp Stöpetie" class="wp-image-654" width="170" height="170"/></figure></div>



<p class="has-text-align-left">Jupp Stöpetie as CEO of ABBYY Europe established ABBYY&#8217;s presence in the Western European markets, growing the brand and market presence to a leadership status for +25 years. His experience includes founding and growing companies and managing all levels of business operations, sales, and marketing. Jupp left ABBYY in spring 2020 and now works as an independent consultant based in Munich, Germany.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
