Mar 02

In the competitive document industry, companies that can create and implement an effective document strategy from the beginning of a documents life cycle to the end have a distinct advantage. One key stage in the life cycle is processing the critical information off of the documents. Companies that can efficiently and cost-effectively capture information have a competitive edge over their competition.

Price, data quality and turnaround-time are critical when processing the billions of documents that are produced each year. The sheer volume of documents companies produce poses a big challenge for them and the document processing service provider who need to extract data from the documents while ensuring manageability and accuracy of information.


The traditional method of capturing data-through manual keying-is expensive, time-consuming and often prone to errors. For these reasons, corporations and document processing service providers have looked for new ways to streamline data capture in order to control costs and improve turnaround-time and efficiency. One way the industry has sought to do this is through the use of automated recognition technology. Automated recognition solutions have made it possible for companies to automatically read a variety of text styles, including machine print, handprint and cursive. Advanced systems combine the power of all three recognition capabilities to simultaneously process all types of information on structured documents with increased speed, accuracy and cost savings.


Several options are available to companies looking to automate their data capture processes. One option is to outsource the process completely and integrate a 100 percent turnkey solution, where all information on a document is recognized and verified by a document processing service provider. Another is to adopt an on-site solution, in which the company deploys the hardware and software at their own facilities. A third is to institute a hybrid solution, where a provider receives the document images over the Internet, processes them through a recognition server and sends them back to the client’s facilities for final keying and verification.


Infrastructure, Support and Resources – Important Factors in Selecting the Right Model


To determine which option would provide the best fit, companies should consider the level of infrastructure, support staff and financial resources available to host a data capture and document processing solution. The company’s current situation and needs play a role in the level of processing that can be performed in-house. Security is another important consideration. If a company is unable to distribute critical information to an outside agency or prefers to keep all data in their hands at all times for security measures, an on-site solution may be the only option.


On the flipside, some companies make a conscious decision to focus internal efforts on their core-competencies and outsource processes that are not central to their business. Across a number of vertical industries, companies have begun to outsource document processing for this very reason. As a result, they have reduced overhead and operating costs associated with manual in-house processing, while decreasing paper handling, management and storage costs.


In this case outsourcing agencies help to facilitate the preparation, capture, warehouse, retrieval and utilization of documents. They manage the workflow process, so that the company only needs to receive the data in the format they prefer to leverage for future marketing, CRM and other business initiatives.


A third option for companies who want to take advantage of data capture services and still maintain final keying and verification in-house is through the use of a software/service model. This method allows companies to send images of the fields they want recognized over the Internet for data extraction. The information is processed automatically through a recognition server, and data that is not read is returned for keying and verification. Companies that incorporate this “hybrid” model benefit from the speed, convenience and efficiency of recognition services, while limiting the number of data entry operators and keeping hardware, overhead, training and management costs minimized.


Assessing the Technology in Light of the Company


Once a decision is made about the type of solution-software, service or something in between-it is important to consider a few other factors. Companies should carefully review performance, data quality and cost before deciding which document processing option is best. Key questions to ask include:


1) Is the technology and process proven?

2) Is the quality of data as good or better than what I have today?

3) What kind of turnaround and throughput should I expect?

4) How much will it cost?


Understanding these factors, and their implications on the overall expense of document processing is critical to every organization’s bottom line.


Increased Sophistication


While no solution can achieve 100 percent accuracy, recent improvements in technology have taken recognition, of both machine print and handwriting, to levels unheard of just a few years ago. The software identifies which text is being read and applies the correct engine-OCR (Optical Character Recognition – machine print), ICR (Intelligent character recognition – hand-print) or NHR (Natural Handwriting Recognition – cursive and unconstrained hand print)-to improve the speed and accuracy of the recognition process. Providers are also moving away from character recognition and offering a more “holistic” approach that reads the entire word or phrase in addition to the individual characters in a field, incorporating “intelligence” into the software to improve recognition performance.


If at all possible, companies considering automation should ask the recognition provider to run a test sample of their own structured documents. Once the software has proven it will save time and money, an accurate estimate of the savings potential can be made.


Improving the Quality of Data


Understanding accuracy rates and performance of current data capture processes, whether manual entry or automated, is critical to gauge the effectiveness of new document processing solutions. Because of the complex system of checks and balances used throughout the process, automated recognition often results in higher accuracy levels than manual data entry.


Advanced features of recognition technology, such as the use of context, can also help to improved accuracy. Context information provides a range of probable meanings that can be applied to a field in order to compensate for the ambiguity of handwritten or printed text. This range can include numbers, dates, values, or field types. Similarly, recognition software can perform database cross-validation to enhance accuracy and read rates. Common uses include matching ZIP codes with appropriate mailing addresses or verifying the numeric amount (i.e., $108.35) and alphanumeric amount (i.e., One hundred eight and 35/100) on a check.


If companies have specific information they want to verify against, such as social security number or employee code, recognition software can incorporate a custom database with which all answers are compared against. This makes the recognition process faster and more reliable, because the vocabulary is smaller and more specific to the application.


Performance and Scalability


Beyond accuracy and data quality, it is important to evaluate the volume of documents a system or outsourcing service can process, and the expected turnaround time. Document processing service providers should be able to manage current volumes and offer scalability to grow with your business. Most agencies have a standard turnaround time, which will help eliminate backlog of claims and keep processing on track.


Understanding and modifying work processes to include data capture and document processing is essential to help organizations focus on their business and core competencies. By eliminating backlogs and time spent on details of manual processing, these organizations can eliminate many of the headaches through increased turnaround times and lower, controlled operating costs.


By considering the existing infrastructure as well as the future goals of the business, managers can determine the best solution for their organization. When reviewing specific products or service vendors, the performance, data quality, throughput and turnaround, and cost of each solution should be assessed. Quality document processing vendors are willing to go the extra mile to prove how their products and services surpass the competition and how this can lead to greater returns. These providers understand that ultimately, it should be a win-win situation for everyone.

Article Source: http://ezinearticles.com/

Nov 02


Data mining is the process of using certain algorithms, software and tools to retrieve, collect, analyze and report information (known as predictive analysis) from a huge pool of data. Data mining is extremely useful these days where information is abundantly available. The information obtained by data mining is used for several applications for decision-making relating to direct marketing, e-commerce, customer relationship management, healthcare, oil and gas industry, scientific tests, genetics, telecommunications, financial services and utilities.

Web data mining is the process of automated extraction of data from the World Wide Web. The internet has extensive data about everything that can be used effectively for making intelligent decisions. However, retrieving as well as sifting through such huge databases is an arduous task. Hence, there are certain data mining tools that help to make this easier. The tools can pick out relevant data and interpret it as per requirements.

There are many kinds of Web data mining: standard mining, data verification and custom mining. Web data mining products can perform a very wide range of functions, including: search engine optimization and website promotion, multiple transformations and modular marketing indicators for CRM, web log reporting, tracking visitor patterns of websites, calculating visitor conversion ratios, reporting online customer behavior, analyzing click-throughs, providing real time log analysis of campaign tracking, click paths, geographic pinpointing, keywords by search engine, web visitor analysis reports, content analysis, extract web events like campaign results, web traffic, etc.

There are several commercially available Web data mining and web usage mining software applications available. Some of them are: AlterWind Log Analyzer Professional, Amadea Web Mining, ANGOSS KnowledgeWebMiner, Azure Web Log analyzer, Blue Martini Customer Interaction System’s Micro Marketing module, ClickTracks, ConversionTrack from Antssoft, Datanautics, (formerly Accrue), eNuggets, (real-time middleware), LiveStats from DeepMetrix, Megaputer WebAnalyst, MicroStrategy Web Traffic Analysis Module, NetGenesis Web Analytics, NetTracker family, Nihuo Web Log Analyzer, prudsys ECOMMINER, SAS Webhound, SPSS Web Mining for Clementine, WebLog Expert 2.0 for Windows, WebTrends, a suite for Data Mining of web traffic information, XAffinity(TM), XML Miner, 123LogAnalyzer. There are also free versions of web Data Mining software such as: AlterWind Log Analyzer Lite, Analog (from Dr. Stephen Turner), Visitator and WUM (Web Utilization Miner).

Data Mining provides detailed information on Data Mining, Data Mining Tutorials, Business Intelligence Data Mining, Web Data Mining and more. Data Mining is affiliated with Offshore Data Entry.
Article Source: http://EzineArticles.com/

Jun 25

I was interested to read about Angela’s experience trying to secure a briefing from Oracle on its collaboration related offerings and activities. As Angela pointed out, the ‘Big O’ was the only large vendor that ‘should’ have a story in this space that declined to tell her what it was up to.

When I later commented on this (with a link to the above) via Twitter, someone else came back to me to say that they too had been having trouble getting Oracle to open up in this area.

I have to say that this doesn’t surprise me. It must be quite challenging for Oracle at the moment trying to figure out how to position in this space. The Oracle Collaboration Suite was launched a few years ago supposedly to save the world from flaky Microsoft Exchange installations and pretty much fell flat. Oracle believed its own rhetoric about the world hating Microsoft, so looked silly to most people when it aggressively launched an initiative that would only work if customers ditched their existing Microsoft messaging infrastructure, which was never going to happen.

In addition to some of the things Angela mentioned, we have also seen the portal wars in which Oracle has consistently been on the back foot, and lately, the march of Microsoft SharePoint and a range of collaboration and unified communications offerings from IBM under the Lotus and WebSphere brands that are largely messaging system agnostic.

Then most recently, we have seen the BEA collaboration offerings thrown into the mix, which before the acquisition, were beginning to look pretty good. BEA had a very sound grasp of the heterogeneous world in which customers live and was taking a very mature view of social media in the enterprise, for example. And, of course, it wasn’t encumbered by competitive obsession, which, as an aside, is arguably one of the biggest obstacles to Oracle being accepted as a truly strategic partner in many major accounts. Telling CIOs and business executives that they have been stupid over the years to waste their money on SAP, Microsoft and IBM, for example, is not the best way to win friends in high places. While competition is good, destructive messaging generally only appeals to junior level activists. It is a huge turn-off in senior management circles.

Coming back to the original question, we should probably continue to expect Oracle to be tight-lipped on not just collaboration, but middleware strategy in general for a little while yet. I have personally been told on a couple of occasions to refer to the ‘official line on oracle.com’ when looking for clarity on open questions that we hear from Oracle’s customers (old or newly acquired). Irritating though this might be, and frustrating though it is to be fobbed off with ‘Mom and Apple Pie’ type feel-good policy statements, the truth is that there is little else Oracle can do until it gets its act together properly.

And to be fair, given some of the confusion than came about as a result of articulating nice sounding stories around work-in-progress plans associated its CRM and ERP acquisitions in the past (that later had to be ‘adjusted’), it is probably better for us to hang on until Oracle really has worked out what it is trying to do in collaboration as it has in the enterprise application space.

Oracle is undoubtedly already aware that needs to be careful that the collaboration and closely related unified communications markets do not slip away from it, and will be doing what it can to make sure it doesn’t get left behind again. In the meantime, it goes without saying that customers should challenge the company hard before making major commitments to it in these areas.