Tried & True Innovation

As we all know, mortgage lenders are looking for an edge. How do they get that edge? They can start by replacing a paper-driven mortgage process with an automated process. This is where industry specialists like Paradatec can help. For over two decades, Paradatec has focused its skills towards delivering the most efficient, accurate, and flexible freeform document classification and data extraction solution available anywhere. Specifically, Paradatec’s advanced OCR solutions offer significant efficiencies for classifying large quantities of differing document types and extracting key data elements from those documents. In the mortgage market, these out-of-the-box capabilities allow for the quick and accurate identification of nearly 500 unique documents in the typical mortgage file, along with capturing over 6,000 data elements from those documents. Our editor talked to (left to right) Mark Tinkham, the company’s Director of Business Alliances; Paul Fischer, the company’s Director of Professional Services; and Neil Fraser, the company’s Director of US Operations; about how lenders can use technology to improve the mortgage process. Here’s what they said:

Q: So, what does Paradatec specifically do that would be compelling to a mortgage servicing, or lending operation?

MARK TINKHAM: Paradatec streamlines and monitors processes which otherwise require significant human labor. We minimize the need for managing large costly staffs of trained loan file indexers and data key entry operators. We do this while at the same time providing statistical feedback and measurement of accuracy and automation. We provide these efficiencies so that our clients are able to better focus on their customers, manage workload peaks and valleys more easily, and measure results over time.

Featured Sponsors:


A good basic example is our ability to automatically identify all the documents in a 500 to 1,000 page loan package, and capture every one of the hundreds of fields on every version of every TRID document (Loan Estimate and Closing Disclosure), every one of the dozens of fields on a Loan Application, Appraisal, Transmittal Summary, Note, Deed of Trust, 4506-T, Income Tax statement and whatever else a client may require.

Q: How does OCR (Optical Character Recognition) technology provide value in today’s Mortgage Industry?

PAUL FISCHER: There are vast differences between some of the lower cost OCR technologies, and the advanced OCR offered by Paradatec. The advantages to using our technology are a dramatically faster, more accurate and less costly process for indexing and capturing data from mortgage loan documents.

The short answer to your question is: we provide our clients with an ability to do in seconds what many operations, using 100% human labor, take hours to do. And, at the same time, we provide results which are more accurate.

Our unique approach to OCR allows us to extend these broad benefits to originators’ and correspondent lenders’ indexing and data ingestion validation, and servicers’ loan onboarding processes.

Featured Sponsors:

Our capabilities, out of the box, today include rules to identify approximately 500 mortgage loan document types and extract more than 6,000 fields from those documents.

In addition, we have helped our clients with automating their compliance processes with HMDA loan audits, UCD creation and TRID capture solutions.

Q: Has the industry fully embraced your automation technology?

NEIL FRASER: We think lenders do understand the need for automation, but many may not be aware of the significant and unique competitive advantages our clients continue to realize.

Lately we have been spending more time sharing our many success stories and getting the word out that we can provide powerful efficiencies related to loan automation.   These advantages range from compressing the time it takes to process borrower-provided documents to expediting the loan onboarding process and making compliance audits significantly more automated.

We offer an ability to dramatically reduce the manual efforts related to indexing loan documents, and capturing key data from those document images. Our sub-second per image processing speed is unique and it allows us to take an approach which others are unable to match due to their OCR performance. This speed and ability to scale our processes to tens of millions of images per day on a small hardware footprint are waking up the industry to the possibilities of how their operations will benefit.

So, we are seeing more and more lenders embracing our technology. And, because we continue to add enhancements and find new ways to provide value with our technology we believe our current and future clients will continue to find new and exciting ways to further embrace our solutions.

Q: How is Paradatec’s OCR technology different than others?

MARK TINKHAM: Our extreme focus on OCR technology began more than twenty-five years ago, and since 2007 we have been applying our unique, sub-second per page, small hardware footprint OCR technology to the mortgage industry. With every implementation, we have continued to build more and more out-of-the-box capabilities specific to processing mortgage loan documents. Over the years we have seen various fads and splashy marketing campaigns touting various OCR technologies and approaches, which in reality were not effective.

Recently we’ve seen an increase in the hype with alternative automation strategies. One approach, which isn’t new, and we have seen in years past, is something called visual classification, in which the image ‘fingerprint’ of a page is used for identification rather than the text itself. This approach is fast and used in an attempt at matching our sub-second per page processing speed.

Featured Sponsors:

For documents that are graphically focused with minimal text, this may work fine, but mortgage files are loaded with text, and in many cases that text will be key to correctly identifying the document type. For example, many Deed and Rider signature pages can look similar, in that the content many times pushes the signature block to its own page. Our clients want the delineation between these docs, and even between the various Riders, but at a ‘fingerprint’ level these pages can look quite similar, leading to many indexing errors. It’s only when the footer text is discovered and read as “PUD Rider,” “MERS Rider,” or “Deed of Trust” that the correct automated decision can be made, which our solution completes with sub-second speed.

Q: How do you ensure quality control and data accuracy?

PAUL FISCHER: We implement database validation of captured data, and reasonability rules for indexing and data capture. In addition, we provide a process for statistically random reviews and measurements of loan indexing and captured data along with an ability to track user efficiency over time. With the Paradatec Statistics database, our clients are able to generate an unlimited number of useful reports which track processing time, by loan, by user, by time period, even down to the document type and extracted data field level.

In addition, we provide an ability to create a quality review and learning process from production output with our analytical tools. This process is performed as part of the testing and implementation stages, and provides deep insights into the accuracy and automation levels which have been achieved.

As part of an ongoing quality measurement and learning adjustment stage, our clients can be confident that their processes are continuing to perform at the highest levels of quality.

Q: Your Company has released an Application Programming Interface (API). In layman’s terms, what does this do?

PARADATEC: We provide a Web Services API which allows end users to submit loan documents and data for validation to our workflow processes from virtually anywhere.

A use case example would be our OnDemandOCR process, which utilizes our API to allow lenders to submit final Closing Disclosures remotely and receive a MISMO formatted GSE compliant Uniform Closing Dataset (UCD) back as output for review and ultimately submission to the GSEs when loans are presented.

Another use case for our API will allow borrowers to submit documents as part of a loan origination. Our OnDemandOCR process will then identify the document or documents submitted, and automatically extract the key data fields from them.

Q: What are some other manual processes that you have automated within your clients’ operations?

NEIL FRASER: Since our focus on the mortgage industry began, we have continued to find more and more new, and many times dramatic ways to enhance our clients’ processes.

A little over a year ago, we were asked to re-index approximately two million loans due to some compliance pressure our client was getting to make sure their loan portfolio accurately accounted for the necessary source documents. We were able to assist by processing over 1.2 billion document images in a matter of weeks. In other cases we have been asked to help meet new compliance obligations by significantly streamlining what would otherwise have been extremely costly, labor-intensive efforts.

Our new HMDA Audit capability enables our clients to quickly validate the data on their Loan Application Register (LAR) against the data found on the associated loan source documents. Each loan is processed at less than one second per page and each of the final source documents’ data is compared to the values on the LAR. This process allows our clients to ensure compliance with the Federal Reserve Board’s Regulation C before submission to the Federal Financial Institutions Examination Council (FFIEC).

Our UCD Audit capability enables our clients and the GSEs to automatically compare the MISMO 3.3 data found in a Uniform Closing Dataset against the corresponding values found on the final Closing Disclosure which is embedded in that UCD. This process is performed at an average of one second per page and each of approximately 300 fields extracted are then compared. Differences found between the MISMO data and the extracted data are reported in a MISMO compliant “differences” file. Along with this, we also produce a corrected UCD based on the embedded Closing Disclosure.

Our CCAR FRY_14M offering helps our largest clients comply with the latest CFO attestation requirements related to the Dodd-Frank Stress Test rules for large financial institutions. This process uses our high speed OCR capability and pre-built rules to classify documents, find the final version of key document types, and validate source document data against attestation data. This process can be performed in seconds per loan, and allows our clients to find and correct much of the inaccuracies typically found. In fact, because the original attestation data is typically key entered with human labor, and final document versions are often confused with non-final versions, prior attestation data is often incorrect. Without automation, this compliance risk mitigation step would be cost prohibitive.

Q: Paradatec has more than a decade of experience within the mortgage industry. What new initiatives and innovations have you recently brought to market or have coming up in the near future?

MARK TINKHAM: Some examples of new initiatives, new capabilities, and product features, some of which were mentioned earlier, include:

The Paradatec WriteUCD module for automated creation of GSE compliant UCDs from final Closing Disclosures.

Web Services API to enable our clients to seamlessly integrate our technology using our OnDemandOCR feature.

An ability to capture every field on every version of both the Closing Disclosure, and the Loan Estimate in an average of one second per page.

Our Paradatec WritePDF module for creating fully indexed loans with data fields highlighted in a PDF which includes a table of contents which virtually maps a loan’s documents and key source data.

An ability to automatically identify and capture all the fields on the new HMDA compliant URLA and the new HMDA addendum to the old URLA.

Our new HMDA audit process which can greatly streamline this process for our clients.

Our UCD Audit capability has attracted some significant interest from the GSEs and some of our larger clients.

We’re developing a new handprint discovery feature that will provide large leaps in automation for our post-close clients, which need to validate the required initials and signatures on key loan documents.

Q: How do you see the mortgage industry and the mortgage process of the future evolving?

MARK TINKHAM: Like many other industries, the mortgage industry is experiencing an evolution through the aid of technology. Staying competitive and reducing per-loan processing costs require the use of technology like ours. Industry leaders such as Amazon and Orbitz have made the self-service model, albeit in other market segments, much less daunting, and the speed at which transactions can be completed has decreased significantly through this evolution. While the magnitude of the buying decision for a home is obviously much greater than that of buying an airplane ticket or a pair of shoes, the consumer has become comfortable with online transactions to the point that a paper-bound process is viewed as slow and stodgy.


Mark Tinkham is Director of Business Alliances at Paradatec, Inc. Over the past twenty-five plus years, Mark has worked for technology companies that deliver innovative solutions to the financial services industry. For the past ten years, his primary focus has been bringing efficiencies to the mortgage market through industry leading Optical Character Recognition (OCR).


Mark Tinkham thinks:

1.) The digital mortgage won’t eliminate the need for manual data entry.

2.) Our UCD Audit process will be found to be an invaluable tool for those lenders selling loans to the GSEs.

3.) The 20 largest lenders and servicers will all embrace advanced OCR by 2020 out of necessity.


Paul Fischer is Director of Professional Services at Paradatec, Inc.  For nearly 15 years he has focused on the design and installation of document capture, content management, and workflow automation systems for clients in a variety of industries.  Since joining Paradatec in early 2013, his primary focus has been on helping mortgage clients improve their operational efficiencies with Paradatec’s advanced mortgage OCR solution.


Paul Fischer thinks:

1.) Cycle times and cost pressures will continue to drive automation initiatives in the mortgage origination and servicing space.

2.) Document ingestion for mortgage servicing rights (MSR) transfers will become an entirely automated process.

3.) Robotic process automation (RPA) will reduce manual labor by 20% and much more in many cases.


Neil Fraser is Director of US Operations at Paradatec, a mortgage OCR technology organization that automates the data entry operations of large lenders through intelligent document analysis. Neil was Paradatec’s first US employee and has grown the organization every year since the company incorporated here in 2002.


Neil Fraser thinks:

1.) Redaction of personally identifiable information (PII) will become ubiquitous for any mortgage documents leaving a lender.

2.) Audits involving regulation such as TRID, RESPA, HMDA etc will become automated.

3.) As more investors move back into the secondary markets, the need for an audit trail from documents to elements in a loan servicing system database will become a requirement.