Our Data Extraction Services fall into three broad categories. The first would be where we’re scanning documents or files for our clients and we are naming those PDF files to reflect what the document is in respect of the specifications for that document type and then delivering that information back as a CSV TXT, Excel or XML file or as a PDF, that’s been named to identify what the content is. In conjunction with this, we generally have OCR or text recognition so you can search the full content of the entire document.
The second category is where we’re undertaking pure data extraction services that would be for instance a shopping centre competition coupon, application forms or written examinations where we’re extracting information and mapping the results against a template and rules or where we’re simply creating a database for marketing purposes. We do a lot of volume in that area.
We also work for research companies, where they’re actually undertaking the questionnaire and we extract the information because often it’s handwritten. Obviously, there’s more deployment of soft forms and online questionnaires these days, but we do all of the handwritten extraction components. We then emulate the data to conform to an Excel spreadsheet template that has been requested by the client and they can then undertake data mining or utilized the metadata created.
The third category of data extraction is where we’re using sophisticated Forms Recognition software to recognize elements or specified zones within records. Often this is an accounts payable application or it is questionnaires and that sort of thing where we’re undertaking optical character recognition, doing bar code recognition ECT.
In an accounts payable application, we’re looking for the ABN number, looking for line item descriptions and the GST amount etc. We can automate checksum value verification on the total value, we’re doing line items for purchase order numbers and that sort of thing. It’s an area where we work right at the front of the accounts payable workflow process, Microsystems opens the mail, scans and extracts the information from each invoice, we then format the extracted metadata. The images and metadata is then uploaded via SFTP to our clients for payment approval.