Textual analysis reveals corporate fraud
The textual analysis technique can be used to identify language patterns in management communications which are inconsistent with either the company’s financial performance or with the communications of other companies in the same industry, indicating fraud
Preliminary findings from research conducted by Patrick Fan and Greg Jenkins, associate professors of accounting and information systems in Virginia Tech’s Pamplin College of Business, suggest that textual analysis software might help auditors identify fraud in companies. The textual analysis technique — used extensively in the social sciences to scrutinize written and oral communication — can be used to identify language patterns in management communications which are inconsistent with either the company’s financial performance or with the communications of other companies in the same industry. Such inconsistencies may indicate fraud. “The results of our initial analysis suggest that our model has substantial predictive power,” said Jenkins. “When fraud is committed in companies, there appear to be patterns in corporate communications that imply wrongdoing.” The professors hope to develop their methodology, based on knowledge from auditing and information systems, into a more precise new computerized tool to help auditors and regulators detect fraud. They have received a grant of about $196,000 from PricewaterhouseCoopers for their two-year project, expected to be completed in 2009.
Fan, a specialist in data and text mining and business intelligence research, said that their model uses text-mining techniques to automatically identify word patterns that might be highly associated with financial fraud. By recognizing language patterns or trends which are inconsistent with either the company’s financial performance or communications issued by other companies in the same industry, the software would guide auditors to particular areas that may need further examination. Explaining the need to compare the company with others in the same industry, Jenkins said that in many instances of fraud, inconsistencies between a company’s communications and its financial performance may be difficult to discern. In such cases, benchmarking a company’s communication patterns against those of other companies in the same industry may help reveal unusual or unexpected differences. A company’s financial performance may be similar to that of its competitors, he said, “yet the language it is using to describe its prospects seems overly optimistic or overly specific or vague relative to others in the industry.” Developing the benchmark data itself, Jenkins said, is a tremendous challenge. He and Fan have compiled a list of cases of known fraud — companies that have been sanctioned by the Securities and Exchange Commission for committing fraud - and are completing identification of another set of companies, those in the same industries ‘whose financial statements have stood the test of time.
The professors will use their methodology to compare large volumes of corporate communications — annual reports, letters to shareholders, and transcripts of analyst conference calls, for example - from these two groups of companies, which represent a variety of industries: technology, retail, energy, and consumer products. “We’re tracking tens of thousands of words from multiple companies and multiple periods,” Jenkins said. “We’re using computing power to go through and look at language to identify patterns — words and frequency of usage - that would be very difficult for a human reader to discern. Our findings so far show that there are systematic differences in textual communications between the two groups of companies.” He and Fan said that they envisage their software eventually serving as a decision-support tool that would improve the efficiency of the auditing process and, ultimately, enable the detection of financial fraud.