Contesting Bad Samples in California Sales Tax Audits

By Daniel M. Davis, CPA
Originally published in The Successful California Accountant

Probably no taxing agency uses audit sampling on a broader scale than the California State Board of Equalization (Board). The Board’s sales and use tax audits rarely span less than three years at a time, and huge volumes of documents can become subject to scrutiny. Detailed verification is neither practical nor desirable, so sampling is the only solution.

Most of the time, a well-designed statistical sample proves reliable information about the population it is drawn from. Unfortunately, not all tax audit samples are well-designed, or even statistical. It would be interesting, and probably disturbing, to learn how many unwarranted tax dollars are generated by bad audit sampling.

The Board is not the only regulatory authority using audit sampling, and it certainly has no monopoly on poor technique. The agency is the focus of this article because its auditors use sampling in nearly every area of procedure, and practitioners often find themselves at a disadvantage when contesting the results.

To be fair, the Board does produce some audits of professional quality based on valid samples, and its overall procedures are no worse than those of other taxing authorities. The defensive tactic discussed below would apply in virtually any situation where bad sampling has been used to the detriment of a client.

Even a well-designed sample may not reflect the same characteristics as its population. “Statistical sampling’s” primary advantage is not that it guarantees an accurate reflection, but that it allows the user to quantify the probability that the sample will be representative within a given range. For purposes of this article, a “bad” tax audit sample is simply one that does not mirror the relevant characteristics of its population, regardless of the reason.

Some of the more common uses of sampling in sales tax audits are:

To compute one or more markup factors. Markup factors may be used to evaluate the reasonableness of recorded sales, or to develop “audited” sales figures.
To assign relative weights to various components of purchases. This might be done when costs of sales are to be marked up to arrive at audited sales; when ratios of taxable and nontaxable purchases are to be computed, either for analysis or markup purposes; or to purify purchases by adjusting for freight charges, supply items, or classification errors.
To test claimed deductions for nontaxable revenue, such as sales for resale, charges for repair or installation labor, or sales in interstate commerce.
To establish purchases subject to use tax. These might involve untaxed purchases of supplies and minor equipment from out-of-state vendors, or withdrawals of untaxed inventory for internal use. (Purchases of larger equipment items usually are investigated in detail rather than sampled.)

Although statistical methods are preferred, the use of judgment (nonstatistical) sampling is common in sales tax audits, particularly where small businesses are involved. Most often these judgment samples involve transactions within a defined time span, such as a month or a calendar quarter. Not surprisingly, projections based on judgment samples can-and often do-produce inflated audit assessments. Sometimes such assessments may be reduced simply by expanding the test period.

For example: assume one of your clients operates a small convenience store. Most items sold at the store, such as beer, cigarettes, and sodas, are subject to sales tax. Some nontaxable food products also are sold, but the client does not separately classify these items, and all purchases are recorded in a single account.

“Business owners are accustomed to thinking in terms of gross margin (profit as a percentage of sales), whereas Board auditors think in terms of markup (profit as a percentage of the cost of sales).”

The client is selected for a routine three-year audit. The auditor wants to determine the markup achieved on sales of taxable merchandise. Since all merchandise purchases are recorded in one account, the auditor must find a way to factor out the nontaxable items. She decides to estimate the taxable ratio by tracing a month of recorded purchases to vendor invoices. For convenience, she picks a month near the end of the audit period and proceeds to segregate purchases for that month. The auditor finds that 85 percent of the purchases for the test month were of taxable merchandise. She applies that ratio to recorded merchandise costs for the three-year audit period in order to estimate the overall costs of taxable goods. When she compares these costs to taxable sales reported to the Board, she notes a markup of only 10 percent. Experience tells her this markup is too low, so she decides to proceed further.

Her next step is to estimate the markup your client should be achieving. This estimate will proceed from another judgment sample, in which the auditor will compare retail selling prices of taxable merchandise to cost. The selling prices will be compiled from observed shelf prices and/or discussions with your client. The costs will be determined from vendor invoices.

Common flaws in such markup computations are:

Failure to adjust for special sales, discounts and promotions, resulting in overstated “average” selling prices;
Over-representation of low volume items carrying a high markup (such as toys), which skew the markup upward;
Under-representation of high volume items carrying a low markup;
Erroneous assumptions and outright mistakes.

Any of these flaws can overstate the auditor’s taxable sales computation. A close review of the working papers, coupled with knowledge of your client’s industry and business practices, will enable you to evaluate the markup test and adjust it for inherent errors.

However, be aware of unnecessary controversy created by terminology. Business owners are accustomed to thinking in terms of gross margin (profit as a percentage of sales), whereas Board auditors think in terms of markup (profit as a percentage of the cost of sales). A markup of 50 percent reflects the same relative gross profit as a margin of 33 1/3 percent, but adverse parties often are unaware they are discussing two different aspects of the same variable, resulting in a heated argument over nothing.

Returning to our example: assume the auditor’s test indicates your client should be realizing a markup of about 28 percent on taxable merchandise. Upon reviewing the test, you decide this percentage is reasonable. However, when the auditor applies this markup to the costs of taxable sales she previously computed, the results greatly exceed the taxable sales your client reported. Naturally, the auditor proposes to assess tax on the difference.

Now it is time to take a closer look at purchases. Based on a one-month test of vendor invoices, the auditor had computed a taxable ratio of 85 percent. In reviewing this test, you note an unusually high volume of cigarette purchases. When you examine disbursements for the following month, you find cigarette purchases have dropped to almost nothing.

A call to your client reveals he bought a large volume of cigarettes at a low cost during the test period, which then kept him in stock for the next calendar quarter. You follow up on this information by making your own purchases segregation test for the three subsequent months. When you combine your results with those of the original test month, you find the taxable ratio drops to 75%.

Applying the new 75 percent ratio to recorded costs of sales naturally results in a lower figure for costs of taxable merchandise. When you compare these lower costs to taxable sales reported to the Board, you find the indicated markup has climbed from 10 percent to 25 percent.

Upon reviewing your test expansion and its effect on the purchases ratio, the auditor decides to accept the sales your client reported after all. If you had not decided to expand the auditor’s test, your client would have been assessed tax, interest and penalty-on sales he never made.

Variations on this theme are all too common in sales tax audits employing block samples. Statistical samples are less likely to incorporate such blatant distortions, but they often are ripe for refinements which can significantly reduce proposed assessments.

The Board has adopted certain minimum standards for statistical samples used to project audit liabilities. Departures from these standards are allowed (except for the three error minimum described below), but they must be justified by the auditor’s comments.

“Probably the most frequently ignored sampling standard is the one limiting the confidence interval.”

The standards are as follows:

A sample size of at least 300 units must be used, unless the auditor can support a smaller sample size.
A minimum of three errors must be found before an error rate may be projected. If less than the minimum is found, either the sample size must be increased or the items must be assessed on an actual (i.e., not projected) basis.
The sample should be drawn from the entire audit period if possible. If records for the entire audit period are not available, a sample should be drawn from the available periods (e.g., one or two years).
The difference then would be projected to the remaining periods.

Any sample reflecting a confidence interval of 75 percent or more must be analyzed and justified in the audit working papers before the results are projected. (The confidence interval indicates how much each sample unit varies from the next, as well as the likelihood that the sample is representative of the population. For instance, a smaller interval suggests the units in the population are more alike, which enhances the reliability of the samples.)

These minimum standards are not rigidly applied. Since most practitioners are not even aware of them, departures rarely are challenged in the field. And once the taxpayer under audit concurs, justifying a departure to the Board reviewer becomes a relatively simple matter for an experienced auditor.

Probably the most frequently ignored sampling standard is the one limiting the confidence interval. The 75 percent limit in itself allows for a high degree of variability, resulting in an increased possibility the sample will not be representative. The solution is to stratify the sample, generally by examining all units over a certain dollar amount (or of a certain type), which restricts the sampling process to smaller or otherwise more homogeneous transactions. (A further advantage of stratifying is that large transactions tend to be watched more carefully from inception, so they often are less subject to recording and reporting errors.)

Other possible adjustments are illustrated by the following true story of an audit of a manufacturer. The auditor had taken a statistical sample of the debits to an expense account for manufacturing supplies and had scheduled about 20 errors for projection. The errors were assumed to be purchases of supplies without tax, either from out-of-state suppliers or under resale certificates furnished to California vendors. Since these supplies supposedly had not been incorporated into finished goods but instead were consumed in the manufacturing process, the errors were presumed to be subject to use tax.

The individual amounts scheduled by the auditor ranged from a few dollars to a few hundred dollars, except for one $20,000 item which made up about 75 percent of the total. Projecting these errors would generate a $160,000 use tax liability.

Upon investigation, the $20,000 item proved to be a fixed asset purchase which had been posted to the wrong account. After some discussion, the auditor agreed to remove the item from the test and assess tax on it individually. The effect was to reduce the indicated liability from $160,000 to a little over $40,000.

None of the other sample errors appeared to be nonrecurring, but each was investigated, with the greatest attention devoted to the largest entries. As a result of additional documentation, nearly half the remaining exceptions were removed from the test. Since the items removed tended to be larger, the liability went down to about $16,000 or 10 percent of the original projection.

The additional exceptions were removed for the following reasons:

Some of the items had been bought from California vendors to whom a resale certificate had not been issued. Since the manufacturer had not claimed to be buying the supplies for resale, the transactions were subject to sales tax. Thus, the liability accrued to the vendors, and use tax could not be assessed.
In one case, the manufacturer actually had paid sales tax to the vendor, but the tax had been billed on a separate invoice.
Some of the supplies had been reported as taxable, but the individual items had been lumped into a total with other purchases subject to use tax. The auditor had not taken the time to analyze the total.
Two of the purchases were of property which had been incorporated into finished goods. These items were properly purchased for resale and were not subject to use tax.

Unless the effect is favorable or immaterial, the results of a tax audit sample rarely should be accepted at face value. Reducing a sample’s impact seldom requires special technical skill; significant changes most often result from questioning assumptions, investigating transactions a little more closely than the auditor, careful documentation, and a clear communication of findings. Your thorough review and strong presentation may well save your clients a good deal more than the cost of your time.