In the final post in this series about reasons companies think they can’t optimize prices, I’ll address perhaps the most frequently heard concern: “I have outliers in my transaction data or dirty data.”
Every company has dirty and imperfect data, so welcome to the club. “Garbage In, Garbage Out” is a root fear goes back to the effective use of business intelligence tools. Because these tools typically consume all of your transaction data, they will only be effective if your data is really good. This has led many people to conclude that all data-driven systems are equally sensitive to flaws in the data.
Yet for price optimization, we may only need to look at 70 percent of your transaction rows to get the information we need to set better prices, setting aside that 30 percent which has suspected or known problems. We can do this without compromising our goal because we’re looking for statistical patterns and price signals within the data, not trying to tie-off to totals in your financial systems. For instance, given a sample of 5,000 transactions, I can calculate the average margin rate for a business. If I now randomly remove 1,500 of those transactions from the sample, I will calculate a nearly identical average margin rate – the signal is preserved. The science of price optimization is advanced enough to account for outliers and to remove “suspicious” or bad transactions from the data set.
Don’t sweat dirty data. I’ve never been involved in a project where we said “your data is too messy for us to use, let’s cancel the project.”




