Overlooking Quality Assurance in Your Data Analysis

Jul 14, 2023
Photo by Ryoji Iwata on Unsplash

When writing SQL code or creating a dashboard, rarely will you run into an error that prevents data from being displayed on your screen. Sure, these run-time errors happen and as an analyst, you’re forced to resolve these errors before any output can be displayed. But just because there isn’t a run-time error doesn’t mean that a bug doesn’t exist. Proper quality assurance checks for such bugs is a frequent oversight by data analysts, either due to lack of training or by being rushed by stakeholders to complete an urgent task.

Throughout my career I’ve seen many analysts who have written SQL code, generated a result set, and then exported the results for the stakeholder without conducting properly quality assurance checks. Sometimes the data being produced is accurate, so no harm is done. But other times bugs are being produced. Sometimes the stakeholder finds the bugs and other times nobody ever notices them at all. Either way, these bugs can cause significant problems for the business and the analyst involved.

When I was heading up analytics at eBay, I had inherited a team of data analysts and I was pulled into a troublesome business project. One of the product teams had rolled out a new initiative to get new sellers to register on the platform. To make it easier for new sellers to register, the Product team created a new user registration flow, which removed many restrictions that had previously been put in place to keep fraudulent sellers from registering new accounts. This new user flow allowed the Product team to achieve their goal of acquiring new sellers, but it was at the expensive of the Losses team having to chase down more fraudulent sellers. While tradeoff like this frequently occur within the business, the question was, was this new initiative good for the business, or would it cause more harm than good.

By the time that I had arrived in the team, this project had already been rolled out and one of the analysts that I had inherited had been heading up the analytics efforts for the project. She had been pulling data, putting reports together, and had been reviewing these reports with one of the primary stakeholders on the Losses team. Unfortunately for the business, she had uncovered some results that we’re quite concerning.

Her reports were showing that approximately 75% of the new sellers were defaulting on their invoices, which was costing the company millions of dollars. This new program to acquire new sellers was causing significant harm and losses for the business. As any good analyst should do, she brought this information to her stakeholder. They reviewed it many times and performed many different pivots on the data to quantify the total cost of the program so that they could inform the senior leadership team, ultimately requesting that this new product registration flow be shut down.

As you might expect, a bit of a conflict arose between the Losses team and the Product team. The product team was achieving their goal of acquiring new customers, but it was at the expensive of the Losses team. The Losses team was now in a battle to prove that 75% of the new customers were bad customers, and the Product team was trying to prove that they weren’t bad customers and that they were a net-positive for the business.

As it turns out, it was the Product team that was right. The new product initiative did cause some incremental losses for the company, but nowhere near 75% of the customer base that was being reported by this analyst and the Losses team. Had the losses team gotten their way, we would have made the wrong decision to shut the program down. But decision to keep the program was only determined after the analyst and the stakeholder had already presented their business case to multiple VPs in the company. This caused some serious damage to the personal brand of the stakeholder and the analyst involved. Both the stakeholder and the analyst were forever tainted due to the lack of quality assurance checks.

On the surface, it seemed like quality assurance checks were performed. The analyst and the stakeholder were working together, they had pulled the data multiple times over several months, and they had discussed the findings with many other people. Nobody seemed to be able to find any issue with the data. The problem was nobody was looking deep enough at the data.

Even though the SQL code executed with an issue, there was a bug on one of the table joins, which resulted in duplicate data being pulled into the result set. Also, another issue was found due to the way that stakeholder was using the data within an Excel pivot table. While there were mistakes by both individuals and the stakeholder shouldn’t have used the data in the way that he did within Excel, the analyst shouldn’t have presented the output in the manner that she did. The why that she created the result set opened the door for the data to be potentially misused, which is exactly what happened.

Fortunately, we found the issues and fixed the mistakes before some terrible business decisions were made. Had it not been for the pushback from stakeholders and the massive discrepancy between the actual and the expected results, the company would have lost millions of dollars in revenue without even knowing it. But this is exactly what happens all the time in analytics environments without anyone being aware.

This is why it’s critical to not only have a solid understanding of SQL, the database, and all the tools of the trade, but also to know how to properly QA the results being generated by these tools. Unfortunately, with the fast-paced work of analytics environments and the frequent oversight of proper quality assurance checks, many bugs slip through the cracks, leading to very bad business decisions. When this occurs and inaccuracies are detected, the bugs and inaccurate recommendations and insights can lead to lasting damages to your personal brand as a data analyst.


Brandon Southern, MBA, is a highly accomplished business professional and the founder of Analytics Mentor. With a remarkable 20-year career in the tech industry, Brandon has excelled in diverse roles encompassing analytics, software development, project management, and more. His expertise has been sought after by esteemed companies such as GameStop, VMWare, eBay, and Amazon.

Renowned for his ability to build world-class analytics organizations, Brandon has generated over $100 million in revenue and savings. His strategic insights and leadership have transformed businesses, delivering exceptional results. With a passion for empowering individuals and fostering growth, Brandon is dedicated to elevating the analytics landscape and driving success.

You can learn more about Brandon and Analytics Mentor at http://www.analyticsmentor.io/

Subscribe to Receive New Articles

Join our mailing list to receive the latest articles and tips to elevate your career.

We hate SPAM. We will never sell your information, for any reason.