All Hail Autonomous Analytics

All Hail Autonomous Analytics

If You Can’t Measure It…

I studied econometrics at university. As an econometrician, I think one part of the novel Improbable by Adam Fawer kept my mind busy for a long time. In the book, there was a conversation between two people. The main character was asking the other person that “if you toss a coin, the result is random, right?” and He continued, “the coin’s behaviour must obey the rules of Newtonian physics. If we calculate the speed of wind, force and angle of my finger and the material features of the coin, then we do not have to guess, we can know every time what outcome is going to be”.

If there is no unknown variable in an equation, it is no longer a forecast. The magical idea inside the dialogue for me was quantifying the details that I never thought were quantifiable. In school, we learnt how to manage the unknowns in the economic subjects that we want to forecast. After I read the book, I often thought if the technology evolution reaches a level of maturity that we can know all the variables in the economic subjects, then we won’t have to forecast anything. We would know what the future will be like. Data technologies are improving very fast, but we are still not at that level. Are we ever going to be?

Data science has reached an outstanding level of its ability and is still improving. It shows off with its ability to perform image recognition, natural language processing, fraud detection, etc. There is no reason why it fails when it predicts the count of a specific product’s sales for the next day. Forecasting techniques might work on the questions about the next day’s stock market trends if they will go up or down, but forecasting the exact value of a share? I am not saying it is impossible; maybe we can do it someday in the future. But at least Robert Lucas did not agree to it for the macroeconomic level.

He is an economist, and he is known for his work, Lucas Critique. This critique says that if you try to forecast an economic value with the current conditions and make your plans based on this forecast, then your forecast will lose its foundation when there is a change in the conditions. It sounds reasonable, right?

But this fact does not prevent us from getting benefits from the power of data science. Theoretically, if an organization has a fast data flow and is the first player to answer the changing conditions, it can still forecast the future with high accuracy. A complete set of underlying data of an economic entity can be massive. Think about a company with a thousand different products and it is trying to monitor their sales every hour. It measures sales of its products in a specific product group within a particular city, age group, and a day. Also, think that you quantify your competitor’s every move in the advertising channels and all price changes and other influencers’ impact on the sales. Maybe I cannot count here but let us say that they have all the details about their products. And they also have a great data team. I think it is not surprising anymore that they can make good forecasts.

Companies are already running forecasts for their possible growth, budget, demand planning. For instance, many organizations seek the answers to these questions every year: –

  • If the next year’s demand will be 15% higher than last year, is the stock level enough?
  • Will we need additional machines to support demand?
  • Current warehouse capacity is enough?

If you have a long and high-quality time series, high accuracy forecasting results is not a dream. But if decision-makers have a shorter period of forecasts with the actual data, they can adjust their decision to meet the end-of-year plans. Having this granular data is related to the economic conditions and vision of the organizations.

…You Can Still Manage It

Even though we do not have all the data, we can still make good forecasts with what we already have. There are many kinds of machine learning algorithms for different data types and they can help with a good forecast.

Economic forecasts based on sophisticated algorithms can create exceptional results if you have granular, real-time, and high-quality time series data.

At this point, Autonomous Analytics systems are about to be a game-changer for companies that have enough live data. Autonomous Analytics systems are working when human support is not there. They are correlating changes on different variables and adapting themselves to the changes. In the end, if there is an anomaly or opportunity in the data, they share it with the correct person. Thanks to Natural language generation technologies, these systems can inform us with clean summaries of text scripts in an email.

Autonomous Analytics can see essential details that we do not see in the data in a paragraph. But why do we not see some important anomalies, opportunities, or patterns in the data? The reasons are generally related to human capacity. Even though you create dashboards and reports for an analyst, there is a limit for us to process and correlate data at once. We are looking at the data with the restrictions of our biases, assumptions, and data processing capacity. Even if one of the data analysts notices an anomaly in one of your KPIs, finding the root cause can take hours because of these restrictions above.

We can set thresholds for alerts. Personally, I am good with that solution but let us give an example; Your organization has customer segments for marketing purposes. You are monitoring their orders with a location-based approach. Your thresholds are +5% and -5% for the sales amount of a product. That means if today’s sales amount of a product is 7% lower than yesterday, you might receive an alert. These thresholds are set by one of your subject matter experts. But if your time series has weekly seasonal behaviour, you might receive tons of alerts. Maybe you can use a moving average on the alert sales amount. Still, if there is a general trend on the dataset, you will receive many false positive alerts when your industry grows 15% yearly, and your moving average interval is not set correctly. If there is an unexpected day in the moving average dataset, it will create another flood of alerts. Think about one week later than Black Friday sales, or a system outage. Moving average formulas cannot exclude these problems by themselves.

Conclusion

When a decision-maker opens a dashboard or a report he/she can see a very small part of all the metrics of the company. Even if they find high-level answers to their questions, they might ask more questions before making an important decision. Most of the time these new questions can not be answered with the existing reports. Analysts and developers are visualizing only a limited number of pre-defined scenarios among the countless possibilities. If an additional BI solution is needed, they add another report to the current list of reports. But this is not scalable. Traditional BI is not helping when people dive deep into the data. Autonomous analytics systems work and analyze the data even when you are not requesting it. They work on the most granular and reliable time series to find causal relationships and correlations. When you start to ask questions, answers will be readily available.

Creating or adopting an Autonomous Analytics platform should be in the roadmap for the companies that don’t want to lose their edge. Online machine learning algorithms can analyze the data and extract the seasonality, trends and impact of possible outliers. The algorithm can set its thresholds based on the historical time series and create more accurate alerts.

If you have a system that performs these analyses autonomously and in real-time, you are lucky. Because you can detect problems and opportunities before your competitors leveraging the traditional BI methods. However, some other companies can be luckier than your organization. The companies that receive a causal relationship summary when more than one metrics have anomalies simultaneously. There can be thousands of anomalies that can happen in a day. Nevertheless, those anomalies can be tied to one specific reason. Suppose your autonomous analytics system is sophisticated enough. In that case, it can show you the most important correlations in an easy-to-read format and let you solve the exact reason for the problem or the opportunity to reach. Here is a great article that you can read about anomaly detection and root cause analysis solutions: Ericsson — How to build robust anomaly detectors with machine learning written by Nikita Butakov.

Creating an Autonomous Analytics platform in the house is possible, but it requires a lot of time, research, and a big budget. To learn more, check out this blog post of Anodot from here: Why Ad Tech Needs a Real-Time Analysis & Anomaly Detection Solution.

Acknowledgements

In writing this post, I would like to thank Satrujeet Rath for the proofreading and revision suggestions.

Related Posts---

Leave A Reply---

Back to top