Artificial Intelligence – sifting through the hype and the promise for forecasting

Today I sat through a full day of talks at the International Institute of Forecasters Foresight Practitioner Conference, which was focused on the topic of “Artificial Intelligence —The Hype and the Promise for Forecasting and Planning.” There was lots of skepticism (and humor) about the hype, along with some use cases of promising results. Different speakers attempted to define artificial intelligence (AI), with most citing machine learning as a subset of AI. I see very little true AI yet implemented, so I prefer to use the term machine learning, but for the purpose of this post I’ll refer to it as AI/ML. My favorite definition was from a slide of this tweet:

Difference between machine learning and AI:
If it is written in Python, it's probably machine learning If it is written in PowerPoint, it's probably AI — Mat Velloso (@matvelloso) November 23, 2018

Are we right to expect so much from AI/ML?

Famed forecasting expert and organizer of the Makridakis Competitions (known as the M-competitions) Spyros Makridakis spoke to us remotely from his post at the University of Nicosia in Greece. He quoted Amara’s Law, which says that we tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run. To illustrate his point he quoted three leaders of the machine learning world, Judea Pearl, Yoshua Bengio, and Ali Rahimi, all of whom hold narrower views of the current promise. However, to highlight the long run perspective, Makradakis shared the 1828 prediction of economist Jean-Baptiste Say about using cars vs. horses: “Nevertheless no machine will ever be able to perform what even the worst horses can – the service of carrying people and goods through the bustle and throng of a great city.” Clearly cars have far exceeded that early prediction. Time will tell if AI/ML does as well.

Challenges of AI/ML

Some common challenges about the promise of AI/ML emerged throughout the day. These models require a LOT of data. Business owners invariably believe they have ready access to all the data needed and that it is in good shape, but both assumptions are almost always proven false, consuming lots of wasted time. And particular to forecasting, time series data is usually short, which is a problem for a hungry machine learning model. The models are black boxes, and people are leery of results they do not understand and quick to lose trust. Even well-constructed models are brittle, meaning they require a lot of fiddling of their knobs to set up and regular maintenance to ensure they still produce reliable results. People have been led to expect “automagical” results, meaning that that AI/ML can perform miracles with minimal human intervention needed. Many of these models are compute intensive, which means they can take a long time to run. And if the math is hard, the change management to implement these models is harder and often overlooked, sometimes even dooming results.

Some promising results

I can still remember not too many years ago when a forecasting expert I know was emphatic that neural networks, one of the most popular machine learning methods employed today, would never hold any promise for forecasting. In fact, for years only a few stubborn academics believed neural networks were worth studying at all. While neural networks showed some early success, they then fell out of favor during what has been called the AI Winter. My Canadian colleagues can be proud that this winter was thawed due to the contributions primarily of four professors funded by the Canadian Institute for Advanced Research (Cifar): Geoffrey Hinton and Yann LeCun at the University of Toronto, Yoshua Bengio at the University of Montreal, and the University of Alberta’s Richard Sutton. As Yuyang (Bernie) Wang of Amazon AI shared in his talk today, finally in 2012 AlexNet (a model with multiple layers of neural networks developed by one of Hinton’s graduate students) won the ImageNet competition, marking the heating up of interest in neural networks, which has showed no signs of slowing. Bernie went on to describe how they use neural networks for all kinds of applications of forecasting at Amazon AI. A particular kind of recurrent neural network, LSTM, adapts well to time series data, and Rob Stevens of First Analytics talked about some promising results his company has had using this model, after reading a paper by Walmart Labs on their success using LSTM.

A supply as a control system, with AI/ML as a control feature

Kinaxis cofounder Duncan Klett drew on his training as an engineer to lay out his premise that a supply chain is a control system. One example we all know is auto cruise control, which adjusts the throttle to maintain the speed we set. A supply chain also fluctuates, but the “control” is our response, and delay is the killer in a supply chain, blowing it up and leading to the bullwhip effect. If we can build a digital model of the entire supply chain, he maintains we can reduce delays and respond better instead of waiting to react. Our concurrent planning approach at Kinaxis provides this digital twin, and he explained how we use machine learning in RapidResponse in many ways, to predict things like lead times and yields, automation to allow the planner to focus on the exceptions, feature engineering to figure out which factors are important, and to enrich the forecasts with an extended feature set. We are using Shapley values to increase interpretability of the features to build trust with users. Better supply chain execution is the goal, such as reducing inventory and stock outs.

What is the forecast for AI/ML?

I have to go back tomorrow for the conference conclusion to hear the talk, “Forecasting Artificial Intelligence,” as well as to listen to the panel of speakers field questions from the audience about all the topics discussed. But given the scepticism on the value of AI/ML for forecasting I’ve heard from so many corners over many years, I was intrigued to hear genuine results reported today. If we can have reasonable expectations (no fairy dust allowed), then I agree with Makradakis’ positioning of Amara’s Law, that in the long run we are likely to have underestimated the effects of AI/ML for forecasting…and beyond.