Forecasting 101: A Guide to Forecast Error Measurement Statistics and How to Use Them
Error measurement statistics play a critical role in tracking forecast accuracy, monitoring for exceptions, and benchmarking your forecasting process. Interpretation of these statistics can be tricky, particularly when working with low-volume data or when trying to assess accuracy across multiple items (e.g., SKUs, locations, customers, etc.). This installment of Forecasting 101 surveys common error measurement statistics, examines the pros and cons of each and discusses their suitability under a variety of circumstances.
The MAPE (Mean Absolute Percent Error) measures the size of the error in percentage terms. It is calculated as the average of the unsigned percentage error, as shown in the example below:
Many organizations focus primarily on the MAPE when assessing forecast accuracy. Most people are comfortable thinking in percentage terms, making the MAPE easy to interpret. It can also convey information when you don’t know the item’s demand volume. For example, telling your manager, "we were off by less than 4%" is more meaningful than saying "we were off by 3,000 cases," if your manager doesn’t know an item’s typical demand volume.
The MAPE is scale sensitive and should not be used when working with low-volume data. Notice that because "Actual" is in the denominator of the equation, the MAPE is undefined when Actual demand is zero. Furthermore, when the Actual value is not zero, but quite small, the MAPE will often take on extreme values. This scale sensitivity renders the MAPE close to worthless as an error measure for low-volume data.
The MAD (Mean Absolute Deviation) measures the size of the error in units. It is calculated as the average of the unsigned errors, as shown in the example below:
The MAD is a good statistic to use when analyzing the error for a single item. However, if you aggregate MADs over multiple items you need to be careful about high-volume products dominating the results—more on this later.
Less Common Error Measurement Statistics
The MAPE and the MAD are by far the most commonly used error measurement statistics. There are a slew of alternative statistics in the forecasting literature, many of which are variations on the MAPE and the MAD. A few of the more important ones are listed below:
MAD/Mean Ratio. The MAD/Mean ratio is an alternative to the MAPE that is better suited to intermittent and low-volume data. As stated previously, percentage errors cannot be calculated when the actual equals zero and can take on extreme values when dealing with low-volume data. These issues become magnified when you start to average MAPEs over multiple time series. The MAD/Mean ratio tries to overcome this problem by dividing the MAD by the Mean—essentially rescaling the error to make it comparable across time series of varying scales. The statistic is calculated exactly as the name suggests—it is simply the MAD divided by the Mean.
GMRAE. The GMRAE (Geometric Mean Relative Absolute Error) is used to measure out-of-sample forecast performance. It is calculated using the relative error between the naïve model (i.e., next period’s forecast is this period’s actual) and the currently selected model. A GMRAE of 0.54 indicates that the size of the current model’s error is only 54% of the size of the error generated using the naïve model for the same data set. Because the GMRAE is based on a relative error, it is less scale sensitive than the MAPE and the MAD.
SMAPE. The SMAPE (Symmetric Mean Absolute Percentage Error) is a variation on the MAPE that is calculated using the average of the absolute value of the actual and the absolute value of the forecast in the denominator. This statistic is preferred to the MAPE by some and was used as an accuracy measure in several forecasting competitions.
Measuring Error for a Single Item vs. Measuring Errors Across Multiple Items
Measuring forecast error for a single item is pretty straightforward. If you are working with an item which has reasonable demand volume, any of the aforementioned error measurements can be used, and you should select the one that you and your organization are most comfortable with—for many organizations this will be the MAPE or the MAD. If you are working with a low-volume item then the MAD is a good choice, while the MAPE and other percentage-based statistics should be avoided.
Calculating error measurement statistics across multiple items can be quite problematic.
Calculating an aggregated MAPE is a common practice. A potential problem with this approach is that the lower-volume items (which will usually have higher MAPEs) can dominate the statistic. This is usually not desirable. One solution is to first segregate the items into different groups based upon volume (e.g., ABC categorization) and then calculate separate statistics for each grouping. Another approach is to establish a weight for each item’s MAPE that reflects the item’s relative importance to the organization—this is an excellent practice.
Since the MAD is a unit error, calculating an aggregated MAD across multiple items only makes sense when using comparable units. For example if you measure the error in dollars than the aggregated MAD will tell you the average error in dollars.
Measuring forecast error can be a tricky business. The MAPE and MAD are the most commonly used error measurement statistics, however, both can be misleading under certain circumstances. The MAPE is scale sensitive and care needs to be taken when using the MAPE with low-volume items. All error measurement statistics can be problematic when aggregated over multiple items and as a forecaster you need to carefully think through your approach when doing so.
About the author:
Eric Stellwagen is Vice President and Co-founder of Business Forecast Systems, Inc. (BFS) and co-author of the Forecast Pro software product line. He consults widely in the area of practical business forecasting—spending 20-30 days a year presenting workshops on the subject—and frequently addresses professional groups such as the University of Tennessee’s Sales Forecasting Management Forum, APICS and the Institute for Business Forecasting. Recognized as a leading expert in the field, he has worked with numerous firms including Coca-Cola, Procter & Gamble, Merck, Blue Cross Blue Shield, Nabisco, Owens-Corning and Verizon, and is currently serving on the board of directors of the International Institute of Forecasters (IIF).