Data Revisions with FRED®
Compelling Question
How do revisions and updates improve data?
Description
FRED® provides access to current data from more than 100 sources. Some of those sources revise and update their data, and ALFRED® stores all previous versions. This article describes why data are revised and updated for new data users and can serve as a reference for advanced data users.
Introduction
Sources revise data for several different reasons and at multiple moments in time. Occasionally, some data contain errors. Regularly, some data are revised to provide a more complete picture of current or very recent economic conditions. Periodically, some data are updated to produce a more accurate account of long-term trends and patterns.
Every time a new version, or vintage, of a data series is released, FRED displays the latest version, and the replaced version is archived in ALFRED. Economic researchers and consumers of data must reference the data vintage they use by citing the date they access the data.1 Otherwise, discrepancies between data vintages can undermine the credibility of their work.
This article describes how headline economic data are revised and updated to provide a more complete and accurate description of economic conditions. High-quality data facilitate good decisionmaking, so the time and energy the data providers spend to improve their data benefits everybody who uses those data.
Revising Data to Correct Errors
The process of collecting information to produce economic statistics is complex, and, despite quality-assurance checks along the way, sometimes erroneous data are released to the public. Not unlike proofreading a text to identify typos, plotting the data in a graph can help identify some of those errors.
Figure 1 shows three consecutive vintages of the same daily data series: The blue line shows the data released on January 5, 2021; the red line shows the data released the following day; and the green line shows the data released the day after that. Every day of the week a new data point is added, so each new line is longer than the previous one.2 Moreover, with the addition of each new observation, the data provider revised previously released data after new information came to light.
For much of the graph, all three lines match up, indicating there were no discrepancies between the vintages. There are some small differences in the data values for the last week of December 2020 between the January 5 (blue) and January 7 (green) data releases. However, the big red spike in the graph reflects the largest discrepancy in data values: The data for December 9, 2020, that were released on January 6, 2021, indicated a 300% increase in value over the vintage released on January 5, 2021. And all the data points in the January 6 release after December 9 had a constant value (almost undistinguishable from zero in the graph), which is also a stark discrepancy from the values in the releases both the day before and day after. Whether this was the result of a coding error or some other reason, the next day's data vintage, on January 7, corrected these erroneously reported values.
Updating the Data Collection Methodology
The process of collecting information to produce economic statistics evolves and improves over time. Changes in methodology frequently result in revised data.3 Not unlike comparing the text between two editions of a book, plotting the data produced using two different methodologies can help identify the scope and impact of the changes involved.4
Figure 2 shows two different vintages of the same monthly data series: The dashed blue line shows the data released on November 9, 2021, and the solid red line shows the data released on December 2, 2021—after a revision in the methodology. The gaps between the blue and red lines in the graph represent the differences in the reported statistic resulting from the revised methodology. You can observe relatively small differences in data values between vintages during some years and almost none during others. Overall, the impact of the change in methodology is most noticeable for the data reported since 2020.
Besides comparing data vintages, you can sometimes identify changes in data collection methodologies through a one-time large variation in the value of the data.5 In those cases, the historical record of the data themselves isn't revised, so the analysis of data values before and after the change in methodology must account for it.
Improving the Accuracy and Completeness of the Data
The process of collecting information to produce economic statistics is time consuming. Because of that, there is frequently a tradeoff between the timeliness and the accuracy of information provided by the data. Not unlike a copyedited and revised draft of a manuscript, the final version of data provides more complete and precise information about current or very recent economic conditions.
Figure 3 shows 12 different vintages of the same monthly data series: The solid black line shows, as of January 7, 2022, the monthly changes in total nonfarm employment for each month in 2021. For each month in the graph, the three red dots show the initially announced value for the monthly change in total nonfarm employment, the first revision, and the second and final revision. The distance between the red dots and the black line shows the magnitude of the data revisions. In some cases the employment figures were revised up, and in other cases they were revised down.
Monthly revisions to employment data are a matter of common practice.6 So are quarterly revisions to gross domestic product (GDP) and annual revisions to the consumer price index (CPI). These revisions reflect additional information collected from consumers, businesses, and government agencies since the initial (or preliminary) data were released.
Finally, headline economic indicators are periodically updated to reflect current economic practices and patterns. For example, the relative weights of goods and services purchased across the different categories of spending included in the CPI are updated every two years. At those times, the Bureau of Labor Statistics recalculates the CPI over the previous five years.7
Figure 4 shows eight different vintages of real GDP for the second and third quarter of 1991. The vintage dates included in the series names are the dates when the Bureau of Economic Analysis released a comprehensive update to the standards and methods it uses to measure economic activity. On those dates, the source also updated its GDP statistics by moving forward the reference year for its inflation adjustment and price measures.
The legend on the left axis of the graph lists the reference years for the eight comprehensive updates to the real GDP series available in FRED: 1987, 1992, 1996, 2000, 2005, 2009, 2012, and 2017. As the general price level rises over time, each successive bar is taller than the previous one. However, absent substantial changes to the types of economic activity being measured, the compounded annual rates of growth of real GDP are very similar across vintages.8
Summary
More accurate data facilitate better decisionmaking. When data sources employ time and resources to improve their data, everybody who uses those data benefits. Because there can be significant differences in data values across series vintages, it is important to provide a complete citation, including the date you accessed the series, when using data.
Additional Resources
Notes
Glossary
Methodology: Any formal procedure used to gather and measure information to produce data.
Tradeoff: An exchange or a compromise between two different outcomes or features.
Vintage: A version of a data series available at a particular moment in time.