Monday, September 27, 2021

Cross-checking ShadowStats

Last week I wrote about Balaji Srinivasan's idea of creating a decentralized version of the Billion Prices Project. The post got me thinking again about the topic of alternative inflation indexes.

One of the most well-known of the alt-inflation indexes is John Williams' ShadowStats, often cited by gold bugs and bitcoin maximalists. As of August 2011, ShadowStats puts U.S. inflation at 13% versus official inflation of 5%, as illustrated in the chart below.

Source: ShadowStats

That's a huge gap. One of the two data series has to be wrong.

I've always dreamt about writing a blog post on ShadowStats, but never had the gumption or statistical chops for it. So I was happy to see that economist Ed Dolan announced on Twitter that he was  republishing a 2015 blog post in which he carefully critiqued ShadowStats. It's such a good article that I'm not going to bother writing my own ShadowStats post anymore.

ShadowStats attracts a lot of sneers from the econ commentariat. What makes Dolan's post so effective is that he gracefully takes Williams' arguments on their merits and then proceeds to analyze them. Put differently, he doesn't try to damn ShadowStats with straw man arguments. He steel-mans it (or steelwomens it).   

Anyways, do read the post. 

Dolan saves his best criticism for the end. When Dolan was writing his post in 2015, the gap between official inflation and ShadowStats inflation was a whopping 7% (see chart above). What Dolan finds is that the majority of this 7% gap can be attributed to a simple double-counting error committed by Williams. By correcting this double-counting error, the ShadowStats inflation number shrinks. And so the gap between it and the official CPI is actually far less menacing than Williams' anti-government fans like to make out.

Dolan challenges Williams to correct his double-counting mistake. But you can see why Williams might find this difficult to do. He has been selling his data for many years on a subscription basis. Admitting that his product contains errors could anger his customer base.

The other part of Dolan's blog post that I want to draw attention to is a set of simple cross-checks he performs to see whether official inflation or ShadowStats is more accurate. For instance, taking grocery prices from a 1982 advertisement and projecting them forward with both inflation indexes, Dolan finds that the official CPI does a better job of predicting where modern grocery prices actually ended up.

It would be unfair to do just one set of crosschecks. Which is why Dolan does a bunch of them. It's worth reading through each one. ShadowStats does not make out well. (For instance, in order for ShadowStats to be right, you've got to believe that the U.S. economy has been in a recession for the last two decades.)

To finish my blog post off, I'm going to add to Dolan's list of cross-checks by adding one of my own. This cross-check is meant specifically for one of the main consumers of ShadowStats data: gold bugs.

If gold investors think ShadowStats data is right, and many of them do, then they also have to accept that gold has lost 91% of its value since January 1980 (see chart below of the gold price adjusted for ShadowStats inflation). Which means that the yellow metal is an awful hedge against inflation, and anyone who buys it for that reason is making a big mistake.

Source: Bullionstar

The far more reasonable position to take is that the ShadowStats data is wrong, and that gold has actually been a decent hedge against inflation since 1980. Using official inflation numbers rather than ShadowStats, the price of gold today is almost even with its 1980 level.

So gold bugs, you can relax. You haven't lost your sanity -- gold is not an awful inflation hedge. Rather, ShadowStats is an awful measure of inflation.

Tuesday, September 14, 2021

A decentralized version of MIT's Billion Prices Project

Balaji Srinivasan, an angel investor, wants to kick start an updated version of MIT's Billion Prices Project. He will invest $100,000 in the project that best envisions how to create a publicly-available decentralized inflation dashboard, one that relies on scraped data from retailer websites.

Many years ago I was a big fan of the MIT's Billion Prices Project, so I perked up when I read about Srinivasan's contest. Created by economists Roberto Rigobon & Alberto Cavallo, the Billion Prices Project collected, or scraped, data from retailers' websites and used it to generate an alternative version of various government-tabled consumer price indexes. (I wrote about the Project here.) Members of the public could get access to Billion Prices U.S. data, albeit with a small delay.

This was incredibly useful! Because government consumer price indexes are published monthly, but websites can be scraped 24/7, the Billion Prices Project was far more responsive to price changes than government consumer price indexes are. It gave you insights into tomorrow's CPI announcement, today.

The Billion Prices Project also garnered attention because it revealed how Argentinean authorities had distorted official statistics to make inflation appear more muted than it really was. Conversely, the Billion Prices Project regularly confirmed the accuracy of U.S. Bureau of Labor Statistics' consumer price indexes, making it a useful tool for whacking gold bugs and inflation truthers over the head.  

While I like Srinivasan's general idea of bringing real-time scraped inflation data to the masses, I see three big problems.

The first problem is over-reliance on scraped data. Scraping is fast and cheap, but only a portion of the global economy's prices are scrape-able. Amazon and Walmart may sell almost every type of physical good under the sun here in Canada and the U.S., but they don't sell services. So while it's easy to find scraped prices of laptop computers, forget about prices for haircuts, rent, or healthcare.

That leaves a pretty big hole. Government statistical agencies such as the Bureau of Labor Statistics (BLS) or Statistics Canada are able to capture services prices because they send out human inspectors to check the prices of things like haircuts and back-rubs. Lacking price data on these items, Srinivasan's inflation dashboard will never be as accurate as the dashboards published by Statistics Canada or the BLS.

Consider too that goods in many developing and undeveloped countries are not available online. Amazon, for instance, isn't going to provide any clues into what is going on with vegetable prices in Afghanistan, or shoe prices in Yemen. Srinivasan says that he wants an "internationally useful" dashboard, but he's certainly not going to get one by relying on scraping alone. He's going to get a rich folks' dashboard.

Which leads into the second problem: the business model won't work. Compiling inflation indexes is costly, but Srinivasan wants his decentralized inflation dashboard to be made public, and presumably free. That's just not possible.

Rigobon & Cavallo's own Billion Prices Project is a good example of this dilemma.

Mere grants weren't enough to fund the Billion Prices Project. Yes, scraping may be cheaper than using physical data collectors, but it's still expensive to compile price indexes. Bills had to be paid. And so the whole Billion Price Project sold out. It was folded into a company called PriceStats and sold as a proprietary product to rich investors and central banks.

At first PriceStats continued to offer some free public dashboards. But this was never going to last. Rigobon & Cavallo's data had commercial value because it was quicker than government data, and could be used by traders to beat the market. Making even a portion of that data available to the public destroyed its commercial value. And so over time the public-facing parts were all discontinued. The Billion Prices Project, at least the public service side of it, is effectively dead. 

How data from PriceStats/The Billion Prices Project overlapped with US consumer price indexes [source]

Srinivasan's proposal faces the same tradeoffs as the Billion Prices Project. Price data is expensive to collect, compile, store, and process. Government agencies like the BLS are funded by taxes, not profits, and so they can give it away for free. We all benefit from this public service. But the calculus is different for private companies. To fund data collection, they must implement some sort of pay-wall. Srinivasan wants to make a public inflation dashboard, much like the BLS does. But he can't. He's not a government. 

(And no, an inflation dashboard won't be able to rely on advertising revenues, say like how Coinmarketcap does. Frenetic gamblers are addicted to checking coin prices. Inflation data doesn't attract eyeballs).

The last problem with Srinivasan's project is the basket problem. The introductory page that describes the project focuses on how to scrape for data. But this omits one of the biggest challenges to compiling any consumer price index: determining what the consumer price basket actually is. That is, what exactly is the "basket" of goods and services that the average consumer consumes each month?  

Government statistics agencies such as the Bureau of Labor Statistics solve this problem by conducting national surveys. For instance, the BLS's baskets are based on interviews with 24,000 Americans each quarter about their spending habits. The BLS gets even more precise data by having 12,000 of those participants keep a detailed diary that lists all expenses for a week.

But that's an incredibly resource-intensive process.

To avoid having to run costly surveys in order to build a representative consumption basket, the Billion Prices Project had a simple solution: it borrowed the BLS's baskets. But Srinivasan's project has declared this solution to be out of bounds. The project's website describes inflation as a "government-caused problem," and so the project can't rely on "government statistics."

Which means that Srinivasan's project will have to build its own representative price basket using its own surveys. Unless it can bring the same amount of financial resources to bear as the BLS, I don't see how it can pull this off.

Alternatively, the project will have to use the BLS's "untrustworthy" data. But that means contradicting its stated philosophy.

To sum up, Srinivasan envisions his decentralized inflation dashboard as being a superior alternative to untrustworthy government dashboards. But government consumer price indexes are far better than he is making them out to be, given the huge amount of money, time, and expertise committed to statistics agencies. (Yes, there are exceptions like Argentina). If any inflation dashboard is likely to be untrustworthy, I fear it will be Srinivasan's built-on-the-cheap dashboard.

(By the way, you'll notice I didn't discuss the decentralized aspect of the inflation dashboard. The project has enough challenges already, before even getting to the decentralized bit.)

All that being said, I'm in the same camp as Srinivasan. Scraped inflation data is neat and useful, and I think the public should be getting access to it. But my preferred solution is different than the one put forth by Srinivasan. Hey, BLS and Statistics Canada! When are you ever going to unveil some sort of free real-time consumer price index that relies on scraped data?


Srinivasan responds. Joe Weisenthal blogs.