One thing that you sometimes see get mentioned in discussion about the relative scarcity of certain Calbee baseball card sets in Japan is that there was a poor potato harvest in a given year which in turn led Calbee to reduce sales of its baseball cards in that year. This seems logical since the two are inextricably linked - all Calbee cards have for years only been sold with potato chips, so if there are fewer potatoes to make those chips its not entirely inconceivable that this would result in fewer baseball cards as well.
This has always intrigued me since it suggests the existence of a potential correlation between two variables - potato and baseball card production - that would be unique to Japan and to the hobby. And it would add a neat piece of hobby trivia to all the other bits floating around out there.
So I decided to try to statistically prove or disprove the existence of this correlation.
To do so I gathered data on both variables: potatoes and cards. For data on annual potato production in Japan I consulted the Potato Pro website's Japan page which allows you to search official figures on potato production. I hand collated the annual data from 1997 to 2013.
For data on Calbee baseball cards I could not locate official figures on annual production from Calbee, so I had to use a proxy. Yahoo Auctions listings in its baseball card category for Calbee cards are broken down on a year by year basis and may be a useful substitute for official figures in this regard, subject to certain limitations. Yahoo Auctions is the biggest auction site in Japan, effectively its equivalent to Ebay, and is one where a large volume of baseball cards are bought and sold. At the time of writing there were 69,997 Calbee cards listed for sale so it represents a relatively large pool of data. The number of Calbee cards available for sale on Yahoo Auctions in a given year category is not a perfect proxy for the number of cards produced in that year but the numbers are large enough to suggest that differences in the number of cards originally produced would show up as differences in the number available on Yahoo Auctions today. I thus hand collated the data on the number of listings for cards available each year from 1997 to 2013.
I chose to set the year range from 1997 to 2013 for two reasons. 1997 was chosen as a cut off point since it marked the beginning of the "modern" style of Calbee card, and the availability of cards on Yahoo Auctions is likely to be a better reflection of the number of cards originally produced for cards after that year since the surviving population is less likely to have been affected by events like moms throwing them away, as cards from the 1970s to early 90s were (in Japan the collecting hobby developed a couple of decades later than its American counterpart). 2013 was selected as the upper limit simply because from 2014 onwards Yahoo Auctions stopped breaking listings down by year (for some reason). One other oddity worth noting is that for the years 2000 and 2001 for reasons that are unclear Yahoo Auctions lumped both years into a single category, so I assigned half to each year in my data. Another limitation to note is that while most of the listings are for single cards, some are for lots, something I haven't taken the time to weed out.
What do the data tell us? I tried to present it in the line chart at the top of this post. The orange line tells you the number of Calbee cards from a given year available on Yahoo Auctions, while the blue line tells you the domestic potato production that year (expressed in 10s of thousands of tons).
On the card side you can see there is a huge spike in 1999, then a massive drop in 2002 which recovered in 2003 and 2004, since when there has been a general declining trend though marked by fluctuations from year to year.
On the potato side there has been a decline in potato production between 1997 (3,390,000 tons) and 2013 (2,400,000 tons), though there is a fair bit of fluctuation year on year in there too (less pronounced however).
When I run a simple linear regression analysis with the baseball cards set as the dependent variable and potato production as the independent, the results suggest there is no significant correlation between the two (R squared = 0.050602). In other words, potato production has no effect on baseball card availability.
After running that though I realized that there was one factor which I had to adjust for. In the years 2007 and 2009-2013 Calbee distributed their cards two per bag. In the other years, they only distributed one per bag. This would suggest that the "two cards per bag" years were being over-represented in the Yahoo Auctions data since every two cards would represent one bag of potato chips in those years. To correct for this, I divided the number of Yahoo Auctions listings for those years in half.
Running the same regression using that data, the R-squared jumps to 0.328352. That is still low - especially given the small sample size involved (just 16 years of data) - and suggests that a lot of other factors which this simple two-variable model isn't capturing are more important than potato production in determining how many Calbee cards get made. But its at least big enough that the relevance of potato production might be a bit more than background noise and could have some effect on Calbee card availability.
From a statistical point of view I haven't exactly put this question to a very rigorous analysis here, but I think it was kind of an interesting exercise. In terms of its limitations, if you look at the data from individual years its quite easy to see that potato production isn't always a major determinant of Calbee baseball card production in a given year. 2002 illustrates this point well - as you can see from the chart there was a huge drop in baseball card production that year, but at the same time potato production reached the second highest level in the data set (tied with 1998). This was an outlier though - 2002 was the year Japan co-hosted the World Cup and the greater interest in soccer that year reduced demand for baseball cards. For other years its worth noting that a lot of the correlation comes from the period between 2004 and 2013 in which both baseball card availability and potato production slowly declined in tandem. Its entirely possible that this is pure coincidence and the two have nothing to do with each other. Or not, who knows? We need more data on other variables to try to untangle this mess.