GCP Tips & Tricks: The Cost Of Storing 3PB In 6 Million Files In GCS In Different Storage Classes

Last week we examined the surprisingly cost-effective economics of GCS' different storage classes. Let's look at a real-world example: the cost of storing 3PB across 6 million files in GCS in a single bucket in the US Multiregion in the four storage classes, along with the cost of reading all 3PB once every month.

The end result is that for applications that read all data once a month and store their data at least 3 months, Standard and Coldline result in nearly identical monthly pricing, with substantial cost savings for Coldline if less than the entire archive is read each month. If only a handful of files out of the 3PB archive is read each month, Coldline can be as much as 4 times cheaper than Standard, making it ideal for archives that must periodically reprocess large datasets every few months, but otherwise just leave the data at rest.

To recap, Google Cloud Storage (GCS) offers four storage classes: Standard (the default that offers maximal-performance global object storage), Nearline (realtime access, lowered at-rest storage costs, 30 day minimal commitment, per-GB fee to access data), Coldline (cheaper than Nearline and higher access fee, with 90 day commitment) and Archive (cheapest at-rest, highest access fee and 365-day commitment). In essence, each progressively lower near offers cheaper at-rest monthly fees at a cost of higher per-GB access fees to access the data and a longer minimal commitment to store the data (deleting or changing the data sooner than the minimum incurs the full price as if it had been stored the entire period).

Using the pricing table for the four storage classes, we arrive at the following table. This assumes 6 million files in a single US Multiregion bucket totaling 3PB in size. The Hosting Cost is the total at-rest monthly charge for storing the 3PB. The Access Once Cost is the total monthly charge for reading all 3PB once that month. This assesses the worst-case scenario in which data still requires once-monthly processing. The Cost Host + Access is the combined total monthly cost of hosting 3PB and reading all 3PB once each month. The Cost Host No Access is the monthly cost of hosting all 3PB and not reading any of it that month.

Hosting Cost ($) Class A Ops Cost ($) Access Once Cost ($) Cost Host + Access ($) Cost Host No Access ($) Cost Reduction No Access (Ratio) Cost Reduction No Access ($)
Standard 78,000 60 0 78,060 78,000
Near 45,000 120 30,000 75,120 45,000 1.73 33,000
Cold 21,000 240 60,000 81,240 21,000 3.71 57,000
Archive 7,200 600 150,000 157,800 7,200 10.83 70,800

Notably, hosting all 3PB and reading the full 3PB archive once each momth costs roughly the same for Standard, Nearline and Coldline, with Archive representing a major cost jump due to the significantly higher per-GB access fee. Hosting 3PB at Standard and Coldline costs nearly identical per month if the entire archive is read once each month. But, imagine that only a handful of files are read each month – then Coldline is almost four times cheaper than Standard.