-------------------------------------------------
Title: A stupid S3 story
Date: 2022-04-22
Device: laptop
Mood: sanguine
-------------------------------------------------

So here's a stupid story.

We had a client a few years ago, they were in the
media sector, they had a lot of people in their
marketing department, and we worked with them for
around four or five years. Through the time they
were with us, we did around a thousand deployments
of their site, and they were pretty busy uploaders
of files to their S3 bucket.

So it turns out that due to a misconfiguration,
each time the application was deployed, it created
completely new versions of every single file in
their S3 bucket. So, a lot of deployments, a lot
of files.

Today, I started running a job to delete their S3
bucket, and found this had ballooned to more than
14 million individual versioned files. The process
of just *counting* the files took about 4.5 hours
(yeah yeah, I know I could have upped the API
concurrency and counted them more quickly, but I
didn't want to exhaust all our API rate limits for
just one housekeeping task).

It's all been deleted now, but I'm so fucking sure
there will be other buckets which shared this
misconfiguration. And they're all going to need to
be found and dealt with.

I feel like a janitor cleaning up cloud vomit.

--C