Описание
The "Stateful App Storage Trap": We overprovisioned our self-managed Postgres/Kafka volumes for a huge ingestion job, and now we’re stuck paying for empty space.
Hey everyone,
Looking for some realistic engineering perspectives on a storage lifecycle problem that’s turning into a quiet standoff between our platform team and finance.
A few months ago, we had to run a large data re-indexing and compaction cycle on our self-managed Postgres and Kafka clusters running on AWS EBS. To avoid any disk-full incidents during ingestion, the on-call team did the safe thing and increased several EBS volumes from around 500GB to 2.5TB.
The ingestion finished, retention/vacuum jobs ran, and now the actual active data footprint is closer to 400GB again.
The problem is we’re now using less than 20% of the allocated storage, while still paying AWS for terabytes of mostly empty block storage.
Our company recently added Kubecost to audit Kubernetes and infra spend, and every Monday it flags these stateful volumes as high-priority waste. Finance sees the reports and asks why we don’t just shrink the volumes back down.
But as everyone here probably knows, expanding EBS is easy. Shrinking it safely is where things get ugly.
To reclaim the space, the team would have to manually scale down replicas, create smaller volumes, run rsync or restore backups, swap mounts/volume references, and coordinate a maintenance window with possible downtime or replication drift risks. For a critical database tier, the blast radius of touching live storage often feels worse than the savings.
So nothing happens, and the oversized volumes stay there.
How are other teams handling this?
Do you mostly ignore Kubecost/FinOps alerts when it comes to stateful storage because reliability matters more, or has anyone actually found a safer way to shrink/reclaim live block storage?
Is manual migration still the only approach people genuinely trust for this?
[link]
[handle]
Контакты работодателя (email/phone/telegram) скрыты из публичного превью —
отправьте резюме, чтобы мы связали вас напрямую.