War story

The disk filled up at 11pm and took the business with it

· Adrian Sullivan

It is almost always a disk. Not a dramatic failure, not a hack, just a disk that quietly filled up, on a Tuesday night, while everyone was asleep.

This one was a distributor. Their database had been growing for years, as databases do, and the log file had a habit of ballooning during the nightly batch. For years there had been just enough room. Then one night there was not. The disk hit zero, the database stopped accepting writes, and the overnight run that loaded the next day’s orders failed silently. Nobody knew until the warehouse turned up at six and had nothing to pick.

The fix took twenty minutes once someone with the right access was awake and looking. The cost was a morning of a business that could not dispatch, a scramble of phone calls, and a very awkward explanation to some large customers about where their freight was.

What gets me, still, is how loudly the disk had been warning them. The free space had been trending towards zero for weeks. The graph, if anyone had been watching the graph, was a straight line heading for a cliff. It was all there, in plain sight, telling anyone who looked exactly what was going to happen and roughly when. Nobody looked, because the disk had always coped before.

The absence of a problem is not the same as the absence of a risk. It is just a risk that has not collected yet.

We go and read the graphs nobody is watching, for free, read-only, and tell you which cliffs you are heading for before you arrive. Most of the time, the server told you. It just told you quietly, weeks ago, in a place nobody was looking.

Free health check

Want to know if this is sitting in your estate? We run a read-only check and hand you a graded report in plain English.

Get your free health check

← All posts