A scheduled job had been failing every night for about a year before anyone found out. Not a quiet, partial failure. A clean, total, red failure, every single night. The job was set up to email someone the moment it broke. It did exactly that. It emailed a person who had left the company eleven months earlier. This is a composite story. I have walked into some version of it more times than I can count.
The setup was sensible when it was built. Someone wired the overnight load to fire an alert on failure, and pointed that alert at their own work address, because they were the one who would fix it. Then they resigned. Their mailbox was closed, the way mailboxes are closed when people leave. The failure emails kept going out, on time, every night, into an address that bounced into nothing. The job kept reporting its trouble to the one person who could no longer hear it.
On the surface, everything looked healthy. The scheduler showed the job in the list. The dashboard nobody quite trusted showed a job, present and accounted for. Nobody had a red light in front of them, so nobody went looking. The data the job was supposed to move had just stopped moving. The downstream reports slowly drifted out of true while everyone read them as gospel.
The day it surfaced, the symptoms pointed everywhere except the cause. A number in a month-end report looked wrong. The first guess was a formula. The second guess was someone keying figures in by hand. People spent an afternoon arguing about a spreadsheet before anyone thought to ask whether the data behind it had arrived at all. It had not arrived in a year. The job had been telling someone the whole time. That someone just did not work there anymore.
Here is the dull lesson, and it is the whole point. The job was fine. The alerting was fine. What was broken was that the alert had a single owner, and that owner was a person, not a role, and that person was gone. An alert that goes to a mailbox nobody reads is not an alert. It is silence with extra steps. Nobody checks for jobs that are quietly succeeding into the void, because by definition there is nothing on the screen to check.
A free, read-only health check finds the jobs that are failing while you sleep, and the ones whose only warning goes nowhere. Fifteen minutes, no changes to anything you run, and you get a graded plain-English report that says which scheduled jobs last actually completed, and when. Better to learn that on an ordinary morning than from a number that has quietly been wrong for a year.
Want to know if this is sitting in your estate? We run a read-only check and hand you a graded report in plain English.
Get your free health check