Mar 14, 2026·8 min read

Timezone bugs in global products: cron drift and fixes

Timezone bugs in global products break billing, emails, and reports. Learn how cron drift, DST, local time, and catch-up jobs need clear rules.

Timezone bugs in global products: cron drift and fixes

Why scheduling breaks across time zones

A schedule can look correct on the server and still feel broken to users. The server follows one clock. Your users follow their own wall clock, in their own city, with their own daylight saving rules.

That gap causes quiet damage first. Billing runs at the wrong local midnight. Reminder emails land an hour late. Daily reports include too much data or miss the first hour of the day. Teams often notice the bug only after support tickets pile up.

A few areas usually fail first:

  • invoices and subscription renewals near midnight
  • reminders tied to business hours
  • daily and monthly reports
  • booking windows and cutoff times

Daylight saving time makes all of this worse because local time does not move in a clean line. In spring, one local hour disappears. A job set for 02:30 may never run that day. In fall, one hour repeats. A job set for 01:30 may run twice unless you stop it.

A simple cron entry does not answer those questions for you. It only follows the clock you gave it. If the product says "send at 9:00 a.m. local time," you need more than server cron. You need a stored time zone, a rule for daylight saving shifts, and a way to decide what counts as one intended run.

Downtime adds another choice. If the system is down for three hours, do you skip missed jobs or replay them after recovery? Neither answer is always right. A missed reminder for a doctor appointment may still matter. A replayed billing charge can create a much bigger mess.

Most of these bugs come from missing rules, not broken code. The system often works exactly as written. The problem is that nobody wrote down how time should behave.

The clocks your product actually follows

Scheduling bugs usually start when a product mixes several clocks and treats them as one. They are not the same.

A server has its own time. A user lives in a local time zone. A business may follow another rule entirely, such as "every business day at 9:00 a.m. New York time." If you blur those together, timing problems show up fast.

Keep event timestamps in UTC. That gives you one stable record of when something happened, no matter where the server runs or where the user opens the app. Logs, audit trails, job history, and queued events are much easier to sort when they share the same base clock.

UTC is not enough for future schedules. If a user says, "Send this every day at 8:00 a.m.," you also need the user's IANA time zone, such as "America/New_York" or "Europe/Berlin." Do not store only "+02:00." An offset is a snapshot. It does not tell you what happens when daylight saving starts or ends.

Display time and execution time should stay in separate lanes. You can show a meeting in the viewer's local time because that is easiest to read. But the rule that decides when the job runs should use the stored schedule and time zone, not whatever clock the browser or server happens to use at that moment.

A simple model helps:

  • Event time is when something actually happened, stored in UTC.
  • User time is what the person expects to see on screen.
  • Server time is the machine clock used to run code.
  • Business time is the rule the company cares about.

Those rules can point in different directions. A payroll job might run on London business time even if the server is in Virginia and the manager is traveling in Singapore. Once that is explicit, a lot of scheduling mistakes disappear.

Where cron helps and where it doesn't

Cron is good at one thing: running a command on a schedule on one machine. If you need to clean temp files at 03:00, rotate logs every Sunday, or send one internal report every morning, cron is a solid tool. It is small, predictable, and easy to inspect.

Problems start when teams expect cron to understand business time. Most cron setups use the machine's time zone unless you configure something else. That means "run at 09:00" often means 09:00 on that server, not 09:00 for the customer, the office, or the market you care about.

A quiet infrastructure change can shift timing without touching the cron line. Move a job to a new server, rebuild a container with a different time zone, or change host settings during maintenance, and the job still runs, just at the wrong hour.

Calendar edges cause trouble too. A rule that runs on the 31st simply skips shorter months. A job meant for local midnight gets strange when your product spans cities and regions. Midnight is not one global moment, and some local dates have missing or repeated times.

Cron also does not promise clean spacing between successful runs. Teams often think "every 5 minutes" means a perfect five-minute interval forever. Real systems have queues, retries, and slow workers. If one run starts late, the next may wait behind it, or a retry may push work into the next window. Over time, that drift leaks into reports, invoices, reminders, and sync jobs.

Cron works best when the rule around it is narrow and explicit. You need to define which time zone controls the schedule, what happens on missing dates like the 31st, whether late jobs should skip or catch up, and how retries affect the next planned run. If those rules are missing, machine behavior fills the gaps.

What daylight saving does to schedules

Daylight saving time breaks the easy assumption that every day has 24 hours. On the spring change, many places jump from 01:59 to 03:00. If a job is set to run at 02:30 in local time, that run never happens because that minute does not exist.

The fall change creates the opposite problem. The clock repeats an hour, so 01:30 happens twice. A scheduler that follows local clock time may run the same task two times. A scheduler that follows elapsed time may run once, but at a moment users do not expect. Either choice can work. The team just has to choose it on purpose.

This gets expensive fast. A daily report can cover 23 hours in spring and 25 hours in fall. Totals change. Charts look odd. Alert thresholds can fire too early or too late. If the same report feeds payroll, commissions, or invoices, one missing or repeated hour can turn into a real dispute.

The same failures keep coming back:

  • reminders vanish when the scheduled minute is skipped
  • emails send twice when the hour repeats
  • daily summaries cover shorter or longer periods than expected
  • billing windows overlap or leave a gap

The safest move is to stop calling a job "daily" until you define what that means. Does it run by local clock time for each user? Does it run every 24 hours in UTC? If the local time is skipped, do you drop that run or move it to the next valid minute? If the local time repeats, do you run once or twice?

Money jobs need tighter rules. Define billing windows with exact UTC timestamps, then show them in local time in the UI. That keeps the math stable even when local clocks shift.

"Run every day at 02:30" is not a full requirement. One more sentence usually decides whether the system feels reliable or broken twice a year.

A simple failure story

Map Every Scheduled Job
Find the quiet jobs that still depend on local time, retries, or server clocks.

A subscription store promised customers a renewal reminder at 9:00 a.m. in their own city. That sounded simple. The team stored the time the user picked and the UTC offset from signup day, then scheduled messages from that.

At first, nothing looked wrong. A customer in New York signed up in January, when the offset was UTC-5, so the app sent the reminder at 14:00 UTC. Every month after that, the system reused the same offset.

Then the seasonal clock change hit. New York moved to daylight saving time and became UTC-4, but the app still treated that customer as UTC-5. The reminder landed at 8:00 a.m. instead of 9:00 a.m.

Nothing crashed. No alert fired. The job ran exactly on time by its own rules, but the rule was wrong.

Support noticed first. People wrote in saying, "I asked for 9:00, why did this arrive at 8:00?" A few turned reminders off. Some canceled because the message showed up at a bad moment and felt sloppy. The team thought they had a messaging problem, but the real problem was scheduling.

The fix was simple. The app needed to save an actual time zone, such as "America/New_York," not just the offset seen on signup day. Then, before every send, it needed to recalculate what 9:00 a.m. local time meant on that date.

Offset is a snapshot. Time zone is a rule set. If you keep the snapshot and throw away the rule, drift is only a matter of time.

Writing scheduling rules that hold up

Start with a sentence that someone outside engineering can check. "Send the invoice reminder at 9:00 a.m. in the customer's local time on the first business day of the month" is better than a bare cron line. That plain sentence is the real rule.

Then turn it into a few decisions:

  • Decide whether the job follows UTC or a local clock. Backups, data syncs, and rate limit resets often fit UTC. Customer reminders, store openings, and payroll usually need local time.
  • Store the next intended run, not just the last successful run. Also store a rule version so you know which logic produced that schedule.
  • Define missing times. If a city skips 02:30 during the spring clock change, your system needs a written rule: skip it, run at 03:00, or run at the next valid minute.
  • Define repeated times. When 01:30 happens twice in the fall, decide whether the job runs once or twice. Most customer-facing jobs should run once.
  • Test one spring change and one fall change before release. If you only test normal weeks, you will miss the cases that matter.

That storage choice matters more than most teams expect. If your service goes down for three hours, a catch-up process should compare "what should have happened" with "what did happen." If you only keep the last run time, the schedule can drift a little after every outage until the gap becomes hard to explain.

Take a small example. A shop sends a pickup reminder at 8:00 a.m. local time. If you schedule it in UTC, customers in Berlin and Chicago get different local behavior. If you schedule it in local time but ignore the spring clock change, some reminders never go out. If you replay blindly after recovery, some customers get two.

Most scheduling bugs start when the business rule stays fuzzy. Write the rule in plain language, choose the clock on purpose, store the intended next run, and decide the weird cases before they happen.

How catch-up jobs should work after downtime

Fix Cron Drift Early
Review how your jobs run across regions, workers, and local business hours.

When a scheduler comes back after an outage, it should not guess. It needs a rule for every missed run. If you leave that choice to default cron behavior or ad hoc code, users get duplicate emails, late reports, or missing data with no clear reason.

Most teams only need three choices:

  • Skip the missed run if old work no longer matters, such as a live dashboard refresh.
  • Replay each missed run if every time slot matters, such as payroll, billing, or compliance exports.
  • Merge missed runs into one recovery run if users care about the final state, not each interval.

Pick one rule per job and keep it stable. A daily digest can merge. A monthly invoice job should usually replay. A fraud scan might skip runs older than a few hours because stale checks can waste money and confuse staff.

Set a hard replay window. That keeps one bad night from turning into a two-day backlog. You might replay up to 24 missed hourly runs or up to 7 missed daily runs, then skip anything older and mark it clearly. Without that cap, recovery traffic can hammer the database right when the system is already struggling.

Every replay should be idempotent. If the same scheduled run executes twice, the result should still be correct. Use a stable run identifier tied to the intended schedule time, not the recovery time. That one choice prevents duplicate charges, duplicate notifications, and repeated file exports.

Record every decision. Store the intended run time, the actual start time, whether the system skipped, replayed, or merged the run, and why. When a customer asks what happened, support should not have to read code or guess from raw logs.

A short support note also helps: "The service was down from 02:10 to 03:00 UTC. We replayed the two missed billing runs. Users may see delayed receipts, but no duplicate charges." Clear language lowers panic after recovery.

Mistakes that cause drift

Many teams still store "UTC+2" instead of "Europe/Berlin." That shortcut works for part of the year. Then the local clock changes and the 9:00 job starts firing at 8:00 or 10:00. Real time zones include rule changes. Offsets do not.

Server midnight causes another mess. Teams run a daily report at 00:00 on the machine that hosts the worker, then assume everyone has reached the same date. They have not. A store owner in Tokyo may already be in tomorrow while a customer in California is still finishing today. If a report should close at local midnight, the schedule needs the user's time zone, not the server's.

Retries can make things worse. Say a reminder email times out and the worker tries again three minutes later. Without a deduplication token, the user may get two copies. Then another worker may record the second attempt as the official send, and your history starts to drift from what users actually saw.

Distributed systems add one more trap. One worker loads a new schedule, another keeps the old version in memory, and both keep running. One node fires at 09:00, another at 09:15, and the mismatch stays hidden until customers complain. Shared schedule state, versioned config, and clear reload rules prevent that split.

The oldest mistake is still common: assuming every day has 24 hours. Some days have 23. Some have 25. During the spring change, 02:30 may not exist. During the fall change, 01:30 happens twice. If a user asked for "every day at 9:00 local time," do not quietly turn that into "every 24 hours." Those are different rules.

Most drift bugs look small at first. A shortcut in time zones, retries, worker config, or day length can sit quietly for months, then fail all at once when the clocks change or a server restarts.

Checks before release

Bring in a Fractional CTO
Get help with scheduling, infrastructure, and product rules that hold up.

Most timing bugs survive until launch because the rule looked obvious in one city and broke everywhere else. A short pre-release check catches a lot of pain, especially for email sends, invoices, reminders, and anything tied to local business hours.

Start with the rule itself. If someone asks what happens when daylight saving starts and ends, your team should answer in one sentence. "Run at 9:00 local time, skip missing times on DST start, and run once on the first 9:00 after DST end" is clear. "It runs every 24 hours" is a different rule.

Before release, check a few things:

  • Write one example for DST start and one for DST end using real dates and one named time zone.
  • Make sure only one worker can claim a scheduled run. A lock, lease, or unique job record is better than hope.
  • Test recovery after downtime. The system should know whether to skip, replay, or batch missed runs, and it should not send the same email twice or charge twice.
  • Give support a traceable path. They should see the rule, the planned run time, the actual execution time, and the final result in one place.
  • Run the same scenario for users in Tokyo, Berlin, and New York. If behavior changes, you should be able to explain why.

Consider a billing job at local midnight. One worker starts late after a restart, another sees the same unpaid invoice, and both try to create charges. If the job has a unique run ID and the charge request is idempotent, the second attempt does nothing. Without that guard, support gets a rough morning.

This is also where teams discover whether their logs are usable. If a customer says, "My reminder came an hour late," support should not need to dig through raw server logs across three systems. They need one record that shows the user's time zone, the stored rule, the computed next run, and what actually happened.

If you can explain those cases before launch, the schedule is probably ready. If you cannot, the code is not done yet.

What to do next

Start with an inventory. Teams usually remember the obvious jobs, like nightly reports or billing runs, but forget the quiet ones: renewal reminders, trial expiry, daily limit resets, payroll exports, and retries tied to local business hours. Write them all down and note which ones depend on local time.

For each job, record five things: who owns it, whether it follows UTC, customer local time, or office time, where the time zone comes from, what should happen during skipped or repeated DST hours, and what should happen after downtime.

Next, give product, support, and engineering one shared schedule format. Keep it plain enough that a support person can read it without guessing. A good rule sounds like this: "Run at 9:00 a.m. in the account time zone, once per calendar day, skip duplicates during fall back, and catch up only if the miss is under 6 hours."

Then run a tabletop test before the next release. Pick a spring-forward day, a fall-back day, and one outage that lasts longer than the job interval. Ask each team what should happen. If the answers differ, the spec is still too vague.

That is where most of these problems begin. The code may be fine, but the rule is fuzzy. That is how cron drift, duplicate sends, and missed runs reach production.

If your product runs across regions, a review from someone who has seen these failures before can save a lot of cleanup. Oleg Sotnikov at oleg.is does this kind of Fractional CTO work, including reviews of scheduling rules, recovery logic, and infrastructure choices. One outside pass is often cheaper than fixing a payroll miss or a day of duplicate customer messages.