Feb 10, 2025·8 min read

Known weirdness list for support teams that gets used

A known weirdness list gives support teams one short page for repeat edge cases, safe workarounds, and clear escalation rules.

Why teams repeat the same incident

Most support teams solve strange issues for the first time in the worst place possible: a chat thread. An agent asks a question, an engineer replies, the customer gets unstuck, and everyone moves on. A week later, the same problem shows up again, but the answer is buried under fifty newer messages.

That is how repeat incidents start. The team did learn something, but only the people in that conversation learned it. Everyone else has to search old chats, rely on memory, or interrupt the same engineer again.

Engineers feel this quickly. They keep answering the same narrow questions in slightly different words: "Is this the payment sync bug or the browser issue?" "Do we retry now or wait?" "Is this safe to change on a live account?" None of those questions is hard by itself. The real problem is the interruption loop.

It gets worse when the fix is a workaround instead of a clean repair. Workarounds drift. One person remembers "refresh the record and retry." Another remembers "retry twice, then reopen the case." A third skips the safety check because they never saw that part. Before long, the team has three versions of the truth and no confidence in any of them.

Escalation breaks the same way. If nobody sets a clear stop point, agents invent their own rules. One escalates after five minutes. Another keeps trying random fixes for half an hour. Someone else tags whichever engineer is online. The same incident lands in different places and gets handled in different ways.

A small queue can hide this for a while. In a growing startup, it turns into folklore: "Ask Alex, he knows this one," or "Use the old fix, I think it still works." That is not support documentation. It is shared memory full of holes.

From the customer's side, the incident may feel rare. For the team, it is a recurring tax. Time disappears in chat, engineers lose focus, and agents lose confidence. If nobody captures these odd cases in one known weirdness list, the team keeps paying to relearn the same lesson.

What this page should do

A good known weirdness list is a live support tool, not a mini manual. When an agent is in an active ticket, they need one place to check quickly without digging through old chats, docs, and half-remembered notes.

This page is for recurring edge cases, not every product bug. Add the odd problems that keep coming back and already have a safe, tested response. Leave out one-off failures, open investigations, and anything the team still does not understand.

A common example is a bug that appears only with a certain account setting, browser state, or import format. It is annoying, real, and familiar, but it does not need a full write-up every time it appears in the queue.

Each entry should answer three questions right away: what the agent is seeing, what safe workaround they can try now, and where the stop point is for escalation.

That stop point needs plain language. If the workaround fails once, say whether the agent should hand the case to support engineering, product, or the on-call person. If the case touches billing, security, missing data, or account access, say that directly. Support should never have to guess when the risk goes up.

Keep the page short enough to read in under two minutes. In practice, that means short entries, simple wording, and only the cases that save time every week. If an agent needs a long explanation to understand a case, move the background into a separate internal note and keep the short version here.

A page like this lowers stress during busy hours because it replaces folklore with a shared answer. New agents use it to avoid common mistakes. Experienced agents use it to confirm the workaround and the handoff rule before they do anything risky.

If the page turns into a catalog of every strange bug anyone has seen, it stops helping. One or two screens is usually enough. Beyond that, people skim past the part they need.

What belongs on the list

A known weirdness list is not a bug backlog. It is a short page for repeat issues that confuse support, waste time, or trigger panic even though the fix is usually simple.

Every entry should help a support person answer three practical questions in under a minute: what am I looking at, what can I do safely, and when do I stop and call engineering? If the entry cannot do that, it is too vague.

Keep each entry practical

Start with the symptom in customer language. Write what the person reports, not an internal diagnosis. "My invoice PDF is blank" works better than "rendering failure in export service" because support hears the first version in real tickets.

Then add the pattern that usually triggers it. This is what stops guesswork. Maybe it happens after a password reset, only on older mobile devices, or when a user uploads a file over a certain size. A short note on the usual trigger helps support confirm the match quickly.

Next comes the workaround. Keep it limited to actions support can do safely without side effects. That might mean asking the customer to retry in 10 minutes, clearing a stuck session, or using a manual export path. If the workaround carries any risk, say so plainly.

Include a hard stop

Every entry needs an escalation rule with a clear stop line. Do not write "escalate if needed." Write the exact cases that require engineering, such as repeated failure after one retry, errors affecting payment, signs of data loss, or more than three reports in an hour.

Add an owner and a review date at the end of the entry. One person should keep it current. Without that, the page turns into rumor storage. A review date also tells support whether they can trust the note or whether it may describe a problem that changed months ago.

A simple format works because people actually read it:

customer symptom
usual trigger
approved workaround
escalation rule
owner and review date

If an entry does not need most of those parts, it probably does not belong on the page yet.

Build the first version in one afternoon

Start with recent tickets, not memory. Use the last 30 to 60 days of support cases and look for repeats. Old stories always sound important, but fresh tickets show where the team is losing time right now.

Focus on the cases that caused delay, confusion, or risky guesses. If an agent had to ask around, try three replies, or make a judgment call without clear guidance, that case belongs in the first pass. Those are the moments when a known weirdness list saves real time.

You do not need a big project. One short working session is enough. Pull recent tickets from the help desk, group similar issues together, and mark the ones that led to long back and forth, unclear ownership, or unsafe workarounds. Then ask one support lead and one engineer to review the pile and merge duplicates. From there, write only five to ten entries in plain language and put the page in front of the team for a week of live use.

Keep the scope tight. Ten solid entries beat fifty weak ones every time. A bloated page gets ignored, and then the same odd issue turns back into folklore.

When a support lead and an engineer review cases together, they usually spot two problems quickly. Support may use three names for the same issue. Engineering may know a limit or failure mode that never made it into support documentation. A short review fixes both.

Each early entry only needs the basics: what the customer sees, what support can safely try, and when the case must go to engineering. If a workaround carries risk, say so in one sentence. If support should never touch it, say that clearly too.

Then test the page during a real week of tickets. Ask agents to use it before they ping engineering. If an entry saves even 10 minutes or prevents one bad guess, keep it. If people skip an entry, rewrite it or remove it. The page works when people reach for it under pressure, not when it looks complete.

How to write each entry

Tighten Team Handoffs

Map what support can do safely and what engineering should own.

Review Handoffs

Write each note for the person who is busy, half in context, and trying to help a customer in under five minutes. Start with the symptom they can see right now, not a guessed cause hidden behind engineering language. "Customer gets a blank export after clicking Download" is useful. "CSV worker fails after timeout" is not, unless support can actually confirm it.

Keep the fix just as plain. Write the workaround as short actions in the order someone should do them. If the order matters, show it. If a step is optional, leave it out. Support docs get ignored when every entry reads like a mini postmortem.

A good entry usually fits on a few lines:

Symptom: what the customer or agent sees
Workaround: 2 to 4 short actions, in order
Escalate when: the exact case where support should stop and send it to engineering
Include in ticket: the 2 or 3 facts engineering needs first

That third line matters more than most teams expect. A known weirdness list saves time only when it draws a clean line between support and engineering. Do not write "escalate if needed." Write the trigger. For example: escalate if the retry fails twice, if the account is on a paid plan, or if the same issue hits more than one user in an hour.

Keep each entry tight. Four to six lines is enough for most cases. If an entry needs three paragraphs, it probably hides two separate issues and should be split.

Also cut private shorthand. Notes like "ask Sam" or "same bug as last spring" help one person and confuse everyone else. If a new hire cannot use the entry on day one, rewrite it.

A simple example makes the point. Say the symptom is "Dashboard shows zero data after reconnecting Stripe." Then list the support steps and add a clear engineering line: "Escalate only if data is still missing 10 minutes after a successful reconnect and the customer confirms new payments came in today." That is short, specific, and easy to use.

A simple example from a support queue

A common case looks boring at first. A customer upgrades their plan, the payment goes through, but the app still shows the old usage limit. Support sees a worried message that says, "I paid, but nothing changed."

The first check is straightforward. The agent confirms that billing shows the new plan, then checks whether the account matches a delay pattern the team already knows about. In many systems, limits update quickly, but the last screen the customer sees can lag for a few minutes because of caching, session state, or a slow sync between services.

When that pattern fits, the reply should stay short and clear. The agent tells the customer to sign out, sign back in, and refresh once. That gives the system time to pull the updated limit instead of showing stale account data.

This is exactly why the page matters. Without a known weirdness list, one agent says "try again later," another opens an engineering ticket immediately, and a third starts checking logs for a bug that is not really a bug.

A useful entry for this case is simple:

What the customer sees: old limit after a successful upgrade
What support checks: billing status and the usual sync delay
Safe workaround: sign out, sign back in, then refresh once
When to escalate: the limit is still wrong after 15 minutes

That last line saves time. Fifteen minutes is long enough for the normal delay to clear, but short enough that a real issue does not sit in the queue all afternoon. When the timer runs out, support can escalate with confidence and include the billing status, timestamps, and what the customer already tried.

This sort of case earns a spot on the page when it stops being rare. If the same issue shows up three times in two weeks, write it down. That is usually enough to tell the difference between a one-off glitch and a pattern your support team should stop rediscovering.

Mistakes that make the page useless

Build The First Version

Use recent tickets to create a short page your support team will keep open.

Start With Oleg

The fastest way to ruin a known weirdness list is to let it become a second knowledge base. When a one-page guide turns into fifty pages, support stops checking it during live work. Keep deeper background somewhere else. This page should answer one question quickly: "Is this that odd issue we already know, and what is the safe next step?"

Vague notes cause the same damage. "Try again later" tells nobody what to check, how long to wait, or when to escalate. A useful entry names the symptom, the trigger, the workaround, and the point where support must hand it off. If an agent cannot follow the note without asking a coworker, the note is not finished.

The worst mistake is mixing safe workarounds with admin-only actions. If you place "clear user cache" beside "run a production script" at the same level, someone will choose the wrong one under pressure. Separate actions by role. Label what support can do, what only engineering can do, and what nobody should do without approval.

Two details get skipped all the time, and both matter: ownership and freshness. Every entry needs one person responsible for keeping it accurate, plus a review date so the team knows whether the note still reflects the current system. Without that, the page fills with ghosts: bugs fixed months ago, outdated paths, and warnings nobody trusts.

Location matters too. If the page lives in a quiet corner of a wiki, it may as well not exist. Put it where support already works: near the queue, inside the internal help panel, or pinned in the daily workflow. People use the page they can open in five seconds.

One more trap is writing for the person who created the workaround. Write for the next hire on a busy Tuesday. Short sentences help. Specific steps help. Clear stop signs help. If the note depends on memory, folklore comes back.

A quick check before you publish

Bring AI Into Support Ops

Plan AI assisted triage, docs, testing, and internal tooling with an experienced CTO.

Plan AI

A known weirdness list only helps if someone can use it under pressure. If a new support agent opens the page during a live ticket, they should understand its shape in about two minutes. If the page feels dense, vague, or stuffed with history, trim it.

Read the page once as if you joined the team this week. Then ask one simple question for every entry: can someone spot the symptom, try the safe workaround, and know when to stop without asking for help?

Use this quick check before the page goes live:

Keep each entry short enough to scan
Say exactly when support should stop trying fixes and send the case to engineering
Remove entries for bugs the team already fixed
Use the same words in support replies, internal notes, and engineering tickets
Name one person who reviews the page every week

That second point matters more than most teams think. A workaround without a stop line turns into guesswork. If the issue crosses a risk boundary, touches billing, or keeps failing after one safe retry, the page should say who takes over next.

Shared wording also saves time. When support says "duplicate invoice after retry" and engineering calls it "double charge race condition," tickets get messy. Pick one plain label and keep it everywhere. The same goes for status notes and escalation reasons.

Be strict about removing stale entries. If engineering fixed a bug last month, delete the workaround or mark it closed right away. Otherwise, agents keep using an old path long after the system changed.

A simple test works well: hand the page to someone new, give them one real ticket, and stay quiet. If they pause to decode a term or ask whether they should escalate, the page still needs work. Clean pages get used. Messy ones turn back into folklore by next week.

What to do next

Start small. Pick five repeat cases from this week and turn them into the first version of your known weirdness list. Do not wait for a perfect template. One short page that people actually open beats a polished document nobody trusts.

Good first entries usually have a few things in common. The same question has reached support more than once, agents have used different answers for the same issue, there is a safe workaround, engineering only needs to step in under one clear condition, or new team members often get stuck on it.

Write those five entries while the details are still fresh. Keep each one short: what the customer sees, what support should check, the safe workaround, and the exact point where the case moves to engineering.

Then review the page after your next incident retro. That matters more than a calendar reminder. A retro shows where the page was missing, where the workaround was wrong, and where the escalation rule was too vague. If one incident created three Slack threads and two different answers, add that gap to the page the same day.

Watch a couple of simple numbers for the next two weeks. You want fewer unnecessary handoffs and faster first replies. If the page works, support agents should solve more cases on the first pass, and engineers should see fewer tickets that only need a known fix.

A small check helps here. Ask two agents to use the page during a real shift. If they still ask, "Do I send this to engineering?" the rule is not clear enough. Rewrite it in plain language. One sentence often works better than a paragraph.

If the line between support and engineering keeps moving, an outside review can help. Oleg Sotnikov at oleg.is works as a fractional CTO and startup advisor, and this kind of workflow cleanup is often enough to stop repeat incidents from turning into team folklore.

The page does not need to be complete. It needs to be current, easy to scan, and updated right after the next messy ticket.

Frequently Asked Questions

What is a known weirdness list?

A known weirdness list is a short internal page for repeat support issues that already have a safe, tested response. Agents use it during live tickets to spot the symptom, try the approved workaround, and know when to stop and send the case to engineering.

What should go on the page?

Put repeat edge cases on it when they confuse support, waste time, or lead to risky guesses. If the same odd issue shows up again and support can handle it safely in a standard way, it belongs on the page.

What should stay off the list?

Leave out one off failures, open investigations, and bugs nobody understands yet. This page is not a backlog or a full manual, so skip anything that does not give support a clear, safe next step.

How short should the page be?

Most teams do best with one or two screens. If the page gets long, people stop checking it during busy shifts and go back to chat or memory.

What should each entry include?

Start with what the customer sees in plain language, then add the usual trigger, the safe workaround, and the exact escalation rule. Finish with an owner and a review date so someone keeps it current.

How do we write a clear escalation rule?

Write a hard stop, not a vague note. For example, tell support to escalate after one failed retry, after 15 minutes, or any time the case touches billing, security, missing data, or account access.

How do we build the first version fast?

Pull tickets from the last 30 to 60 days and group the repeat issues. Then ask one support lead and one engineer to review them together and write five to ten short entries in plain language.

Where should the page live?

Put it where support already works, like near the queue, inside the internal help panel, or pinned in the daily workflow. If agents need to dig through a wiki to find it, they will not use it under pressure.

How do we keep the list from going stale?

Give each entry one owner and a review date, and remove notes for bugs the team already fixed. After messy tickets or retros, update the page that day while the details are still fresh.

How can we tell if the page is actually working?

Watch for fewer repeat questions, fewer unnecessary handoffs, and faster first replies. If agents still ask whether they should send a case to engineering, the page needs a clearer stop line or a rewrite.