I’ll be honest: if you SSH into a server, grep the last 500 lines of a log, pipe it to Claude Code or Codex, and ask “what’s wrong here?”, you’ll get a decent answer most of the time. These tools are good at reading log output and spotting patterns. If you’re already comfortable in the terminal and you’ve got an AI CLI installed, you’ve got a free incident investigation tool.
So why did I build SuperTerminal?
For a lot of cases, you wouldn’t need it. If you investigate one or two incidents a month, Claude Code in your terminal is fine. I built SuperTerminal for the people who do it 15 times a month and are starting to notice the friction.
What works about AI CLIs for debugging
Claude Code, Codex, Gemini CLI, Aider, whatever your preferred tool is. They’re all good at the same thing: you give them context, they reason about it.
For incident debugging, the workflow looks like:
ssh prod-server-1
journalctl -u myapp --since "1 hour ago" | tail -500
# copy output
# paste into Claude Code
# "what's causing these errors?"
The AI reads the logs, identifies error patterns, suggests probable causes. It works because log interpretation is a text-reasoning task and LLMs are good at text reasoning.
Advantages:
- You already have the tool installed
- It costs nothing (or pennies in API calls)
- It works for any kind of debugging, not just incidents
- You control the prompt completely
- No vendor relationship, no account, no onboarding
I use these tools. They’re good.
Where it starts to break down
The friction shows up when debugging incidents becomes a regular part of your week.
You’re re-prompting from scratch every time. The AI doesn’t remember that last Tuesday’s payment failure was caused by the same connection pool issue. Every incident starts with you deciding what to check, which server to SSH into, what logs to pull. If the same failure mode fires 5 times in a quarter, you’ve done the same manual work 5 times.
You’re also the orchestrator. Claude Code reads what you give it. It doesn’t SSH into servers on its own or decide which service to check next. You run a command, copy the output, paste it, read the response, decide the next command, run it, paste again. For a complex incident spanning 3 services, that’s a lot of copy-pasting.
Context windows are a real limit too. A busy production server can generate hundreds of megabytes of logs per hour. You can’t paste all of it into Claude. So you’re making judgment calls about which 500 lines to show the AI, and if you pick wrong, the root cause isn’t in the window.
There’s no audit trail. Your Claude conversation is ephemeral. If your team lead asks “how did you diagnose this?” three weeks later, you don’t have a log.
And nothing runs when you’re not there. An AI CLI can’t watch your logs while you sleep.
What I built to fix this
SuperTerminal handles the orchestration. You describe the incident, it SSHes into the servers, runs commands, feeds each output to the AI, and the AI decides what to check next. No copy-pasting between terminals.
The runbook model is the bigger difference. I wanted a way to build a diagnostic sequence once and reuse it. The first time you investigate a database connection pool issue, you’re building a runbook. The fifth time, you’re just running it.
Every investigation is logged. Commands executed, AI responses, timestamps. Searchable later. Your teammate can look up how you diagnosed the same failure last month.
Side by side
| Claude Code / Codex | SuperTerminal | |
|---|---|---|
| SSH orchestration | You do it manually | Automated (uses your SSH config) |
| AI provider | Whichever CLI you use | Your choice of 6 providers |
| Multi-server investigation | Copy-paste between terminals | AI traverses servers automatically |
| Runbook reuse | None (re-prompt every time) | Build once, reuse on every incident |
| Audit trail | Chat history (ephemeral) | Full execution log (persistent) |
| Cost | Free or per-token | Free while in beta |
| Setup | Already installed | 5 minutes (SSH config + AI key) |
| Flexibility | Unlimited (any prompt, any task) | Structured (incident investigation) |
| Learning curve | You already know it | Low (describe the problem in English) |
When the CLI is enough
If you investigate incidents rarely, a few per month, the re-prompting overhead doesn’t matter. If your incidents are usually single-server, single-log-file problems, one SSH and one paste to Claude gets you there. If you enjoy the control of doing it yourself and you’re fast at it, no tool is going to beat raw muscle memory.
When SuperTerminal is worth trying
If you’re investigating 10+ incidents a month and the same failure modes keep coming back. If your incidents span multiple servers and you’re tired of copy-pasting between terminals. If you have a team and you want diagnosis workflows that are shareable. If you want investigation to happen before you wake up.
I built it partly because I got tired of re-prompting Claude every time the same database issue fired at 3am. The runbook exists now. It runs in two minutes. I go back to sleep.
Try it
SuperTerminal is free while in beta. If you’re already using Claude Code for incident debugging, you’ll see the difference on the first multi-server investigation. Get started here.