Runtime Hours: The Unit of Work in OutcomeDev
Runtime hours are human hours. Runs are bounded by the runtime window and message budget you set.
OutcomeDev already has a clean unit of time: runtime hours.
“Hours” only matter because they buy execution. In OutcomeDev, those hours are a bounded sandbox window set in the interface, where an agent can act on your behalf inside a real repo and produce inspectable results.
So we don’t need a new concept of time. We just need the right interpretation:
- Runtime hours are the same human hours.
- Agent hours are those same runtime hours used by an agent, on your behalf, inside a sandbox run.
Why runtime hours feel different with agents
The clock doesn’t change. The overhead changes.
Human work inside the same number of hours usually includes taxes like:
- context switching
- meetings and coordination
- waiting for access
- hunting for the “source of truth”
- re-explaining intent
Agents running inside a sandbox pay different taxes:
- tool availability
- permissions and sandbox limits
- deterministic constraints (rate limits, budgets, API quotas)
When an agent has tools and a proof loop, it spends more of the same hours turning intent into artifacts.
The real budget is time + messages
In OutcomeDev, you don’t only pay with time. You also pay with messages.
So the practical boundaries for a run are:
- runtime window (runtime hours)
- message budget (how many turns you allow)
The proof loop is what you use to judge progress and correctness:
- commands executed
- diffs produced
- checks run (lint, type-check, tests)
- artifacts shipped (PRs, previews, drafts)
A mental model you can actually use
Saying “spend 5 hours on support” is fine if 5 hours is the runtime window you set.
The important part is that the run has an explicit boundary and an execution goal, for example:
- Spend the next runtime window on support and reach Inbox Zero.
- Draft replies and produce a triage log.
- Create follow-up tasks for anything that needs engineering.
If you’re running an autonomous loop runner (like a Ralph loop), timeboxing it to the same runtime window and message budget keeps it from getting overzealous while still letting it execute end-to-end.
That’s operations as loops:
What makes a run reliable: tools + constraints
Reliable output comes from two things:
- tool access (DB, email, browser automation, CLIs)
- constraints that force proof (artifacts, checks, budgets)
Without tools, an agent is stuck in suggestion mode. Without constraints, an agent is stuck in vibe mode.
With both, you get compounding execution.
The practical payoff
Once you plan in runs and proof, you stop asking:
- “How do we build a system for every operational need?”
And you start asking:
- “What is the smallest workflow prompt that can be re-run daily and compound?”
That shift is the real adoption unlock: not smarter models, but newer paradigms of work.