What If All Knowledge Work Could Be Programmed?
TL;DR
Most knowledge work today is stuck in stone—static processes and templates that can’t keep up with a constantly changing environment. GPT-5 pushes open the door to treating all knowledge work like code: adaptable, versioned, measurable, and improvable.
Why this is more human, not less: Programming knowledge work isn’t about relocating humans — it’s about unleashing them. When we can command machines to handle the mechanistic, deterministic, programmable work, we can once again be free to do more of the human work — applying our innate curiosity, imagination, and creativity to create the new possibilities of tomorrow.
An Invitation to the Reader
What follows are my ideas, informed by my background in creative operations, strategy, and organizational design. I believe there’s something powerful here—something that could help us unlock the full potential of AI while preserving and elevating the uniquely human parts of work.
But I only bring one perspective. This is a first sketch of a shared conversation. If you work in operations, strategy, project management, or any role that coordinates complex workflows, I’d love your take:
What resonates with you?
What challenges or opportunities do you see?
What have I missed?
My hope is that we can explore this together—building an approach that lets machines do the machine work, and humans do the human work.
A Quick Glossary (for Non-Software Folks)
You don’t need to be a developer to follow this. Here are a few terms from the software world that will help you navigate the ideas in this article.
Branch – Creating a copy of a process or workflow so you can change it without affecting the original. Later, you can merge the changes back if they prove valuable.
Merge – Combining changes from a branch back into the main process after review.
Diff – A side-by-side comparison that shows exactly what changed between two versions.
PR (Pull Request) – A formal proposal to merge changes into the main version. In our analogy, this is a review step for a knowledge-work process.
Eval (Evaluation) – A test or set of criteria to measure whether a process or output meets the desired standards.
Repo (Repository) – A central place where all versions of processes, specs, and assets are stored, tracked, and made available to others.
Conversation starter: If you’re from outside software, which of these terms already has a close cousin in your world?
All Knowledge Work Is Code
It always has been.
Every plan, policy, playbook, or process is, at its core, a set of instructions for achieving a goal.
The problem? Until now, most of it has been stuck in stone.
Static processes.
Best practices written once and filed away.
Templates that were outdated the moment they were published.
Software developers have lived in a different world.
They’ve had programming languages, frameworks, and repositories.
They can branch, merge, commit, audit.
They can see exactly how something was built, who changed what, and why.
They can measure performance against a standard and refactor when the code falls short.
The rest of us have worked in half-code: brittle, slow to adapt, dependent on human memory and undocumented nuance. Updated by committee in costly cycles. Turned to stone while the terrain around it changes.
GPT-5 pushes open the door to changing that.
It’s not magic. It’s not perfect. But several advances cross a threshold that makes this vision worth actively building toward:
Higher instruction fidelity – Executes multi-step specs and preserves constraints with far greater accuracy.
Larger context windows – Can “see” an entire process, history, or playbook at once.
Multi-modal reasoning – Works across text, spreadsheets, diagrams, and other inputs in a single flow.
Improved orchestration – Coordinates with agents and external tools more reliably.
These aren’t party tricks—they’re the foundations for treating any knowledge work like code:
A city’s disaster-response plan.
A multinational’s M&A playbook.
A humanitarian aid supply chain.
A marketing brief, an annual plan, a rollout strategy.
Once a process can be expressed as an executable spec—context, constraints, instructions, and acceptance criteria—an AI agent can run it, adapt it, and improve it.
Adaptable in real time as conditions shift
Versioned so you can see exactly what changed and why
Measured so you can connect each change to outcomes and know what worked
The Human Dividend: Unleashing Human Work
If knowledge work becomes programmable, the payoff isn’t just speed. It’s time, attention, and energy returned to the human domains that matter most.
What we unleash:
Curiosity — exploring new questions, reframing old ones, testing bold ideas.
Imagination — envisioning futures beyond today’s constraints.
Creativity — shaping concepts, artifacts, and experiences that didn’t exist before.
Judgment under uncertainty — making calls when data collides with values.
Taste & standards — what “good” feels like in a specific context.
Relationship & trust — building the human glue of teams, culture, and community.
What we hand to the machines (on purpose):
Repetition, reconciliation, formatting, retrieval, synthesis at scale.
Execution against well-defined constraints and acceptance tests.
Monitoring and alerting when reality diverges from the spec.
By encoding the repeatable, we expand the frontier of the uniquely human — making more space for the work that’s generative, relational, and imaginative.
Conversation starter: If you had 20% more of your team’s attention freed from repetitive work, what would you apply it to?
First-, Second-, and Third-Order Effects
These aren’t strictly sequential stages. Some will unfold in parallel, some will reinforce others, and some may only emerge once earlier shifts take root. The point is to notice them — and decide which ones to lean into.
First order: speed, adaptability, transparency.
Cycle times compress. Reuse goes up. Meetings disappear when review can happen on the spec itself.
Second order: structure and culture shift.
Meeting culture gives way to exception culture: we meet when the system can’t reconcile trade-offs or when values are implicated.
Knowledge work debt becomes visible: outdated specs, brittle handoffs, weak tests, stale sources.
Libraries emerge: tone packs, policy packs, channel packs, evaluation packs. Work becomes composable.
Third order: human roles are redefined.
People (Humans!) move up the stack:
Spec Architects encode judgment, constraints, and trade-offs.
Curators maintain reusable modules that actually work.
Governors design gates, provenance, and red-teaming.
Evaluators design outcome tests that resist gaming.
Storytellers turn telemetry into meaning and decisions.
Conversation starter: Do you see these as stages or shifts that happen in parallel and feed into each other?
What Stays Human (On Purpose)
Not everything should be programmable.
Tacit and relational work—trust, negotiation, taste—remains human-led. Specs can scaffold, not substitute.
Value conflicts—brand, ethics, trade-offs under uncertainty—require human authorship.
Deliberate slowness—moments where ritual matters should resist automation by design.
Patterns (and Traps) You’ll See
As teams start making knowledge work programmable, certain patterns tend to accelerate success — and certain traps can quietly erode it. The difference often comes down to whether workflows are designed with clarity, versioning, and portability in mind from the start.
Patterns that (tend to) work
Brief-as-PRD – Treat a brief like a product requirements document: context, constraints, and success criteria so work can be executed, reviewed, and improved like code.
Plan-as-hypothesis – Plans become living experiments with explicit success criteria and the ability to pivot as conditions change.
Continuous integration for ops – Regularly run and test processes to catch issues early, not after they cause failures.
Decision gates by risk tier – Match the review process to the risk level: low-risk auto-ships, higher risk requires human checkpoints.
Traps to avoid
Prompt spaghetti – Disorganized, undocumented prompts/workflows that can’t be reused or improved. Fix with version control and shared repositories.
Goodhart’s Law – “When a measure becomes a target, it ceases to be a good measure.” Use multiple metrics and keep humans in the loop.
Hidden dependencies – Workflows that rely on tools, data, or access no one else has. Fix by declaring dependencies and ensuring portability.
Vendor monoculture – Building entirely on one platform, risking lock-in. Fix with portability layers and periodic migration drills.
Conversation starter: Which of these patterns could you pilot in your domain tomorrow?
Why GPT-5 Is the Threshold (Not the Finish Line)
It’s not that “the model is smarter.” It’s that the building blocks are starting to line up:
Instruction fidelity: specs are worth writing because the system can actually follow them.
Context span: the agent can hold the whole story—assets, policy, history—without losing the thread.
Multi-modal fluency: the work isn’t only text; the model can operate across the real artifact mix.
Tool/agent orchestration: from demo to dependable—hand-offs that actually happen.
That’s enough to treat meaningful chunks of knowledge work like code today, and to grow the programmable surface area quarter by quarter.
The Choice
We’ve never had the tools to unstick this work from stone.
Now, we do.
The question isn’t if. It’s whether we’ll architect it with intent—specs, evals, governance, and culture—or inherit it by default from whatever tools arrive first.
Either way, the code of how we work is about to be written.
Let’s write it so machines can machine—and humans can be unleashed to human.