CTO / AI Engineer · 7+ Years AI Product Delivery
Co-founded MaybeAI, XCelsiorAI & Footprint Analytics. Built AI systems across FinTech, Web3 & SaaS serving 700k+ users and $800M AUM. Hands-on leader in credit scoring, anti-fraud, computer vision, and LLM-based agent platforms.
The new skill in AI isn't prompting — it's Context Engineering. As AI agents become mainstream, the difference between a cheap demo and a "magical" agent is the quality of context you provide.
Context Engineering is the discipline of designing and building dynamic systems that provide the right information and tools, in the right format, at the right time, so an LLM can accomplish a task.
Think of it as everything the model "sees" before it generates a response — not just the prompt string, but a complete system output.
Prompt Engineering focuses on crafting the perfect instruction string. Context Engineering is a system that runs before the LLM call — it's dynamic, task-specific, and covers both knowledge and capabilities.
Most agent failures today are not model failures — they're context failures. The agent doesn't have what it needs.
Imagine you ask an AI: "Check if we have inventory for SKU-123 and send an order confirmation if we do."
A "cheap demo" agent sees only the user message and outputs a robotic reply. A "magical" agent enriches context before calling the LLM: your calendar (shows you're busy), past emails (informal tone), contact list (key partner), and tools like check_inventory and send_email. The output becomes genuinely useful.
For data engineers: context engineering means your data pipeline, schema definitions, and domain knowledge all need to be legible to the AI at the right moment — not dumped as a massive doc, but surfaced dynamically.
OpenAI's term for the discipline of designing environments, feedback loops, and control systems that help AI agents accomplish complex, reliable software work at scale.
OpenAI's team built a product with 0 lines of manually-written code. Over 5 months, Codex wrote ~1 million lines across application logic, tests, CI, docs, and tooling — at roughly 1/10th the time it would have taken by hand.
The key shift: engineering work moved from writing code to designing environments, specifying intent, and building feedback loops. Humans steer. Agents execute.
Early progress was slower than expected — not because Codex was incapable, but because the environment was underspecified. The agent lacked tools, abstractions, and internal structure to make progress.
The fix was almost never "try harder." Instead: "What capability is missing, and how do we make it both legible and enforceable for the agent?"
One big AGENTS.md fails predictably: context is crowded out, everything becomes "important," the file rots, and drift is inevitable.
Instead: a short AGENTS.md (~100 lines) acts as a table of contents. The real knowledge lives in a structured docs/ directory treated as the system of record. Progressive disclosure — teach the agent where to look, don't overwhelm it up front.
Both Claude Code and OpenAI Codex are harness engineering tools — they take a repository context and drive a development loop: design → code → review → test → deploy. The quality of your AGENTS.md, docs structure, and feedback tooling determines how effective they are.
For SE/data teams: integrating these tools means investing in documentation hygiene, structured knowledge bases, and fast feedback loops (CI that gives agents signal they can act on).
Throughput changes merge philosophy: when agent output far exceeds human attention, corrections are cheap and waiting is expensive. Minimal blocking merge gates. Short-lived PRs. Test flakes addressed with follow-up runs.
Human judgment is still required — but it operates at a higher layer of abstraction: prioritize work, translate feedback into acceptance criteria, validate outcomes.