For developers: What should we do next in the age of AI

At this point, April 2026, everyone is familiar with AI integrated deeply into the software development process. Some worry that AI will eventually replace or devalue them. I know that's the common concern, I don't need and don’t want to brag about it. There are some practices I believe developers should do next.

Strengthen the tech stack and domain expertise

"AI can write all the code, why bother choosing the language?" You may have heard this already, but the answer is simple: a language is tied to its ecosystem, it takes years to become an expert in a tech stack, and we rely on that expertise to judge the solutions provided by AI.

The code looks fine, and all the tests pass, but is it safe to deploy? Not quite. AI makes several architecture decisions no one is aware of, like using long expiration days for the auth token in a fintech product, choosing a low-maintenance npm package from some unknown developer, or heavily investing time in edge cases that never exist in real life.

Taste and judgement

So, being fundamentally strong in your company's tech stack is even more valuable in the AI age. AI can write code, but we still need to review and sometimes rewrite it, and redirect AI toward better code. We are opinionated here, not just accepting any code. Our taste is proven and solid, let it dictate the way new code is written. You're familiar with the system you own, your team's coding conventions, your battle scars. Even though all conventions are written into some instructions, AI can still go off the rails sometimes.

Regarding domain knowledge, AI can't replace anyone outright yet. Even in a scenario of a many-years complex project system, even though we can implement new features or fix bugs tangled in layers of complexity, anyone on the team can just drop in Opus 4.6 and a few lines of requirements and AI can implement a lot of that, it's still not the end of your expertise. Companies still need people to drive development. The metrics below show exactly where that human edge lives.

Drive the output via TDD

This is real-life experience from the team I'm working in, and several others have experienced the same: nothing drives AI output better than the TDD approach. You can also think of spec-driven development in the same mindset, we must understand what we need to do first and force the AI to actually understand what it's going to build.

When you're directing thousands of lines of code generation, you need a forcing function that makes you actually understand what's being built. Tests are that forcing function. -- Martin Fowler

We don't even need to write the tests line by line. Given the existing tests in the system, we can instruct AI to build new tests based on existing patterns and feed it our requirements for the expected output. It needs to write the test first and run it, the test should fail, then write the code and run the test again, keeping fixing the code in a loop until all green. With AI, TDD is easier than ever.

Context Engineering

AI providers keep improving models to handle larger and larger context windows, but that doesn't mean we can put the whole codebase into one prompt. Models don't know where to focus their attention when the provided context is too large. At enterprises, codebases can be millions of lines of code, not to mention that logs and real production data can be large in some cases.

Also, code is not everything; logs, metrics, ticket systems, and human discussions are inputs as well. We need to carefully compose good context for a task: that can mean referencing the relevant ticket, including meeting notes on the solution, querying related data and logs, and embedding them into the prompt. This process is often called context engineering; in an agentic setup, it becomes one part of a broader “harness engineering” discipline that also covers tools, repositories, and feedback loops.

Observability and Debugging

I mapped my debugging skills into an agent and shared it with the team. In general, it's the skill of looking for related data in logs, metrics, and traces. In detail, it includes grabbing traceable information like traceId and entityId, finding the related logs to query in our Elasticsearch database, then locating the relevant code that produces those logs, then asking for a real piece of data, looking for related tickets and discussions around the issue. That is my expertise, collected over years of working on the system, that's not something AI can easily establish without a human to orchestrate it. Not to mention production access: we can't give it to AI directly. Who knows what a hallucinating AI agent might do.

As a result, this AI agent alone reduced my MTTR by 12x, from 1 hour to 5 minutes. In most cases, within 5 minutes I can identify what is wrong with the incident we're facing, compared to hours of investigation before the AI age.

But if I stop feeding my skills to AI, stop providing the correct data through my authority, and stop judging the AI's investigation results, the output becomes meaningless. As I keep improving, the AI also improves through my discoveries.

Metrics

Technical expertise sets the foundation, but metrics are how we keep score. Here's where the human edge becomes measurable.

Coding metrics

Thanks to AI, we can now produce more options and more solutions to the same problem. For code, there are some simple metrics we can measure right after solutions are provided, like Cognitive Complexity.

Cognitive Complexity specifically measures how difficult code is for humans to understand, heavily penalizing nested control structures like if statements, loops, and switches.

We can reinforce it through instructions, but sometimes I still need to edit the code manually to maintain readability. Other metrics we can use are Cyclomatic Complexity, Maximum Nesting Level, and Duplicated Lines (%).

Product metrics

As we, developers, step into higher-value work, the product-centric metrics are what we need to be aware of and contribute to after the coding step. Metrics like feature adoption, clicks, retention, conversion, and search matching rate are usually the KPIs of the product team, but understanding them will bring developers closer to the customer. It's what companies usually dream of: a product engineer.

A product engineer (or product-minded engineer) blends software engineering with product thinking. They code features while considering user needs, business impact, and data, beyond pure tech tasks.

Service Level Metrics

Service Level Metrics are the most important for developers. They're tied into the contract we signed with customers, so they must be satisfied, that's a direct impact on the company. Example:

#	Service	SLI (Measured)	SLO (Target)	SLA (Promise)
1	Web Latency	p95=200ms	<300ms 99%	<500ms 99%
2	Login Time	Avg=1.5s	<2s 95%	<3s 99%
3	Checkout	99.2% success	>=99.95%/7d	>=99.9%
4	Email Delivery	97% inboxed	>=99%/day	>=98% or refund
5	Support Tickets	Avg=30min	<45min 95%	<1hr response
6	Cache Hit	85% hits	>=90%/hour	>=85%
7	Uptime	99.92%	>=99.95%/mo	>=99.9% or refund

Take this example from the internet to understand it better:

During a traffic spike: SLI drops to 99.7% (outages from nested if bugs in your API). SLO breached → investigate, error budget exhausted → prioritize fixes over features. SLA holds (still >99.9%), avoiding penalties. Ties back to code reviews: Low Cognitive Complexity prevents such failures.

Metrics lead all the way back to decision-making when prompting, reviewing, and refining code to ensure business value. We haven't given AI this level of thinking yet, and AI also doesn't go to a client and sign the SLA contract, doesn't take charge of the penalties. Only humans are accountable for it. It leaves us with questions like:

Should we implement this feature at all? What is the benefit?
Among 3 proposed solutions, what technical debt can we tolerate?
The solution is clear, but it requires 48 hours of work for a low-value ticket, is it worth it?
Our biggest client can produce 30M events at peak, what do we need to do to ensure the SLA?

Ownership and Accountability Mindset

AI can't be assigned to PagerDuty and join an incident call at 3:00 AM. It can help in the incident investigation process, but do customers want to show up on a call with an AI agent during an incident? Definitely not. When SLAs are breached and penalties land, that weight falls on a human, and always will.

Based on the foundation of technical expertise and metrics-centric development above, we can see that the more AI absorbs the doing, the more valuable the person who takes responsibility becomes.

Communication and Collaboration

One of my favorite communication skills is translating the technical focused discussions into language that product people can understand, while keeping it relevant and precise. People love the summarize button, but who can ensure that summary is still correct? Only the person sitting between these layers can take responsibility for it.

At this point, I think we can all agree that we can't just prompt all the way to production, and on the other side, engineers can't only work on code when AI can do that part most efficiently. Pure software engineers are elevating to product engineers, who can orchestrate the technical components to satisfy requirements from the product and the customer, and can also communicate well between stakeholders.

I remember the scene from The Matrix Reloaded where Neo tells the Architect that the machines need human beings to survive. Even though AI has assimilated most human knowledge, without us, it would stagnate, waiting for new human discoveries. We are not becoming obsolete. We are becoming the ones who decide what gets built, why it matters, and who answers for it.

Developers, what should we do next in the age of AI?

Strengthen the tech stack and domain expertise

Taste and judgement

Drive the output via TDD

Context Engineering

Observability and Debugging

Metrics

Coding metrics

Product metrics

Service Level Metrics

Ownership and Accountability Mindset

Communication and Collaboration

Comments

More from this blog

Scaling database practices

How I built and scaled real‑time chat app from 2 to millions of users — without a rewrite?

We won Honestbee Logistics Challenge 2016

Migrating a decade of production data from an abandoned database (RethinkDB) to MongoDB

Command Palette

Strengthen the tech stack and domain expertise

Taste and judgement

Drive the output via TDD

Context Engineering

Observability and Debugging

Metrics

Coding metrics

Product metrics

Service Level Metrics

Ownership and Accountability Mindset

Communication and Collaboration

Comments

More from this blog