Writing

LLMs, software engineering, and an average engineer's experience

A personal note on using LLMs as an average software engineer: what helps, what hurts, and where the workflow still feels difficult.

llmssoftware engineeringagentic development

I’ve been meaning to write this for some time now as I’ve gone through my own version of the journey all of us engineers/developers are currently on.

It’s safe to say the entire space has been flipped on its head, I find myself deeply fatigued reading giant posts full of em dashes, “it’s not this, it’s that” slop posts, predictions from different people about where we are going, mass tech layoffs, tone-deaf CEOs talking about laying off people for AI like it’s some groundbreaking achievement and frankly… I’m exhausted.

In this post I’m not going to predict the future or try to sound overly optimistic nor pessimistic, I’m going to share my honest experience as the average engineer. I’m not some supergenius directing agent swarms, and I’m not going to grift and pretend that I am.

I read posts from people (some who appear to be quite respected) talking about how they are setting up massive jobs overnight for swarms of agents to complete and I just can’t understand how they are achieving this with any semblance of accuracy or effectiveness but I’m not a top engineer at a unicorn startup or massive successful tech company or some cracked 10x engineer so maybe it is just a skill issue, I’m totally willing to entertain this idea.

Some key themes I’ll cover here:

  • The sycophancy of LLMs, Claude believes all your ideas are good ideas even if they aren’t
  • The loss of attention to detail and the exhaustion that comes with the supervision of LLM outputs
  • Mourning the loss of an old friend and the flow, where did it go?
  • Where I still find joy
  • Are we really more productive?

The sycophancy of LLMs, Claude believes all your ideas are good ideas even if they aren’t

Despite models getting better and better this problem is still very real, I’d say it’s probably the biggest problem right now. A next token predictor cannot understand consequences and cannot be held accountable for its bad advice. It is programmed to give you what you want even if what you want is not actually what you need.

Your judgement and deep domain knowledge is your most valuable trait and you must at all costs not think that AI is going to effectively do this for you. The less you know about a topic the more susceptible you are to this happening and nobody is safe.

Business leaders who employ hundreds of people who talk like they know everything because Claude agreed with every idea they had, judging themselves on how many tokens they managed to burn on the “tough problems” they are solving with Claude. I would say these people are more susceptible than anyone as they largely lead people and are not as involved in the day to day of doing the work. Claude has made them feel like they know more than the people they hire to execute and as a result leads them to make terrible judgement calls and waste their employees’ time reviewing their AI generated slop.

I think a hard rule to hold yourself to should be, the less I know about the topic the more questions I must ask, the more sceptical I should be, the more I should reason about the problem and the more I should lean on the people in the organisation who have strong knowledge in this domain. People first, the LLM second.

The LLM knows all of these things but in order to get them out in a coherent way domain knowledge is absolutely essential. I believe an LLM can help you build domain knowledge if used the right way. Skepticism always, questions always, this is so so important. I’ve been burned plenty of times by being lazy and taking things at face value only to realize something was wrong down the line.

The loss of attention to detail and the exhaustion that comes with the supervision of LLM outputs

You spin up Claude Code/Codex or whatever your preferred LLM + harness is. It’s time to start planning a feature. You have a relatively good grasp of the problem, know the techstack ok but maybe you’re touching a few things you haven’t touched before.

You begin speaking to your agent about how to approach this problem and it comes back to you with what looks like a great plan. You’ve got the architecture of the solution down, you know what code it touches, what migrations need to be done, the general flow of your solution within the system. You’ve got a beautiful architecture doc, it feels like you’ve covered all the bases and even better you did it in record time. You get Claude to create all the corresponding tickets via MCP or CLI in ClickUp or GitLab or whatever you use. You feel good.

Things feel so well specced, hell you could probably pass these tickets off to your agent and they could almost execute them unattended.

You start your first session, pick up the first ticket and begin execution. You hop into plan mode, Claude spits back a multiparagraph plan for execution of the ticket, you read it but you’re starting to feel some fatigue, you’re not really absorbing every detail anymore, you’re starting to skim over details that don’t seem as important. You begin to defer more and more to the agent even if you don’t realize it.

Ok, time to execute, you switch to automode and let the agent do its thing, you watch the diff start to fill out.

All done.

Changes applied, tests written and passing, things look like they work. Great stuff.

Now it’s time to review the code, there is a lot of it. At first things are going well, you’re finding minor issues, listing them out for the agent to fix in its next pass. But the fatigue is growing, you’re skimming more details and letting things through.

The fact is that code review sucks and it’s one of the most mundane and cognitively taxing activities you can do and every engineer can attest to this.

You get to the end and open your PR, you feel a sense of accomplishment but it’s small and you’re still feeling kinda robbed and you are exhausted.

The process repeats, you make the plan, you read the plan, you execute the plan, you review code, but now you begin to notice things, design decisions that don’t make sense, the LLM hallucinated a small detail that has implications for the next phase of what you’re building. You’re now rethinking the entire approach.

These are things you were sure you covered in your planning and output reviews but in fact they were those little details you skimmed over earlier as that fatigue started to kick in.

The problem was everything happened so fast that you didn’t actually have time to reason about why you were doing something a certain way, you didn’t sit with yourself as you implemented it, you didn’t have those little lightbulb moments.

You read the output but you didn’t comprehend it, and how could you, there was just so much. This is one of my biggest challenges and I’m still actively figuring out how to shape my workflow to minimize it.

I try to break tasks down so my soft human brain can maintain context over them. Build out detailed rules to try to get the agent to follow conventions, make its outputs more simplistic, avoid over-engineering and describing concepts in a million words.

This helps a bit but the deeper you go into the context window the more the agent starts to forget the rules you have set for it. Hopefully as they improve this problem goes away.

I’m getting better at it and I feel my workflow is improving. The biggest piece of advice I can share from my experience is: just slow the F*$k down, take your time, you’re still moving waaaay faster than you did when you did everything manually. It’s ok to take your time, ask questions, reason about things and ensure you understand what your code is doing.

Mourning the loss of an old friend and the flow, where did it go?

I feel like I’ve almost forgotten the feeling of putting on my headphones, putting on some soft repetitive music and just losing myself in the code, maybe I’m debugging an issue or I’m busy writing some sort of change, I’m in the zone slowly implementing each piece of the solution, having realizations and getting little hits of dopamine as I do the work.

This process is ultimately… dead.

Now it’s reading outputs, writing responses and waiting for the next output while the agent does all the work that otherwise would have held me in flow.

Don’t get me wrong, a lot of this was tedious… BUT it was a big part of keeping you in that zone, maintaining your mind on that single track.

Now between agent responses you’re trying not to get distracted, perhaps you could go to another session but you’re breaking the flow, you’re context switching even if you’re working on another session that is tackling another part of the job, there is still a context switch taking place that ultimately makes it impossible to maintain this flow.

I am totally ok with the idea that maybe I’m doing it wrong and there are things I could be doing better, I am open to suggestions and I’m trying every day to improve.

But to my old friend, I miss you dearly and while you slowly fade away into memory, I will look back fondly on a simpler time.

To be clear this does not mean I am anti-change, I believe in moving forward and adapting to the times, this is a fact of life and I refuse to be a boomer wishing I could go back to the good ol’ days.

Which brings me to the next topic…

Where I still find joy

You take the good with the bad and what I can say is I generally know more since LLMs came around. I have spent hours going back and forth with models talking through implementations, asking questions about the why and what and it’s been immensely insightful. I know more about many topics now than I ever did before and I truly believe that if you push yourself to use an LLM as a teacher and force yourself to understand everything it outputs it can be an extremely powerful tool.

It’s like having a teacher you can ask a question at any time, you can ask the question as many times and as many ways as you like and they never get tired of you.

However it can be equally as destructive should you give in to everything at face value and defer your sense of skepticism, reasoning ability and thinking entirely to the LLM. I find myself teetering between these at times which is generally determined by how fatigued I am at the time.

I’ll reiterate, SLOW THE F*$K DOWN, the more I do this the more I can find joy, the more I feel in control of what I’m doing the more I feel the quality of my output is respectable.

Where I’ve found joy is the time spent going deeply into planning and implementation sessions i.e. building detailed specs and then talking through the implemented code with the agent and when I really take my time thinking through the problems I feel like I’m finding that flow again but I won’t pretend it’s anywhere near as strong and rewarding as the old process was. I’m working on it though, maybe I’ll get there one day but for now I’ll keep chasing the dragon.

AI has also given us the ability to think bigger, there are many things we can create that we never would have had the time or capacity for in the past and this is wildly exciting. You can now migrate entire codebases to new languages in days to weeks. Build much larger applications faster etc.

Where it gets tough is watching an agent spit out work it would have taken you weeks or months to do yourself.

You have moments questioning your usefulness. However the more I work with agents the more I realize using them is a deep skill. Go try build Openclaw or Hermes or port the entire Bun codebase to Rust. Chances are you won’t get it right even with 1000s of agents.

The driver needs to know what they are doing and so this becomes the skill and it goes really really deep. I still feel immensely challenged every day, the challenge is just different.

Are we really more productive?

So this is an interesting topic and I think the answer to this question can be both yes and no.

Let me explain.

The case for Yes

My ability to implement solutions is faster without a doubt, especially if I understand the domain well it’s easier to judge what looks right and what doesn’t and there are plenty of tedious tasks and workflows I can quite easily defer to an agent.

Maybe it’s sending an invoice, covering tedious admin on tickets, implementing a solution in a system and techstack I understand deeply.

I can understand how very senior “10x engineers” feel like superhumans, their domain knowledge is so deep that their ability to call bullshit is developed enough that for a large variety of tasks they are able to comprehend outputs and finetune the solution at a much higher rate than an engineer that is less experienced.

I am a believer that a well-tuned workflow in the right context can be very impactful and I’m slowly finding this out as I dive deeper.

I hope one day I can be like Peter Steinberger or Teknium when I’m big, but for now I haven’t discovered the secret sauce they seem to have their hands on.

It is however undeniable you can do some crazy stuff if you know what you’re doing, it’s possible and people are proving this every day.

The case for No

In more recent experience this applies especially working with more non-technical people but with some technical as well.

It has become all too easy to generate slop and to have an agent make you feel like you know what you’re talking about. I have had people in leadership send me 20-page technical specs for a project they have no understanding of, expecting me to read the giant pile of slop they generated with Claude. It is completely incoherent and you can see the LLM has done its best to spit out something that represents the user’s desired outcome even if it means jumping through all sorts of hoops to make a solution sound even slightly feasible.

In a way this gives me some sense of security that my profession is not going anywhere anytime soon. The saying is very true, bullshit in will create bullshit out and what I can get out of an LLM vs a non-technical human are vastly different things.

The problem arises when non-technical humans feel they are being helpful by wasting my time and forcing me to review their incoherent slop output. It would be far better if you presented your problem to me and allowed me to go away and speak to the LLM on your behalf, exercise my expertise and judgement and come back to you with a coherent response.

I don’t claim to understand your domain, please don’t pretend you now understand mine because an LLM made you feel smart. The slopocalypse is real and some serious emphasis needs to be given to what we deem as acceptable to do with LLMs. Time spent reading slop output is a PRODUCTIVITY KILLER and we need to do something about this.

Concepts like tokenmaxing are a plague to the world, we are killing trees to produce drivel. We should be approaching token use like we do fuel for our cars… with efficiency in mind. How can we achieve the same outcome with less tokens and fewer words? The job is still automated and the output is more comprehensible and not wasting our time.

I’ve had a similar experience with other engineers who produce plans and piles of tickets with plans that look coherent until I am sent a PR with an implementation that has clearly not been reasoned about much. This ultimately requires a long back and forth in review and a constant readjustment of implementations which makes things ultimately take longer than they would have had they put in time to really think about the why and the what.

I am guilty of this myself and I’ve already learned this lesson more than once. It’s funny how it goes back to the same fundamental concept that existed in software development before AI and that I’ve mentioned twice already, SLOW THE F*&K DOWN!!

Anyway thanks for coming to my TED talk, I can confirm this post is 100% human generated.

I’ll share some rules/thoughts that are especially important to me:

  • Of course the biggest one: SLOW THE F*&K DOWN!!
  • Don’t defer your thinking, as tempting as it may be. AI cannot make your decisions for you and you are responsible for its output.
  • Human to human communication should be written by humans. AI is fine for building plans, spec docs, writing code but when responding to your colleague, writing a post, a cover letter etc, do it yourself. I think we can all agree we read enough AI generated text, let’s keep sacred what should be sacred.
  • The above point also goes for code review, write your comments yourself, reason about the code yourself. You can use an agent to do a pass and help you find things you might have missed, you can converse with it about your findings but don’t make your co-worker read your agent’s output, that’s what Code Rabbit is for.