Codegen Log | This is not Code, this is Energy
LLM Vendors are constantly improving code generation capabilities and even in the Agent Era one-shot LLM codegen still gets a nice headline. Actual, practical uses however require us to iterate and accrete and shape codegen via validations and tests. Work is still work.
But just for amusement, how might would-be prosumers of LLM services facilitate the occurrence of those fancy one-shot codegens?
First, The Very Obvious Thing: All that is not Shaped is Improvised
When running many like-prompted codegen reps you immediately find that things not locked into functional shapes via specification and other scaffolding may be subject to LLM whim. This player piano fills in estimated arrangements as the prompt is exhausted and the notation thins out, yielding an aleotoric rendition rather than pure playback. Cue confusion, angst and noise.
- Hermetic builds are not generally expected with pure LLM codegen.
- But with the latest models & services the codegen results are increasingly nominal
- …but sometimes they are not what you want
- Sometimes improvs reveal interesting things you missed
- And sometimes they are Ok, but weird
- And …sometimes they are just skibidi
In the far dim past of last Winter
Playing with repeatability experiments, building a simple version of a familiar app type (a basic web server) from specs, testing & validating. Over and over and over. Watching code variances and compliance with build guidelines. Exciting.
Spring
Our toolchest is now stuffed with elaborate Cursor rules, Github Copilot instructions, a menagerie of process roles and formalisms, deutero-prompting, delicate phrasing, extensive guard-railing, lots of w00-ware and just-tries.
Which yield frequent bouts of nominal one-shot codegen.Yay! …bouts…
How can we tighten this up?
Hot Codegen Summer
Wielding a few standard process management artifacts combined with Knowledge Graphs consisting of RDF encoded support documents (“Lore”) to be included in the prompt context.
AKA; PRDs & SBOMs and graphRAG / subgraph injection
You’re busy building a cybernetic LLM-independent PM + development methodology, like everyone else is doing.
Results? Not bad at all! Getting quite useful in fact. Pushing further…
Early autumn
Assembling various house and domain Lore, accreting source and docs into a private RDF-based Knowledge Graph. From there your graphRAG apparatus (a ROLE, a tool, an agent or maybe just …you) extracts needed entities and relations into tuned subgraphs, which are used as context participants.
And then you upload them to the Great Service Provider in the Sky. Your privacy & confidentiality are protected by majestic TOS wordwalls, reputation and vibes. This somehow does not fill you with joy sparks.
How about local? Local would be better… Much model & hardware research follows.
So…maybe go retail with big flavors of Qwen, Mistral or DeepSeek on a top-maxxed 512GB Mac Studio?
All local, sandboxed & safe for source code and other confidential/private material. Also fences in agentic extravagance.
Pretty cool…but unfortunately too costly* for personal use.
What is indeed a minimum viable personal local codegen LLM setup?
To be Continued…
* A price which if we were looking back for silly historical concordances of the contemporary local LLM scene would be oddly similar to that of the flagship 512K Mac of late 1984 (about $10K, inflation corrected).
Bleeding edge cuts.