This is a case study about a street game, a failed build, a pivot, and a methodology that made the difference. It is also about a Burroughs-inflected cut-up ritual, a ghost installation that happened by itself on a VPS, and an iPhone bug fixed in minutes from a kitchen table.
It is a HITM case study. But it starts with the failure.
The Project
Spiegel des Universums ("Mirror of the Universe") is a street ritual game. Players receive a two-word glitch phrase — generated by a cut-up algorithm inspired by Burroughs and Gysin — and a randomly selected location in their city within a chosen radius, with navigation links for Apple Maps and Google Maps. They go there. They find or create the phrase in the environment. They photograph it, seal the portal with a spoken ritual, and submit their trace to a public photo pinboard. Then they get noodles.
The project exists as a full web application: Ghost CMS running on a VPS, a custom theme, a geolocation engine, an n8n orchestration layer handling photo uploads, gallery delivery, image serving, and admin deletion — all of it wired together, member-gated where appropriate, publicly accessible where not. There is a downloadable zine. There are printable Glitch Tag cards. There is a Mirrorling — a personal glitch gremlin the player keeps, buries, or lets whisper.
It is, by any measure, a complete, production-deployed, street-tested creative application. It was built in one day.
But first, three days of nothing.
Days 1–3: Conversational AI, No Architecture
The project began with a clear creative vision and no technical plan. The approach was straightforward: describe the idea to an LLM in conversation, let it generate suggestions, iterate from there.
What followed was three days of circular motion. The LLM generated options. Each option required a reaction. Each reaction produced a new set of options that partially contradicted the previous ones. Without a structural anchor — a spec, a set of constraints, a defined boundary between what the AI could touch and what required a human decision — every session started slightly further from clarity than the last.
This is the failure mode that HITM exists to prevent. Not incompetence. Not bad tools. Structural absence. The AI had no defined space to execute within, so it kept generating into the void, and the void kept expanding.
After three days, the project had conversations, fragments, and frustration. It had no codebase.
The Pivot: Forcing a Legible State
The move that changed everything was not a better prompt. It was asking the LLM to produce a summary and handoff document from everything that had been discussed. This forced consolidation — the chaotic conversational output had to be compressed into a structured, transferable state.
That document became the specification. Not because it was perfect, but because it was bounded. It said: here is what exists, here is what has been decided, here is what remains open. It was, in HITM terms, the markdown spec produced retroactively to rescue a build that had never had one.
With that document in hand, the decision was made to start fresh. New session. New tool. The handoff prompt, the spec, and the Zine — the original creative document that defined the game's world, rules, and aesthetic — were handed to Codex together.
The First Exchange: Why It Worked Differently
Before writing a single line of code, Codex was asked to read the materials and explain the project back in its own words. This is a HITM gate: nothing proceeds until the model demonstrates architectural understanding. Codex summarized the project accurately, then asked three questions:
"1. Frontend stack: Plain HTML/CSS/JS, or a framework like Astro/Next?
2. Ghost gating: Do you want the Play UI hosted on the static site and gated via Ghost Portal, or embedded directly inside Ghost?
3. Maps: Google Maps only, Apple Maps only, or user choice?"
These questions are worth pausing on. They are not implementation details. They are architectural decisions. A conversational LLM generating without constraints would have made these decisions silently, invisibly, and probably inconsistently across sessions. Codex surfaced them as human decisions before touching anything.
The answers given were:
"n8n is already installed on the VPS. We will develop locally, then deploy via Git. We still need to set up Ghost on the VPS. We will create a Ghost template from the HTML/CSS files. 1. Plain code, no framework. 2. What's the difference? The static site will be the Ghost site, no? 3. Both, user's choice."
The second answer is the one that matters most for teaching purposes. "What's the difference?" is not a gap in technical knowledge to be embarrassed about. It is a human making an architectural decision from their understanding of the project's purpose: the site is the site, Ghost is not a separate layer. Codex took that answer and resolved the implementation question within it. The human defined the boundary. The AI figured out what the boundary meant technically and executed accordingly.
That is the division of labour HITM is built on.
The Build: One Day
With constraints defined and architectural decisions made, the build proceeded in structured steps. Ghost was installed on the VPS autonomously. The custom theme was scaffolded from the agreed HTML/CSS/JS approach. The geolocation engine — which determines a random point within the player's chosen radius, applies the rules that govern what counts as a valid glitch spot, and generates the correct Apple and Google Maps navigation links — was built stepwise, with each stage reviewed before the next began.
The n8n workflows were delivered as JSON: four of them, handling photo upload, gallery data serving, image binary delivery, and admin deletion. Each was tested. Logs were sent back when something didn't behave as expected. Corrections were made within the same session.
The photo pinboard — member-gated upload, public gallery, admin delete with token authentication — was implemented as a full Ghost theme layer, with the n8n webhooks handling data and file operations underneath.
At the end of the day: a deployed, working application.
The Field Test: One iPhone, One Bug, Two Minutes
The first street test surfaced a geolocation issue on one player's iPhone. The location service wasn't returning coordinates correctly in that browser context.
This is where the methodology shows its second-order value. Because the build had been structured — because every decision had been documented, every system boundary made explicit, because the handoff document captured the state of the entire system — the bug was diagnosable. The log made sense. The fix was targeted. It took a few minutes at home, in the same session, without reconstructing context from scratch.
In a vibe-coded build, that iPhone bug might have required hours of archaeology through a codebase no one fully understood. In a HITM build, it required reading a log, identifying the constraint that had been violated, and correcting it within the established architecture.
The Handoff Document as Proof of Method
The handoff snapshot produced at the end of the build is itself a HITM artifact. It documents the production setup, every working n8n workflow endpoint and its purpose, the data schema, the file storage structure, the Docker mount configuration, the critical gotchas already encountered (with their solutions), the files most recently modified, and the recommended next work.
Any future session — with any LLM, or with a human developer — can resume from this document without losing architectural understanding. That is what HITM produces that conversational AI-assisted building does not: a system that remains legible to its owner.
What This Case Study Demonstrates
The methodology is domain-agnostic. Spiegel des Universums is not a SaaS product, a business tool, or a productivity application. It is a street art ritual with a cut-up phrase generator and a Mirrorling card. HITM worked here not because the project was technical but because the project had structure before execution began.
The control condition exists. Three days of conversational building produced nothing. One day of structured building produced a deployed, street-tested application. Same project. Same AI capability. Different architecture.
The human decisions were not technical. "Plain HTML, no framework." "The Ghost site is the site." "Both maps, user's choice." None of these required developer knowledge. They required understanding of the project's purpose, which is always a human domain.
The debug loop is a feature of the architecture, not a separate skill. The iPhone fix was fast because the build was legible. Legibility is not an accident in HITM — it is a structural output.
The handoff document closes the loop. A build that cannot be handed off — to a future self, to a collaborator, to a developer — is not a finished build. HITM produces handoff-ready output as a matter of discipline, not as an afterthought.
Appendix: The System That Was Built
For reference, the complete Spiegel des Universums production system:
Infrastructure: VPS, Ghost 6.x, n8n in Docker, Git-based deployment from local development.
Frontend: Plain HTML/CSS/JS custom Ghost theme. No framework.
Geolocation engine: User-facing distance selector, constrained random point generation, Apple Maps and Google Maps navigation link generation, iOS/Android detection for default map choice.
n8n workflows: Photo upload (validation, disk write, database insert), gallery data serving (JSON with full item schema), image binary delivery, admin deletion with token authentication.
Data: SQLite table `spiegel_pinboard` with columns for id, username, glitch phrase, coordinates, latitude, longitude, comment, and filename.
File storage: Docker-mounted volume, container path mapped to VPS host path.
Authentication: Ghost member gating for upload submission. Public gallery.
Downloadable artifacts: Zine (PDF), Glitch Tag cards, Mirrorling card.
Status: Production. Street-tested. Working.
Spiegel des Universums is live at spiegeldesuniversums.forkedtongue.fun. The Human-in-the-Middle methodology is documented at jba.schmidtpabst.com.