My AI Stack: A Technical Breakdown of How I Actually Work With AI

Most people use AI like a search engine with better grammar. I use it like a business partner.

That shift — from tool to partner — changed everything. Not because the models got smarter (though they did). But because I changed how I set things up. The memory architecture, the coding discipline, the execution model — it all compounds.

Here's the honest, technical version.

The Core Problem: AI Has No Memory

Every session starts from zero. No context about your stack, your decisions, your ongoing projects, your architecture choices. You either re-explain everything each time (slow), or you get generic output that doesn't fit your situation (useless).

Everything I've built is a solution to this problem.

Layer 1: CLAUDE.md — Persistent Context Files

The foundation is a file called CLAUDE.md. It's plaintext Markdown that Claude Code automatically loads at the start of every session. An employee handbook that gets read every morning before work begins.

Two levels:

Global ~/.claude/CLAUDE.md — Loaded in every session, everywhere:

Identity and business context
Hard rules that must never be broken (critical separations, explicit "NEVER" tables)
Stack and tooling preferences
Custom command vocabulary ("when I say X, do Y")
Database location and key query patterns
Directory structure

Project-level CLAUDE.md — Loaded only in that project's directory:

Dev commands (pnpm dev, pnpm build, specific scripts)
Architecture decisions specific to that codebase
Current focus area and known gotchas

The design principle: load only what's needed, in order of criticality. The global file has a priority tier system — P1 (always read), P2 (daily work reference), P3 (search on demand). Not everything gets loaded into the context window; some things are pointers to look up.

What Goes In a Good CLAUDE.md

The mistake most people make: they try to document everything. The right approach: document only what changes AI behaviour.

High-value entries:

Rules that prevent catastrophic mistakes (e.g. "never mix up email systems — wrong system = disaster")
Vocabulary mapping ("when I say 'standup', run these specific queries")
Stack and tool preferences that would otherwise get generic answers
Database schema patterns and quick query templates
Paths to critical scripts

Useless entries:

Information the AI already knows
Decisions you haven't made yet
Anything more than two levels deep in rarely-used detail

The file evolves. After any session where I re-explained something that should already be known, I add it.

Layer 2: Semantic Knowledge Search

For context too large for CLAUDE.md but too important to ignore, I use a semantic search layer over my own documents — a vector-indexed collection of Markdown files covering project contexts, reference material, process docs, and notes.

The rule: search first, don't load full files. Instead of loading an entire 500-line context document, I ask "how do I handle X in this system?" and get the relevant 3-line answer.

Collections:

Config — skills, contexts, reference docs (~385 documents)
Business — project contexts, process docs (~25 documents)
Notes — ideas, research, journal (~10 documents)

The interface is natural language queries that resolve to specific document fragments. Context window stays lean; large knowledge base remains accessible on demand.

Layer 3: Database-Backed Session Memory

The deepest memory layer: a SQLite database that persists everything across sessions, weeks, and months.

Every time I finish a significant chunk of work, a session summary is written:

INSERT INTO sessions (
  project, summary, focus, open_questions,
  next_steps, files_touched, date
) VALUES (?, ?, ?, ?, ?, ?, date('now'));

When I start work on a project, the first thing that happens:

SELECT summary, focus, open_questions, next_steps
FROM sessions
WHERE project = ?
ORDER BY id DESC LIMIT 1;

Within ten seconds of starting, I have full context on exactly where I left off, what was unresolved, and what comes next.

The database covers 20 tables: deals, contacts, meetings, tasks, KPIs, goals, habits, finances, content, ideas, daily logs, sessions, travel. Single source of truth for everything.

The database-first rule: whenever anything happens — task completed, deal progressed, meeting occurred, revenue received — the database gets updated first. Not a Notion page. Not a Markdown file. The database.

This sounds tedious. It's not, because the AI mediates all database access in natural language. "Log this meeting" becomes an INSERT. "Show pipeline" becomes a SELECT. "Mark that lead as not interested" becomes a PATCH to the CRM API. The underlying queries happen transparently.

The Coding Workflow: Plan-First Discipline

This is where most AI-assisted coding goes wrong. People describe what they want, get code, notice problems, iterate. It feels fast. It's actually the slow path — you spend enormous time debugging output built on wrong assumptions.

I follow a strict three-phase workflow:

Phase 1: Research

Before touching anything, the codebase gets explored comprehensively — every relevant file, every pattern that could be affected, every existing implementation that relates to the task.

Specific language matters: "Go through everything thoroughly. Understand all the intricacies of how X is implemented before we discuss any changes."

The output goes into research.md. Not chat. A file. Because:

It survives context window compression
It can be referenced in later phases
It forces structured thinking over stream-of-consciousness exploration

research.md contains: what exists, how it works, what could break, what patterns are already established, what decisions were made and why.

Phase 2: Plan + Annotation Cycle

Research informs plan.md — a detailed implementation plan with file paths, code snippets showing the approach (not the full implementation), trade-offs considered, and a granular task checklist.

I read the plan in my editor and annotate it inline:

<!-- MALTE: Don't use the generic auth middleware here —
we have a custom one at /lib/auth/project-auth.ts that
handles tenant context. -->

<!-- MALTE: DB connection needs the read replica, not
the primary — see config pattern in /lib/db -->

Send it back. Get a revised plan. Annotate again if needed. This cycle repeats 1-4 times. By the end, every significant architectural decision has been explicitly approved. The AI isn't guessing — it has domain constraints baked in.

Why this works: the plan is a shared artifact. It's the interface between my domain knowledge and the AI's implementation capability. Annotations are how domain knowledge flows in. Revisions confirm it was understood.

Phase 3: Implementation

Once the plan is approved, one command:

"Implement it all. Mark each item completed in plan.md as you go. Do not stop. Continuously run typecheck after each file."

Then I step back. The AI works through the checklist, ticking items, running type checks, fixing errors as they appear. I supervise loosely — making item-level decisions (skip this, modify that) without driving every keystroke.

The plan.md becomes a live progress document.

Dual-Agent Architecture: Architect + Executor

For non-trivial coding, I use two AI tools with distinct roles.

Claude Code is my architect and copilot. This is the tool I talk to. It has my full CLAUDE.md context, understands the business domain, reviews code critically, and produces task specs. It's the brain I trust with what to build and how.

OpenCode is my coding executor. Once a plan is approved and a task spec is written, I hand it off and let it run. It writes the actual files, runs tests, fixes errors — in the background while I move on.

The interface between them: a detailed task spec saved to a file before execution starts. The spec contains:

What to build in precise terms
Constraints and patterns to follow
Files that definitely need to change
Acceptance criteria
Explicitly what not to do

This separation matters for a reason that isn't obvious: planning and executing require different cognitive modes. When the same session is both designing architecture AND writing code, you get sloppy output because execution pressure corrupts the planning. Separating them forces each to be done properly.

Practically: Claude Code in the terminal for all strategic work. OpenCode spinning in the project directory for implementation. Review the diff when it's done.

Skills: Reusable Prompt Programs

The third piece of the architecture: a library of skills — specialized prompt templates for recurring task types.

A skill is a Markdown file encoding the best approach for a specific category of work: writing outreach emails, creating proposals, doing SEO audits, running retrospectives, estimating API costs, generating database migrations.

Instead of starting from a blank prompt, I invoke a skill by name. The system loads the relevant context, asks for inputs, and produces output that follows established patterns.

The library covers:

Dev: deploy, typecheck, migrate, release, PR creation
CRO: page optimization, signup flows, onboarding, paywalls
Marketing: copywriting, launch strategy, pricing, psychology
SEO: audit, programmatic SEO, schema markup
Client work: proposals, contracts, invoices, scoping, onboarding

Skills solve the consistency problem. The same task done ten times by ten different prompts produces ten different quality levels. Skills encode the prompt that produces the best output, every time.

The AI OS: How It All Connects

A real morning session looks like this:

Start of day: type standup → the AI queries the database (pipeline, KPIs, tasks, last sessions), checks what outreach needs to run, returns a concise briefing. I state my three priorities and they get logged as tasks.

Starting a coding task: describe the feature → Claude explores the codebase and writes research.md → produces plan.md → I annotate in my editor → plan gets revised and approved → hand off to OpenCode with task spec → it runs in background → I review the diff, ask Claude to review it critically → merge.

Capturing context: any decision or insight → "idea: [description]" → logged to database with timestamp. Any meeting → "log meeting: [notes]" → logged with outcomes and next steps. Any revenue → "log revenue: [amount]" → goals table updated. End of day → "eod" → session summary written, tasks updated.

The AI is the operating system. The database is the filesystem. Natural language is the shell.

Quick Reference: The Stack

| Layer | Tool | Purpose | |-------|------|---------| | Global context | ~/.claude/CLAUDE.md | Always-loaded rules and preferences | | Project context | ./CLAUDE.md | Per-project commands and patterns | | Knowledge base | Vector-indexed markdown | Semantic search over large doc sets | | Session memory | SQLite database | Cross-session continuity | | Planning | Claude Code | Architecture, review, task specs | | Execution | OpenCode | Code writing, tests, iteration | | Reusable patterns | Skills library | Consistent quality for recurring tasks | | Business OS | SQLite + AI interface | Deals, tasks, habits, KPIs unified |

What This Actually Took

Honest accounting: this didn't emerge fully formed. The CLAUDE.md evolved over months of noticing what I kept re-explaining. The database schema went through three iterations. The plan-first discipline required real habit change — the instinct to just ask for code immediately is strong.

But the compounding effect is real. Every refinement to the context files improves every future session. Every session summary makes the next project start faster. Every new skill permanently raises the quality floor for that category of work.

The question isn't "does this take time to set up?" It does. The question is whether the compounding return justifies it.

For anyone doing serious, sustained work with AI: yes.