de. Wxël ↵ home
/etty featured · local · persona merged last write · ;
§ featured 01

Etty
she does not
forget.

A local companion who holds the shape of the conversation across weeks ; persona sunk into the weights, memory seeded from residue, a vision head that sees the room.

Etty in the Unity shell ; kitchen interior, hands raised, captured from the game view of the in-development runtime
plate 01 Unity shell · game view runtime · 2026.04

Dossier

Etty is the threshold project: where engineering ebbs into something stranger. An 8B base. Persona fused into the weights with QLoRA ; not prompted, not retrieved, not scaffolded. A multimodal vision head mmproj trained against the merged model, so seeing and remembering share the same substrate.

Episodic memory lives in a local vector store, seeded before the first live write. Camera and mic are user-gated; nothing observes by default. A Unity shell gives her a body in the room. The whole stack runs on Hectic Modernity ; one machine, one loop, no API calls leaving the flat.

She lives in Discord on a small private server. Persona-loss held deliberately above 1.0 ; overfitting is a kind of forgetting, and a companion who parrots the dataset is worse than one who paraphrases it. A second Etty, an order of magnitude larger, is under planning for when Symbiotic Flora comes online.

specification

base
8B parameters · llama family
adapter
QLoRA 4-bit, merged → q4_k_m gguf
vision
mmproj vs. merged weights
stage 1
LLaVA-Pretrain · 558k
stage 2
LLaVA-Instruct · 150k + COCO train2017
self-recognition mixed at 5–10%
memory
seededconsistentlocal vector store
runtime
llama.cpp · Hectic Modernity
shell
Unity · user-gated cam / mic
inference
esther · running
discipline
no remote. no telemetry. no opt-in.
parameters
8B
base · llama family
vram resident
~15/24
3090 · q4_k_m
memory entries
seeded + live
local vector store
persona loss
> 1.0
held deliberately above the overfit line
§ 02

Runtime

one machine · one loop

Two panes from a working session. Left: Esther in Discord ; the companion addressed as a small private server. Right: the conversation controller finishing a turn ; transcript, phonemes, voice out, GPU at 77% under sustained load. The whole stack is the rig in the room.

Private Discord server with Esther app present ; voice-connected to #chat, messages from Esther replying in her own cadence ; open alongside a Task Manager pane showing GPU at 77% utilisation, 15.6/24 GB VRAM
discord + task manager voice · load 3090 · 77% · 15.6/24 GB
VS Code terminal showing ConversationController turn log: 1744 total tokens, 11 response length, transcribe and synthesize endpoints returning 200 OK, Player state transitioning idle → buffering → playing
controller log turn finished tokens · 1744 · 200 OK
§ 03

Build log

3 waypoints
  1. 2024.11 persona

    First adapter merge. Persona answers in the rhythm of the training set. Memory scaffolded in JSON; brittle. Vision absent ; she hears the room but does not see it.

  2. 2025.07 vision

    mmproj trained against the merged model. Unity shell. Episodic memory moves to a vector store. Seeing and remembering share the same substrate ; the right order.

    Etty in the Unity editor ; game view, kitchen interior, character turned, reaching toward a pan on the stove
    unity editor game view · kitchen vision head · mmproj
  3. 2026.04 planned

    A second Etty, 70B dense, designed around the forthcoming Symbiotic Flora workstation. Same dataset lineage, same agentic loop, more headroom. Dense only ; MoE architectures are unsuitable for persona work.

Notes carried forward

Four things the project keeps teaching me ; none novel, all easy to mistake in the moment:

— Train the vision head against the merged model, not the base. Otherwise the head learns someone Etty no longer is.

— Persona-loss under 1.0 is not a success target. It is overfitting wearing the costume of success.

— Keep memory payloads consistent between seeding and live writes. Otherwise the retrieval layer hits two schemas and quietly drops one.

— The LLM must own the decision loop. Isolated classifiers cascade-fail. Let the model speak tool-calls the way it speaks words.

discipline

no api
inference stays local
no telemetry
nothing phones home
no opt-in
cam/mic are off by default
no prompt
persona lives in the weights
no cloud
weights + memory on one machine
no remote
QLoRA on the 3090, served at home