30 interviews · 5 themes · every claim quotes a real participant

The Agent extracts every meaningful statement, clusters them by sentence embedding in the sandbox, and proposes 5–7 themes for you to confirm. Each theme card lists frequency, distribution across participants, and 3–5 verbatim quotes. Save the framework — your follow-up study auto-codes against the same taxonomy.

A peek at what you get

Theme map
Page 1 of 4·Theme map
Top theme card
Page 2 of 4·Top theme card
Quote anchor wall
Page 3 of 4·Quote anchor wall
Codebook appendix
Page 4 of 4·Codebook appendix

30 interviews · 5 themes surface

Each statement gets embedded, clustered inside the sandbox, and the Agent proposes themes for you to confirm. Each point is a participant statement; its color is the cluster it landed in.

T1Onboarding feels overwhelming28
T2Wants more keyboard control19
T3Hidden value · "Aha" moment late24
T4Anxiety about pricing transparency11
T5Trust requires audit trail14

Every theme · anchored to verbatim quotes

Not "AI summary themes". Each theme card cites 3–5 verbatim quotes from different participants — the lead researcher audits, the analysis is auditable.

T3
P-014 · founder
I didn't realize I could just paste a folder and have it written about until day five.
T1
P-007 · designer
First login I just stared. There were nine panels open and no one telling me where to start.
T4
P-022 · CFO
I need a per-task cost number before I can let anyone on my team use this with our data.

Codebook saved to Drive · next quarter reads the file

Themes + definitions + coding rules are written to a codebook.json in your Drive's research folder. Q2 starts by reading the json, then codes the new batch against the same themes and produces a delta report.

/research/onboarding/codebook.json
2.4 KB · 5 themes · 96 coded statements
in Drive
{
  "themes": [
    { "id": "T1", "label": "Onboarding overwhelm", "n": 28 },
    { "id": "T2", "label": "Wants keyboard control", "n": 19 },
    { "id": "T3", "label": "Aha late", "n": 24 },
    { "id": "T4", "label": "Pricing anxiety", "n": 11 },
    { "id": "T5", "label": "Audit-trail trust", "n": 14 }
  ]
}
onboarding-themes-may-2026.pdf
1.8 MB
codebook.qda.xml
REFI-QDA spec
statements.csv
96 rows · with theme assignment

How it works

Step 01

Drop the transcripts

Text, Markdown, DOCX, or PDFs work. If you have audio, run /tools/meeting-minutes first to transcribe — that's the cheaper / faster path. The Agent expects each transcript to identify the participant in the file name or a header line.

Step 02

Pick a coding mode · sign off on themes

Let the themes emerge from the data (inductive), code against your existing framework (deductive), or compare to last quarter's framework (longitudinal). For inductive, the Agent shows proposed themes in chat before rendering — you rename, merge, or split. Nothing is finalized without your sign-off.

Step 03

Pick up the PDF + framework

The PDF lands in your Drive: theme cards, a 2D map showing how themes relate, a quote-anchor wall (3–5 verbatim excerpts per theme), and an appendix listing every coded statement. The Agent also writes a `codebook.json` to the same Drive folder — next quarter's round reads it and codes against the same themes.

Why Vecbase for this

Every theme cites participant quotes · no "the model found 5 themes"

Cheap qualitative tools spit out theme names without grounding. The Agent's theme cards each carry 3–5 verbatim excerpts from different participants — you can read why the theme exists, not just trust a label. The codebook appendix lists every coded statement so the lead researcher can audit every assignment.

Embeddings run in the sandbox · your data does not become training data

Sentence-embedding clustering happens inside your workspace-scoped sandbox with an open embedding model. Transcripts never leave the sandbox boundary; only the named outputs (PDF, framework JSON) land in your Drive. We do not train on customer interview content.

Themes you reject get logged · the Agent doesn't sneak them back in

When you reject a proposed theme in chat ("merge these two", "this label is wrong"), the decision is logged in the framework JSON. If a future round tries to reintroduce a similar cluster, the Agent surfaces the prior rejection as context — you can confirm the new round changes things or apply the same rejection consistently.

The framework gets better every round — not from scratch every time

Most qualitative tools start at zero every project. Here the codebook persists in your workspace. Quarter 2 picks up Quarter 1's themes, adds what's new, retires what's gone — and the Agent shows you the diff. Over a year you have a research framework that captures the evolution of how users talk about your product, not 4 disconnected studies.

Frequently asked

Tell it in chat. Common edits: "merge themes A and B — they're the same finding from two angles", "split theme C — there are two distinct sub-themes hiding here", "rename theme D to 'X' — that's how this audience talks about it". The Agent re-renders the PDF with your edits and logs the decision in the framework. The Agent's job is to surface candidate structure — yours is to decide what the data is actually saying.

Get yours in under 90 seconds

Sign in, hand it over to the Agent — the finished file lands in your Drive.