Status: draft (M3 candidate; benched alongside SPEC-007 in M2.5)
Owner: OpenQuackKit/Polish/ (extends TextPolishEngine from SPEC-007)
Last updated: 2026-04-29
Augment the LLM polish step with active-app context so the polished transcript reads correctly for where it’s about to be pasted. The same spoken sentence should produce a Slack-shaped DM in Slack, prose in Pages, and code/comment style in VS Code, and domain terms should resolve correctly given the context (e.g. “income tax” — heard by Whisper from “in-context” — gets corrected when the foreground app is VS Code).
This is the M3 row from docs/ROADMAP.md:
Active-app context: feed the foreground app + focused input field’s surrounding text into Whisper’s prompt bias and the polish/agent prompt, so domain terms resolve correctly and the agent has the same context the user does.
Picking the SPEC-007 default model without testing in-context behaviour risks shipping a model that polishes well in isolation but ignores context when given it. We bench in-context cases now so the model recommendation considers all three dimensions (see SPEC-007 §Quality gates). Implementation lands in M3 once SPEC-007 has shipped its base polish UI.
Extend PolishContext from SPEC-007:
public struct PolishContext: Sendable {
public let language: String?
public let foregroundApp: AppContext? // nil if user disabled in Settings
public let timestamp: Date
}
public struct AppContext: Sendable {
public let bundleID: String // e.g. "com.tinyspeck.slackmacgap"
public let displayName: String // "Slack"
public let category: AppCategory // coarse bucket — see below
}
public enum AppCategory: String, Sendable {
case chat // Slack, Discord, iMessage, Teams
case email // Mail, Outlook, Spark
case code // VS Code, Xcode, JetBrains, Cursor
case docs // Pages, Word, Google Docs (Safari/Chrome)
case terminal // Terminal, iTerm, Warp
case browser // generic
case other
}
The category is what the prompt actually consumes; the bundle ID is kept for telemetry and future per-app tuning.
Append a single context line before the user message:
[Context: writing in {category} ({displayName})]
{raw_transcript}
Per-category nudges baked into the system prompt:
chat — keep it short, casual, no full bullets unless the user
clearly listed several items.email — formal sentences, no bullets unless requested, end with
proper punctuation.code — preserve identifiers, don’t add prose; if input is clearly a
code comment, format as one line.docs — paragraph form, prefer prose over bullets.terminal — single-line command-shaped output if input is a command;
otherwise prose. Do not invent flags.browser / other — fall back to base SPEC-007 behaviour.These nudges are appended to the SPEC-007 system prompt, not a replacement.
In bench/polish_corpus/cases.jsonl, in-context cases use the
in_context category and group N raws × M contexts:
{"id": "ctx_001_chat", "category": "in_context", "language": "en",
"raw": "ok so I'm thinking we drop the model and pick one of the smaller ones",
"app_context": "chat",
"references": ["thinking we drop the current model and pick a smaller one"],
"must_contain": [], "must_not_contain": []}
{"id": "ctx_001_email", "category": "in_context", "language": "en",
"raw": "ok so I'm thinking we drop the model and pick one of the smaller ones",
"app_context": "email",
"references": ["I'm thinking we should drop the current model and pick a smaller one."],
"must_contain": [], "must_not_contain": ["ok so"]}
{"id": "ctx_001_code", "category": "in_context", "language": "en",
"raw": "ok so I'm thinking we drop the model and pick one of the smaller ones",
"app_context": "code",
"references": ["// drop the current model and pick a smaller one",
"Drop the current model and pick a smaller one."],
"must_contain": [], "must_not_contain": ["ok so"]}
Initial set: 10 raws × 3 contexts = 30 paired cases. The judge prompt
sees the app_context slot and scores whether the output fits.
engine = off is a no-op.email, not browser. Phase 1 says no (that needs
AX/tab-URL access). Revisit after shipping.context.py — gathered foreground app via osascript; the Swift
port should use NSWorkspace.shared.frontmostApplication instead.docs/ROADMAP.md M3 row 3.