openquack

Roadmap

Atomic tasks β€” every item cites a SPEC and maps to a PR. Agent contributors should claim a task by opening a draft PR; see AGENTS.md.

Reconciled against shipped code on 2026-05-29 (current release v2.0.0-alpha.16). Statuses below reflect what is actually on main, not what a spec proposed. The companion spec index tracks the same implementation state per spec.

Status


Adoption focus (current priority)

The product has a working foundation. The current cycle is about removing first-launch friction, fixing visible quality issues, and improving the surfaces new users land on β€” not building more features. Pick from this band first. Ordered by impact.

Β  Task Spec Notes
🟑 Code signing + notarisation β€” kill the first-launch Gatekeeper β€œright-click β†’ Open” dance SPEC-025 Biggest install-success unlock, still not delivered. Signing capability exists in scripts/wrap_app.sh (Developer-ID branch + hardened runtime + TSA timestamp), but no notarisation is wired (notarytool/stapler absent everywhere) and no shipped DMG is Developer-ID-signed β€” README/INSTALL still tell users to right-clickβ†’Open. Blocked on a paid Apple Developer ID ($99/yr account decision). Once that’s cleared: add notarytool submit --wait + stapler staple to make_dmg.sh/release CI, then strip the Gatekeeper caveats.
πŸ”΅ README demo gif β€” strong first-impression artifact SPEC-027 S; cheap solo win. The landing-page-polish half is already done (og/twitter cards, banner in docs/index.md). The genuinely-missing deliverable is a looping ≀2 MB demo GIF at the top of README + landing page β€” no *.gif/*.mp4 exists in-repo yet.
🟑 Mandarin auto-detect fix β€” visible quality bug (#17, open since 2026-05-09) SPEC-021 M. No fix on main yet. PR #20 (draft, CI green) banks only the categorical failure-mode metrics baseline + zh corpus β€” the user-facing fix (PR-2, model-layer language handling) is the actual deliverable. Land #20 as the baseline, then ship the fix. Pairs with #62 below.
🟑 Sparkle auto-update β€” existing users stay current without reinstalling SPEC-026 Far past spec-only: dependency + Updates pane + framework bundling all shipped (#40/#50/#49). But auto-install is a deliberate no-op β€” SUPublicEDKey omitted, appcast.xml ships zero items. To finish: generate EdDSA keys, commit the public key, sign + publish appcast items (PR-C). Gated behind SPEC-025 (notarised signed builds) for a trustworthy update path.
πŸ”΅ Voice reply to live sessions β€” modifier-key + notification voice-reply action SPEC-031a S; spec merged (#54), zero implementation on main. Injection mechanism (--resume -p vs daemon socket) resolved in impl PR.
βšͺ Re-transcribe a past dictation from History β€” recovery without re-speaking β€” (#62) New 2026-05-29 feature request. Re-run stored audio through transcription/polish from the History view, with optional explicit language + model override. Directly mitigates #17’s wrong-language case; reuses local history (SPEC-014). Needs a spec; good retention item.
βšͺ Submission tracking for awesome-mac / awesome-llm / awesome-swift lists β€” quiet but durable inbound
βšͺ Quarterly bench refresh β€” durable artifact + monthly relaunch hook β€” leverages existing 5Γ—2Γ—177 bench

Pending docs PRs (renumber required)

Two spec-doc PRs are clean and CI-green but cite already-assigned SPEC numbers and must be renumbered to the next free monotonic IDs before merge. They cross-reference each other, so renumber as a coordinated pair; do not rename the branches (renaming an open PR head ref can close it).

PR Becomes What Action
#41 SPEC-033 Privacy-respecting Pages visitor analytics (GoatCounter) Renumber 031β†’033; fix body β€œAdds SPEC-028”→033; its SPEC-032 cross-ref β†’ SPEC-034. Then merge.
#42 SPEC-034 Active-install count via Sparkle appcast hits (Cloudflare Worker) Renumber 032β†’034; fix body β€œAdds SPEC-029”→034; its two SPEC-031 cross-refs β†’ SPEC-033. Then merge. Impl gated on SPEC-026.

Feature backlog (deferred until adoption signal improves)

These have SPECs and are ready to claim, but we’re holding new feature scope until install + retention pipelines are strong enough to justify the surface area. Pick from here only when Adoption focus is empty.

Β  Task Spec Notes
🟑 Custom dictionary auto-learn (PR-A as #32; PR-B/C deferred) SPEC-022 M. PR #32 (post-paste AXObserver + correction store) is CI-green and merge-clean but on hold β€” it sits in this deferred band and Adoption focus isn’t empty. No auto-learn code is on main. (Distinct from the shipped static-dictionary seed, PR #59, made invisible again in alpha.16.)
πŸ”΅ Agent session protocol + PassthroughAgent + conversation panel SPEC-006 M
πŸ”΅ ClaudeCodeAgent β€” long-lived subprocess, streaming events SPEC-006 M
πŸ”΅ Approval prompt UX (overlay morph + buttons) SPEC-006 S
πŸ”΅ Settings β€” Privacy + Agent panes SPEC-006 S; lands with agent impl
πŸ”΅ TextPolishEngine protocol + OllamaPolishEngine (HTTP) SPEC-007 S
πŸ”΅ MLXLMPolishEngine (in-process via mlx-swift-lm) SPEC-007 M
πŸ”΅ Settings β†’ Polish pane (engine picker, model picker) SPEC-007 S
πŸ”΅ Bench polish WER delta + latency on openquack-bench SPEC-007 S
πŸ”΅ Domain-term accuracy bench (e.g. β€œClaude Code” not β€œcloud code”) SPEC-007 S
πŸ”΅ β€œSend-confidence” bench: % of utterances clean enough to ship as-is SPEC-007 S
πŸ”΅ Engine prompt-token cache β€” offline-path parity for customWords SPEC-032 S; perf-fix groundwork for re-shipping the dictionary seed default
πŸ”΅ Per-app tone profiles SPEC-024 M; needs SPEC-007 first
πŸ”΅ App localisation (i18n) β€” runtime UI strings (READMEs already translated) SPEC-019 M
πŸ”΅ ANE-cache-only model footprint (investigation) SPEC-029 S
πŸ”΅ ANE cache footprint: volunteer measurement campaign SPEC-030 S
βšͺ OllamaAgent (local HTTP) SPEC-006 ext S
βšͺ MLXLMAgent (in-process via mlx-swift-lm) SPEC-006 ext M
βšͺ Active-app context: feed foreground app + focused field text into Whisper prompt bias and polish/agent prompt β€” M
βšͺ Investigate streaming for medium-length (15–30s) audio: bench WER vs. wall-time at lower targetChunkSeconds SPEC-012 ext S
βšͺ Live partial transcripts in pill/popover while speaking β€” M
βšͺ System-audio capture (meeting mode) β€” ScreenCaptureKit
βšͺ Action confirmation UI for high-risk agent calls β€” privacy gate
βšͺ Per-agent transcript history pane (opt-in, local-only) β€” β€”
βšͺ Linux / Windows ports β€” post-2.0

Done

Β  Task Spec Notes
🟒 Agent kickoff: voice-launched claude --bg session via separate hotkey β€” daemon-managed, notification-driven SPEC-031 shipped in v2.0.0-alpha.14 (#53); post-ship fixes #57/#58
🟒 Update flow: install-aware banner + silent one-click brew upgrade (no Terminal, quit-first relaunch) SPEC-011 shipped via c193f0e; SemVer-precedence fix #61
🟒 Copy button on the last-transcript card SPEC-020 merged in #18
🟒 Send-feedback menu item β€” one click from status item to GitHub issue chooser SPEC-018 merged in #5
🟒 Dictation distribution + personal performance stats (Longest dictation, avg realtime Γ—, length histogram) SPEC-028 merged in #45
🟒 fn / Globe key as a bindable hotkey (bare fn or fn+key) β€” closes #23 SPEC-003a infra #28 (PR-A) + UI wiring a8371b8 (PR-B)
🟒 Launch at login (SMAppService toggle in Settings β†’ General) β€” closes #29 SPEC-023 merged in #33 (reconcile) + #39 (UI) β€” issue #29 still open, needs closing
🟒 Recording overlay follows the cursor across monitors β€” closes #25 SPEC-004 merged in #27
🟒 Usage stats pane: words dictated, time saved, audio processed β€” local-only SPEC-013 merged in c91da06
🟒 Local audio + transcript history β€” local-only, retention cap SPEC-014 merged in c91da06
🟒 Stream transcription for long audio (>~30s) β€” chunk while recording SPEC-012 perf; user never sees partials
🟒 App shell β€” SwiftPM target, menu bar, About panel SPEC-010 β€”
🟒 Audio capture β€” AVAudioEngine β†’ 16 kHz mono WAV SPEC-001 β€”
🟒 Global hotkey (βŒƒβ‡§Space toggle, KeyboardShortcuts pkg) SPEC-003 β€”
🟒 Record β†’ WhisperKit medium (en) β†’ transcript in popover + clipboard SPEC-002 β€”
🟒 Floating recording-state pill (top-centre, click-through) SPEC-004 β€”
🟒 CGEvent ⌘V auto-paste at cursor (Accessibility prompt + clipboard fallback) SPEC-005 β€”
🟒 Onboarding flow (Welcome β†’ Mic β†’ Paste β†’ Hotkey β†’ Done) β€” β€”
🟒 Settings scene MVP (General / Models / Shortcut / About) β€” β€”
🟒 Smart text post-processing (capitalise, punct, fillers) β€” β€”
🟒 Live level meter + push-to-talk SPEC-001 ext β€”
🟒 VAD auto-stop + sounds + custom dictionary β€” β€”
🟒 App icon (procedural cream-gradient duck) β€” β€”
🟒 DMG + Homebrew cask + README polish β€” β€”
🟒 WhisperKit engine SPEC-002 primary; Apple Silicon Metal
🟒 Lightning engine (Python subprocess) SPEC-002 bench-only baseline
🟒 Metrics: WER / CER / RTF / RSS / cold-start SPEC-002 OpenQuackKit/Metrics/
🟒 Corpus: 177 clips (TTS / multilingual / LibriSpeech / noise-aug) β€” bench/corpus/
🟒 Bench rerun on enriched corpus β†’ BENCHMARKS.md β€” M4/16GB matrix
🟒 openquack-cli (single-file transcribe) SPEC-002 β€”
🟒 SPM scaffolding (Kit + bench + CLI) β€” Package.swift, three targets
🟒 Vision + roadmap + AGENTS.md + spec scaffold β€” β€”

Becoming autonomous

North-star reflection, not a committed milestone. No item below is claimed, dated, or gated on the Adoption focus band. It records where the repo would have to change to let agents run the contribution loop on their own β€” so the gaps are written down, not rediscovered.

What exists β€” the contract. Specs-as-law (AGENTS.md), four bench harnesses (openquack-bench / -stream-bench / -polish-bench / -bias-bench), issue + PR templates, the GTM trackers and playbooks, and persistent agent memory. An agent that reads these knows what good work looks like and how it is judged.

The meta-gap β€” no machinery. The contract is human-readable, not machine-executable. .claude/ holds only worktrees/ β€” no committed agents/, commands/, workflows/, or settings. AGENTS.md is a manual a human (or a prompted agent) follows by hand; nothing runs the backlog β†’ claim β†’ branch β†’ implement β†’ draft-PR loop. CI smoke-tests the tiny model on the short corpus only, so quality regressions pass unseen.

Three unlocks.

Β  Unlock Spec State
πŸ”΅ Committed .claude/ agent harness β€” named roles, saved workflows, trigger + permissions model with human gates on irreversible steps SPEC-037 spec proposal
πŸ”΅ CI eval-gate β€” fetchable corpus, medium-model WER/RTF thresholds on every quality-path PR SPEC-038 spec proposal
🟑 Signed, tag-triggered release pipeline β€” the cut a release-bot can only PREP today SPEC-025 partial
βšͺ Privacy-preserving field-feedback loop β€” local self-diagnosis β†’ consented reports β†’ opt-in aggregates SPEC-036 / 039 / 040 building on shipped diagnostics

Autonomy ceilings, per domain.

Realistic target. Agents run the loop; humans gate the irreversible and outward steps β€” merge, release, public post.


How to claim a task

  1. Pick from Adoption focus first. Move to Feature backlog only if Adoption is empty.
  2. Open an issue using the Agent Task template; mark yourself as owner.
  3. Read the cited SPEC.
  4. Open a draft PR within ~24h naming the task in the title.
  5. Follow AGENTS.md for PR shape and required tests.