First Run Wizard - undefined Docs

The Setup Wizard

When you open FiavaionDictate for the first time, the Setup Wizard runs automatically. It walks you through five short steps — the whole thing takes about two minutes if you skip AI setup, or five to ten minutes if you want to configure a provider.

None of the AI steps are mandatory. Dictation works without any AI configured. You can click “Skip” on steps 2, 3, and 4 and revisit them any time from Settings → AI.

Wizard WalkthroughTechnical Details

Step 1 — System Check

The wizard starts by making sure everything is in place.

What you’ll see: A checklist of three items, each showing a green tick or a red warning:

Server reachable — confirms the Python server is running and responding
Browser compatible — confirms your browser supports the Web Speech API
Microphone available — confirms your browser can access your microphone

If the server check fails: The Python server isn’t running. Close the wizard, make sure start.bat (Windows) or start.command (Mac/Linux) is running, and reload the page.

If the browser check fails: You’re using Firefox or an older browser. Open FiavaionDictate in Chrome or Edge.

If the microphone check fails: Click Allow when Chrome asks for microphone permission. If you already dismissed that prompt, click the mic icon in Chrome’s address bar and change it to Allow, then reload.

Once all three items show green ticks, click Next.

Step 2 — Local AI Setup

This step helps you set up Ollama — a free tool that runs AI models on your own computer. Nothing you dictate ever leaves your machine when you use Ollama.

What you’ll see: A list of three recommended models with a one-click install command for each.

Model	Download Size	Speed	Best For
Gemma 3.4B	~3 GB	Fast	Quick corrections, everyday writing
Llama 3.2 3B	~2 GB	Fastest	Older or lower-spec machines
Mistral 7B	~5 GB	Medium	Long documents, complex writing

To install Ollama:

Click the link to ollama.com/download and run the installer
Come back to the wizard — it will detect Ollama automatically
Click the Pull button next to your chosen model (or copy the command and run it in a terminal)
Wait for the download to finish — the progress bar updates in real time

Not sure which model to pick? Start with Gemma 3.4B. It’s a good balance of speed and quality for most dictation correction tasks. You can add more models later.

Want to skip this for now? Click Skip — you can always set up Ollama later from Settings → AI → Local AI.

Step 3 — Cloud AI Options

If you’d prefer to use a cloud-based AI service, this step shows you the options. Cloud AI is generally more capable than a small local model, but your text is sent to the provider’s servers when you request a correction.

What you’ll see: A comparison of the three supported providers.

Provider	Cost	Privacy	Quality
Google Gemini	Free tier available	Sent to Google	Excellent
Anthropic Claude	Paid	Sent to Anthropic	Excellent
OpenAI GPT-4	Paid	Sent to OpenAI	Excellent

The Fiavaion recommendation: Start with Google Gemini. It has a genuinely free API key — no credit card, no subscription. You get 250 requests per day on Flash and unlimited on Flash-Lite. That’s more than enough for regular dictation use.

Click any provider card to jump to setup instructions. Or click Skip to configure AI later.

Step 4 — API Key Setup

If you chose a cloud provider in Step 3 (or already have a key), this step walks you through entering it.

For Google Gemini:

Go to aistudio.google.com
Sign in with your Google account
Click Get API Key in the left sidebar
Click Create API Key and copy it
Paste it into the Gemini key field in the wizard

For Anthropic Claude:

Go to console.anthropic.com
Sign in and go to API Keys
Click Create Key, copy it, paste it into the wizard

For OpenAI:

Go to platform.openai.com
Go to API Keys, click Create new secret key, copy and paste it

After pasting a key, click the Test button. The wizard sends a tiny test request to confirm the key is valid and the connection works.

“Your key is locked in your browser — we never see it.” When you paste a key, it’s immediately encrypted using AES-GCM encryption and stored only in your browser’s storage. It is never sent to Fiavaion. When you use AI correction, the key goes directly from your browser to the AI provider’s servers.

Step 5 — Ready to Dictate

You’ve made it. This screen shows a quick summary of what’s configured and a pointer to your first voice command.

Try this right now: Click Open Dictation to close the wizard and open the main interface. Then:

Click the microphone button (or press Space with the text area focused)
Say: “hello world period”
You should see: Hello world.

What just happened?

“hello world” was transcribed as you spoke
While you were speaking, the text appeared in light grey — that’s “interim” text (the engine’s live guess, not yet committed)
When you paused, the text turned white — that’s the final committed result
“period” was recognised as a voice command and converted to a . instead of the word “period”

That’s the core of FiavaionDictate. Everything else — AI correction, formatting commands, macros, accessibility profiles — builds on top of this.

Step 1 — System Check (technical)

The system check makes three async calls when the wizard opens:

Server reachability: fetch('/api/health') with a 3-second timeout. If the response is not 200 OK, the check fails. In GitHub Pages mode this check is skipped (always passes).
Web Speech API availability: typeof window.SpeechRecognition !== 'undefined' || typeof window.webkitSpeechRecognition !== 'undefined'. The webkit-prefixed version covers Chrome/Edge. Firefox returns false for both.
Microphone availability: navigator.mediaDevices.getUserMedia({ audio: true }) — this triggers the browser permission prompt if not already granted. The check resolves to pass/fail based on whether the permission is granted. The audio stream is immediately stopped after the check; no audio is recorded or retained.

The wizard will not proceed to Step 2 if any of the three checks fails. There is no way to bypass this in the UI — the Next button is disabled. If you need to skip system check (e.g., running a headless test), set localStorage.setItem('wizard_skip_system_check', '1') before opening the wizard.

Step 2 — Local AI Setup (technical)

On entering Step 2, the wizard polls http://localhost:11434/api/tags with a 2-second timeout to detect a running Ollama instance. If Ollama responds, the installed models list is displayed. If not, the “Install Ollama” prompt is shown.

The model list is re-polled every 5 seconds while the step is open, so newly pulled models appear without a manual refresh.

The Pull button in the wizard opens the Ollama API endpoint POST http://localhost:11434/api/pull with {"name": "<model>", "stream": true} and streams the download progress via Server-Sent Events. The progress bar reflects the completed / total bytes from the stream.

To skip Ollama detection entirely on a corporate network where localhost:11434 is blocked, set localStorage.setItem('wizard_skip_ollama', '1').

Step 3 — Cloud AI Options (technical)

This step is purely informational — it renders a static comparison table with links. No API calls are made. The “recommended” badge on Gemini is hardcoded.

The privacy model for cloud AI: when a correction is requested, the FiavaionDictate frontend calls /api/ai/proxy (local server) which forwards the request to the provider’s API. The local server adds the API key from the request header, never logs request bodies, and returns the response verbatim. In GitHub Pages mode, requests go directly to the provider from the browser (requires the provider to support CORS — Gemini does, Claude and OpenAI do not, so cloud AI for those providers is local-server-only).

Step 4 — API Key Setup (technical)

Encryption implementation:

Keys are encrypted using the Web Crypto API:

Key derivation: PBKDF2 with SHA-256, 100,000 iterations, salt from crypto.getRandomValues(new Uint8Array(16))
Encryption: AES-GCM with a 256-bit key, IV from crypto.getRandomValues(new Uint8Array(12))
Storage: localStorage key fd_apikey_<provider> holds a JSON object { salt, iv, ciphertext } (all base64-encoded)

The PBKDF2 input material is a combination of navigator.userAgent and window.location.origin — this ties the encrypted key to the browser profile. The same key cannot be decrypted in a different browser profile or on a different machine.

Test request: The test button sends a minimal request to the provider (a single-token completion for Gemini/OpenAI, or a models list call for Claude) to verify the key is valid without consuming meaningful quota.

To reconfigure keys after the wizard: Settings → AI → API Keys. The same encryption applies. Keys can be cleared individually per provider or all at once.

Step 5 — Ready screen (technical)

The ready screen reads the current configuration from localStorage and displays a summary. After closing the wizard, localStorage.setItem('wizard_completed', '1') is set. On subsequent loads, this flag is checked first — if present, the wizard is skipped entirely.

To force the wizard to run again: Open browser DevTools → Application → Local Storage → http://localhost:8080 → delete the key wizard_completed. Reload the page.

Interim vs final transcription colours:

The Web Speech API fires two event types:

onresult with isFinal: false — interim results, displayed in color: rgba(255,255,255,0.45) (dim grey)
onresult with isFinal: true — final committed result, displayed in color: #ffffff (white)

Interim results are speculative and may change as the engine receives more audio context. Final results are committed and run through the voice command parser. If a final result contains a recognised command phrase (e.g., “period”), the command is executed and the phrase is removed from the output.

Re-running the Wizard

You can restart the wizard at any time:

Go to Settings (gear icon, bottom-right)
Click Setup Wizard in the left navigation
Click Restart Wizard

This resets your wizard progress flag but does not clear your existing API keys or AI configuration.

Next Steps

Configure AI in detail → AI Provider Setup
Start dictating → Voice Commands Reference
Personalise your experience → Accessibility profiles in Settings → Profiles