How do I download Enhanced Speech Recognition on Windows 11?

Open Settings > Time & language > Speech, find the Speech recognition section, and select Download next to Enhanced Speech Recognition (or download the speech pack for your language under Settings > Time & language > Language & region). You need an internet connection to fetch the model, which is a few hundred megabytes depending on language. After the download completes, restart your PC so voice typing (Windows + H) can use the new recognition resources.

Why won't voice typing start even though my microphone works?

The most common cause is that the Enhanced Speech Recognition language resource is not installed. When the model is missing, the voice typing panel opens and the microphone is detected, but transcription never begins. Go to Settings > Time & language > Speech, download the recognition pack for your display language, and restart the PC. Also confirm the correct language is active with Windows + Spacebar, and that no OEM utility is blocking the Win+H shortcut.

How accurate is Windows 11 Enhanced Speech Recognition?

With the Enhanced Speech Recognition model installed and a clear microphone, Windows 11 voice typing reaches roughly 85-90% accuracy for conversational English, improving further as auto-punctuation and context kick in. Accuracy drops for technical vocabulary, proper nouns, and accented speech because there is no user-editable dictionary. Tools with custom vocabulary support, such as Weesper Neon Flow, let you bias the model toward domain-specific terms for higher first-pass accuracy in medical, legal, and engineering work.

Windows 11 Enhanced Speech Recognition: Enable (2026)

Q: What is Enhanced Speech Recognition in Windows 11?

Enhanced Speech Recognition is the downloadable speech recognition language resource that powers Windows 11 voice typing. It is an optional component installed per language from Settings > Time & language > Speech. Without it, the voice typing microphone panel can open but dictation will not start. With it installed, recognition for your display language is more accurate than the minimal default. It is available to every Windows 11 PC and is separate from Fluid Dictation, which requires Copilot+ hardware.

Q: Is Enhanced Speech Recognition the same as Fluid Dictation?

No. Enhanced Speech Recognition is the downloadable recognition model available on any Windows 11 PC, and it improves how accurately your speech is transcribed. Fluid Dictation is a separate, newer feature that automatically rewrites grammar, punctuation, and filler words using on-device small language models — and it is exclusive to Copilot+ PCs with a qualifying 40+ TOPS NPU. You can use Enhanced Speech Recognition without a Copilot+ PC; you cannot use Fluid Dictation without one.

Q: Does Enhanced Speech Recognition make voice typing work offline?

Not entirely. The downloaded recognition resources live on your device, but standard Windows 11 voice typing (Win+H) still routes speech through Microsoft's online Azure speech services and requires an internet connection. Enhanced Speech Recognition improves accuracy and is required for dictation to function, but it does not turn Win+H into a fully offline tool. For genuinely offline transcription with no cloud dependency, you need a local-only application such as Weesper Neon Flow.

Enhanced Speech Recognition is the optional, downloadable speech model that makes Windows 11 voice typing more accurate. You enable it in Settings > Time & language > Speech, where you select Download to install the recognition resource for your language. It is available on every Windows 11 PC, is required for dictation to start, and is distinct from Fluid Dictation, which needs Copilot+ hardware.

Introduction

If you have pressed Windows + H, watched the microphone panel appear, and then found that nothing was being transcribed, the missing piece is almost always Enhanced Speech Recognition for Windows 11. This optional download is the recognition model that powers accurate voice typing — and many users never realise they need to install it.

This guide explains what Enhanced Speech Recognition is, how to enable and download it step by step, how much accuracy it adds, and where it still falls short. We will also clarify the common confusion between Enhanced Speech Recognition (every PC) and Fluid Dictation (Copilot+ PCs only), and show when an offline alternative makes more sense.

What is Enhanced Speech Recognition in Windows 11?

Enhanced Speech Recognition is the downloadable language resource that Windows 11 uses to convert your speech into text during voice typing. It is an optional component you install per language, and without it dictation will not start even when your microphone is working.

In plain terms, it is the speech recognition model behind the Win+H toolbar. Microsoft ships Windows 11 with minimal speech components, then lets you download the fuller recognition resource for whichever display language you use. Once installed, voice typing transcribes more reliably and supports the auto-punctuation and voice commands you expect.

Key facts about Enhanced Speech Recognition:

It is an optional download, not enabled by default on every install
It is installed per language (English, French, German, and so on)
It is available to all Windows 11 PCs — no special hardware required
It is required for voice typing to actually transcribe speech
It is separate from Fluid Dictation, the Copilot+ rewrite feature

Enhanced Speech Recognition vs Voice Typing: what’s the difference?

Voice typing is the feature (the Win+H toolbar). Enhanced Speech Recognition is the model that voice typing depends on. You can think of voice typing as the engine and Enhanced Speech Recognition as the fuel — the engine turns over, but it cannot run without it.

This distinction matters because Windows surfaces them in different places. The toolbar lives wherever you type; the model lives in Settings > Time & language > Speech.

How do I download and enable Enhanced Speech Recognition?

Open Settings > Time & language > Speech, then select Download next to Enhanced Speech Recognition (or download the speech pack for your language). You need an internet connection, and you should restart the PC once the download finishes.

Here is the full process, step by step:

Open Settings (Windows + I)
Go to Time & language > Speech
Under the Speech recognition section, locate Enhanced Speech Recognition
Select Download — Windows fetches the recognition resource for your active display language
Wait for the download to complete (a few hundred megabytes, depending on language and connection speed)
Restart your PC so voice typing picks up the new model
Press Windows + H in any text field to start dictating

If you do not see the model for the language you want, add that language first under Settings > Time & language > Language & region > Add a language, then return to the Speech page and download its recognition resource.

What if the download fails or dictation still won’t start?

A failed download or stalled dictation usually traces back to one of three causes: a missing language pack, a paused download, or an OEM shortcut conflict. Address them in that order.

Missing language resource — re-open Settings > Time & language > Speech and confirm the download finished, then restart
Active language mismatch — switch to the installed language with Windows + Spacebar before pressing Win+H
Shortcut conflict — disable vendor utilities (HP, Dell, Lenovo, ASUS) that may capture the H key or require the Fn modifier

For a deeper walkthrough of the toolbar itself — settings, voice commands, and language switching — see our complete Windows 11 dictation toolbar guide.

How much accuracy does Enhanced Speech Recognition add?

With the Enhanced Speech Recognition model installed and a clear microphone, Windows 11 voice typing reaches roughly 85-90% accuracy for conversational English. Without it, dictation either fails to start or relies on minimal recognition that misreads far more words.

The accuracy gain comes from the fuller acoustic and language model that the download provides. Combined with auto-punctuation — which you enable from the toolbar’s gear icon — the result is usable for emails, notes, drafts, and casual writing.

Aspect	Without Enhanced model	With Enhanced Speech Recognition
Dictation starts	Often fails	Yes
Conversational accuracy	Poor / minimal	~85-90%
Auto-punctuation	Limited	Full support
Voice commands	Unreliable	Reliable
Technical vocabulary	Weak	Still weak (no custom dictionary)

Accuracy still drops sharply for proper nouns, brand names, medical terms, legal citations, and programming identifiers, because Windows 11 voice typing has no user-editable dictionary. To understand the factors that drive recognition quality across systems, read our analysis of voice dictation accuracy and speech recognition.

Is Enhanced Speech Recognition the same as Fluid Dictation?

No — and conflating the two is the single most common mistake. Enhanced Speech Recognition runs on any Windows 11 PC and improves transcription accuracy. Fluid Dictation runs only on Copilot+ PCs and rewrites grammar, punctuation, and filler words after transcription.

Feature	Enhanced Speech Recognition	Fluid Dictation
Hardware required	Any Windows 11 PC	Copilot+ PC (40+ TOPS NPU)
What it does	Improves recognition accuracy	Rewrites grammar & filler words
Where to get it	Settings > Speech > Download	Ships automatically on Copilot+
Processing	Recognition resource on device; Win+H still uses Azure online	On-device small language models
Availability	Every user	Copilot+ owners only

If your PC is a standard (non-Copilot+) machine, Enhanced Speech Recognition is the best native accuracy you can get — Fluid Dictation simply is not available to you, regardless of settings.

Does Enhanced Speech Recognition work offline?

Not fully. The downloaded recognition resources live on your device, but standard Windows 11 voice typing (Win+H) still routes audio through Microsoft’s online Azure speech services and requires an active internet connection. Enhanced Speech Recognition improves accuracy and is required for dictation to function — but it does not make Win+H a private, offline tool.

This is an important privacy nuance. Even with the model downloaded locally, your dictated audio can still leave your device for cloud processing. For professionals handling confidential material — doctors, lawyers, journalists, consultants — that is a hard limitation.

When you need genuinely offline dictation

For fully on-device transcription with no cloud round-trip, you need a local-only application rather than the native toolbar. This is precisely the gap Weesper Neon Flow fills: it processes speech entirely on your device using local Whisper-class models, so audio never leaves your computer.

Capability	Windows 11 Voice Typing	Weesper Neon Flow
Price	Free	5 EUR / month
Recognition model	Enhanced Speech Recognition (download)	Local Whisper-class model
Processing	Online (Azure) for Win+H	100% on-device
Internet required	Yes	No
Custom vocabulary	None	Yes (custom prompts)
AI rewrite on any PC	No (Copilot+ only)	Yes
Works on macOS	No	Yes (Metal-accelerated)
Privacy	Audio sent to Microsoft	Audio stays local

For the full technical comparison of local versus cloud transcription — latency, accuracy, and energy use — see our breakdown of on-device versus cloud transcription. The short version: a Whisper-class model on consumer hardware now matches cloud accuracy with strictly better privacy.

When should you use Enhanced Speech Recognition vs an alternative?

Use Enhanced Speech Recognition when you want free, native voice typing on Windows 11 for everyday, non-sensitive writing. Choose an offline alternative when privacy, custom vocabulary, cross-platform support, or sustained professional use matters more than zero cost.

Enhanced Speech Recognition is the right choice if you:

Dictate casual emails, notes, and search queries
Have a reliable internet connection
Do not handle confidential or regulated content
Mostly use common, everyday vocabulary

A dedicated tool like Weesper Neon Flow is the better fit if you:

Need transcription that never sends audio to the cloud
Work in a specialist domain with technical terminology
Switch between Windows and macOS
Want AI-quality rewriting without buying a Copilot+ PC

If you already followed our Windows 11 voice dictation setup guide and found the native experience limiting, the offline route is the logical next step. If you are still deciding which built-in option suits you, our guide to choosing between Windows voice typing and Voice Access covers both tools side by side.

Try Weesper Neon Flow free for 15 days — fully on-device, no cloud account, works on Windows and macOS today.

Conclusion: get the model, then decide if it’s enough

Enhanced Speech Recognition is the download that turns Windows 11 voice typing from “won’t start” into “good enough for everyday dictation.” Install it from Settings > Time & language > Speech, restart, enable auto-punctuation, and you will reach roughly 85-90% accuracy on conversational English at no cost.

But know its boundaries: it does not provide custom vocabulary, it does not make Win+H offline, and it does not unlock Fluid Dictation on standard hardware. If you dictate for hours, handle sensitive material, or need domain-specific accuracy, the native model alone will not get you there.

Ready to compare? Download Weesper Neon Flow and run it side-by-side with Windows voice typing on your next dictation task. The free trial works on macOS and Windows, processes everything on-device, and requires no cloud account.