ComfyUI Qwen3 ASR
Visit ProjectA high-performance ComfyUI integration for the **Qwen3-ASR** model family. This extension provides state-of-the-art speech-to-text transcription, language identification, and precise word-level timestamps using the novel Qwen3 Forced Aligner.

Features
- High Accuracy: Supports Qwen3-ASR 0.6B and 1.7B models.
- Multilingual: Supports 52 languages and dialects with automatic language detection.
- Word-Level Timestamps: Optional integration with Qwen3-ForcedAligner-0.6B.
- Flexible Precision: Support for
bf16,fp16, andfp32to balance VRAM and speed. - Automatic Resampling: Internally handles audio resampling to 16kHz for optimal model performance.
- FlashAttention 2: Integrated support for FlashAttention 2 to significantly reduce VRAM usage and accelerate inference.
Preview

Installation
Manual Installation
Navigate to your ComfyUI
custom_nodesdirectory:cd ComfyUI/custom_nodesClone this repository:
git clone https://github.com/kaushiknishchay/ComfyUI-Qwen3-ASRInstall the dependencies using your ComfyUI Python executable:
# For portable versions, use the full path to your python.exe python.exe -m pip install -r ComfyUI-Qwen3-ASR/requirements.txt(Recommended) Install FlashAttention 2 for maximum performance:
# For FlashAttention 2 (requires compatible NVIDIA GPU) python.exe -m pip install -U flash-attn --no-build-isolationOR
Install via ComfyUI Manager
- Search
ComfyUI-Qwen3-ASRbyKaushik
Model Setup
Models must be placed in the models/diffusion_models/Qwen3-ASR/ directory. Each model should be in its own subfolder containing the full weights and configuration.
Recommended Directory Structure:
ComfyUI/models/diffusion_models/Qwen3-ASR/
├── Qwen3-ASR-1.7B/
│ ├── config.json
│ ├── model.safetensors
│ └── ...
├── Qwen3-ASR-0.6B/
│ └── ...
└── Qwen3-ForcedAligner-0.6B/
└── ...Downloading Models
You can use the huggingface-cli to download the models directly into the correct folders:
huggingface-cli download Qwen/Qwen3-ASR-1.7B --local-dir models/diffusion_models/Qwen3-ASR/Qwen3-ASR-1.7B
huggingface-cli download Qwen/Qwen3-ForcedAligner-0.6B --local-dir models/diffusion_models/Qwen3-ASR/Qwen3-ForcedAligner-0.6BTroubleshooting
- Python 3.13 Issues: If you encounter an
UnboundLocalErrorrelated tolazy_loader, ensure you have updated the package:python.exe -m pip install -U lazy-loader - VRAM Usage: The 1.7B model requires approximately 4-6GB of VRAM in
bf16mode. If you run out of memory, try the 0.6B model or usecpumode.
License
This project is licensed under the MIT License. The Qwen3 models are subject to the Qwen License Agreement.