VieNeu-TTS 0.3B | Learning Gallery

pnnbao-ump/VieNeu-TTS-0.3B is a compact Vietnamese text-to-speech model designed for offline/on-device use, with an emphasis on low-latency synthesis and short-reference voice cloning. The project positions this 0.3B checkpoint as a from-scratch retrain (not a shrink of the earlier 0.5B release), and calls out Vietnamese pronunciation stability and Vietnamese↔English code-switching as goals.

The other practical hook is deployment flexibility: in addition to the PyTorch weights, the author provides GGUF Q4/Q8 variants that can help with CPU/mobile inference on supported hardware. That’s a nice pattern for TTS projects, where “demo works on my GPU” isn’t the hard part — the hard part is getting something that runs reliably on everyday hardware while still sounding good.

If you want to try it quickly, start with the repo’s Gradio demo to validate your audio pipeline and reference-clip quality, then benchmark the GGUF versions for your target device. Also note the license (CC BY-NC 4.0), which makes it a better fit for research and personal projects than commercial deployments.

Quick stats from the listing feed: pipeline: text-to-speech · 4 likes · 17 downloads.

View on Hugging Face

Source listing: https://huggingface.co/models?sort=modified