VoiceBox
🚀 Visit Website

📤 Share

📝 Summary

Meta's text-guided multilingual universal speech generation model.

🏷 Own this tool?
Apply for ✅ Verified Badge

🏷 Tags

⭐ Rating

No ratings (0 ratings)
Rate this tool:

📖 Tutorials

VoiceBox

⭐ No ratings (0) 👁 0 views 📅 2026-06-10 📁 AI Audio Tools
🆓 Free Free to use via demo website; no subscription plans mentioned.
🚀 Visit Website

📝 About This Tool

VoiceBox is a state-of-the-art speech generative model by Meta AI based on non-autoregressive flow matching. It learns text-guided speech infilling at scale and can perform tasks like zero-shot TTS, cross-lingual style transfer, transient noise removal, content editing, and diverse sample generation across six languages. It generates speech up to 20x faster than autoregressive models through in-context learning.

⚡ Key Features

Zero-shot text-to-speech synthesis

Cross-lingual style transfer

Transient noise removal

Content editing

Diverse speech sample generation

Multilingual support (6 languages)

20x faster than autoregressive models

✨ Why Choose It

Non-autoregressive flow matching enables faster generation

In-context learning without task-specific training

Handles both past and future audio context

👥 Who Is It For

AI researchers

Speech technology developers

Content creators

Accessibility tool builders

❓ FAQ

Q: What languages does VoiceBox support?

A: English, French, German, Spanish, Polish, and Portuguese.

Q: Can VoiceBox remove background noise?

A: Yes, it can regenerate noise-corrupted speech to remove transient noises like doorbells or barking.

Q: Is VoiceBox open source?

A: The research paper and demos are available, but the model itself is not fully open source.

🔄 Alternatives to VoiceBox

💬 User Reviews (0)

Sort by: Most Helpful Newest

No reviews yet. Be the first!

✍️ Write a Review

Rating:

🔥 Popular Tools