Microsoft launched its first in-house AI models, MAI-Voice-1 for rapid speech generation and MAI-1-preview for instruction-following tasks.
Microsoft AI, a division dedicated to developing and integrating AI technologies across Microsoft, announced the release of MAI-Voice-1, its first high-fidelity and expressive speech generation model. The model is currently available in Copilot Daily and Podcasts, as well as in the new Copilot Labs experience, allowing users to explore expressive speech and storytelling capabilities
MAI-Voice-1 delivers natural audio in both single- and multi-speaker scenarios and is designed for speed, generating a full minute of speech in under one second on a single GPU, making it one of the most efficient speech generation systems currently available. The model enables applications such as interactive “choose your own adventure” stories or personalized guided meditations, showcasing the potential of voice as a primary interface for AI companions.
Microsoft AI Launches Public Testing Of MAI-1-Preview, Its First Fully Trained Foundation Model
Apart from that, Microsoft AI has initiated public testing of MAI-1-preview on LMArena, a widely used platform for community model evaluation. This marks the division’s first fully trained foundation model and provides an early look at capabilities that will be integrated into Copilot. The MAI-1-preview is an in-house mixture-of-experts model, pre- and post-trained on approximately 15,000 NVIDIA H100 GPUs, designed to handle instruction-following and deliver helpful responses for everyday tasks.
The model will be gradually introduced for selected text-based use cases in Copilot, allowing Microsoft AI to gather feedback and refine performance. The team combines in-house models, partner contributions, and open-source innovations to optimize results across millions of interactions daily. MAI-1-preview is also available to trusted testers, with API access applications open to collect insights on its strengths and areas for improvement.
Looking ahead, Microsoft AI plans to advance the model further while orchestrating a suite of specialized models tailored to different user intents and scenarios. The division aims to continue developing leading AI solutions and making them accessible to users worldwide.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Microsoft Debuts First In-House AI Models: MAI-Voice-1 For Ultra-Fast Speech And MAI-1-Preview For Instruction-Following Tasks
In Brief
Microsoft launched its first in-house AI models, MAI-Voice-1 for rapid speech generation and MAI-1-preview for instruction-following tasks.
Microsoft AI, a division dedicated to developing and integrating AI technologies across Microsoft, announced the release of MAI-Voice-1, its first high-fidelity and expressive speech generation model. The model is currently available in Copilot Daily and Podcasts, as well as in the new Copilot Labs experience, allowing users to explore expressive speech and storytelling capabilities
MAI-Voice-1 delivers natural audio in both single- and multi-speaker scenarios and is designed for speed, generating a full minute of speech in under one second on a single GPU, making it one of the most efficient speech generation systems currently available. The model enables applications such as interactive “choose your own adventure” stories or personalized guided meditations, showcasing the potential of voice as a primary interface for AI companions.
Microsoft AI Launches Public Testing Of MAI-1-Preview, Its First Fully Trained Foundation Model
Apart from that, Microsoft AI has initiated public testing of MAI-1-preview on LMArena, a widely used platform for community model evaluation. This marks the division’s first fully trained foundation model and provides an early look at capabilities that will be integrated into Copilot. The MAI-1-preview is an in-house mixture-of-experts model, pre- and post-trained on approximately 15,000 NVIDIA H100 GPUs, designed to handle instruction-following and deliver helpful responses for everyday tasks.
The model will be gradually introduced for selected text-based use cases in Copilot, allowing Microsoft AI to gather feedback and refine performance. The team combines in-house models, partner contributions, and open-source innovations to optimize results across millions of interactions daily. MAI-1-preview is also available to trusted testers, with API access applications open to collect insights on its strengths and areas for improvement.
Looking ahead, Microsoft AI plans to advance the model further while orchestrating a suite of specialized models tailored to different user intents and scenarios. The division aims to continue developing leading AI solutions and making them accessible to users worldwide.