ACE-Step 1.5 XL, a locally executable music generation AI, has been released, capable of generating songs with Japanese vocals.

The music generation AI ' ACE-Step 1.5 XL ' was released as an open model on April 2, 2026. ACE-Step 1.5 XL is an enhanced version of ' ACE-Step 1.5 ,' which was released in February 2026, and can be run on a local PC to generate songs with Japanese vocals and other content.
ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation
https://ace-step.github.io/ace-step-v1.5.github.io/
ACE-Step-1.5-xl is out now.
— ACE Music (@acemusicAI) April 8, 2026
We scaled the DiT decoder to 4B. And it shows better audio quality, better prompt following, and better musicality. It still fast -- 8 steps with turbo distillation.
What didn't change:
- Same generation API, same LoRA training code, same everything… pic.twitter.com/P0YUFseEQ3
ACE-Step 1.5 XL is a 4B parameter DiT model that can generate songs with vocals simply by giving instructions in natural language. It can also generate Japanese vocals, and the ACE-Step 1.5 XL demo page includes songs with Japanese vocals.
Japanese vocal track generated by the music generation AI 'ACE-Step 1.5 XL' - YouTube
ACE-Step 1.5 XL was developed as an open model, and three versions are available for download at the following link: the base model 'acestep-v15-xl-base', the fine-tuned model 'acestep-v15-xl-sft', and the distillation model 'acestep-v15-xl-turbo'.
ACE-Step/acestep-v15-xl-base · Hugging Face
https://huggingface.co/ACE-Step/acestep-v15-xl-base
ACE-Step/acestep-v15-xl-sft · Hugging Face
https://huggingface.co/ACE-Step/acestep-v15-xl-sft
ACE-Step/acestep-v15-xl-turbo · Hugging Face
https://huggingface.co/ACE-Step/acestep-v15-xl-turbo
The main differences between the three models are as follows: acestep-v15-xl-base offers the highest diversity, while acestep-v15-xl-sft offers improved quality but reduced diversity. acestep-v15-xl-turbo enables fast and high-quality generation, but has lower diversity and fine-tuning capabilities. The minimum VRAM required for execution is 12GB, and the recommended VRAM capacity is 20GB or more.
| Model | acestep-v15-xl-base | acestep-v15-xl-sft | acestep-v15-xl-turbo |
|---|---|---|---|
| Number of steps | 50 | 50 | 8 |
| quality | Medium | High | Very High |
| diversity | High | Medium | Medium |
| Fine tuning suitability | Easy | Medium | Medium |
ACE-Step 1.5 XL allows you to reuse the same workflow as ACE-Step 1.5. A Japanese tutorial titled 'ACE-Step 1.5 Ultimate Guide (Must Read)' is also available.
ACE-Step 1.5 Ultimate Guide (Must Read) · GitHub
https://github.com/ace-step/ACE-Step-1.5/blob/main/docs/ja/Tutorial.md

Related Posts:







