Ace-Step 1.5 is live — MIT License, 100% Free

The Free, Open-Source
AI Music Generator That
Beats Paid Tools

Generate full songs with vocals, instruments & lyrics in ~20 seconds. No subscription. No limits. Runs locally on your GPU. The Suno killer is here.

~20s per song
🎵 Vocals + Instruments
🔓 MIT License
💰 $0/month
🌐 AceJAM web UI
💸 Calculate Your Savings ⭐ View on GitHub

Song Generation Cost Calculator

See exactly how much you save by switching from paid AI music platforms to Ace-Step. Adjust your monthly generation volume below.

200 songs/mo
102,000
3 min avg
1 min10 min
$0

saved per month vs the most expensive option

Ace-Step 1.5 vs Suno v4 vs Udio vs Google MusicFX

A complete head-to-head breakdown of every major AI music generation platform as of April 2026.

Feature Ace-Step 1.5 Suno v4 Udio Google MusicFX
Monthly Cost $0 forever $8–$24/mo $10–$30/mo Free (limited)
License MIT (full commercial) Suno ToS, restricted Udio ToS, restricted Google ToS
Runs Locally / Offline ✓ Full offline ✗ Cloud only ✗ Cloud only ✗ Cloud only
Open Source ✓ GitHub ✗ Closed ✗ Closed ✗ Closed
Full Vocal Lyrics ✗ Instrumental only
Generation Speed ~20s (local GPU) 30–90s (server) 45–120s (server) ~30s
Monthly Song Limit Unlimited 500–10,000 600–8,400 Limited
Privacy (no data upload) ✓ 100% local ✗ Uploads to Suno ✗ Uploads to Udio ✗ Uploads to Google
Custom Model Fine-tuning ✓ LoRA support
API Access ✓ Self-hosted REST Paid tiers only Limited beta Google AI Studio
Web UI Available ✓ AceJAM app ✓ suno.com ✓ udio.com ✓ musicfx.withgoogle.com
Genre Diversity Wide (20+ genres) Wide Wide Moderate
Hardware Required GPU 8GB+ VRAM None (cloud) None (cloud) None (cloud)
Song Version 1.5 (Apr 2026) v4 (2025) v2 (2025) DJ Remix (2025)
Best For Devs, power users, commercial production Casual creators, no setup Sound design focus Quick instrumental jams

What is Ace-Step?

The open-source model that's changing AI music generation — how it works and why it matters.

Ace-Step is a state-of-the-art AI music generation model built on a diffusion transformer architecture. Unlike earlier text-to-music models that generated only short clips or instrumentals, Ace-Step generates complete songs — vocals, backing track, lyrics, and all — in a single forward pass.

The model was trained on a diverse corpus of licensed music covering 20+ genres, with a focus on vocal coherence, lyric-melody alignment, and multi-instrument realism. Version 1.5, released April 2026, significantly improved high-frequency detail in vocals and reduced artifact rates in sustained notes.

What makes Ace-Step genuinely different is its MIT license. You own your output. You can build products on top of it. You can fine-tune it. There are no royalty obligations and no platform terms that restrict commercial use.

Key Technical Capabilities

  • 🎤Full vocal generation with intelligible lyrics from text prompt
  • 🎸Multi-instrument layering (drums, bass, guitar, synths, piano)
  • ~20 second generation time on RTX 4090 (full song)
  • 🎛LoRA fine-tuning support — train on your own voice/style
  • 🌐REST API built-in for integration into your apps
  • 🔧Runs on 8GB VRAM minimum (consumer hardware)
  • 📦Docker container available for easy deployment

Ace-Step Timeline

1
Early 2025
Ace-Step v1.0 released — Initial open-source release with basic vocal + instrumental generation. Community excitement about MIT license.
2
Q3 2025
AceJAM companion app launches — Web UI wrapper for local Ace-Step instances. No CLI required.
3
Q4 2025
LoRA fine-tuning support — Users can train custom vocal styles and genre presets on consumer GPUs.
4
Apr 2026
Ace-Step 1.5 released — Improved vocal quality, faster generation (~20s), better lyric-melody alignment, reduced artifacts.
5
Coming Soon
Ace-Step 2.0 (rumored) — Multitrack stems output, real-time streaming generation.

GPU Requirements for Ace-Step 1.5

Ace-Step runs locally on NVIDIA GPUs. Here's exactly what you need at each tier.

Minimum
8 GB
RTX 3060 / RTX 3070
Generation time: ~60–90s per song. Occasional VRAM pressure. Works but slower. Also: RTX 4060 (8GB), RX 6800 (16GB via ROCm).
Recommended
10–12 GB
RTX 3080 / RTX 4070
Generation time: ~30–45s per song. Smooth for solo production use. RTX 4070 Ti Super (16GB) gives more headroom for longer songs.
Great
16–20 GB
RTX 3090 / RTX 4080
Generation time: ~25–35s. Handles long songs (5+ min) easily. Good for running AceJAM as a local server for a small team.
Ideal
24 GB
RTX 4090 / A6000
Generation time: ~20s (full song). Batch generation, LoRA training, highest quality. Best choice for commercial production or hosting a server.
⚠️ AMD GPU note: ROCm support is in beta. RTX GPUs (NVIDIA) are strongly recommended for best compatibility. macOS M-series (Apple Silicon) support via MPS is experimental in v1.5 — generation works but is slower (~3–5x vs comparable NVIDIA GPU).

AceJAM — Web UI for Ace-Step

AceJAM is the official web interface for Ace-Step. No command line needed once it's running.

Quick Setup (5 minutes)

  1. Install dependencies Python 3.10+, CUDA 12.x, git. NVIDIA driver 525+.
  2. Clone Ace-Step git clone https://github.com/ace-step/ace-step and pip install -r requirements.txt
  3. Download model weights Run python download_weights.py — fetches ~8GB of model files from HuggingFace automatically.
  4. Start the local server python server.py --port 7860 — this exposes the REST API on localhost.
  5. Open AceJAM Navigate to acejam.app and enter your local server URL, or self-host AceJAM from its GitHub repo.
  6. Generate your first song Type a style prompt + lyrics, hit Generate, and get your full song in ~20 seconds.
 generate.py — example API call
import requests, json # Ace-Step local server API url = "http://localhost:7860/generate" payload = { "prompt": "upbeat indie pop, female vocal, acoustic guitar, summer vibes", "lyrics": "[verse]\nSunshine on my windowsill\nCoffee steaming on the sill\n[chorus]\nI'm alive, I'm alive", "duration": 180, # seconds "seed": 42 # for reproducibility } r = requests.post(url, json=payload) result = r.json() # Returns base64 WAV audio audio_b64 = result["audio"] print(f"Generated in {result['time_ms']}ms")
AceJAM Features
  • 🎨Visual prompt builder with genre presets
  • 🎼Lyric editor with section markers (verse/chorus/bridge)
  • 🔄Real-time generation preview (streaming)
  • 📁Song history & export (WAV, MP3, FLAC)
  • 🎛LoRA preset selector for custom styles
  • 🌐Optional public tunnel via ngrok for team sharing

Frequently Asked Questions

Everything you need to know about Ace-Step 1.5 and AceJAM.

What is Ace-Step AI Music Generator?
Ace-Step is an open-source AI music generation model released under the MIT license. It can generate full songs complete with vocals, instruments, and lyrics in approximately 20 seconds on a modern GPU. Version 1.5 was released in April 2026 with improved vocal quality, faster generation speed, and better lyric-melody alignment.
Is Ace-Step really free — including for commercial use?
Yes. Ace-Step is MIT-licensed and completely free to use, modify, and distribute — including commercially. You run it locally on your own GPU, so there are no subscription fees, no per-song charges, and no usage limits. Your only cost is electricity. The MIT license grants you full rights to the generated output for commercial purposes.
How does Ace-Step 1.5 compare to Suno v4?
Ace-Step 1.5 runs locally with no usage limits, full MIT commercial rights, and zero monthly cost. Suno v4 costs $8–$24/month, caps you at 500–10,000 songs/month, and retains certain platform rights. Suno has a polished web UI and zero hardware requirements — so it wins on accessibility. Ace-Step wins on cost, control, privacy, and flexibility. For hobbyists who just want to make songs quickly, Suno is easier. For developers, studios, or power users generating at volume, Ace-Step is the clear choice.
What GPU do I need to run Ace-Step 1.5?
Minimum: NVIDIA GPU with 8GB VRAM (RTX 3060, RTX 4060). Expect ~60–90s per song. Recommended: RTX 3080 or RTX 4070 (10–12GB VRAM) for ~30–45s per song. Best: RTX 4090 (24GB VRAM) for the full ~20s generation speed and batch processing. AMD GPUs work via ROCm but with experimental support. Apple M-series works via MPS backend but is slower.
What is AceJAM and do I need it?
AceJAM is the official companion web application for Ace-Step. It provides a browser-based UI to interact with your locally-running Ace-Step server — no command-line required once set up. You don't technically need it (the REST API works fine with curl or Python), but AceJAM makes the experience much more user-friendly with its visual prompt builder, lyric editor, song history, and export tools.
What genres does Ace-Step 1.5 support?
Ace-Step 1.5 supports 20+ genres including pop, rock, hip-hop, EDM, country, jazz, classical, R&B, folk, metal, lo-fi, bossa nova, reggae, and electronic. You specify genre, mood, tempo, instrumentation, vocal style, and lyrics via text prompt. The more specific your prompt, the better the output. Version 1.5 particularly improved multi-instrument realism and vocal consonant clarity.
Can I fine-tune Ace-Step on my own voice or music style?
Yes! Ace-Step 1.5 supports LoRA fine-tuning, which means you can train lightweight adapter weights on your own vocal recordings or a specific musical style. Training a LoRA takes about 2–4 hours on an RTX 4090 with a reasonable dataset. Once trained, your LoRA preset can be loaded in AceJAM to generate music consistently in your custom style.
Does Ace-Step work on CPU (no GPU)?
Technically yes, but practically no for real-time use. On a modern CPU (e.g., Ryzen 9 or Core i9), generation of a 3-minute song takes 20–40 minutes. GPU is strongly recommended. If you have no suitable GPU, using a cloud GPU rental (RunPod, Vast.ai) for a few cents per hour is a cost-effective alternative while still keeping costs far below Suno/Udio subscriptions.

More Free AI Tools

Useful calculators and guides for AI power users.