ACE-Step 1.5 — Local Network Setup

ACE-Step is an open-source AI music generation model. This guide covers running it on the workstation and serving the Gradio web UI to the local network.

Location

/mnt/NV2/Development/ACE-Step-1.5/

Cloned from GitHub. Uses uv for dependency management — the .venv is created automatically on first run.

Quick Start (Gradio UI)

cd /mnt/NV2/Development/ACE-Step-1.5
./start_gradio_ui.sh

Accessible from any device on the network at http://10.10.0.41:7860 (or whatever the workstation IP is).

.env Configuration

The .env file in the project root persists settings across git updates. Current config:

SERVER_NAME=0.0.0.0
PORT=7860
LANGUAGE=en

Key Settings

Variable	Default	Description
`SERVER_NAME`	`127.0.0.1`	Set to `0.0.0.0` for LAN access
`PORT`	`7860`	Gradio UI port
`LANGUAGE`	`en`	UI language (`en`, `zh`, `he`, `ja`). Must be set — empty value causes `unbound variable` error with the launcher's `set -u`
`ACESTEP_CONFIG_PATH`	`acestep-v15-turbo`	DiT model variant
`ACESTEP_LM_MODEL_PATH`	`acestep-5Hz-lm-0.6B`	Language model for lyrics/prompts
`ACESTEP_INIT_LLM`	`auto`	`auto` / `true` / `false` — auto detects based on VRAM
`CHECK_UPDATE`	`true`	Set to `false` to skip interactive update prompt (useful for background/automated starts)

See .env.example for the full list.

REST API Server (Alternative)

For programmatic access instead of the web UI:

cd /mnt/NV2/Development/ACE-Step-1.5
./start_api_server.sh

Default: http://127.0.0.1:8001. To serve on LAN, edit start_api_server.sh line 12:

HOST="0.0.0.0"

API docs available at http://<ip>:8001/docs.

Hardware Profile (Workstation)

GPU: NVIDIA RTX 4080 SUPER (16 GB VRAM)
Tier: 16GB class — auto-enables CPU offload, INT8 quantization, LLM
Max batch (with LM): 4
Max batch (without LM): 8
Max duration (with LM): 480s (8 min)
Max duration (without LM): 600s (10 min)

Startup Behavior

Loads .env configuration
Checks for git updates (interactive prompt — set CHECK_UPDATE=false to skip)
Creates .venv via uv sync if missing (slow on first run)
Runs legacy NVIDIA torch compatibility check
Loads DiT model → quantizes to INT8 → loads LM → allocates KV cache
Launches Gradio with queue for multi-user support

Full startup takes ~30-40 seconds after first run.

Gotchas

LANGUAGE must be set in .env: The system $LANGUAGE locale variable can be empty, causing the launcher to crash with unbound variable due to set -u. Always include LANGUAGE=en in .env.
Update prompt blocks background execution: If running headlessly or from a script, set CHECK_UPDATE=false to avoid the interactive Y/N prompt.
Model downloads: First run downloads ~4-5 GB of model weights from HuggingFace. Subsequent runs use cached checkpoints in ./checkpoints/.

3.3 KiB Raw Blame History