Guide to Prompting Minimax Music for AI Music Generation

Master style prompts, lyrics formatting, and structural tags for Minimax Music

Last Updated

Apr 20, 2026Fresh

Models Tested

Minimax Music v2

Minimax Music v2 is a large-scale AI music generation model built on a Mixture-of-Experts (MoE) architecture with 230 billion parameters. It produces fully mixed and mastered songs from two simple inputs: a style prompt describing the musical direction and lyrics containing the words and song structure. It powers Ambience AI's audio generation pipeline, enabling creators to produce professional-quality tracks with realistic vocals, rich instrumentals, and polished production.

The model generates 44.1kHz stereo audio for up to five minutes per generation. It supports over 100 genres, delivers synchronized vocals that match your lyrics precisely, and outputs audio that sounds ready for release. Minimax Music v2 also supports voice cloning and instrumental reference audio for even more control over the final result.

This guide covers everything you need to know about prompting Minimax Music v2 effectively, from writing style prompts and formatting lyrics with structural tags to generating instrumentals, using reference audio, and troubleshooting common issues.

Understanding Minimax Music

Minimax Music v2 is a commercial-grade music generation model developed by MiniMax. It uses a Mixture-of-Experts architecture with 230 billion parameters, making it one of the largest models purpose-built for music creation. The model separates musical direction from lyrical content through a two-input system designed for lyric-driven composition.

The Two-Input System

Style Prompt (10 to 300 characters)

The style prompt defines the musical direction: genre, mood, instruments, tempo, vocal style, and production qualities. It functions as a creative brief that shapes the overall sound. Example: "Indie folk, melancholic, introspective, longing, acoustic guitar, soft vocals"

Lyrics (10 to 3,000 characters)

The lyrics field provides the vocal content along with structural tags that organize the song into sections. Minimax Music v2 supports 14 structural tags including [Verse], [Chorus], [Bridge], and more. For instrumental tracks, enable the is_instrumental flag.

Key Capabilities

230 billion parameter MoE architecture for high-quality, fully mixed and mastered output
44.1kHz stereo audio with output up to 5 minutes per generation
100+ genres including pop, rock, jazz, classical, electronic, hip hop, R&B, and more
Precise lyrics-to-vocal synchronization with realistic singing voices
14 structural tags for detailed song arrangement control
Voice cloning and instrumental reference audio support
Multiple output formats: MP3, WAV, and PCM

Crafting Style Prompts

The style prompt is your primary tool for shaping the sound of your generation. It must be between 10 and 300 characters. Think of it as a concise creative brief that tells the model what kind of track to produce.

Style Prompt Anatomy

1. Genre and Subgenre

Lead with the genre to establish the musical foundation. Be specific with subgenres when possible. Examples: "dreampop", "melodic techno", "neo-soul", "post-rock"

2. Mood and Atmosphere

Add emotional descriptors that guide the feel. Examples: "melancholic, introspective", "euphoric, anthemic", "dark, brooding", "warm, nostalgic"

3. Key Instruments

Specify instruments you want featured. Examples: "acoustic guitar, piano, soft strings", "808 bass, hi-hats, synth pads"

4. Vocal Direction and Tempo

Guide the vocal style and speed. Examples: "breathy female vocals, 90 BPM", "raspy male vocal, uptempo", "soulful harmonies, slow ballad"

Example: Well-Crafted Style Prompts

"Indie folk, melancholic, introspective, acoustic guitar, soft female vocals, gentle piano, 95 BPM"

Genre + mood + instruments + vocal direction + tempo (87 characters)

"Cinematic orchestral, epic, sweeping strings, brass fanfare, timpani, heroic, slow build"

Genre + mood + instruments + dynamic direction (89 characters)

"Lo-fi hip hop, chill, jazzy chords, vinyl crackle, mellow Rhodes piano, 80 BPM"

Genre + mood + instruments + tempo (79 characters)

Tips for Better Style Prompts

Be specific. "Dark melodic techno, pulsing bassline, atmospheric pads" works better than just "electronic music."
Stay within the 300-character limit. Focus on the most important descriptors rather than listing everything.
Avoid contradictory descriptors. Combining "aggressive" with "gentle" or "ambient" with "thrash metal" will produce inconsistent results.
Include vocal direction when generating songs. Describing the vocal quality (breathy, powerful, raspy) helps the model match your vision.

Generating Instrumental Music

To generate a purely instrumental track, use [Instrumental] as your lyrics. This tells the model to focus entirely on the musical arrangement without generating any vocals.

Instrumental Setup

Style Prompt: "Cinematic orchestral, epic, slow tempo, strings, brass, emotional, film score"

Lyrics: [Instrumental]

Instrumental Tags in Lyrics

Even when writing songs with vocals, you can include instrumental sections using structural tags. The [Inst] and [Solo] tags create vocal-free passages within your song. You can also add parenthetical instrument directions to guide what plays during these sections.

Example: Mixed Vocal and Instrumental

[Verse]
Walking through the city lights
Every corner tells a story tonight

[Inst]
(guitar solo, building intensity)

[Chorus]
We belong to the night
Under neon skies so bright

Genre Fusion Tips

Minimax Music v2 handles genre blending well when you guide it clearly in the style prompt. Structure your prompt with a primary genre and secondary influences to get coherent results.

Primary + Influence

"Jazz fusion, electronic elements, smooth, late night, saxophone, synth pads"

Era + Modern Production

"70s funk, modern production, groovy, bass guitar, clavinet, punchy drums"

Writing Lyrics with Structural Tags

The lyrics field accepts 10 to 3,000 characters and supports 14 structural tags that organize your song into distinct sections. These tags tell Minimax Music v2 how to arrange the track and where to place vocals, instrumental breaks, and transitions.

All 14 Structural Tags

[Intro]

Opening section

[Verse]

Story sections

[Pre Chorus]

Builds to chorus

[Chorus]

Main hook

[Post Chorus]

After the hook

[Bridge]

Contrasting part

[Interlude]

Musical break

[Transition]

Section connector

[Build Up]

Rising tension

[Break]

Sparse moment

[Hook]

Catchy phrase

[Inst]

Instrumental

[Solo]

Instrument solo

[Outro]

Closing section

Example: Structured Lyrics

[Intro]
(soft piano, ambient atmosphere)

[Verse]
Morning light through the window pane
Every whisper calls your name
I've been searching for a sign
Something real, something mine

[Pre Chorus]
Can you feel it in the air tonight

[Chorus]
We are infinite, we are the stars
Burning bright through all these scars
Nothing in this world can pull us apart

[Interlude]
(strings swell, gentle build)

[Bridge]
When the darkness tries to find us
We will be the light behind us

[Chorus]
We are infinite, we are the stars
Burning bright through all these scars
Nothing in this world can pull us apart

[Outro]
(fade out, piano and strings)

Lyrics Best Practices

Use Parenthetical Notes

Add stage directions in parentheses to guide the arrangement. For example, (whispering), (guitar solo), or (building intensity) give the model additional context.

Write for Singability

Use simple, natural phrasing. Short lines of 4 to 8 words work best. Avoid tongue twisters, complex vocabulary, or very long sentences that are difficult to sing naturally.

Match Style and Lyrics

Keep the emotional tone consistent between your style prompt and lyrics. Sad lyrics paired with "upbeat, party" in the style prompt will produce confusing results.

Stay Within Character Limits

Lyrics must be between 10 and 3,000 characters. For songs up to 5 minutes, you have plenty of room. Use structural tags and parenthetical notes to fill out the arrangement without needing excessive lyrics.

Minimax Music Prompt Examples

Here are complete prompt templates for common music generation use cases. Each includes both a style prompt and lyrics you can adapt for your projects.

Pop Song with Vocals

Style Prompt:

"Pop, catchy, upbeat, female vocal, synth, bright production, 120 BPM"

Lyrics:

[Intro]
(synth arpeggios, building energy)

[Verse]
Lights are flashing all around
Feel the rhythm, feel the sound
Tonight we're never coming down
This city's ours to own

[Pre Chorus]
Can you feel it rising

[Chorus]
Dance with me under the neon glow
Let the music take control
We are everything we'll ever know
Let the night unfold

[Inst]
(synth breakdown, pulsing bass)

[Chorus]
Dance with me under the neon glow
Let the music take control

[Outro]
(fade out, echoing vocals)

Cinematic Instrumental

Style Prompt:

"Cinematic orchestral, epic, sweeping strings, brass fanfare, timpani, heroic, slow build"

Lyrics:

[Instrumental]

Lo-Fi Chill Beat

Style Prompt:

"Lo-fi hip hop, chill, mellow, vinyl crackle, jazzy piano, soft drums, warm, 80 BPM"

Lyrics:

[Instrumental]

Rock Track with Lyrics

Style Prompt:

"Rock, electric guitar, powerful drums, male vocal, energetic, raw, 130 BPM"

Lyrics:

[Intro]
(distorted guitar riff, crashing drums)

[Verse]
Standing on the edge of the unknown
Fire in my veins, I'm not alone
Every road I take leads me back home

[Chorus]
We rise, we fall, we carry on
Through the storm we're standing strong
This is where we all belong

[Solo]
(electric guitar solo, soaring)

[Bridge]
The ground may shake beneath our feet
But we will never taste defeat

[Chorus]
We rise, we fall, we carry on
Through the storm we're standing strong

[Outro]
(drums fade, final guitar chord rings out)

Try these prompts in our AI audio generator to hear the results.

Editing and Refining Your Music

Getting the perfect track often takes iteration. Minimax Music v2 offers several approaches for refining your generations and pushing results closer to your creative vision.

Generate Variations

Run the same style prompt and lyrics multiple times. Each generation produces a different arrangement, melody, and vocal interpretation. Generate three to five variations and pick the best one.

Iterate on Style Prompts

Tweak the style prompt between generations. If the track is too slow, add "uptempo" or increase the BPM. If the vocals are too prominent, emphasize instruments. Small changes in the style prompt can produce meaningfully different results.

Voice Cloning Reference

Upload a reference audio clip to guide the vocal timbre and style. This lets you maintain a consistent vocal character across multiple generations. The model adapts the singing voice to match the reference while following your lyrics.

Instrumental Reference Audio

Provide a reference track to guide the instrumental arrangement, production style, and overall sonic texture. The model uses this as a template for the backing track while generating new music that follows your style prompt and lyrics.

Refinement Workflow

Start by generating an initial track with your style prompt and lyrics. Listen through and identify what works and what needs adjustment. Tweak the style prompt to shift the genre, mood, or instrumentation. Revise your lyrics to improve flow or add structural variety. Use reference audio if you want to match a specific vocal character or instrumental style. Iteration is the key to getting professional results from Minimax Music v2.

Troubleshooting Common Issues

Here are solutions to the most common issues creators encounter when generating music with Minimax Music v2.

Model Singing Structural Tags

Problem: The model vocalizes the tag names (e.g., singing "verse" or "chorus" out loud)

Solution: Make sure tags use the exact supported format with proper capitalization: [Verse], [Chorus], etc. Avoid custom or unsupported tag names. Place each tag on its own line with a blank line before the lyrics.

Style Prompt Not Being Followed

Problem: The generated track doesn't match the style you described

Solution: Be more specific in your style prompt. Use concrete genre names, specific instruments, and clear mood descriptors. Avoid vague terms like "good" or "nice." If you need a particular vocal style, describe it explicitly (e.g., "breathy female vocal" instead of just "female vocal").

Choppy or Unnatural Vocals

Problem: The vocals sound robotic, choppy, or poorly synchronized

Solution: Simplify your lyrics. Use shorter lines with natural phrasing (4 to 8 words per line). Avoid complex vocabulary, tongue twisters, or lines that are too long. Add [Interlude] or [Inst] breaks to give the vocals breathing room between sections.

Choosing the Right Output Format

Question: Which output format should you select?

Answer: Use MP3 (up to 256kbps) for web sharing and general listening. Use WAV for lossless quality when you plan to do further editing or mixing. Use PCM for raw, uncompressed audio in professional production workflows.

Start Creating Music with Minimax Music

Minimax Music v2 brings professional-quality AI music generation to every creator. With the right combination of style prompts, structural tags, and lyrics, you can create everything from cinematic scores to pop anthems with realistic vocals. The key is to start simple and iterate.

Begin with the templates in this guide, experiment with different genre and mood combinations, and use reference audio to dial in the exact sound you want. The more you create, the better your intuition for prompting will become. Try it now with our AI audio generator.

Looking for more music techniques? Check out our ACE-Step music prompting guide for an alternative approach to AI music generation. You can also explore our Flux image prompting guide, WAN video prompting guide, Kling video prompting guide, or browse our complete suite of creative tools.

Sources & Citations

This guide has been compiled based on research and expert insights from the following sources:

Ready to Create Music with Minimax Music?

Put your new music prompting skills to use with our AI audio generator. Create songs, instrumentals, and soundscapes using the techniques you've just learned.

Try the Audio Generator Start Creating Free

Guide to Prompting Minimax Music for AI Music Generation

Introduction to Minimax Music Prompting

Understanding Minimax Music

The Two-Input System

Style Prompt (10 to 300 characters)

Lyrics (10 to 3,000 characters)

Key Capabilities

Crafting Style Prompts

Style Prompt Anatomy

1. Genre and Subgenre

2. Mood and Atmosphere

3. Key Instruments

4. Vocal Direction and Tempo

Example: Well-Crafted Style Prompts

Tips for Better Style Prompts

Generating Instrumental Music

Instrumental Setup

Instrumental Tags in Lyrics

Example: Mixed Vocal and Instrumental

Genre Fusion Tips

Primary + Influence

Era + Modern Production

Writing Lyrics with Structural Tags

All 14 Structural Tags

Example: Structured Lyrics

Lyrics Best Practices

Use Parenthetical Notes

Write for Singability

Match Style and Lyrics

Stay Within Character Limits

Minimax Music Prompt Examples

Pop Song with Vocals

Style Prompt:

Lyrics:

Cinematic Instrumental

Style Prompt:

Lyrics:

Lo-Fi Chill Beat

Style Prompt:

Lyrics:

Rock Track with Lyrics

Style Prompt:

Lyrics:

Editing and Refining Your Music

Generate Variations

Iterate on Style Prompts

Voice Cloning Reference

Instrumental Reference Audio

Refinement Workflow

Troubleshooting Common Issues

Model Singing Structural Tags

Style Prompt Not Being Followed

Choppy or Unnatural Vocals

Choosing the Right Output Format

Start Creating Music with Minimax Music

Sources & Citations

Ready to Create Music with Minimax Music?