MusicWeaver: Coherent Long-Range and Editable Music Generation from a Beat-Aligned Structural Plan

Abstract

Current music generators capture local textures but often fail to model long-range structure, leading to off-beat outputs, weak section transitions, and limited editing capability. We present MusicWeaver, a music generation model conditioned on a beat-aligned structural plan. This plan serves as an editable intermediate between the input prompt and the generated music, preserving global form and enabling professional, localized edits. MusicWeaver consists of a planner, which translates prompts into a structural plan encoding musical form and compositional cues, and a diffusion-based generator, which synthesizes music under the plan’s guidance. To assess generation and editing quality, we introduce two metrics: the Structure Coherence Score (SCS) for evaluating long-range form and timing, and the Edit Fidelity Score (EFS) for measuring the accuracy of realizing plan edits. Experiments demonstrate that MusicWeaver achieves state-of-the-art fidelity and controllability, producing music closer to human-composed works.

Generate and edit music

Synthwave night drive that blooms into a brighter chorus (fix length and edit bars 5-8)

Original

Switch to off-beat open hats and a brighter lead timbre

Swap chorus chords to a classic synthwave lift (F→G→A→F flavor) while keeping the groove

Open the pad’s filter progressively across the chorus for a rising, brighter texture

Bossa lounge with nylon-guitar lead (fix length)

Original

Delay the off-beats to introduce a gentle swing feel in the shaker while keeping the groove intact

Emphasize a light 2–3 clave pattern in the cross-stick for a more idiomatic bossa pulse

Drop kick and cross-stick in bar 4 for a one-bar breath

Cinematic strings that break into drum-and-bass (change length)

Original

Add four bars of drum-free strings before the drop for a longer build

Accelerate across the string bars into the drop, tightening momentum without changing bar count

Transpose the final drop for a climactic lift

Electronic pop with four-on-the-floor chorus (change length)

Original

Raise chorus energy at bars 5–8, add a crash at bar 5, and densify hats

Chorus extended to bars 5–12 and a key lift in the last two bars

Switch to halftime backbeat and swung hats for a laid-back feel at bars 9–12, keeping chorus intensity

MusicWeaver: Coherent Long-Range and Editable Music Generation from a Beat-Aligned Structural Plan

Content

Abstract

Method

Generate and edit music

Synthwave night drive that blooms into a brighter chorus (fix length and edit bars 5-8)

Bossa lounge with nylon-guitar lead (fix length)

Cinematic strings that break into drum-and-bass (change length)

Electronic pop with four-on-the-floor chorus (change length)