How to Make Your Own Karaoke Tracks: Removing Vocals From Any Song in 2026

Last spring my friend Hana turned 35 and asked me — politely, but with the energy of someone who had already bought a karaoke machine — whether I could make sure "Mr. Brightside" was on it. I checked. Of course it was. The Killers' biggest song is on every karaoke catalog in the Northern Hemisphere. Then she added: "...and 'Sticks and Stones' by Jamie T." I checked again. Nothing. Not on KaraFun, not on Smule, not on Sunfly, not even on the sketchier corners of YouTube where you can usually find a half-decent fan-made instrumental.

This is the problem this guide solves: what do you do when the song you actually want to sing is not in any karaoke catalog? For most of karaoke's history the answer was "pick a different song." That answer changed about three years ago, and most people haven't caught up yet.

Why catalogs miss songs

There are three reasons a song you love won't be in any commercial karaoke library.

The first is licensing. Karaoke companies clear songs the same way streaming services do, and the math doesn't favor B-sides, indie tracks, or anything from a label that has changed hands twice. If the rights are tangled, the song doesn't get re-recorded as an instrumental, and that's that.

The second is recency. New releases lag catalogs by anywhere from three months to never. Songs that defined a cultural moment can sit unlicensed for a year, by which point the moment has moved on.

The third is geography. Most catalogs skew heavily toward English-language pop, with a long tail of country, classic rock, and showtunes. If you want to sing Hikaru Utada, Stromae, or Jamiroquai deep cuts, you're going to fight for it.

For decades the workaround was bad. Now it isn't.

The old methods (briefly, so you can skip them)

Before about 2019, removing vocals from a finished song mix was a trick, not a technology. The most common method — center-channel cancellation — exploited the fact that lead vocals are usually mixed dead center while instruments are panned across the stereo field. You'd flip one channel out of phase, sum to mono, and any sound that lived equally in both channels would cancel out.

It worked, kind of. It also gutted the kick drum, ate the bass, and left the song sounding like it was being played through a phone in another room. Some old karaoke websites still distribute tracks made this way. They are not good.

The other old approach was buying "karaoke version" recordings — note-for-note re-records by session musicians. The legitimate ones from companies like Sunfly were fine but expensive. The illegitimate ones, sold for a dollar each on dodgy CD-ROMs at flea markets in 2003, were not.

Skip both methods. The technology has moved.

What changed: AI source separation

In 2019, a research team at Deezer released Spleeter, an open-source machine-learning model that could separate a finished song mix back into its component stems — vocals, drums, bass, and "other" — using a neural network trained on thousands of paired examples. The output was startlingly clean compared to anything the old cancellation tricks had ever produced.

Spleeter was the watershed. The models that came after it — Demucs from Meta's research lab, MDX, the various winners of the annual Music Demixing Challenge — kept improving. By 2023, the gap between a properly separated AI instrumental and the original studio multitrack was small enough that most listeners couldn't tell. By 2025 it was small enough that I, sitting at a karaoke machine, couldn't tell.

"The engine under every modern karaoke-maker tool is the same. The differences are mostly about packaging."

This is the engine under every modern karaoke-maker tool, free or paid. Knowing that is useful, because it means the differences between competing tools are mostly about packaging — speed, format support, interface — rather than the underlying audio quality, which has more or less converged.

Three ways to actually do this

There are three roads. Pick based on how much friction you can tolerate.

1. Run the models locally

The maximalist option. Install Python, set up a virtual environment, install Demucs, download model weights, and run a command like demucs --two-stems vocals song.mp3.

Pros: free, top-quality output, you keep all your stems, no upload limits, and your audio never leaves your machine.

Cons: it requires you to be the kind of person who is comfortable typing commands into a terminal at midnight. Demucs also benefits enormously from a halfway-decent GPU; on a CPU-only laptop, a four-minute song takes around ten minutes to process.

If that paragraph made your eyes glaze over, skip to option three.

2. Desktop apps with built-in models

The middle road. Tools like Ultimate Vocal Remover (UVR), RipX, and a handful of others wrap the same underlying models in a regular desktop application — drag an audio file in, click a button, get stems out. UVR in particular is free, open-source, and excellent.

Pros: no terminal, full quality, runs locally, your files don't leave your computer.

Cons: still a download-and-install process, still benefits from a GPU, and the UI is the kind of UI that was designed by audio engineers rather than people who design UIs. Functional but not friendly.

3. Browser-based, one-click services

The road I'm on now. You upload a file, the service runs the same kind of model on its servers, and you download the result a minute or two later. It's the path of least resistance for anyone who doesn't want their hobby to become a sysadmin job.

For this, I currently use stemsplit's karaoke maker. I landed on it after trying half a dozen alternatives because it does exactly the thing I want a tool like this to do — drop in an MP3, get an instrumental out — without the things I don't want, like account walls in front of the export button, watermarked previews, or a monthly subscription nudge that turns a five-minute task into a fifteen-minute negotiation. The output quality is on par with what I get from Demucs running locally, which is the only thing I actually care about. Your mileage may vary; the field moves fast and the tool I'm recommending today might not be the best one in eighteen months. Try it, try a couple of others, keep what works.

A note on privacy

The honest tradeoff with any web-based separator is that your audio gets uploaded to someone else's server for processing. If that bothers you — and there are reasonable reasons it might — go to option two. UVR running on your laptop never sends a byte anywhere.

Tips for getting the best result

A few things that matter regardless of which tool you pick.

Start with the highest-quality source you can. A 320 kbps MP3 separates better than a 128 kbps YouTube rip. A FLAC separates better still. The model can only remove what the model can hear; if the source is fuzzy, the instrumental will be fuzzy. Buying the song on Bandcamp and feeding it the FLAC is, no joke, often worth the seven dollars.

Some songs separate better than others. Music with a clear lead vocal sitting in front of distinct instrumentation — most pop, rock, R&B, hip-hop — comes out cleanly. Music where vocals are buried, doubled, drenched in reverb, or layered with backing harmonies is harder. A lot of indie rock, most choral work, anything by My Bloody Valentine: expect artifacts. You'll know within ten seconds of listening to the result.

Save the stems, not just the instrumental. Most tools give you the option to download all four separated layers — vocals, drums, bass, and "other" — instead of just the karaoke mix. Take them. Once you have stems, you can build a custom backing track in any audio editor: kill the lead vocal but keep the backing harmonies, drop the kick, swap the key. It's free flexibility you'll be glad to have later.

Listen on the speakers you'll perform on. An instrumental that sounds clean in headphones can sound thin through a karaoke speaker. If you're building a track for a specific room, A/B it on whatever rig you'll actually use. (More on choosing that rig in my home karaoke setup guide.)

Adding the lyrics layer

An instrumental is not yet a karaoke track. A karaoke track is an instrumental plus on-screen lyrics, ideally synced to the music.

You have two options. The lazy one is to pull up the song's lyrics on your phone in another window and read along; this works fine for living-room singalongs and is what I do nine nights out of ten. The proper one is to load your instrumental into a karaoke player app that supports custom tracks plus an .lrc lyrics file (a plain-text format with timestamps next to each line).

Lyric sites like Megalobiz host community-made .lrc files for most popular songs; for obscure tracks you can sync them yourself in fifteen minutes with any free .lrc editor. For more on which apps actually accept custom tracks — and which ones gatekeep that feature behind a paywall — see my karaoke apps comparison.

The bigger picture

We are living through a small, weird, wonderful moment in karaoke history. For the first time, the song you want to sing and the song the karaoke machine has are no longer two separate questions. If you can find a clean copy of a recording, you can have a karaoke version of it by the time you've finished pouring a drink.

That changes what's possible at a karaoke night. You can build a custom playlist for a friend's wedding. You can rescue the moment when somebody shouts a request for a song nobody knew the room needed. You can finally tackle that obscure deep cut you've been workshopping in the shower for years.

If you're putting together a setup at home, the next thing to read is the home karaoke setup guide — it covers the rest of the chain: the microphone, the speakers, the mixer, the screen, and how to glue them all together without spending audiophile money. And if you're stuck on what to actually sing once you've got the rig running, start with this list of forgiving first-timer songs.

Hana, by the way, got her Jamie T. The party went long.

✦ ✦ ✦