Ultimate RVC 💙

Source type

The type of source to retrieve a song from.

Source

Link to a song on YouTube or the full path of a local audio file.

Source

Select a song from the list of cached songs.

Voice model

Select a model to use for voice conversion.

Vocal pitch shift

The number of octaves to shift the pitch of the converted vocals by. Use 1 for male-to-female and -1 for vice-versa.

-3 3

Overall pitch shift

The number of semi-tones to shift the pitch of the converted vocals, instrumentals and backup vocals by.

-12 12

Pitch extraction algorithm(s)

If more than one method is selected, then the median of the pitch values extracted by each method is used. RMVPE is recommended for most cases and is the default when no method is selected.

rmvpe

Index rate

Increase to bias the conversion towards the accent of the voice model. Decrease to potentially reduce artifacts coming from the voice model.

0 1

RMS mix rate

How much to mimic the loudness (0) of the input voice or a fixed loudness (1). A value of 1 is recommended for most cases.

0 1

Protect rate

Controls the extent to which consonants and breathing sounds are protected from artifacts. A higher value offers more protection but may worsen the indexing effect.

0 0.5

Hop length

How often the CREPE-based pitch extraction method checks for pitch changes measured in milliseconds. Lower values lead to longer conversion times and a higher risk of voice cracks, but better pitch accuracy.

1 512

Whether to split the input voice track into smaller segments before converting it. This can improve output quality for longer voice tracks.

Split input voice

Whether to apply autotune to the converted voice.

Autotune converted voice

Autotune intensity

Higher values result in stronger snapping to the chromatic grid and artifacting.

0 1

Whether to clean the converted voice using noise reduction algorithms.

Clean converted voice

Cleaning intensity

Higher values result in stronger cleaning, but may lead to a more compressed sound.

0 1

Embedder model

The model to use for generating speaker embeddings.

Custom embedder model

Select a custom embedder model from the dropdown.

Speaker ID

Speaker ID for multi-speaker-models.

Room size

Size of the room which reverb effect simulates. Increase for longer reverb time.

0 1

Wetness level

Loudness of converted vocals with reverb effect applied.

0 1

Dryness level

Loudness of converted vocals without reverb effect applied.

0 1

Damping level

Absorption of high frequencies in reverb effect.

0 1

Main gain

The gain to apply to the main vocals.

-20 20

Instrumentals gain

The gain to apply to the instrumentals.

-20 20

Backup gain

The gain to apply to the backup vocals.

-20 20

Output name

If no name is provided, a suitable name will be generated automatically.

Output sample rate

The sample rate of the mixed output track.

Output format

The audio format of the mixed output track.

Show intermediate audio tracks produced during song cover generation.

Show intermediate audio

Song

Vocals

Instrumentals

Main vocals

Backup vocals

De-reverbed main vocals

Main vocals with reverb

Converted vocals

Postprocessed vocals

Pitch-shifted instrumentals

Pitch-shifted backup vocals

Song cover

Source type

The type of source to retrieve a song from.

Source

Link to a song on YouTube or the full path of a local audio file.

Source

Select a song from the list of cached songs.

Song destination

Select the input track(s) to transfer the song to when the 'Transfer song' button is clicked.

Step 1: audio

Song

Audio

Song directory

Directory where intermediate audio files are stored and loaded from locally. When a new song is retrieved, its directory is chosen by default.

Separation model

The model to use for audio separation.

Segment size

The size of the segments into which the audio is split. Using a larger size consumes more resources, but may give better results.

64 128 256 512 1024 2048

Primary stem destination

Select the input track(s) to transfer the primary stem to when the 'Transfer primary stem' button is clicked.

Step 2: vocals

Secondary stem destination

Select the input track(s) to transfer the secondary stem to when the 'Transfer secondary stem' button is clicked.

Step 4: instrumentals

Primary stem

Secondary stem

Vocals

Voice model

Select a model to use for voice conversion.

Song directory

Directory where intermediate audio files are stored and loaded from locally. When a new song is retrieved, its directory is chosen by default.

Pitch shift (octaves)

The number of octaves to pitch-shift the converted voice by. Use 1 for male-to-female and -1 for vice-versa.

-3 3

Pitch shift (semi-tones)

The number of semi-tones to pitch-shift the converted vocals by. Altering this slightly reduces sound quality.

-12 12

Converted vocals destination

Select the input track(s) to transfer the converted vocals to when the 'Transfer converted vocals' button is clicked.

Step 3: vocals

Pitch extraction algorithm(s)

If more than one method is selected, then the median of the pitch values extracted by each method is used. RMVPE is recommended for most cases and is the default when no method is selected.

rmvpe

Index rate

Increase to bias the conversion towards the accent of the voice model. Decrease to potentially reduce artifacts coming from the voice model.

0 1

RMS mix rate

How much to mimic the loudness (0) of the input voice or a fixed loudness (1). A value of 1 is recommended for most cases.

0 1

Protect rate

Controls the extent to which consonants and breathing sounds are protected from artifacts. A higher value offers more protection but may worsen the indexing effect.

0 0.5

Hop length

1 512

Whether to split the input voice track into smaller segments before converting it. This can improve output quality for longer voice tracks.

Split input voice

Whether to apply autotune to the converted voice.

Autotune converted voice

Autotune intensity

Higher values result in stronger snapping to the chromatic grid and artifacting.

0 1

Whether to clean the converted voice using noise reduction algorithms.

Clean converted voice

Cleaning intensity

Higher values result in stronger cleaning, but may lead to a more compressed sound.

0 1

Embedder model

The model to use for generating speaker embeddings.

Custom embedder model

Select a custom embedder model from the dropdown.

Speaker ID

Speaker ID for multi-speaker-models.

Converted vocals

Vocals

Song directory

Directory where intermediate audio files are stored and loaded from locally. When a new song is retrieved, its directory is chosen by default.

Room size

Size of the room which reverb effect simulates. Increase for longer reverb time.

0 1

Wetness level

Loudness of converted vocals with reverb effect applied.

0 1

Dryness level

Loudness of converted vocals without reverb effect applied.

0 1

Damping level

Absorption of high frequencies in reverb effect.

0 1

Effected vocals destination

Select the input track(s) to transfer the effected vocals to when the 'Transfer effected vocals' button is clicked.

Step 5: main vocals

Effected vocals

Instrumentals

Backup vocals

Instrumental pitch shift

The number of semi-tones to pitch-shift the instrumentals by.

-12 12

Backup vocal pitch shift

The number of semi-tones to pitch-shift the backup vocals by.

-12 12

Song directory

Directory where intermediate audio files are stored and loaded from locally. When a new song is retrieved, its directory is chosen by default.

Pitch-shifted instrumentals destination

Select the input track(s) to transfer the pitch-shifted instrumentals to when the 'Transfer pitch-shifted instrumentals' button is clicked.

Step 5: instrumentals

Pitch-shifted backup vocals destination

Select the input track(s) to transfer the pitch-shifted backup vocals to when the 'Transfer pitch-shifted backup vocals' button is clicked.

Step 5: backup vocals

Pitch-shifted instrumentals

Pitch-shifted backup vocals

Main vocals

Instrumentals

Backup vocals

Song directory

Directory where intermediate audio files are stored and loaded from locally. When a new song is retrieved, its directory is chosen by default.

Main gain

The gain to apply to the main vocals.

-20 20

Instrumentals gain

The gain to apply to the instrumentals.

-20 20

Backup gain

The gain to apply to the backup vocals.

-20 20

Output name

If no name is provided, a suitable name will be generated automatically.

Output sample rate

The sample rate of the mixed output track.

Output format

The audio format of the mixed output track.

Song cover destination

Select the input track(s) to transfer the song cover to when the 'Transfer song cover' button is clicked.

Song cover

Source type

The type of source to generate speech from.

Source

Text to generate speech from

Source

Edge TTS voice

Select a voice to use for text to speech conversion.

Voice model

Select a model to use for voice conversion.

Edge TTS pitch shift

The number of hertz to shift the pitch of the speech generated by Edge TTS.

-100 100

TTS speed change

The percentual change to the speed of the speech generated by Edge TTS.

-50 100

TTS volume change

The percentual change to the volume of the speech generated by Edge TTS.

-100 100

Octave shift

The number of octaves to pitch-shift the converted speech by. Use 1 for male-to-female and -1 for vice-versa.

-3 3

Semitone shift

The number of semi-tones to pitch-shift the converted speech by.

-12 12

Pitch extraction algorithm(s)

If more than one method is selected, then the median of the pitch values extracted by each method is used. RMVPE is recommended for most cases and is the default when no method is selected.

rmvpe

Index rate

Increase to bias the conversion towards the accent of the voice model. Decrease to potentially reduce artifacts coming from the voice model.

0 1

RMS mix rate

How much to mimic the loudness (0) of the input voice or a fixed loudness (1). A value of 1 is recommended for most cases.

0 1

Protect rate

Controls the extent to which consonants and breathing sounds are protected from artifacts. A higher value offers more protection but may worsen the indexing effect.

0 0.5

Hop length

1 512

Whether to split the input voice track into smaller segments before converting it. This can improve output quality for longer voice tracks.

Split input voice

Whether to apply autotune to the converted voice.

Autotune converted voice

Autotune intensity

Higher values result in stronger snapping to the chromatic grid and artifacting.

0 1

Whether to clean the converted voice using noise reduction algorithms.

Clean converted voice

Cleaning intensity

Higher values result in stronger cleaning, but may lead to a more compressed sound.

0 1

Embedder model

The model to use for generating speaker embeddings.

Custom embedder model

Select a custom embedder model from the dropdown.

Speaker ID

Speaker ID for multi-speaker-models.

Output gain

The gain to apply to the converted speech.

-20 20

Output sample rate

The sample rate of the mixed output track.

Output name

If no name is provided, a suitable name will be generated automatically.

Output format

The audio format of the mixed output track.

Show intermediate audio tracks produced during speech generation.

Show intermediate audio

Speech

Converted speech

Mixed speech

Source type

The type of source to generate speech from.

Source

Text to generate speech from

Source

Edge TTS voice

Select a voice to use for text to speech conversion.

Edge TTS pitch shift

The number of hertz to shift the pitch of the speech generated by Edge TTS.

-100 100

TTS speed change

The percentual change to the speed of the speech generated by Edge TTS.

-50 100

TTS volume change

The percentual change to the volume of the speech generated by Edge TTS.

-100 100

Speech destination

Select the input track(s) to transfer the speech to when the 'Transfer speech' button is clicked.

Step 2: speech

Generated speech

Speech

Voice model

Select a model to use for voice conversion.

Octave shift

The number of octaves to pitch-shift the converted speech by. Use 1 for male-to-female and -1 for vice-versa.

-3 3

Semitone shift

The number of semi-tones to pitch-shift the converted speech by.

-12 12

Pitch extraction algorithm(s)

If more than one method is selected, then the median of the pitch values extracted by each method is used. RMVPE is recommended for most cases and is the default when no method is selected.

rmvpe

Index rate

Increase to bias the conversion towards the accent of the voice model. Decrease to potentially reduce artifacts coming from the voice model.

0 1

RMS mix rate

How much to mimic the loudness (0) of the input voice or a fixed loudness (1). A value of 1 is recommended for most cases.

0 1

Protect rate

Controls the extent to which consonants and breathing sounds are protected from artifacts. A higher value offers more protection but may worsen the indexing effect.

0 0.5

Hop length

1 512

Whether to split the input voice track into smaller segments before converting it. This can improve output quality for longer voice tracks.

Split input voice

Whether to apply autotune to the converted voice.

Autotune converted voice

Autotune intensity

Higher values result in stronger snapping to the chromatic grid and artifacting.

0 1

Whether to clean the converted voice using noise reduction algorithms.

Clean converted voice

Cleaning intensity

Higher values result in stronger cleaning, but may lead to a more compressed sound.

0 1

Embedder model

The model to use for generating speaker embeddings.

Custom embedder model

Select a custom embedder model from the dropdown.

Speaker ID

Speaker ID for multi-speaker-models.

Converted speech destination

Select the input track(s) to transfer the converted speech to when the 'Transfer converted speech' button is clicked.

Step 3: speech

Converted speech

Output gain

The gain to apply to the converted speech.

-20 20

Output sample rate

The sample rate of the mixed output track.

Output name

If no name is provided, a suitable name will be generated automatically.

Output format

The audio format of the mixed output track.

Mixed speech destination

Select the input track(s) to transfer the mixed speech to when the 'Transfer mixed speech' button is clicked.

Mixed speech

Checkbox

Filter voice models by selecting one or more tags and/or providing a search query.

Select a row in the table to autofill the name and URL for the given voice model in the form fields below.

Search query

Tags

English Japanese Other Language Anime Vtuber Real person Game character

Public models table

Public models table

Compa - Hyperdimension Neptunia	Yuigahama Yui from Yahari Ore no Seishun Love Comedy wa Machigatteiru (250 Epochs)	Anime,Other Language,Real person	dacoolkid44 & hijack & Maki Ligon	2023-07-31	https://huggingface.co/zeerowiibu/WiibuRVCCollection/resolve/main/Compa%20(Choujigen%20Game%20Neptunia)%20(JPN)%20(RVC%20v2)%20(150%20Epochs).zip


Emilia	Emilia from Re:Zero	Anime	rinka4759	2023-07-31	https://huggingface.co/RinkaEmina/RVC_Sharing/resolve/main/Emilia%20V2%2048000.zip
Klee	Klee from Genshin Impact	Game character,Japanese	qweshsmashjuicefruity	2023-07-31	https://huggingface.co/qweshkka/Klee/resolve/main/Klee.zip
Yelan	Yelan from Genshin Impact	Game character,Japanese	iroak	2023-07-31	https://huggingface.co/iroaK/RVC2_Yelan_GenshinImpact/resolve/main/YelanJP.zip
Yae Miko	Yae Miko from Genshin Impact	Game character,Japanese	iroak	2023-07-31	https://huggingface.co/iroaK/RVC2_YaeMiko_GenshinImpact/resolve/main/Yae_MikoJP.zip
Lisa	Lisa from Genshin Impact	Game character,English	qweshsmashjuicefruity	2023-07-31	https://huggingface.co/qweshkka/Lisa2ver/resolve/main/Lisa.zip
Kazuha	Kaedehara Kazuha from Genshin Impact	Game character,Japanese	iroak	2023-07-31	https://huggingface.co/iroaK/RVC2_Kazuha_GenshinImpact/resolve/main/Kazuha.zip
Barbara	Barbara from Genshin Impact	Game character,Japanese	iroak	2023-07-31	https://huggingface.co/iroaK/RVC2_Barbara_GenshinImpact/resolve/main/BarbaraJP.zip
Tom Holland	Tom Holland (Spider-Man)	Real person,English	tjkcreative	2023-08-03	https://huggingface.co/TJKAI/TomHolland/resolve/main/TomHolland.zip
Kamisato Ayaka	Kamisato Ayaka from Genshin Impact - CN voice actor	Game character,Other Language	kannysoap	2023-08-03	https://huggingface.co/benitheworld/ayaka-cn/resolve/main/ayaka-cn.zip
Amai Odayaka	Amai Odayaka from Yandere Simulator	Anime,English	minecraftian47	2023-08-03	https://huggingface.co/NoIdea4Username/NoIdeaRVCCollection/resolve/main/Amai-Odayaka.zip
Compa - Hyperdimension Neptunia	Compa from Choujigen Game Neptune (aka Hyperdimension Neptunia)	Anime,Japanese	zeerowiibu	2023-08-03	https://huggingface.co/zeerowiibu/WiibuRVCCollection/resolve/main/Compa%20(Choujigen%20Game%20Neptunia)%20(JPN)%20(RVC%20v2)%20(150%20Epochs).zip
Fu Xuan	Fu Xuan from Honkai Star Rail (HSR)	Game character,English	__june	2023-08-03	https://huggingface.co/Juneuarie/FuXuan/resolve/main/FuXuan.zip
Xinyan	Xinyan from Genshin Impact	Game character,English	shyelijah	2023-08-03	https://huggingface.co/AnimeSessions/rvc_voice_models/resolve/main/XinyanRVC.zip
Enterprise	Enterprise from Azur Lane	Anime,Japanese	minecraftian47	2023-08-03	https://huggingface.co/NoIdea4Username/NoIdeaRVCCollection/resolve/main/Enterprise-JP.zip
Kurt Cobain	singer Kurt Cobain	Real person,English	florst	2023-08-03	https://huggingface.co/Florstie/Kurt_Cobain_byFlorst/resolve/main/Kurt_Florst.zip
Ironmouse	Ironmouse	Vtuber,English	ladyimpa	2023-08-03	https://huggingface.co/Tempo-Hawk/IronmouseV2/resolve/main/IronmouseV2.zip
Bratishkinoff	Bratishkinoff (Bratishkin \| Братишкин) - russian steamer	Real person,Other Language	.caddii	2023-08-03	https://huggingface.co/JHmashups/Bratishkinoff/resolve/main/bratishkin.zip
Yagami Light	Yagami Light (Miyano Mamoru) from death note	Anime,Japanese	takka / takka#7700	2023-08-03	https://huggingface.co/geekdom-tr/Yagami-Light/resolve/main/Yagami-Light.zip
Itashi	Itashi (Russian fandubber AniLibria)	Anime,Other Language,Real person	BelochkaOff	2023-08-03	https://huggingface.co/4uGGun/4uGGunRVC/resolve/main/itashi.zip
Michiru Kagemori	Michiru Kagemori from Brand New Animal (300 Epochs)	Anime,English	wolfmk	2023-08-03	https://huggingface.co/WolfMK/MichiruKagemori/resolve/main/MichiruKagemori_RVC_V2.zip
Kaeya	Kaeya (VA: Kohsuke Toriumi) from Genshin Impact (300 Epochs)	Game character,Japanese	nlordqting4444	2023-08-03	https://huggingface.co/nlordqting4444/nlordqtingRVC/resolve/main/Kaeya.zip
Mona Megistus	Mona Megistus (VA: Felecia Angelle) from Genshin Impact (250 Epochs)	Game character,English	shyelijah	2023-08-03	https://huggingface.co/AnimeSessions/rvc_voice_models/resolve/main/MonaRVC.zip
Klee	Klee from Genshin Impact (400 Epochs)	Game character,English	hardbop	2023-08-03	https://huggingface.co/hardbop/AI_MODEL_THINGY/resolve/main/kleeeng_rvc.zip
Sakurakoji Kinako	Sakurakoji Kinako (Suzuhara Nozomi) from Love Live! Superstar!! (700 Epoch)	Anime,Japanese	ck1089	2023-08-03	https://huggingface.co/Gorodogi/RVC2MangioCrepe/resolve/main/kinakobetatwo700.zip
Minamo Kurosawa	Minamo (Nyamo) Kurosawa (Azumanga Daioh US DUB) (300 Epochs)	Anime	timothy10583	2023-08-03	https://huggingface.co/timothy10583/RVC/resolve/main/minamo-kurosawa.zip
Neco Arc	Neco Arc (Neco-Aruku) (Epochs 600)	Anime	ozzy_helix_	2023-08-03	https://huggingface.co/Ozzy-Helix/Neko_Arc_Neko_Aruku.RVCv2/resolve/main/Neko_Arc-V3-E600.zip
Makima	Makima from Chainsaw Man (300 Epochs)	Anime,English	andpproximately	2023-08-03	https://huggingface.co/andolei/makimaen/resolve/main/makima-en-dub.zip
PomPom	PomPom from Honkai Star Rail (HSR) (200 Epochs)	Game character,English	kannysoap	2023-08-03	https://huggingface.co/benitheworld/pom-pom/resolve/main/pom-pom.zip
Asuka Langley Soryu	Asuka Langley Soryu/Tiffany Grant from Neon Genesis Evangelion (400 Epochs)	Anime,English	piegirl	2023-08-03	https://huggingface.co/Piegirl/asukaadv/resolve/main/asuka.zip
Ochaco Uraraka	Ochaco Uraraka from Boku no Hero Academia (320 Epochs)	Anime,Japanese	danthevegetable	2023-08-03	https://huggingface.co/legitdark/JP-Uraraka-By-Dan/resolve/main/JP-Uraraka-By-Dan.zip
Sunaokami Shiroko	Sunaokami Shiroko from Blue Archive (500 Epochs)	Anime	lorddavis778	2023-08-03	https://huggingface.co/LordDavis778/BlueArchivevoicemodels/resolve/main/SunaokamiShiroko.zip
Dainsleif	Dainsleif from Genshin Impact (335 Epochs)	Game character,English	nasley	2023-08-03	https://huggingface.co/Nasleyy/NasleyRVC/resolve/main/Voices/Dainsleif/Dainsleif.zip
Mae Asmr	Mae Asmr - harvest mommy voice (YOUTUBE) (300 Epochs)	English,Real person,Vtuber	ctian_04	2023-08-03	https://huggingface.co/ctian/VRC/resolve/main/MaeASMR.zip
Hana Shirosaki	Hana Shirosaki / 白咲花 From Watashi ni Tenshi ga Maiorita! (570 Epochs)	Anime,Japanese	tamalik	2023-08-03	https://huggingface.co/Pawlik17/HanaWataten/resolve/main/HanaWATATEN.zip
Kaguya Shinomiya	Kaguya Shinomiya from Kaguya-Sama Love is war (200 Epochs)	Anime,Japanese	1ski	2023-08-03	https://huggingface.co/1ski/1skiRVCModels/resolve/main/kaguyav5.zip
Nai Shiro	Nai Shiro (Ai Kayano) from No Game No Life (360 Epochs)	Anime,Japanese	kxouyou	2023-08-03	https://huggingface.co/kuushiro/Shiro-RVC-No-Game-No-Life/resolve/main/shiro-jp-360-epochs.zip
Yuigahama Yui	Yuigahama Yui from Yahari Ore no Seishun Love Comedy wa Machigatteiru (250 Epochs)	Anime,Japanese	zerokano	2023-08-03	https://huggingface.co/Zerokano/Yuigahama_Yui-RVCv2/resolve/main/Yuigahama_Yui.zip
Fuwawa Abyssgard	Fuwawa Abyssgard (FUWAMOCO) from Hololive gen 3 (250 Epochs)	Vtuber,English	megaaziib	2023-08-03	https://huggingface.co/megaaziib/my-rvc-models-collection/resolve/main/fuwawa.zip
Kana Arima	Kana Arima from Oshi no Ko (250 Epochs)	Anime,Japanese	ddoumakunn	2023-08-03	https://huggingface.co/ddoumakunn/arimakanna/resolve/main/arimakanna.zip
Raiden Shogun	Raiden Shogun from Genshin Impact (310 Epochs)	Game character,English	nasley	2023-08-03	https://huggingface.co/Nasleyy/NasleyRVC/resolve/main/Voices/RaidenShogun/RaidenShogun.zip
Alhaitham	Alhaitham from Genshin Impact (320 Epochs)	Game character,English	nasley	2023-08-03	https://huggingface.co/Nasleyy/NasleyRVC/resolve/main/Voices/Alhaitham/Alhaitham.zip
Izuku Midoriya	Izuku Midoriya from Boku no Hero Academia (100 Epochs)	Anime,Japanese	khjjnoffical	2023-08-03	https://huggingface.co/BigGuy635/MHA/resolve/main/DekuJP.zip
Kurumi Shiratori	Kurumi Shiratori (VA: Ruka Fukagawa) from D4DJ (500 Epochs)	Anime,Japanese	seakrait	2023-08-03	https://huggingface.co/HarunaKasuga/YoshikoTsushima/resolve/main/KurumiShiratori.zip
Veibae	Veibae (165 Epochs)	Vtuber,English	recairo	2023-08-03	https://huggingface.co/datasets/Papaquans/Veibae/resolve/main/veibae_e165_s125565.zip
Black Panther	Black Panther (Chadwick Boseman) (300 Epochs)	Real person,English	tjkcreative	2023-08-03	https://huggingface.co/TJKAI/BlackPannther/resolve/main/BlackPanther.zip
Gawr Gura	Gawr Gura from Hololive EN	Vtuber	dacoolkid44 & hijack	2023-08-05	https://pixeldrain.com/u/3tJmABXA
Houshou Marine	Houshou Marine from Hololive JP	Vtuber,Japanese	dacoolkid44 & hijack	2023-08-05	https://pixeldrain.com/u/L1YLfZyU
Hoshimachi Suisei	Hoshimachi Suisei from Hololive JP	Vtuber,Japanese	dacoolkid44 & hijack & Maki Ligon	2023-08-05	https://pixeldrain.com/u/YP89C21u
Laplus Darkness	Laplus Darkness from Hololive JP	Vtuber,Japanese	dacoolkid44 & hijack	2023-08-05	https://pixeldrain.com/u/zmuxv5Bf
AZKi	AZKi from Hololive JP	Vtuber,Japanese	Kit Lemonfoot / NSHFB	2023-08-05	https://huggingface.co/Kit-Lemonfoot/kitlemonfoot_rvc_models/resolve/main/AZKi%20(Hybrid).zip
Ado	Talented JP artist (500 epochs using every song from her first album)	Real person,Japanese	pjesek	2023-08-05	https://huggingface.co/pjesek/AdoRVCv2/resolve/main/AdoRVCv2.zip
LiSA	Talented JP artist (400 epochs)	Real person,Japanese	Phant0m	2023-08-05	https://huggingface.co/phant0m4r/LiSA/resolve/main/LiSA.zip
Kokomi	Kokomi from Genshin Impact KR (300 Epochs)	Game character,Other Language	kannysoap	2023-08-09	https://huggingface.co/benitheworld/kokomi-kr/resolve/main/kokomi-kr.zip
Ivanzolo	Ivanzolo2004 russian streamer \| Иван Золо 2004	Other Language,Real person	prezervativ_naruto2009	2023-08-09	https://huggingface.co/fenikkusugosuto/IvanZolo2004/resolve/main/ivanZolo.zip
Nilou	Nilou from Genshin Impact KR (300 Epochs)	Game character,Other Language	kannysoap	2023-08-09	https://huggingface.co/benitheworld/nilou-kr/resolve/main/nilou-kr.zip
Dr. Doofenshmirtz	RUS Dr. Doofenshmirtz from Phineas and Ferb (300 epochs)	Other Language	argaxus	2023-08-09	https://huggingface.co/Argax/doofenshmirtz-RUS/resolve/main/doofenshmirtz.zip

Model URL

Should point to a zip file containing a .pth model file and optionally also an .index file.

Model name

Enter a unique name for the voice model.

Output message

Pretrained model

Select the pretrained model you want to download.

Sample rate

Select the sample rate for the pretrained model.

Output message

Find the .pth file for a locally trained RVC model (e.g. in your local weights folder) and optionally also a corresponding .index file (e.g. in your logs/[name] folder)

Upload the files directly or save them to a folder, then compress that folder and upload the resulting .zip file

Enter a unique name for the uploaded model

Click 'Upload'

Files

Model name

Output message

Find the config.json file and pytorch_model.bin file for a custom embedder model stored locally.

Upload the files directly or save them to a folder, then compress that folder and upload the resulting .zip file

Enter a unique name for the uploaded embedder model

Click 'Upload'

Files

Model name

Output message

Dataset type

Select the type of dataset to preprocess.

Dataset path

The path to an existing dataset. Either select a path to a previously created dataset or provide a path to an external dataset.

Dataset name

The name of the new dataset. If the dataset already exists, the provided audio files will be added to it.

Audio files

Model name

Name of the model to preprocess the given dataset for. Either select an existing model from the dropdown or provide the name of a new model.

Sample rate

Target sample rate for the audio files in the provided dataset.

Whether to remove low-frequency sounds from the audio files in the provided dataset by applying a high-pass butterworth filter.

Filter audio

Whether to clean the audio files in the provided dataset using noise reduction algorithms.

Clean audio

Cleaning intensity

Higher values result in stronger cleaning, but may lead to a more compressed sound.

0 1

Audio splitting method

The method to use for splitting the audio files in the provided dataset. Use the Skip method to skip splitting if the audio files are already split. Use the Simple method if excessive silence has already been removed from the audio files. Use the Automatic method for automatic silence detection and splitting around it.

Chunk length

Length of split audio chunks.

0.5 5

Overlap length

Length of overlap between split audio chunks.

0 0.4

CPU cores

The number of CPU cores to use for multi-threading.

1 16

Output message

Model name

Name of the model with an associated preprocessed dataset to extract training features from. When a new dataset is preprocessed, its associated model is selected by default.

F0 method

The method to use for extracting pitch features.

Hop length

The hop length to use for extracting pitch features.

1 512

Embedder model

The model to use for generating speaker embeddings.

Custom embedder model

Select a custom embedder model from the dropdown.

Include mutes

The number of mute audio files to include in the generated training file list. Adding silent files enables the training model to handle pure silence in inferred audio files. If the preprocessed audio dataset already contains segments of pure silence, set this to 0.

0 10

CPU cores

The number of CPU cores to use for multi-threading.

1 16

Hardware acceleration

The type of hardware acceleration to use. 'Automatic' will automatically select the first available GPU and fall back to CPU if no GPUs are available.

GPU(s)

The GPU(s) to use for hardware acceleration.

Output message

Model name

Name of the model to train. When training features are extracted for a new model, its name is selected by default.

Number of epochs

The number of epochs to train the voice model. A higher number can improve voice model performance but may lead to overtraining.

1 1000

Batch size

The number of samples in each training batch. It is advisable to align this value with the available VRAM of your GPU.

1 64

Whether to detect overtraining to prevent the voice model from learning the training data too well and losing the ability to generalize to new data.

Detect overtraining

Overtraining threshold

The maximum number of epochs to continue training without any observed improvement in voice model performance.

1 100

Vocoder

The vocoder to use for audio synthesis during training. HiFi-GAN provides basic audio fidelity, while RefineGAN provides the highest audio fidelity.

Index algorithm

The method to use for generating an index file for the trained voice model. KMeans is particularly useful for large datasets.

Pretrained model type

The type of pretrained model to finetune the voice model on. None will train the voice model from scratch, while Default will use a pretrained model tailored to the specific voice model architecture. Custom will use a custom pretrained that you provide.

Custom pretrained model

Select a custom pretrained model to finetune from the dropdown.

Save interval

The epoch interval at which to to save voice model weights and checkpoints. The best model weights are always saved regardless of this setting.

1 100

Whether to save a unique checkpoint at each save interval. If not enabled, only the latest checkpoint will be saved at each interval.

Save all checkpoints

Whether to save unique voice model weights at each save interval. If not enabled, only the best voice model weights will be saved.

Save all weights

Whether to delete any existing training data associated with the voice model before training commences. Enable this setting only if you are training a new voice model from scratch or restarting training.

Clear saved data

Whether to automatically upload the trained voice model so that it can be used for generation tasks within the Ultimate RVC app.

Upload voice model

Upload name

The name to give the uploaded voice model.

Hardware acceleration

The type of hardware acceleration to use. 'Automatic' will automatically select the first available GPU and fall back to CPU if no GPUs are available.

GPU(s)

The GPU(s) to use for hardware acceleration.

Whether to preload all training data into GPU memory. This can improve training speed but requires a lot of VRAM.

Preload dataset

Whether to reduce VRAM usage at the cost of slower training speed by enabling activation checkpointing. This is useful for GPUs with limited memory (e.g., <6GB VRAM) or when training with a batch size larger than what your GPU can normally accommodate.

Reduce memory usage

Output message

Voice model files

Voice models

Select one or more voice models to delete.

Output message

Custom embedder models

Select one or more embedder models to delete.

Output message

Custom pretrained models

Select one or more pretrained models to delete.

Output message

Training models

Select one or more training models to delete.

Output message

Checkbox

Song directories

Select one or more song directories containing intermediate audio files to delete.

Output message

Speech audio files

Select one or more speech audio files to delete.

Output message

Output audio files

Select one or more output audio files to delete.

Output message

Dataset audio files

Select one or more datasets containing audio files to delete.

Output message

Checkbox

Configuration name

The name of the configuration to save the current UI settings to.

Output message

Configuration name

The name of a configuration to load UI settings from

Output message

Configuration names

Select the name of one or more configurations to delete

Output message

Built with Gradio logo