Utau Problems (Original Voicebanks not working)

LadyPunk1313

Momo's Minion
A few years ago I made my first Utau Rave Ryde despite not having any friends in the fandom (Vocaloid and Utau alike) i followed various advice from videos, forms, etc. Long story short he sucked ass and I decided to re record him. After fully re recording him in Romaji I decided to download some USTs and plug n play to see what he sounded like in a full song. The overall result was not bad if you want to listen here are his high and low demos: https://soundcloud.com/ladypunk1313/aaaaaaaaaaa-rave-ryde-act-2-low-demo https://soundcloud.com/ladypunk1313/aaaaaaaaaaa-rave-ryde-act-2-high-demo
After a while of unsuccessful attempts to get him to sing a cover I was dissatisfied with the quality of his voice bank and scrapped him once again. About a year or two later I got my first mic and decided to re record him yet again with a real mic versus the computer's mic. Things went better and instead of using romaji I decided to record him in kanji since most of the USTs were in kanji instead of romaji. I was finally satisfied with with him but something strange happened. Rave would only sing the AAAAAA UST (Which is the demos posted above) Other attempts to get him to work (Creating an UST, using another UST, etc.) would result in a glitchy choppy mess which I cannot describe better than that so just hear it for yourself: https://soundcloud.com/ladypunk1313/how-does-one-utau-part-2

The same thing happened when I recorded my second utau, and Rave's little sister, Kandie Kane. Which you can also hear here: https://soundcloud.com/ladypunk1313/how-does-one-utau

I honestly have no idea what is wrong with them and am terribly frustrated at this point. If anyone knows or even suspects the might know what's wrong I'd really appreciate any and all advice/theories.
Thank you
 

수연 <Suyeon>

Your friendly neighborhood koreaboo trash
Supporter
Defender of Defoko
LadyPunk1313 said:
Um what is otoing? There wasn't anything like that in what I saw in the tutorials.
dilleniidae said:
Did you oto the voicebank? You might want to oto/re-oto it.

Otos are what every utau needs to sing correctly and to be able to work with romaji and hiragana.
You access it via: Tools -> Voice Bank Settings (or simply Ctrl + G)

Otoing involves the following parameters/settings:

oto_screenshot_1.png


Alias: Alternative name for a phoneme so that it's able to play either romaji or hiragana. It's simply best to have both formats (Banks made in Japan often only have Japanese alias [like Defoko's bank above], so you'll have to make the aliases in romaji yourself).

Offset: First blue area. Where the sound starts. It's best practice to leave some silence before the sound starts to emulate the natural pause that occurs in VCV voicebanks. It also helps prevent consonant drops. 20 to 60 milliseconds of silence is recommended.

Preutterance: The red line. This goes where the consonant (k, g, p, b, h, f, s, t, ts, v, n, m, r, ch, sh, y, w) ends and the vowel begins. The exceptions to this are consonant glides (ky, gy, py, by, hy, ny, my, ry). In the case of consonant glides, y is not to be treated as a consonant (same goes for kw, gw, pw, etc. if the bank includes Korean sounds).

Consonant: Pink area. This goes over the consonant/consonant glide, and some of the vowel. You generally want to have about 1/4 of the sound covered and the rest should be white. This tells the program what not to stretch (so you won't get sounds drawn out in ways that are unnatural or sounds like a speech impediment like really long s sounds).

Overlap: Green line. This parameter is complicated to explain. It is generally used mostly for VCV and CVVC banks, but if your vowel sounds (a, i, u, e, o, n) are long enough (2 - 4 seconds), then it can be used in making smooth transitions between vowels (I would have a set of vowels without overlap and a set of vowels with overlap to prevent timing issues with vowels that come after a rest). Anything before it becomes inaudible and anything after it fades in. You do not use this generally in the case of consonant-vowel samples - exceptions only in long, drawn out sounds (sh, s, f, sometimes n, m if you held them out long while recording). Even in the case of long consonants, it should not be more than 1/2 the consonant length. This never goes after the preutterance (red line). You will often see banks like Defoko, Teto, and other banks putting it after the preutterance and even after the consonant, but it's generally not in good practice and can make your bank show errors and sound robotic or full of static. Defoko/uta herself is an example of this poor otoing practice (and poor otoing in general), exacerbating her already artificial vocals.

Cut off: Second blue area. This goes just before the sound fades out/ends.
 

Kai_Kimura

Teto's Territory
ust are almost never in Kanji its Hiragana maybe that's the problem and as for voice quality don't record in stereo uta hates stereo and makes your voice sound like you recorded in a noisy train one more thing mic dont matter too much i used a high school musical mic for ps2 has a usb cable i bought it for rock band lol hasn't failed me yet lol.
 

Similar threads