What does a Japanese Vocaloid's singing sound like to native japanese speakers?

m170

Ritsu's Renegades
Defender of Defoko
I don't understand japanese, so i've always wondered.
does a japanese vocaloid's singing actually sound like accurate japanese? or does it kinda sound distorted like how a lot of english voice banks do to english speakers like me?
and even though i put this under vocaloid, does the same go for a decent quality UTAU like Yamine Renri or Namine Ritsu etc
if not, is it possible that a UTAU like Namine Ritsu can pronounce japanese better than a vocaloid like Gumi or vice versa?
 

Arissa

Ritsu's Renegades
Defender of Defoko
I don't understand japanese, so i've always wondered.
does a japanese vocaloid's singing actually sound like accurate japanese? or does it kinda sound distorted like how a lot of english voice banks do to english speakers like me?
and even though i put this under vocaloid, does the same go for a decent quality UTAU like Yamine Renri or Namine Ritsu etc
if not, is it possible that a UTAU like Namine Ritsu can pronounce japanese better than a vocaloid like Gumi or vice versa?
Tbh I think something like this can be detected even by a non-native Japanese speaker. Listen at how the syllables/words are being sang/spoken, and if they sound intelligible when a vocaloid sings them. Utau and Vocaloid produce different sounds, and some Utau are more realistic than others and even sound more realistic/human than Vocaloids, so I'm sure many can produce clear vocals that make it easier for listeners to understand/make out whatever it is they're saying.

Like with English Vocaloids, I'd suspect that a Japanese Vocaloid's, well, Japanese, would be decipherable, but have that slurred sound(hard to describe; the sound Vocaloid outputs in general) with decent/no tuning done. Not everything will sound entirely clear, which also helps make sense of the fact that just about all Vocaloid songs have subtitles. With really spectacular tuning, however, where a Vocaloid can sound more human-like, and therefore may sound more understandable.
 

수연 <Suyeon>

Your friendly neighborhood koreaboo trash
Supporter
Defender of Defoko
I imagine that a Japanese speaker finds vocaloids to be uncanny like an English speaker would with libraries aimed at the language: the pronunciation is often either too perfect and precise or just a little "off" from how an actual human vocalist would approach a word/phrase (sometimes the "off" sound is due to dialect/accent that one's ear may not be used to; other times, it's with tuning or lack thereof; with older (V1-V3) vocaloids, the "off" sound is also due to missing phonetic data and/or lack of editing: no schwa, no tap, every consonant being pronounced rather than utilization of glottal endings).

The main difference that makes UTAU sound a little more human is that it - usually - captures variations in recordings better. With Vocaloid, you don't hear much difference between a normal library and it's power variation cause the consonants are consistent where as with UTAU, you can have a library that has normal recordings as well as aspirated recordings. To give a vocaloid example: UNI - being a Korean library - has ㄱ (g/k), ㄲ (kk), ㅋ(k). The difference between all three is that g/k is practically unvoiced (a non-aspirated g like 'good'), kk is a non-aspirated k like in "skip", and k is aspirated like in "kite." Only English libraries have this level of differences in consonants. The closest thing Japanese libraries have to a nonstandard pronunciation is Sachiko and that's due to specialization.
 

chunter

Ruko's Ruffians
Defender of Defoko
As in any other language, it depends on how much work you put into making the crossfades and syllables sound realistic. I otherwise have not gotten a straight answer to this question, mostly because I don't know how to ask it correctly.
 
  • Like
Reactions: m170

Nohkara

Pronouns: He/him
Supporter
Defender of Defoko
I'm just half Japanese but I can tell that I can understand most of Japanese Vocaloid without lyrics, as long as no weird "tuning" in consonant where shouldn't be, let's say, no extra "n" syllable everywhere (*cough* Kyaami's tuning tuto) made "s" sound extra long with no good reason or stopping of "k/t/p" where shouldn't be.

In Japanese there's difference between "kite/きて" and "kitte/きって" (little "tsu" in this case makes a noticeable small pause between "ki" and "te" unlikely in "kite" which is just continuous) many Oversea users who knows none Japanese often puts that little "tsu" in places where is should NOT be. And it sound WEIRD and UNNATURAL to me. Sometimes people do this is "an accent" purpose but for me it's just... annoying. (IDK where I should compare this but I guess that this is as annoying as someone cannot say "r" in English.)

And then, oh boy, there are some Vocaloids that are recorded(?) so funky way that are quite difficult understand most of time (without hardcore tuning/mixing): Arsloid and Yumemi Nemu and Gachapoid Ryuto. I have no idea why but Nemu's voice sound like her VP was far away(?) from microphone or tried too hard to "accent" her voicebank. Arsloid, duh, I'm disappointed still on him, too much quality issues. Gachapoid, eh, do I need to explain this, his voice is extra froggy and not human like at all.

I do understand Sachiko most of time but because she's genre specific, her consonants are recorded more strongly than in normal Vocaloid. With fast songs she sounds very unnatural with k/t/p sound because it's pronounced so... hardly. If she had "normal k/t/p" sound variation I bet that she would be much easier Vocaloid to use and has more singable genres.

Then there are few Vocaloid with Japanese VB with non-native VP: Yohioloid, SeeU and Luo Tianyi (coming in this year).

I am super honest here but Yohioloid sounds most "native-like" from all 3. Maybe he has mild Swedish accent but it's relatively SMALL, I can hear that his VP definitely KNOWS Japanese and he has worked on it.

I do not hate SeeU but her Japanese sounds like her Korean, I cannot spot difference (please don't take this as offense, I really cannot hear difference). Like... I have heard Korean UTAU users to pronounce Japanese with better accent than SeeU's Japanese, SeeU's Japanese level sounds like "just started Japanese" level in my ears.

Luo Tianyi Japanese doesn't have many demos yet but what I have hear, her Japanese is not bad as being Chinese (Chinese people struggles a lot with Japanese r, u, ts, ch and sometimes b/g/d too what I have heard from Chinese UTAUs). She still have strong-ish "Chinese melody" in her voice that makes me need to more focus to understand her but her pronouncing is better than most of Chinese UTAU with Japanese VB in my opinion (this does not mean that I dislike Chinese UTAU, I'm sure that most of users are doing their best but it's sad that only pinch of them are actually very understandable without lyrics).
 

수연 <Suyeon>

Your friendly neighborhood koreaboo trash
Supporter
Defender of Defoko
I do not hate SeeU but her Japanese sounds like her Korean, I cannot spot difference (please don't take this as offense, I really cannot hear difference). Like... I have heard Korean UTAU users to pronounce Japanese with better accent than SeeU's Japanese, SeeU's Japanese level sounds like "just started Japanese" level in my ears.

You could probably compare SeeU's Japanese to Miku's English where they had to learn as they went. I honestly wouldn't be surprised if Kim Dahee's level of Japanese was/is beginner level. Korean groups generally get prepped for international promotions by being taught Japanese and English (best example is probably BoA - iirc, Japanese speakers couldn't tell that she wasn't native and her English album wasn't really accented), but GLAM more than likely didn't receive that kind of investment. They were axed with only 3 or 4 promotional singles to their name, so... more of a one off group than anything.
 
  • Like
Reactions: m170

WinterdrivE

Ritsu's Renegades
Defender of Defoko
I do not hate SeeU but her Japanese sounds like her Korean, I cannot spot difference (please don't take this as offense, I really cannot hear difference). Like... I have heard Korean UTAU users to pronounce Japanese with better accent than SeeU's Japanese, SeeU's Japanese level sounds like "just started Japanese" level in my ears.
I'm convinced they didn't actually record any new samples for her JP VB, but just recycled the nearest KR sample. (eg, か=카) SeeU's Japanese accent is literally straight up Korean.
 

Nohkara

Pronouns: He/him
Supporter
Defender of Defoko
I'm convinced they didn't actually record any new samples for her JP VB, but just recycled the nearest KR sample. (eg, か=카) SeeU's Japanese accent is literally straight up Korean.
No wonder then wow, that's super lazy At least for Yohioloid and Luo, they made separate Japanese recordings which I'm glad about (Chinese sounds are so different compared to Japanese anyway, so they had to made new recording for that anyway. Like... Chinese cannot have [ e ] [ n ] sound alone, no ki/gi, no y-glides etc. While Korean has technically all needed basic sounds for Japanese plus extra amount of y-glide sounds, so...)
 
  • Like
Reactions: m170

WinterdrivE

Ritsu's Renegades
Defender of Defoko
No wonder then wow, that's super lazy At least for Yohioloid and Luo, they made separate Japanese recordings which I'm glad about (Chinese sounds are so different compared to Japanese anyway, so they had to made new recording for that anyway. Like... Chinese cannot have [ e ] [ n ] sound alone, no ki/gi, no y-glides etc. While Korean has technically all needed basic sounds for Japanese plus extra amount of y-glide sounds, so...)
I don't know that they definitely did that, but it certainly sounds like it. Or at best they just had Dahee read off a Japanese reclist written in Korean. Like, I can hear SeeU pronouncing ん as 응.
 

수연 <Suyeon>

Your friendly neighborhood koreaboo trash
Supporter
Defender of Defoko
Shortcuts wouldn't surprise me - honestly speaking - given how ambitious their goals were at the time... Bilinguals - especially the first time through - always sound awkward af or are done lazily to meet deadlines (Yohioloid may be the 1 exception). That's not unique to SeeU.
- Luka English V2 had a lot of missing data (at the time, it was more like a gimmick and made for Japanese users only). V4 had more data, but it was poorly programmed, even for power vocal standards (power vocaloids always have some degree of choppiness, but Luka V4 took the cake)
- Macne Nana... the less said the better. I could use a Japanese library and not tell the difference. That's how accented she is.
- Miku Eng: it's gotten better with V4, but still awkward due to accent issues (r/l/w).
- Dare I mention the bad experiment that was Sonika - and that was a native vocal. Sounds like they tried to experiment with "How cheaply can we make a vocaloid, using a singer that doesn't have access to a studio?" She's not a bilingual, but I do put her in the "ambitious for the time" category. Non-studio vocals have gotten better over time.
 

pl2vio

Momo's Minion
I'm not a native Japanese speaker but I have been studying Japanese formally for almost 4 years now and in my opinion, it really depends on how the tuning in the song is made. For most songs that are sung at a normal speed and tuned nicely, I could pick up on the words they are singing without having to look at the lyrics. Of course, there are also non-distinguishable songs with great tuning, just that the vocals seemed to blend in too much that I can't make up what they were saying except for a few clearly spoken words.
 

ElectricSunset1217

Momo's Minion
Shortcuts wouldn't surprise me - honestly speaking - given how ambitious their goals were at the time... Bilinguals - especially the first time through - always sound awkward af or are done lazily to meet deadlines (Yohioloid may be the 1 exception). That's not unique to SeeU.
- Luka English V2 had a lot of missing data (at the time, it was more like a gimmick and made for Japanese users only). V4 had more data, but it was poorly programmed, even for power vocal standards (power vocaloids always have some degree of choppiness, but Luka V4 took the cake)
- Macne Nana... the less said the better. I could use a Japanese library and not tell the difference. That's how accented she is.
- Miku Eng: it's gotten better with V4, but still awkward due to accent issues (r/l/w).
- Dare I mention the bad experiment that was Sonika - and that was a native vocal. Sounds like they tried to experiment with "How cheaply can we make a vocaloid, using a singer that doesn't have access to a studio?" She's not a bilingual, but I do put her in the "ambitious for the time" category. Non-studio vocals have gotten better over time.
I love Miku's English as a native speaker myself (both versions for different reasons- v4 was my first but boy was i happy when my parents helped me buy v3).

I grew to appreciate it more familiarizing myself with more advanced Japanese pronunciation and like, the way the mouth is held and stuff and thus being able to detect elements of Japanese pronunciation in her accent, as well as seeing Japanese people try to speak English. And I realize Saki Fujita must have worked really hard to learn for the project. She had to learn a lot of sounds that were foreign to her and how to properly pronounce them. Miku exaggerates some sounds and pronounces them awkwardly because since Saki's unfamiliar with them, so is she. She was doing the best she could despite not really knowing English. (I also just think her accent is cute tbh lol, and I'm just being a know-it-all)

I think Macne Nana's point was to capture the voicer's natural accent; as a result, however, phoneme editing with her is...certainly an experience. Miku does have some trouble pronouncing Rs but not quite to Nana's extent and it sounds more like an accent thing and less like a speech impediment...

But yeah, as an L2 Japanese speaker, SeeU Japanese just sounds entirely phonetically Korean, those Rs especially. I dunno if a single JP-EN voicebank reaches "phonetically Japanese" territory other than maybe Nana.

Sonika's a darn shame, because her voice is really sweet and pleasant but her quality is so bad. I would totally get her if she wasn't potentially a total nightmare to mix.

I've tried Luka V2 English and yeah, she's really choppy, isn't she? Her voice is sooo pleasant though. I wish she was easier to use. If I were to choose between the two I'd get V4 to save some pain; in her case I like her but I dislike the Straight/Soft situation (Miku V3 and V4 English are a generation apart but they make a MUCH better XSY pair).

As for my thoughts on how Japanese Vocaloids sound, yeah, they have pretty odd accents for the most part. But hey, that's part of their charm. (There are some pretty naturally-accented Japanese Vocaloids out there though.)

I'm just half Japanese but I can tell that I can understand most of Japanese Vocaloid without lyrics, as long as no weird "tuning" in consonant where shouldn't be, let's say, no extra "n" syllable everywhere (*cough* Kyaami's tuning tuto) made "s" sound extra long with no good reason or stopping of "k/t/p" where shouldn't be.

In Japanese there's difference between "kite/きて" and "kitte/きって" (little "tsu" in this case makes a noticeable small pause between "ki" and "te" unlikely in "kite" which is just continuous) many Oversea users who knows none Japanese often puts that little "tsu" in places where is should NOT be. And it sound WEIRD and UNNATURAL to me. Sometimes people do this is "an accent" purpose but for me it's just... annoying. (IDK where I should compare this but I guess that this is as annoying as someone cannot say "r" in English.)
I am aware of the difference between single and double consonants, but we Western producers generally lower the velocity of consonants for extra impact, like the singer is almost "drawing back" to pronounce the next syllable, like their tongue is a bow and the syllable is an arrow...or something. (I think I've seen Japanese producers do the long consonant thing as well. I wonder if they're mimicking Western producers?) I suppose I could stop doing it but I won't always know how to add more impact to a syllable otherwise.

But something I love about Japanese Vocaloids is how little phoneme editing you have to do aside from velocity edits, silent vowels and stuff to "decorate" the performance (e.g. end breaths and the aforementioned "drawing back", and in Crypton V4X's case EVEC as well if wanted).

Also, since then, there has been Xin Hua Japanese, and I think her accent is more noticeable than Tianyi's. I've heard WIL is voiced by a Filipino as well, I dunno how strong his accent is, I think I hear a bit of Filipino in his voice.
 

Similar threads