See here for Reclist downloads, Base OTOs, Tutorials, Tools, & Demos
Introduction
Salem-Style English CVVC, or Salem CVVC / S-CVVC for short, is not a single reclist, but rather a method and style guide for English DIY vocal synths. It's built for use in UTAU and OpenUTAU, but could reasonably be used in other softwares.
It's really nothing revolutionary; just a careful construction of voicebank techniques with thoughtful application of English phonology. I noticed a lot of room for improvement in existing English reclists, so sometime in 2017 I decided to put my linguistics background and organization skills to the test and attempt to create a better one.
I wanted to develop a reclist, which later evolved into a broader method, that could create English voicebanks of consistently good quality without having an excess of unnecessary or redundant samples. After a lot of experimental voicebanks and consulting with other vocal synth users, I arrived at a few core design principals:
- Flexibility — ensuring samples included are ones that users can get the most mileage out of.
- Efficiency — reclists are organized to minimize tediousness and redundancy.
- Frequency — prioritizing included phones and samples based on how likely they are to be used.
- Naturalness — sampling from phrases whenever possible, organizing samples by their transition point, and accounting for relevant allophones.
- Consistency — creating a standardized core reclist and a style guide for any additional samples included.
- Ease-of-Use — taking measures to make S-CVVC voicebanks straightforward to create and use.
- Time & Effort — keeping the size of voicebanks relatively manageable in terms of both string count and oto length.
Demonstrations
Overview
The core reclist consists of three sections: CVVC strings, vowel strings, and consonant strings. Different sections types have different prefixes so that they won't get jumbled together in the voicebank files, but these can be removed. All strings have a max length of 8 syllables, and are compatible with OREMO and my guideBGMs.
There are two prewritten lists at time of writing: the LITE list, which contains the smallest number of samples I felt still achieved good quality results, and the [[| FULL] list, which contains a number of additional samples for more natural pronunciation (optimized for American English but still compatible with other dialects, further discussed here).
For comparison with the tables below, a Japanese VCV is usually around 150 strings and 950 oto lines. The core LITE list is about the same size as an ARPAsing reclist (about 220~250 strings), but with more distinct samples and much less redudancy than the default list. Both lists are smaller than VCCV (1066 strings and 3429 oto lines), though the FULL List with the CVC add-on has a larger oto.
Time estimates do not include breaks or equipment setup/takedown.
I recommend taking breaks and/or splitting recording over multiple days for the longer lists.
LITE List Stats
List | String Count | Est. Time to Record | OTO Length | Max Pitches |
---|