Mako Nagone is said to be perfect for beginners, but it's a Kana-only voicebank. The Macne UTAU otos are said to be "lazy". Also, the sample names aren't standard and they're aliased only for Kana. Try to use overseas voicebanks with good otos or use a Kana conversion plugin on your USTs. Overseas voicebanks usually handle both Kana and Romaji. Try to find a voicebank that does not stress certain consonants.
You might want to familiarize yourself with pitchbends and flags, too. I recommend zteer's Advanced Pitch Editor for editing pitchbends. UTAU has a built-in pitch preset manager called "A LA CARTE". While it's not as capable as the pitch plugin, it can be a good starting point. The flag list for moresampler is available on Kanru Hua's website. If you're willing the extra time for better intonation, use envelope manipulation. zteer has an Advanced Envelope Editor available for download.
If you want good results really quick, download presamp, moresampler and a VCV voicebank. Yowa Shion Normal VCV renders smooth under presamp+moresampler (aside from a few glitches), but you can use Teto, Mako or any other VCV you want. Just don't forget to set presamp to both wavtool and resampler and configure presamp with predit to use moresampler as both wavtool and resampler. predit is a plugin which is bundled with presamp by default. presamp basically turns UTAU into VOCALOID, provided if you're using it with a VCV/CVVC voicebank and can do Kana to Romaji conversion without you having to touch the UST.
Make sure you've set your decimal separator to a point before attempting to use presamp or any resampler.
This was created with presamp and moresampler and zteer's envelope and pitch plugins.
Also, USTing. Some USTs need to be edited in order to render good results. If use use presamp you have to set preutters and overlaps to zero on every note. Also, clear the STPs and remove and global or local flags that come with the UST. There is a chance that the envelopes need resetting as envelopes are used for much more than realistic-sounding consonants.