resamplers have to do pitch shifting, formant correction, and time stretching.
Things like Melodyne and Autotune have had lots of money and time put into them by large teams but also the use case is totally different, working with real singingi in the first place rather than a jumble of samples.
Even the default resampler can do pitch shifting pretty well it's when you start stretching and doing vibrato that everything sort of falls apart a little bit. If I'm not mistaken it's based on tandem-straight so it's not like it was 100% made by one guy.
Each resampler uses different methods. Fresamp separates separates samples into frames/segments. EFB-GT/GW uses WORLD magic.
Resamplers like bkh01 and moresampler more or less (re?)synthesize the voice samples. BKH01 is a bit more dirty about it and also redoes everything on the fly each time while Moresampler saves analytic data of each sample and uses that instead of the actual wav file.
VS4u is literally just vocalshifter so it's like it's kinda like pre-autotune.
The default resampelr actually sounds worse than the previous version in some cases. In all honesty it's much more fair to look into other resamplers for comparison.
Fresamp with the F2L2 flags will sound good and fresamp14 has a version that can use NVIDIA cards to run really fast. EFB-GW is good for some males. BKH01 is good if you really want to mess with a voice and do something unnatural. Moresampler tries to be as close to the original source as possible but is still under heavily development so it can have the occasional glitch, it also has a lot of custom features to transform a voice in a way that is extremely useful.
Fresamp11/14(with flags as it's default settings aren't desirable), EFB-GT/GW, and Moresampler are three resamplers that almost never sound metallic and if they do it's usually from sample quality issues. Resampler, Tips, tn_fnds, M4, w4u are very often metallic sounding but work for a few voices.
A lot of the time the failure of the resampler can be due to the quality of your samples, the more noise the smaller the useful range. Also you want to use the large, CONSISTENT part of your samples. So the area you want to stretch/loop should be as uniform as possible. Utau can sound fantastic if you have the right combination of configuration and such. No matter how long you set the vowel segment to be stretched it will not compensate for noise in your samples, you will need to just live with that.