Don't forget the price difference. Don't forget other things that'll eat into the available power, like the effects. If I remember correctly, Reface CS uses an updated version of the engine, so the quality could have gone up while the number of voices stays in place. Also, compact and battery powered, this limits how much power they can put in there.
I don't know much about the AN engine but it has ties with Yamaha's physical modelling work:
In addition to the virtual acoustic modeling used in the tone generator itself, physical circuitry modeling was applied in some elements of the VL1 effects as well. The VL1 was followed by the VP1 with physical modeling of string instruments, the AN1x with modeling of analog synthesis, and the EX5 and EX7 which combined algorithms from both the VL1 and AN1x with a range of new physical modeling concepts. The technology was constantly evolving and being refined at K’s Lab. The term “VCM” was first used to describe the modeling technology used in the “Add-on Effects” series of audio effects developed for a digital mixing console that was released in 2004. From that point onward K’s Lab has been devoted to developing physical modeling technology that effectively emulates a variety of audio effects, including the types of analog outboard processors used in recording studios and guitar processors, and implementing that technology in a wide range of products. - The Development of Virtual Circuitry Modeling Audio Effects, Yamaha
The JD-Xi can achieve 128 voices because it uses an approach that starts with samples and then does all sorts of things with them, it is technologically probably closer to the Reface YC and CP than to the CS.
Look at the Blofeld or Ultranova, both can have a touch more than 20 voices... in a best case scenario. Here's part of how the Ultra does it:
According to Novation, “The wavetables in the Supernova series are all calculated. The wavetables in the UltraNova, even the standard analogue waves are wavetable oscillators. This change in oscillator generation was first used on the A-Station and K-Station and subsequently in the KS series, X-Station and Xio.” This allows the UltraNova to have some advanced tricks when it comes to the oscillator section, which will be covered in detail below. - MATRIXSYNTH Review and Overview of the Novation UltraNova
I don't know what the Blofeld does but I wouldn't be surprised it's the same kind of stuff.
KingKORG, not a cheap thing (originally). 24 voices. Same as the RADIAS. However, much more capable oscillators, much better filters, and the overall quality has gone up.
I could go on. Comparing voice counts is meaningless because it's missing way too much of the picture. It does not tell you the quality of the voices, it does not tell you the complexity of the voices, it does not tell you if those voices are reliable, it does not tell you what other things the synth may be doing that also uses up resources, it does not tell you if the engine responds smoothly to changes, etc.
With digital you can go anywhere from a million aliasy pseudo saws to a hyper real mono analog emulation that brings an i7 to its knees. Where on that range a synth is is a big guess because the exact tech details aren't known, "VA" being short for "Vaaaaaaaaaague".
BTW, multitimbrality has no place on the CS, just not what it's for