Froom my point of view there are 3 main factors that produced this "loudness war":
First of all, as we all know it was that hype around the fact that loud sells better. It is true to some extent; e.g. you'll most likely like a version which is 0.5 - 1.5 dB louder due to the fact that it sounds more exciting and that the difference in sound volume isn't that much for your average Joe. This used to work to some degree, but when you can't get more "loud" on the record, the whole "loudness sells" is just bollocks.
Second of all, there was no standard to what RMS levels should be. If this existed, there would be no problem, but since there's a jungle out there, anyone can try what ever they want to, even if it's just dumb and sounds crappy.
Third of all, some artist like it that way. For example, in the american metal scene (e.g. metalcore, deathcore, whatevercore they come up with next) the whole overcompressed/overlimited sound is a characteristic of the genre. They WANT the kick drums to sund like pillows being hit instead of drums. The limiter is seen more like an effect (in the sens of chorus) to shape the sound. This is also true for the IDM/breakcore genres (due to the fact that they mix and "pre master" themselves). Sadly, this was taken as a general purpose effect, but you can't use all of the effects on all genres. Just try to imagine having all tracks on a symphonic song with a distorted flanged echoed reverbed trancegated sound (someone might invent a new genre this way, but in my opinion it's just sheen stupidity).
The bad part in all of this is that it seems that nowadays the standard of RMS level seems to be "as loud as you can, brick wave preferably". Some music genres have an excellent dynamic range (I recently listened to some japanese taiko drummers CD) but some genres are defined by no dynamic whatsoever (I think that they're trying to make their music resemble their faces, rugged and expresionless

). What really bugs me is the fact that genres that used to have dynamics try to have this "fresh, modern sound" and focus less and less on creativity.
To make things even worse, the consumers simply forgot how to use the volume knob. They expect extremely loud music and if it doesn't sound as loud as the other things they listen to they will either find it a) unprofessional ("everybody's making it loud, why hasn't it changed in the last 20 years?") b) too soft (as if it were to hard to turn the volume up) or a comination of both.
The only viable solution I see (which existed for quite some time) is ReplayGain. Since the majority of listeners use mp3, they could easily use the RG gain tag and most of the level problems would be solved. It's not that hard for the most popular players to use RG (some of them have), to support it and to automatically calculate RG before playing the damn thing (unless a tag is present). Also, it's easy to implement it in modern players via firmware (at least into some players) so people won't have to buy new stuff.
In the meantime, AES/EBU could try to "force" and set the standard RMS levels to -14 dbFS (chosen by the fact that this is the VU meter setting), with exceptions to other levels given by the dynamic requirment of the song, but to have a maximum RMS peak at -14 dbFS(or let's say -12). Most VST should use the K system (or have an option for it) when it comes to levels. In the end we would have the same result, some extreme genres would still have a brick song, but only x db softer, but artists would have the option to be dynamic without worrying about the volume. We still would have some ****tards (as I read in a post in this thread) who pay to see "all those lights on", but their quantity would be negligibe.
PS: it's actually amusing that there is so much hype about the "analog" warmth, but people still try to get everything as loud as possible, destroying that warmth to some degree.