While disabling the cache probably won't help as much on newer systems (since the focus has been trying to get the memory system to go faster and faster as CPUs become insanely fast), it's still quite effective for a major drop in performance.
Most chips made in the last decade or so have included an on-chip (L1) cache that runs at the full speed of the processor. Basically, this means that the memory is as fast as the processor, and the processor doesn't really need to wait on its memory. Most chips (since the Pentium Pro, essentially) also include a high speed L2 cache, which runs at some multiplier of the clock speed (generally 1/2 on older chips, and essentialy 1, aka the same as the L1, on newer chips). The L2 cache is slower than the L1, but it's also a lot bigger. Generally, the L1 is only big enough to hold the part of the program that is being immediately executed, while the L2 is generally big enough to hold most of the used program and data.
Now, the important thing is that this architecture means that a CPU hardly ever needs to use main memory at all. It can run full speed most of the time. However, as soon as it needs to fetch data from main memory, the entire CPU has to wait while the instructions and data get carried over from the very slow main memory.
All of this stuff gets passed over the front side bus, which ran at a mere 66 MHz for a long period of time, possibly including that Pentium 450. Even 100 MHz and 133 MHz FSBs aren't that big of an improvement. On top of that, accessing main memory is a lot less efficient in terms of clock cycles than accessing the local caches, so it's even slower than the MHz discrepency would suggest.
The important thing to note here is that the 486 and the 386 (for which many of these games were designed) had a 33 MHz FSB. Also, these games were often designed for 33 MHz processors (with the 486DX2/66 running at 66 MHz internally). By turning off your cache, you're essentially forcing your high end Pentium III 450 or whatever to run essentially as fast as your FSB will allow, which is about as fast as a 386 or 486 can run. Hey, progress.
Another aspect of this is that the 386 basically executed an instruction, then went back and fetched the next one. The 486, however, keeps fetching instructions, then passing them on to other parts of the chip for execution, like on an assembly line. So while the 386 requires multiple cycles per instruction, the 486 can effectively execute one instruction per clock, and the Pentium has multiple pipelines running in parallel, so it can execute multiple instructions per clock. Modern chips like the Athlon can execute around 4-8 instructions per clock cycle.
The upshot is that when you turn off the cache and force the processor to eat instructions dribbled through a straw, it effectively executes like a 386, one at a time.
Now, this probably won't continue working indefinitely into the future, so some day we'll either need MoSlo anyway, or we'll be running all our games in emulation anyway, which should slow things down enough.
For now, the game will still run faster, although not insanely fast, but as the FSB speeds keep going up (I believe the latest Opterons go up to 800 MHz or 1.6 GHz, and the latest P4s have an 800 MHz bus already, I believe), the CPU will be able to run faster, even with the cache turned off.