One of the problems with your comparison is that you're only looking at CPU clock speed. An emulator emulates much more than a CPU. In the case of NES, the PPU takes a lot more resources to emulate than the CPU because it's more specialized hardware that runs at a much higher clock than the CPU. N64 emulators, on the other hand, offload a lot of the video emulation to a 3D accelerator.
Another problem is that it often takes a lot more cycles to emulate something more accurately. A lot of NES games require much finer tuned accuracy than N64 games do. With that in mind, the best NES emulators are much more compatible than the best N64 emulators. N64 emulators employ major tricks that reduce emulation requirements significantly at the cost of being incapable of emulating some things correctly or some games at all. An N64 emulator that emulates everything at a low level is much much slower than the ones you've used, and even then won't have nearly the timing accuracy that the best NES emulators do (or even most of the not as good ones).
Finally, most emulator authors are only going to emphasize performance if typical hardware requires it. Instead of optimizing their code they'd prefer to spend their time finding/fixing bugs, which is very time consuming in an emulator, and adding features. Optimizing code is not only time consuming, but takes special talents that I don't think every programmer is really good at. Even then, it can make it much harder to ensure correctness of the program, especially while adding new things. Some optimizations, especially those popular in older emulators, are also inherently unportable, preventing the program for being easy to compile for other platforms.