Hi Visor4,
Multiple CPU cores could be used, but this involves deciding on several aspects:
- How much--if any at all--of the RDP code should really be done on the CPU? You could multi-thread it, but by the time someone were to contribute to the sources (in a variant like what maister has started), much of it could be processed on the video hardware which can leave whatever left for the CPU.
- Should multi-threading parallelize the complex phases of the RDP emulator with other such phases within itself? Should it parallelize the RDP to the RSP? Should it parallelize the RSP and/or RDP to the CPU? (Note that the last could natively solve some timing-specific RSP problems in certain titles.)
- Should the multi-threading be managed within plugin callback loops or within the main CPU core loop (e.g., an exe process)? This is especially relevant if the above question isn't answered with the first idea.
- Is management of the chosen multi-threading design scheduled better than native CPU management within the simpler single-threaded execution model, if the performance was fast enough such that only a single CPU core would be needed for full speed?
- How much benefit does multi-threaded pixel-accurate RDP emulation yield when weighed against the benefits of other optimizations remaining to be committed for the plugin in a single-threaded environment?
Items 2 and 5 are more interesting to me. Item 1 I can overlook because I am not--internally to this same plugin project--playing with hardware-accelerating the RDP like maister is. I prefer keeping it optimized while staying a software renderer, at least within this single plugin. Item 3 can be overlooked in an anti-plugin environmental model, such as RetroArch, where the RCP emulation is done within the same process image as the CPU core (which makes multi-threading easier to play with). Item 4 can temporarily be ignored due to the fact that, at present, this plugin does not run at full speed within one single CPU core on the majority of x64 processors.
But item 5 is really what I was trying to get back to with this plugin--it should be optimized to perform well even for people who either can't or do not want to use more than 1 CPU core or have it multi-threaded. Ideally, any realistic possibility of the plugin running at full speed without multi-threading should come to fruition. My current computer has only 2 CPU cores and is plagued by CPU overheat when too much is utilized, so I really do not want to have the plugin running on more than 1 CPU core in this computer's case even though I probably could. That's just for my rather arbitrary scenario, personally.
That being said, all my interests in fulfilling feature requests like scan-lines, GLSL accelerations, disabled DirectDraw renderer code to analogize performance against the new not-yet-released OpenGL blitter code, etc. have sort of overwhelmed me into a mixed state of business and procrastination. The plugin should still get another update, but I'll have to throw some of those feature requests out of my head temporarily to be able to concentrate to returning to it.