What's new

Thoughts about Dolphin Performance (For Dolphin Developers)

Cdti

New member
Hello there.

Well, all in all Dolphin is quite impressive. But aside from the more important
compatibility problem, It has some performance flaws. I'm talking about Video
Plugin's design.

I've tested some aspects of Dolphin performance and discovered that Video Plugin could use some major tweaking.

Let's take Resident Evil Zero as an example. I'm using last Dolphin beta (1.0.3), since the ingame speed is no different from previous versions.



Take a close look at the statistics.

So we have 27k triangles in this scene, which is the typical count for Riva TnT 2 generation video card, running at 5 FPS on GeForce 6600 GT.

You might say "Of course, this is an emulation, after all!". But take a look at this:

- 5584 DrawIndexedPrimitive Calls
- 5585 SetVertexShader Calls

This means that you are submitting 10k heavy like hell calls each frame, and killing off CPU completely. I also suspect that Dolphin performs a lot of buffer
locking operations which is also _very_ expensive.

Think about it: 25k DrawIndexedPrimitive Calls = 100% busy 1 Ghz CPU.
You're making 5k calls. But you need a lot of CPU time for emulating, plus
game logic, plus background applications. You can't just have all your CPU
for submitting.

This is no big secret for me that GameCube architecture is very different, and
it is possible that GC DIP calls are not as expensive as thier PC counterparts.

But you have to do something about this. The maximum count should be no higher than 500-600 for DIP calls, and 100-200 for SVS calls. Until then Dolphin
will remain "A Pregnant Snail".

So it seems to this is the main bottle neck. (Oh, come on, we all know that
Resident's game logic is not too complex: no physics, AI is quite simple, tiny scenery and so on)

The specs are: 3 Ghz Pentium 4, 1 GB RAM, GeForce 6600 GT 128 MB.
 
Last edited:

vlado

Sexy's back!
Sorry i'm not a coder but i'm wondering , if this wold be fixed wold the game speed go up considerably :D?
 
OP
C

Cdti

New member
Yes, the speed boost may be dramatic.

In case of Resident Evil 0 and above mentioned specs It may achieve playable FPS.

But don't forget, that this is plugin architecture flaw, and not a bug. :)
So it's hard to fix, but it should be done in order to get playable emu with
current PCs..
 

vlado

Sexy's back!
wew , that's really nice :D i hope it's exactly like you say :) , i really suck at C++ etc. i'm just a web designer :)
 

Doomulation

?????????????????????????
Anyone can say that less work for the CPU means speedup. I suggest we wait and see. Though from a programmer's view, I would say it is a sensible idea.
 
OP
C

Cdti

New member
Anyone can say that less work for the CPU means speedup.

Well, of course, but this is not just "less work for the CPU". Right now, CPU is literally jammed with graphics. If you'll take your time testing Direct3D9, you'll soon discover that even without any game logic (bare graphics) - 5k DIP calls will dampen your FPS below 10 even on high specs PC.

Secondly, in the above mentioned example (and in Dolphin emulated games in general) GPU remains idle, while CPU is working like crazy. So it really does not matter if you have GeForce 6600, four GeForce 7800s in Quad SLI or a crappy GeForce 2.

i'd also have a request for you could you perform this test on pcsx2 to?
Ok, I'll check it out later.
 

Doomulation

?????????????????????????
Cdti said:
Well, of course, but this is not just "less work for the CPU". Right now, CPU is literally jammed with graphics. If you'll take your time testing Direct3D9, you'll soon discover that even without any game logic (bare graphics) - 5k DIP calls will dampen your FPS below 10 even on high specs PC.
The less calls you make, the less work for the CPU to process and do all those calls, basically.
I'm not into graphics programming, but anything with that lots of calls with passing memory and allocating means a big slowdown. I cannor say how big since I haven't worked on 3d graphics before. But you are right I presume.

Secondly, in the above mentioned example (and in Dolphin emulated games in general) GPU remains idle, while CPU is working like crazy. So it really does not matter if you have GeForce 6600, four GeForce 7800s in Quad SLI or a crappy GeForce 2.
The GPU hardly matters in emulation, as we know this.
 
OP
C

Cdti

New member
The less calls you make, the less work for the CPU to process and do all those calls, basically.
I'm not into graphics programming, but anything with that lots of calls with passing memory and allocating means a big slowdown. I cannor say how big since I haven't worked on 3d graphics before. But you are right I presume.

You're right of course. But transfering data from CPU to GPU on current generation PCs architecture is very expensive. And decreasing call count from 5 thousands to 5 hundreds will bring dramatic speed increase.

Secondly, in the above mentioned example (and in Dolphin emulated games in general) GPU remains idle, while CPU is working like crazy. So it really does not matter if you have GeForce 6600, four GeForce 7800s in Quad SLI or a crappy GeForce 2.
Well, it does actually if you are using techniques to decrease load on CPU and bring it on GPU instead.
 

Doomulation

?????????????????????????
Yes, but such things are not common. The GPU processes graphics, and to my knowledge, there are not many who have written shaders to offload load from the cpu.
Otherwise the GPU is quite... idle. Often-wise, at least... I think.
 
OP
C

Cdti

New member
I guess, I'll code proxy dll to intercept Dolphin's Direct3D calls and see how it works without the graphics.
 

Lightning

Emulator Developer
It is known that the graphics in Dolphin is flawed. However, I believe a large part of it is due to the gamecube design and passing many chunks of data to be rendered.

As for dolphin being recoded to fix such issues, not likely, considering the current development speed.

Emulators do take time to first make functional, then to even start to optimize. Yes, they have a dynarec but the CPU is not the only thing to be optimized. You have the video, the sound, file access, and anything else. Consider that by emulating you are processing more instructions than the original hardware, and that the original CPU ran at almost 500Mhz.

Is it possible to emulate games faster? Yes. When will an emulator be out, considering all the current emulators that are out are crawling in development speed or dead, unknown. It has turned into a waiting game of an emulator to come out that solves the various issues that the current emulators have.

You figure Dolphin, Dolwin, Ninphin, WhineCube, GCEmu, GCube. They have all failed at the moment. going based on emulator-zone's list, there are 7 playstation emulators and only 1 rated as being good. I fully expect that we will see an emulator at some point that will work properly, I just do not know when.
 
OP
C

Cdti

New member
You're right in most cases. However the biggest problem in emulation is the guesswork i.e. the compatibility. It's hard to understand hardware architecture design without the explanations from authors.

I believe a large part of it is due to the gamecube design and passing many chunks of data to be rendered.

I guess you're right on this, though this does not mean that emulators should act exactly the same way. I do see several workarounds for this problem. They are far from perfect, but they can improve efficiency quite well.

They have all failed at the moment. going based on emulator-zone's list, there are 7 playstation emulators and only 1 rated as being good.

Yes, since, as I said, it's about hit and miss work, emulators are indeed very hard to develop. But, It's more about their compatibility and not performance. Plus Play Station architecture is a lot more complicated than GameCube's, since the latter was intended to ease the development (N64 anyone?), and indeed seems a lot more reasonable than PS.
 

Lightning

Emulator Developer
As I said. It is a waiting game. There are obvious ways not to do things, and the current emulators make that obvious. So, how long before an emulator comes out that does it proper? Only time can tell if a current emulator is corrected, or, a new one comes out to take the trophy.
 
OP
C

Cdti

New member
Maybe you're right. But as I see it, if some's to redesign Dolphin's graphics facility, then it's good for it's age. Dolphin is quite compatible, and performance may be boosted right now, without any thoughts of the bright future. :)
 

MooglyTwo

New member
Although I am not a Dolphin developer, I can make an educated guess as to what the video plugin is doing. Cdti, consider that the reason for the large number of SVS and DIP calls is likely to be the fact that the video plugin is immediately performing any rendering as requests come down the pipe. Since the plugin is probably not stateful enough to know what requests have happened previously, it simply sets the appropriate vertex shader parameters anew at every tri or tri-strip draw request.

What would be worth considering is delaying all polygon writes by one frame and maintaining a set of buckets for the current frame, with each bucket corresponding to a given shader state. When a draw request comes in, the draw request is added to the appropriate bucket and/or a new bucket is created if the shader parameters have not been used before during the emulation. At the end of the frame, each bucket has its shader parameters set, then all of its polygons are drawn in one go. Buckets would be kept across frames (but emptied each frame once drawing is done) so as to avoid multiple memory allocations per frame. There would likely be significant CPU overhead unless you decided to get really clever with shader programming to do the sorting on the GPU, but it would probably be less than what's being used now by blindly rendering all polys immediately.

Just my two cents...
 

Top