What's new

Support for special framebuffer stuff in plug-ins?

squall_leonhart

The Great Gunblade Wielder
Doomulation said:
I doubt that. The cache is much too small and there is no way to tell what is going to be stored there.


I would doubt that, too. Can you see the fast transfer rates (in the doc)? Yes, those are speeds per second. If an emulator can't do framebuffer read/write all the time, then that means it must surpass the speed that it can write and read from per second. Again, which means it could consume 1 gb+ of ram to buffer it. And whatever you say, referencing and modifying such a big block of memory is nowhere near cheap. That is my theory.


maybe gpu pete could help with making a more efficient framebuffer?
his opengl plugins have several setting framebuffer settings.
err.. minimal low quality framebuffer, normal(best setting, only loads whats needed), and high (shows everything,.. can cause artifacts)


ok. if the cache can't store fb info, then maybe it can store the entry points to the framebuffers meaning the cpu can pull them faster?
 

Doomulation

?????????????????????????
All transfers to and from the GPU card are slow...
I'm not getting your point there... store entrypoints in the framebuffer?
 

squall_leonhart

The Great Gunblade Wielder
ok in that case.... can framebuffer be prerendered and cached to system or vram?, that would make emulation slightly quicker.

how does the N64 exactly load framebuffer?,.. it would only load certain parts and not all otherwise it would be overloaded to wouldn't it?

im not really sure, im just trying to think of a way that the framebuffer emulation can be made faster.

i just thought of one.

instead of copying the framebuffer to rdram, can't the plugin be redesigned to handle all framebuffers via the gpu.. this would be possible especially on Pixel shader 2+ cards as they have programmable gpu's

(its very possible,.. the are able to do physics and audio on a gpu.. so why not framebuffer emulation)
 
Last edited:

Rice

Emulator Developer
N64 have 4MB system RAM (RDRAM). Frame buffer is in the 4MB RAM. In another word, GPU and CPU are sharing the same amount of RAM. Programs running by CPU can directly access the frame buffer. No need to copy. So it is fast.

IBM PCs in CGA, EGA and VGA era are shareing its memory with video cards like this. Of course this configuration has its own problems, so later 3D video cards start to have own onboard video RAM, and frame buffer starts to be not accessible directly by CPU.

Because the way N64 CPU/GPU handling frame buffer is different from the way of current CPU/GPU handling frame buffer, there isn't a fast way to complete emulate N64 frame buffer effects.
 

Doomulation

?????????????????????????
squall_leonhart said:
instead of copying the framebuffer to rdram, can't the plugin be redesigned to handle all framebuffers via the gpu.. this would be possible especially on Pixel shader 2+ cards as they have programmable gpu's

(its very possible,.. the are able to do physics and audio on a gpu.. so why not framebuffer emulation)
Yes, an interesting theory. Let the graphics card do the work so the contents need not be transferred. Don't know if it's possible, but it would certainly speed things up. But it would also raise the requirements.
 

MooglyTwo

New member
Rice - actually, with the Xenon's XGPU, you can't directly access the framebuffer. You can get a window into the XGPU memory space, but you can only use it for reading / writing texture data and other data that doesn't directly touch any of the four main FBs or final FB. You could have the XGPU render the FB contents into a texture within an area of memory that you've marked as write-combined (both the XGPU and CPU can't blindly access the full memory space, you have to mark shared memory as such), or you can use the CPU to write a texture to the aforementioned area of memory, and then render that texture in screenspace to the FB, but you cannot explicitly read bytes out of the FB. So in any case, there's no free lunch, and you wind up with about the same performance hit as you would with DMA setup time on the PS3. Furthermore, you can command the RSX on the PS3 to pull texture data from main RAM, but you have a slight performance hit when doing so.

What I'd be more curious about is, as Doom / squall mentioned, would be emulating the RDP completely on a consumer video card's GPU. You'd need a bleeding-edge card, but it might be possible, and in fact it might be marginally more realistic in terms of the way the RDP works. You pass the RDP packets themselves to the GPU, which in turn processes the packet and renders the triangle (or performs the necessary function). There would be an additional command set implemented to write or read back only the emulated FB space (rather than the full framebuffer), and you could simply read back the FB at the start of the frame, run the emulation until the end of the frame (with any FB reads and writes happening with the cached FB contents), then at the end of the frame you write the emulated FB back to your program running on the GPU. Slow, but not quite as slow.
 

Doomulation

?????????????????????????
Something else... would it be possible to share the load? The CPU can emulate half the RDP and the GPU the rest. The CPU could only send half the framebuffer to the GPU, then, or something like that. I don't know if such a thing is possible, or feasible?
 

squall_leonhart

The Great Gunblade Wielder
wel the way it is now, the CPU does most of the work,.. with most games these days, the cpu and gpu both do alot of work

the only time PJ64 makes hard use of the gpu is when AA is on at high levels
(i have 12xS aa on in a profile with 16xAF for PJ64 and theres no slowdown.... strangely 8xS (old method(not RGM...) causes a slowdown but 12xS looks sweeter then 8xS and goes faster :p wierd.

anyway. as i was saying, the GPU has direct accss to Vram like the N64 core does its Rdram, so maybe handling Framebuffer on the gpu is a better idea?

the N64 core is after all, an MPU (media processing unit... basically both a cpu and gpu)

Doomulation, i think that it would require a special Dx9 plugin to do it anyway, so the plugin wouldn't work on lesser cards anyhow.
 

gandalf

Member ready to help
offtopic: how do you enabled 12xS? you used rivatunner? (it´s possible to make profiles with rivatunner?! :p)


EDIT: I find how to make profiles with RT :)
 
Last edited:

Doomulation

?????????????????????????
squall_leonhart said:
anyway. as i was saying, the GPU has direct accss to Vram like the N64 core does its Rdram, so maybe handling Framebuffer on the gpu is a better idea?

the N64 core is after all, an MPU (media processing unit... basically both a cpu and gpu)

Doomulation, i think that it would require a special Dx9 plugin to do it anyway, so the plugin wouldn't work on lesser cards anyhow.
But the point was, maybe it would be too slow for the card? If it was, then the hybrid method as I suggested, might offload work from the GPU.
And yes, I know it would require an up-to-date card, but why spoil the fun for those who has those cards? The old method can be used for old cards, basically, and the new for new, better cards.

...But this is all just a wish :p
 

squall_leonhart

The Great Gunblade Wielder
well.. anything from the 5600+ would be capable of this, and remember, the GPU is more efficient at handling framebuffers (as framebuffer is a video orientated function)
 

Doomulation

?????????????????????????
Perhaps. We shall see.
And remember that the Geforce FX line was pathetic.
The GPU is fast at rendering graphics, but how good is it at Pixel Shaders? How fast? Can you dynamically create Pixel Shader instructions? Can it fit within the Pixel Shader instruction limit?
 

gandalf

Member ready to help
Well, DirectX10 will be more friendly with shaders, but of course, anyone have an dx10 card now :p
Anyway, the cards have more shader units every year, an 7300GT have 8pixel pipes for example, it´s based on 7600GT core, so, it´s fast and cheap. Could handle shaders just fine.

The cards are very programable today, we can make some physics on them
 

Doomulation

?????????????????????????
Indeed, we shall see. Perhaps it isn't expensive at all, and perhaps it is. I would guess most of the expensive operation with frame buffers comes with the transfer to and forth. But the GPU is also weaker in raw power than the CPU, but it has access to its own ram much, much faster.
 

squall_leonhart

The Great Gunblade Wielder
the cpu isn't designed for handling graphics though, it is designed for math interpretation and calculation.

the gpu on the other hand is designed to handle graphics and framebuffers.

the 5600 Ultra + weren't that bad.. sure not the faster at pixel shader 2.0, but still capable of it. they weren't the fastest, but they aren't the slowest either.
(i've run my 5900XT in farcry at ultra high settings at 1280x1024 at about 25-30fps.... had to tweak my aperture settings to do it...)

i also run Halo via the ATI shader path (using DxTweaker) and theres no slowdown... (very bad miscalculation on Bungies part).

Dx9b (PS2.0a and 2.0b) can handle up to 1536 Pixel shader instructions... and 65,555 vertex instructions. which should be enough... if not SM3.0 can handle unlimited Vertex instructions (in theory) and 65,555 pixel shader instructions.

SM4.0 (specs yet to be set) supports unlimited unified shader instructions.
 
OP
Poobah

Poobah

New member
squall_leonhart said:
Dx9b (PS2.0a and 2.0b) can handle up to 1536 Pixel shader instructions... and 65,555 vertex instructions. which should be enough... if not SM3.0 can handle unlimited Vertex instructions (in theory) and 65,555 pixel shader instructions.
I hope you mean 65535 and not 65555, because that would be a rather unintelligent waste of a bit.
 

squall_leonhart

The Great Gunblade Wielder
yeah :p i just put 65555 coz i couldn't find the exact number at the time...

all the sites that used to compare SM2 to sm3 have vanished :|
 

Doomulation

?????????????????????????
You should have writen 0xFFFF :p We should bug someone to test it out. Maybe someone could bring it up in the pj64 private dev board and see if jabo is willing ;)
 

Top