Rice to OpenGL 2.1 (and potentially ES)

Narann · Jul 14, 2014

Hi guys! As you maybe know, I'm rewritting some parts of mupen64plus-video-rice to make it more OpenGL 2.1 compliant (aka rewritting almost everything related to graphic...). The idea is to throw away old deprecated OpenGL functions (and pipeline) and have a code that could be easely ported to OpenGL ES 2 (used in a modern way, OpenGL 2.1 is supposed to be easy to port to OpenGL ES 2).

I'm very new to the N64 HLE scene so I learn smoothly. Massive thanks to Gonetz's blog and his answers that totally help me to jump. You have no idea how guru comments are valuable when you have no idea where to start.

Before do the big jump to go in massive shader use (aka OpenGL 2.1 refactor), my two big bugfixes where fix fog (fog in Rice!) and fix fractionnal part of the insertMatrix command (fix leaning trees in SSB64). Unfortunately, fog (and a lot of others things) are breaks now that the Color Combiner has been rewrite to avoid deprecated OpenGL functions (Full GLSL). I surprised myself to actually read MESS's code and other plugin's code to see if I get it or not. I must admit my programming knowledge just bounce up since I'm working on Rice (and I still have a lot to learn).

Anyway, I choosed to create this thread after a lot of investigations to understand the relation between ALPHA_CVG_SEL and CVG_X_ALPHA. Digging over the internet I found this thread (it seems to was the good old time where everything has to be done, so many peoples discussing) and I have the impression it is the way to go: Create a thread to help others to follow you work and maybe get some help. So now I think I've (kind of) understood.

What I hope (we will see) is peoples here could help me, just giving the good word I'd miss or I should look for. When you are digging in a problem, often a simple "Did you check <magic word here>?" is enougth to help you to look at the right place.:inlove:

Don't expect massive activity (spare time project) but I will try to update this thread each time I get something new.

What took my time recently (and I should defenetly move to something else as there is a lot of work): The relation between ALPHA_CVG_SEL and CVG_X_ALPHA:

CVG_X_ALPHA alone : Surface is clamped (coverage) if alpha == 0 Aka do a nice alpha blending but clamp on places where alpha is not needed. This create a transparency artefact when two transparency objects are displayed. 1080SB over use this and you can see this artefact here on trees your going under (on the left). It could be considered as a "strict cutoff".

Have CVG_X_ALPHA avoid this:

Providing something like this:

While it can do the job, this is not perfect. This is why we often see this:

CVG_X_ALPHA and ALPHA_CVG_SEL : Surface is clamped if alpha > 0.5. (0.5 is arbitrary value even if I'm sure there is a true way to find it (I guess it's because coverage is anti aliased and I guess there is no interest to put CVG_X_ALPHA and ALPHA_CVG_SEL without AA_EN but I need to investigate).

Anyway, this works: Mario Kart Karts sprites (CVG_X_ALPHA and ALPHA_CVG_SEL) are nicely clamped. MK64 items, 1080SB border trees (CVG_X_ALPHA alone) are not but keep pass a "honest" Z depth test.

If you look the 1080SB video carefully at the provided time (effect visible a 6min20s), you will realize the transparency is not arsh but a little rounded (hard to see on the video, I agree, this is why I ask for valid N64 screenshots. :whistling: ). Actually, put a >0.1 instead of a == 0 give a better visual result on highest resolutions as ==0 show pixel squares on the transparency borders (I guess it's not a problem on native resolution), >0.1 loose a little of the border (most of the time it has no visual impact on the alpha blending so...).

If I missed something (I'm sure AA_EN has some words to say here) don't hesitate to notice me.

What I still have to do:

Rewrite Fill shaders (now it use deprecated OpenGL functions and it doesn't work anymore).
Rewrite Copy shader.
Now, some triangles with points outside the camera disappear, investigate.
Integrate fog computation in Blender (aka the last step of the Color Combiner).
Clean classes. The way classes are abstracted is outdated (fix OpenGL/DirectX pipeline way) and confusing (the biggest proof is that they are used in an inconsistent way all over the code). OGLCombiner had 5 inheritance levels before I start to clean. Now its has three... Same for RSP_Parser is supposed to call RenderBase class which is suppose to call Render class which is supposed to call OGL/DirectXRender class... I don't know what is the point of RenderBase and, once again, Render class is designed to work in a fixed (and deprecated) pipeline way.
FrameBuffer emulation has always poorly worked in Rice, investigate.

darkjack · Jul 17, 2014

Fixing fog and trees in sb64 along with porting to a newer opengl sounds fantastic and I wish you the best of luck! Is there any reason why you would want to use an even newer opengl version or is 2.x all you need?

Narann · Jul 17, 2014

Thanks for the kind words darkjack.

About "limiting" myself to OpenGL 2.1, there is multiple reason:

One of the main idea of Richard was to throw away deprecated OpenGL stuff and have a base code that could be shared between OpenGL and OpenGL ES. As OpenGL 2.1 is supposed to be easely portable to OpenGL ES 2, this has been the relevant choice.
Rice code is very old and use fixed graphic pipeline paradigm so jump from the current state to a OpenGL 3+ would be difficult. You will have with a plugin that doesn't work at the first moment you use advance modern OpenGL functions. So OpenGL 2.1 is the smooth path. Maybe after we could discuss about push things further but seriously, there is so much work to do that you can forget that for now.
My current laptop only support OGL2.1 so...

But once I will have an OpenGL 2.1 compliant plugin with every old things removed (it will take time) Rice should be easy to port to OGLES2.

Narann · Jul 19, 2014

I slowly progress. My current day to day job is avoid to break things as much as possible. Everytime I fix something, something else break.

I've opened an imgur album (imgur provide embedable code but I guess emutalk forum need a plugin to properly read the code and avoid injection).

imgur album here

What's not in the album:

Doom intro show weird color effects on some faces (the ones supposed to be lighten).
I have the impression there is some performance issues but I'm not sure if it's related to GLSL or not.
I have hard time to properly set Z depth. Reading glide64mk2 code I have the impression it is not something trivial.

Narann · Jul 23, 2014

Hi all!

I'm slowly moving forward to migrate from fixed pipeline to shader.

One of the last bug is that some triangles are missing. You can find an example here:

I already had this bug when I tried to get the OpenGL ES2 code working quickly.

After A LOT of investigation, it seems to be a MESA bug related to an inconsistency in the OpenGL spec (read the Paul Berry's answer):

https://bugs.freedesktop.org/show_bug.cgi?id=64668

TL;DR (if I understood well): In fixed pipeline, you need to specify what is the near plane and what is the far plane as every triangles are computed at the same time. In shader pipeline, the near plane and far plane "can be" defined by the original position of the vertex (or manually using gl_VertexClip). By default, MESA use gl_Position = glVertexClip. In our case (N64) the projected vertex position is already computed and so, the original position is never stored in the GPU.

Something like that...

Do a:

export LIBGL_ALWAYS_SOFTWARE=1

Solve the problem (but it's damn slow of course) so it's not related to the code but to MESA.

As I was not sure it was related to the same ticket above so I opened a new ticket:

https://bugs.freedesktop.org/show_bug.cgi?id=81649

I will try to find a workaround (maybe something related to gl_VertexClip)

Anyway, I spend too much time on this (and it was frustrating) so I will let it like this now and go back once I will have a solution.

But I have a last question for N64 guru:

In modern GPU code, you often store the local vertex position and use a Vertex Shader to do something like: gl_Position = modelViewMatrix * gl_Vertex

The N64 doc say that when a vertex is send to the RDP, it original local position is never stored and the stored position is the vertices's position mutiplied by the projection matrix multiplied by the model matrix stack... (aka: vertex is projected at the moment it is send to the RDP).

Understanding this, what can happen is: You put the two matrix stacks to a certain state, you push a vertex (vertex1), then you modify the matrix stacks and push another vertex (vertex2) then you modify again the matrix stacks and push the last vertex (vertex3). If this three vertices are supposed to be a triangle, you can't use the same matrix in the vertex shader, like: gl_Position = modelViewMatrix * gl_Vertex. So for now you need to project the vertex on the GPU side.

Anyone can told me is this statement is true or false?

A big thank in advance!

About Rice code now. Here is the rough plan:

Now I have finished the Color Combiner, remove once for all everything related to fixed pipeline stuff (ColorCombiner4, FragmentColorCombiner, deprecated commands etc...). About this part, thanks to you guys which have put the OpenGL ES 2 code up to the Rice branch, this truly help me and some part has just been "remove the old code, get the code of OpenGL ES2".
I had to dig in different place and I think I have a good vision about how the "Parser", the "RenderBase" (both are not classes but C functions) and the global "RDP struct" works. I had think about refactoring some places. Example: The global RDP struct store both values from the N64 RDP register and value of the renderer (aka dword color value to array of floats). I don't know if it's a need or not but Rice code is an inconsistent mess and I don't know by which point I should start. For thoses having already to dig in Rice code: What would be the best place that would need a refactoring? I need you advise about what is the biggest priority.
The g_vtxProjected5, g_texRectTVtx, g_vtxBuffer, g_oglVtxColors, etc... (all global array of float) are a problem. They make the code hard to read.
I also would like to decrease the number of level of inheritance. I'm not a big fan of inheritance in general because you often inherit just to add stuff over what has already been write and doing so year after year bring to unmaintanable code. I would like to limit the number of inheritance level to two (logic abstracted class + implemented class).
Buy a Raspberry pi to write OpenGL ES2 compilation code.
Texture and Framebuffer code is massive but I'm sure some tiny color correction code which are done in CPU could be done on the GPU side using framebuffer. I will not look at it soon but I'm sure this part is the biggest performance bottleneck of Rice.

A lot of work to do. Code cleanup first, improve after.

Once again you can see some screenshots with explanation here.

darkjack · Jul 24, 2014

Glad to see you are still making progress! That's too bad that that bug was with mesa. Maybe I can help test in the future since I'm using the closed Nvidia driver.

Narann · Jul 24, 2014

That would be nice!

Are you comfortable with compilation stuff? mupen64plus is very easy to compile and I could help you in this task.

In such case, my repo is here, git clone it and compile it and use a command line like this one and it should work:

Code:

./mupen64plus --corelib /home/login/mupen64plus-local-bin/libmupen64plus.so.2 --gfx mupen64plus-video-rice.so --plugindir /home/login/mupen64plus-local-bin/ '/home/login/n64/Super Smash Bros..z64'

I suggest you save your .config/mupen64plus/mupen64plus.cfg because my plugin also update this file in a "not so clean" way for now (I didn't handle the config file update yet).

darkjack · Jul 24, 2014

I've got your code cloned (also figured I'd get the project from git as well) and I think I've set it all up right. I just linked all the plugins I built into a folder called plugins. I had some trouble with using plugindir, but using full paths for each plugin seemed fine.
Here are some screenshots I took
https://imgur.com/aIPJw37
https://imgur.com/s5CzP0E

A bit of system information
Nvidia 340.24
3.15.5-2-ARCH

Anything you want me to test just throw it up here

Narann · Jul 24, 2014

Thanks a lot darkjack!

This results look like mine so that's great!

You didn't notify it so I ask: did you encounter some missing faces with triangles near camera?

weinerschnitzel · Jul 26, 2014

I would work on refactoring RenderBase before the DL Parser.

Narann · Jul 26, 2014

Hi weinerschnitzel, I agree, a RenderBase refactor would be great. Actually, I'm starting to draw how class work and how they could work even better. I wonder if "RenderBase" C functions could be merged in the Render class.

For now we have:

DL_Parser functions (parse the dword commands) send raw result to -> RenderBase functions (convert the raw values to "renderable stuff", e.g.: uint8 colors to float) -> call Render abstract class.

RenderBase work with a lot of global "used anytime everywhere" variables.

But organize RenderBase would improve general code consistency.

death--droid · Jul 27, 2014

Got to agree with weinerschnitzel, the RenderBase needs one hell of a refactor... keep trying to find something to distract me when I have a look at it. DL Parser isnt actually to bad

Narann · Jul 27, 2014

Ok so as you guys have already read the RenderBase "function set" here is my advise, please feedback:

RenderBase was a bunch of function "related to render". Some where just "set this to the gRDP struct" some where more "utilities" (inline ViewPortTranslatef_x which transform a viewport pixel position to a float equivalent) and some where highly important (InitVertex(), PrepareTriangle(), ProcessVertexData() etc...). Notice this last one are ca
Trying to understand what was the initial author intention I think the gRDP and gRSP struct was supposed to be encapsulate into the RenderBase (InitRenderBase() init all the gRDP and gRSP).

I suggest:

Separate the RDP getter/setter RSP getter/setter in both distinct classes. This classes will encapsulate the gRDP and gRSP. In the future, this both struct could disappear replaced by .
Then have a RenderBase class that gather all the rest.
Then, anything in RenderBase which is not uniquely called by the parser (e.g.: ViewPortTranslatef_x) is put in a more appropriate class (Render in this last case).

Narann · Jul 28, 2014

lol Yesterday I was starting to read how I would refactor RenderBase code and realize there is ton of (from my investigation) useless code.

I've removed the PostProcessSpecularColor (which do nothing) and PostProcessDiffuseColor. This both function work in a totally old way. I explain: This functions get a vertex and modify the color depending on Color Combiner parameters. It take time to me to understand what's the point as my ColorCombiner use only the vertex color before the PostProcess modifications. And I realize this are just bunch of old code when their was no other way to do (OpenGL 1.x)... Each time I dig in such code, I'm very impressed by all the massive work that have been done to provide this because their was no other way to do. Compared to how things are easier now. Anyway, I removed all this PostProcess stuff and realize a lot of code could be trash away with them.

This open some doors to what could be removed safety. My next big remove should be the "GeneralCombiner". I have investigate a little and how this work is totally crazy and genius (even if totally useless now). But if this code is removed, other code directly related could follow!

Notice I use the regression test tools avaible in the mupen64plus-core each time I do a commit and for now, except some alpha erode (I know where this problem came, sometime it provide better results than before, sometime not) I still get the same results than before so any code remove doesn't massively break things.

So no RenderBase refactor yet, still removing old deprecated code.

PS: The current way Rice deal with specular "seems broken". It give a specular but as if the specular light where on camera position and it's very shiny. Reading through the doc it's hard to get how the specular is actually computed, more investigation/debug is needed.

death--droid · Jul 29, 2014

Damnnnnn I had been staring at those PostProcessSpecularColor code for so long and never realised its pretty much useless >.< and yeah got to agree that the things Rice had to jump through to get things working on older systems :\ and how much API's have come since then.
Thanks to this finding all this no longer used code

Clements · Jul 29, 2014

I believe some of the code was to help people with Geforce 2 and Geforce 4 MX cards back in the day, as Rice tried to make his plugin as compatible with low-end hardware as possible. Glad to hear that it is still being worked on and updated to the modern age without worrying much about legacy stuff.

Narann · Jul 29, 2014

[MENTION=63162]death--droid[/MENTION]: look at the PostProcessDiffuseColor and look at what vertex color is actually used by the GPU and you will realize PostProcessDiffuseColor "path" has no impact on vertex color. So you remove PostProcessDiffuseColor. Removing PostProcessDiffuseColor you realize there is a lot of function in it, so you remove this functions, and if you do it well, you should also realize IColor.h is totally useless as it is only "really" used in a GetColor() function that use the demuxed combiner modes to return the color the combiner will use, and GetColor is used by one of the function used in PostProcessDiffuseColor...

After this, you have another door to totally remove the GeneralCombiner (and it's impressive CombinerTable that, if you follow how its use, make you understand how brilliant was the author).

Yeah... I often feel like a spelunker digging into old code no one want to touch anymore and finding some treasures (even if the finality is to remove them...).
[MENTION=7407]Clements[/MENTION]: I must admit I feel a little inglorious when I do the "git rm" command on all this massive files that are not usefull anymore. All this impressive hack and tricky code. I read them, trying to understand them and think: Damn so brilliant! And the file is plenty of stuff like this? So many hours of work! My general programming knowledge push up since I do that. About legacy, mupen64plus-video-rice was dying so I need this refresh. I don't wan't to frustrate peoples and I think OpenGL 2.1 is a good compromise (My 32bits 2008 Intel 965GM chipset is OpenGL 2.1 so I have myself a tiny computer).

PS: I just found why Top Gear Rally had bad texts:

I was thinking it was UV troubles because Rice has a lot of troubles with UVs (and dig into UVs, aspect ratio, realize TopGearRally actually create a FrameBuffer of 1280x480...) but this was not that and it's my Color Combiner... For a unknown reason (bad demux? It seems unlikely...) the tex0 color (the one with letter color pixels) is not used in the GLSL shader program (generated from the combiner mode dwords) responsible of text draws. And I have no idea to look in. A high debug will be needed (or a simple hack if(GAMEISTGR) but I hate that).

death--droid · Jul 30, 2014

Awesome thanks for the heads up for that!!

BTW if your keen on writing a new color combiner I would take a good look into pixel shaders, it should simplify the whole entire process, something gonetz also is doing with his latest plugin. DirectX version of Rice Video also makes good use of it so I would check out that as well.

darkjack · Aug 1, 2014

Sorry that this is a bit late, but yes, I didn't notice any missing triangles near the camera, so all is good in Nvidia land. Anything you are looking to see on other hardware?

Narann · Aug 1, 2014

The Top Gear Rally bug drive me crazy! The SetCombiner dwords are properly demuxed (I've compared with the old way and Glide64 code) but Top Gear Rally seems to send the Texel1 at the place the Texel0 is supposed to be. I guess there is a bug in the way my ColorCombiner is generating the GLSL code.

death--droid: I will defenetely look at it and see how you generate your pixel shader code! I really hope I will find why mine create this problem (on this specific game!)

darkjack: Thanks! About TGR, I'm almost sure you will have the problem too but as you ask: Could you test Top Gear Rally (and maybe other Top Gears) to see if the text bug still appear? If you can save me and said: It's OK!

If everything seems ok on my ColorCombiner, I will have do a hack in the SetCombiner demux part (Rice actually had a lot of hack just after demuxing Combiner dwords).

For any N64 guru: Reading the doc I realized N64 vertices can have one and only one texture coordinate per vertex. In Rice there is two. I really wonder if it's still needed as ColorCombiner never use the second.

Rice to OpenGL 2.1 (and potentially ES)

Graphic programming enthusiast

New member

Graphic programming enthusiast

Graphic programming enthusiast

Graphic programming enthusiast

New member

Graphic programming enthusiast

New member

Graphic programming enthusiast

Surreal64 Nut

Graphic programming enthusiast

Active member

Graphic programming enthusiast

Graphic programming enthusiast

Active member

Active member

Graphic programming enthusiast

Active member

New member

Graphic programming enthusiast