What's new

Announcement: Cycle-accurate N64 development underway.

OP
MarathonMan

MarathonMan

Emulator Developer
So in other words, you don't really know how CEN64 would handle the N64's native 640x480 high-res mode?

CEN64 already handles games that make use of 640x480. It's just a matter of what to do for games that are NOT using the "high-res" mode... which, right now, I more or stretch the image to fit the window (as the console would do).
 

Guru64

New member
Didn't mean to get anyone's hopes up, but it's probably going to be a while before I get around to actually doing this.
 
OP
MarathonMan

MarathonMan

Emulator Developer
Worked on PIF emulation today. I think there's either a bug or a timing issue somewhere, because some demos will stop animating if I don't play with the PIF (hack). This causes some demos, like the fractal demo, to have blocked threads that go unanswered... which causes animation to stop after a short while. Hopefully I can narrow down on what exactly is causing this issue and stop using the hack.

Also reporting VI/s in the window title now to get a feel for things. I currently only have two machines to test on (and another tomorrow):

2.4GHz Core2 Mobile: ~20 VI/s depending on the ROM.
4.5GHz Sandy: ~35-50 VI/s depending on the ROM.

Once I add cycle delays in, these figures should get a boost of speed up. This is particularly true for FPU-heavy ROMs like LaC's fire demo... it runs a full 5-10 VI/s slower than the other ROMs that I've tested.

Scalability tests:
1x CEN64: 50 VI/s
2x CEN64: ~49 VI/s
3x CEN64: ~47 VI/s
4x CEN64: ~44 VI/s
...
8x CEN64: ~26 VI/s

It falls off a cliff after the 4th instance because I only have 4 cores (w 2 threads/ea.). I've never seen my computer get this hot... I found a new stress test. :devil:
 
Last edited:

beannaich

New member
CEN64 already handles games that make use of 640x480. It's just a matter of what to do for games that are NOT using the "high-res" mode... which, right now, I more or stretch the image to fit the window (as the console would do).

This is the best way to handle resolution changes. The other way of handling this (which is uglier in my opinion), is to re-size the window and re-initialize your GL objects to whatever the resolution is. I think making the window/GL objects match the largest resolution possible then letting your graphics card scale up the screen is the better option. There may be some purists that will be upset that the pixels won't be right after bi-linear filtering is applied, so you may want to include a nearest neighbor option. This is still great work, and I have no doubt that the finished product will be awesome :)

Side note: I believe I saw an error in your implementation of NOR, it was being implemented as XNOR, all the datasheets I have don't make any mention of this. I would have changed it but I'm not sure whether or not I'm right, so I figured I'd just bring it up instead :-D
 
OP
MarathonMan

MarathonMan

Emulator Developer
Side note: I believe I saw an error in your implementation of NOR, it was being implemented as XNOR, all the datasheets I have don't make any mention of this. I would have changed it but I'm not sure whether or not I'm right, so I figured I'd just bring it up instead :-D

Whoops... that's a bug. Fixed, thanks! :D
 

sanni

New member
Under Ubuntu x64 12.10 and gcc 4.7.2 I had to add -lm to the LIBS in the makefile or it wouldn't compile.
 

Nintendo Maniac

New member
This is the best way to handle resolution changes. The other way of handling this (which is uglier in my opinion), is to re-size the window and re-initialize your GL objects to whatever the resolution is. I think making the window/GL objects match the largest resolution possible then letting your graphics card scale up the screen is the better option. There may be some purists that will be upset that the pixels won't be right after bi-linear filtering is applied, so you may want to include a nearest neighbor option.
If we could force high-res mode then you could just display the 640x480 image in fullscreen natively. ;)

Anyway, a nearest neighbor option would definitely be useful for any CRT display. As I can tell you from playing AM2R on XP at 320x240 with bilinear scaling to 640x480 in fullscreen on a CRT ain't very pretty.

Also, being able to use the monitor or the GPU's upscaling would indeed be useful since such things can be improved in time. However, similar to bsnes/higan it may be wise to allow external filters and upscalers.
 
Last edited:
OP
MarathonMan

MarathonMan

Emulator Developer
Found the bug in the PIF!

Fractal zoomer now boots and animates the greeting screen without the hack.
Gained an extra ~0.5VI/s or so (?) after removing said hack.
In oman's pong, the computer plays pong by itself because it thinks there's no controller in P2.

Awesome. :)

EDIT: Found another couple optimizations in the core resulting in another couple VI/s boost. Some ROMs are now exhibiting as high as 55 VI/s+ and usually never fall below 45 VI/s on my machine.

EDIT 2: Even more success! Today has been great... fixed a bug in the memory system that was preventing LaC's fire demo from properly rendering (the flames now reach the top of the screen). In addition, the demo also runs slightly faster. And, if that wasn't all, Namco Museum appears to be *very* close to booting!
 
Last edited:

Zuzma

New member
That's really impressive! If I wasn't tied to windows I think I'd be testing it out myself. Or maybe it'll work inside virtual box if I put some effort into actually installing linux. :p
 
OP
MarathonMan

MarathonMan

Emulator Developer
That's really impressive! If I wasn't tied to windows I think I'd be testing it out myself. Or maybe it'll work inside virtual box if I put some effort into actually installing linux. :p

Once I got it to play Namco, I was going to look into building it on without without any SSE-x requirement. So hold that thought... just be ready to test!
 

Zuzma

New member
Oh I'll be ready for sure... er assuming I can build the thing. The last time I used linux was some old build of redhat in the late 90's. Will it work on fedora linux? I'm guessing that's sorta the bleeding edge desktop friendly version of it.
 
Last edited:
OP
MarathonMan

MarathonMan

Emulator Developer
It'll work on anything that has gcc-4.7.2... so Ubuntu, Fedora, Debian (Wheezy), ... take your pick.
 

Zuzma

New member
DoT DOT! haha thanks a lot MM! I'm going to have just as much fun setting this thing up as I am using it. :)
 

sanni

New member
If you just wanna try it real quick here is a short tutorial:

- download the Xubuntu 64 Bit 12.10 live cd from here: http://xubuntu.org/getxubuntu/
- once it's done booting open a terminal by rightclicking on the desktop and choosing open terminal/cmd here
- into the terminal type the follwing things followed by a [RETURN] after each line
sudo apt-get update
sudo apt-get install build-essential
sudo apt-get install git
sudo apt-get install libglfw-dev
git clone https://github.com/tj90241/cen64.git
- now change to the newly created cen64 directory and open a terminal in there too and type:
git submodule init
git submodule update
make debug
- now download and copy the pifdata.bin into the cen64 directory, just google for "n64 bios mess"
- download lac's firedemo, make sure it's in z64 file format and copy it to cen64
- write into the terminal
./cen64 pifdata.bin fire.z64

I got around 7VIs with a 4.3Ghz Sandy i5 that way, surely installing Linux instead of running a live cd will improve the performance greatly.
 
OP
MarathonMan

MarathonMan

Emulator Developer
If you just wanna try it real quick here is a short tutorial:

- download the Xubuntu 64 Bit 12.10 live cd from here: http://xubuntu.org/getxubuntu/
- once it's done booting open a terminal by rightclicking on the desktop and choosing open terminal/cmd here
- into the terminal type the follwing things followed by a [RETURN] after each line
sudo apt-get update
sudo apt-get install build-essential
sudo apt-get install git
sudo apt-get install libglfw-dev
git clone https://github.com/tj90241/cen64.git
- now change to the newly created cen64 directory and open a terminal in there too and type:
git submodule init
git submodule update
make debug
- now download and copy the pifdata.bin into the cen64 directory, just google for "n64 bios mess"
- download lac's firedemo, make sure it's in z64 file format and copy it to cen64
- write into the terminal
./cen64 pifdata.bin fire.z64

I got around 7VIs with a 4.3Ghz Sandy i5 that way, surely installing Linux instead of running a live cd will improve the performance greatly.

Couple things to add:
If you type just 'make' (without debug), your performance will be like... >5x better.

I only merge the plugins into the root repo every so often. To always get the latest version of a plugin, cd into any directory, checkout master, and rebuild:
Code:
cd vr4300
git checkout master
git pull
cd ..

make
 

Hacktarux

Emulator Developer
Moderator
Hi,

Just quickly reviewed your code. I did not spend enough time reading it so maybe i missed something. Where are you handling the number of cycles required by each components ? As i see it every read and write operations are done instantly. For example, i had some hard time when i tried to implement some kind of pif timing in my cycle accurate emulator as communication with this chip is particularly slow. The other two tricky parts were mostly handling correct pipeline stalls in all cases, data and instruction caches (that is not really complex in itself but commonly add complexity when debugging something) and CP0 hasards (i'm really not sure it's that important to emulate). I will try to understand this evening how you are implementing these parts :)

Anyway i'm glad i'm not the only on interested in progressing toward that direction. I hope i wil have more time soon to plug again the N64 and everything required to run tests on it and progress again on my little project.
 
OP
MarathonMan

MarathonMan

Emulator Developer
Hi,

Just quickly reviewed your code. I did not spend enough time reading it so maybe i missed something. Where are you handling the number of cycles required by each components ? As i see it every read and write operations are done instantly. For example, i had some hard time when i tried to implement some kind of pif timing in my cycle accurate emulator as communication with this chip is particularly slow. The other two tricky parts were mostly handling correct pipeline stalls in all cases, data and instruction caches (that is not really complex in itself but commonly add complexity when debugging something) and CP0 hasards (i'm really not sure it's that important to emulate). I will try to understand this evening how you are implementing these parts :)

Anyway i'm glad i'm not the only on interested in progressing toward that direction. I hope i wil have more time soon to plug again the N64 and everything required to run tests on it and progress again on my little project.

Hi Hacktarux,

Right now, I'm more focused on modeling flow and control... not so much with the timings. That being said, I do have the means for creating the delays in place already (see the beginning of CycleVR4300, for example). Essentially, each device is responsible for managing its own delays, and broadcasting information about those delays to other components as needed. The former is done within each plugin itself (perhaps the VR4300 is stalling on full write buffers), and the latter is communicated through the bus component (an RDRAM occured, now how long should the requesting device wait until it actually "receives" that data). In some ways, I've basically separated emulation and simulation, and use the simulated statistics to drive the emulation.

I'm waiting on the timings due to the amount of complexity involved in just modeling the pipeline itself. Properly modeling the flushing of the pipeline on an external interrupt , for example, gave me a royal headache... and I didn't even have to deal with timings at the time! I'm also hoping that adding in the delays will give me a performance boost... after all, the ultimate goal of my project is to have a simulator that can play games in realtime and debunk the "LLE is too slow" myth once and for all! :)

Glad to see you're still interested in the project as well... I was following your project closely before I started mine -- I got a lot of my inspiration from you! PS: I know you said you were waiting to release the source until it's ready -- if you would be willing to share it, I would be more than willing to respect any wishes that you have to keep it closed. It would be great to have another reference, as you seem to have the PIF and timings hammered down better than I currently do.

Here is another test run, this time in a Virtual Machine. I used the free VMWare Player with Ubuntu 64bit 12.10 as guest and Windows 7 64bit as host system.


Cool! I think that one jagged line that you see there is due to another bug that I'm currently tracking down... I believe this is the same bug that is preventing commercial ROMs from rendering properly and causing input/PIF issues in some areas.
 
Last edited:

Hacktarux

Emulator Developer
Moderator
Hi Hacktarux,

Right now, I'm more focused on modeling flow and control... not so much with the timings. That being said, I do have the means for creating the delays in place already (see the beginning of CycleVR4300, for example). Essentially, each device is responsible for managing its own delays, and broadcasting information about those delays to other components as needed. The former is done within each plugin itself (perhaps the VR4300 is stalling on full write buffers), and the latter is communicated through the bus component (an RDRAM occured, now how long should the requesting device wait until it actually "receives" that data). In some ways, I've basically separated emulation and simulation, and use the simulated statistics to drive the emulation.

I'm waiting on the timings due to the amount of complexity involved in just modeling the pipeline itself. Properly modeling the flushing of the pipeline on an external interrupt , for example, gave me a royal headache... and I didn't even have to deal with timings at the time! I'm also hoping that adding in the delays will give me a performance boost... after all, the ultimate goal of my project is to have a simulator that can play games in realtime and debunk the "LLE is too slow" myth once and for all! :)

Glad to see you're still interested in the project as well... I was following your project closely before I started mine -- I got a lot of my inspiration from you! PS: I know you said you were waiting to release the source until it's ready -- if you would be willing to share it, I would be more than willing to respect any wishes that you have to keep it closed. It would be great to have another reference, as you seem to have the PIF and timings hammered down better than I currently do.



Cool! I think that one jagged line that you see there is due to another bug that I'm currently tracking down... I believe this is the same bug that is preventing commercial ROMs from rendering properly and causing input/PIF issues in some areas.

I would still advise actually emulating cycles on a cycle accurate emulator ;-)

It should be done as early as possible otherwise you will end up with something very similar to existing emulators except you have splitted cpu emulation to several stages. BTW, the real benefit (accuracy seen from the games code) to emulate each stage is to have correct cycle counts and cache. It is also important to take into considerations all possible case where the pipeline stalls. I remember having a hard time figuring out how to emulate data bypass between stages especially with the 2 cop1 registers formats: this is one of the cases that involve pipeline stalls if i remember right. Anyway, i can tell you to be very careful with cache and timing, it really has an effect on the machine seen from the software and some roms do really weird things (i'd say they were very lucky that all n64 were behaving exactly the same, it hides a lot bugs :) ).

I am still not convinced i can really run it fullspeed. Things like cache and write buffers add many things that should be checked for nearly each instruction and timing between all components takes a lot of time. The problem is not really to add delay. It is to let each components do its jobs at the same time and synchronize everything at the right number of cycles. I have also tried to emulate as correctly as i could the behaviour of the VI interface, it is also taking a good amount of cpu time (i am not talking about crt emulation that is virtually free as it is done with GPU). Taking all of that into consideration, i still think it will be extremely slow with RDP. Then again, i am not doing this project with speed as a requirement at this stage.

I won't share source code right now as a lot of values are really experimental and i feel that if i spread it, it would prevent other people from doing their own tests on the real n64. I think it is quite important now, that different people do their own experimentation and later we can share and compare to hopefully fully understant how the hardware is behaving exactly. That said, if you have any question that you feel i can answer, feel free to ask. I will do my best to answer.
 

Top