Announcement: Cycle-accurate N64 development underway.

MarathonMan · Jul 18, 2013

angrylion said:
The RDP renders into the RDRAM. The VIF never writes to the RDRAM, it reads pixels from the RDRAM, processes them and outputs one color component per VI cycle. The VIF part of the job is represented in my function rdp_update(), which reads pixels from the emulated RDRAM, processes them and writes them into an off-screen DirectDraw surface, which represents the VI coordinate space, or, if you will, TV image coordinate space. There's thus no conceptual difference between the real hardware and my plugin, your claim is unsubstantiated. rdp_update() function implements various filters that the VI applies correctly, but it might not update certain VI registers properly, mainly because the Zilmar's plugin spec doesn't support the required level of synchronization between the RDP and CPU.

I never said the VIF writes into RDRAM:

instead of copying the frame to RDRAM and allowing the VIF to display the image

The DMAs aren't emulated, so that's a bit of an oversimplification.

I didn't know it was writing to an offscreen surface, though.

Fanatic 64 · Jul 19, 2013

Kreationz said:
Daedalus is already on both PC x86 and PSP MIPS. Also Java, OSX, Linux and Android(ARM, X86, MIPS) are planned. That will be 5 platforms, and sub-platform(Java) on 3 different processors and 6 total ports. We already have 2 and 3 different coders working on three of the others.

And this is relevant to this thread because...?

angrylion · Jul 19, 2013

MarathonMan said:
The DMAs aren't emulated, so that's a bit of an oversimplification.

Yes, I just copy individual pixels instead of emulating DMAs, but I think there should be no difference for a test program unless you also implement DMA timings and synchronization.

Danny · Jul 20, 2013

Keep up the great work on this, it's a fantastic project.

ericitaquera · Jul 26, 2013

Thank you very much for the initiative!

talker · Jul 27, 2013

How's things going?

MarathonMan · Jul 27, 2013

talker said:
How's things going?

Things have been at a standstill for the last few weeks. I've been busier than usual and haven't found the time to work on CEN64 recently.

MarathonMan · Jul 31, 2013

MarathonMan said:
Things have been at a standstill for the last few weeks. I've been busier than usual and haven't found the time to work on CEN64 recently.

I'm now able to simulate homebrew ROMs at full speed on fast systems.

Namco Arcade is holding >= ~50VI/s.

The best part is that everything will appear to run "fast" right now due to the fact that cycle delays are not implemented yet. Once they are, CEN64 will have even more headroom than it does presently, enabling it to run on lower-spec machines.

Remote · Jul 31, 2013

But will it blend?

MarathonMan · Aug 2, 2013

Remote said:
But will it blend?

IDK

The slow pace of this project has been frustrating me somewhat lately. So, I've decided to change things up a little.

I'm going to work on porting/writing plugins with cycle-accuracy and high-performance in mind, but for HLE emulators. There will be no immediate benefits to accuracy at all for some time, but it will enable me to test my plugins and not have to deal with the frustrations of not knowing which component of my emulator is broken.

EDIT: Looking into testing with Mupen64Plus ATM, since it seems the most cross-platform friendly.

Fanatic 64 · Aug 2, 2013

You could later (when the plugins are actually accurate) make a Zilmar Spec port, as far as I know it's pretty easy.

:afro:

MarathonMan · Aug 2, 2013

Fanatic 64 said:
You could later (when the plugins are actually accurate) make a Zilmar Spec port, as far as I know it's pretty easy.

:afro:

AFAIK, Mupen64Plus and Zilmar-spec are more or less the same.

EDIT: This was a fantastic idea. Beating down bugs like nobody's business.

beannaich · Aug 2, 2013

MarathonMan said:
This was a fantastic idea. Beating down bugs like nobody's business.

Always a good idea (whenever possible) to test your code against a working emulator. Whether it be following along with an execution log, or writing plug-ins. That's how I've always made my emulators

MarathonMan · Aug 2, 2013

beannaich said:
Always a good idea (whenever possible) to test your code against a working emulator. Whether it be following along with an execution log, or writing plug-ins. That's how I've always made my emulators

I'm guilty of preoptimization. I was doing unit testing (or at least trying), but was being rather careless about it. There are all these really latent bugs everywhere that the execution logs weren't revealing. Example:

Code:

static const ShuffleKey VectorOperandsArray[16] = { 
  /* -- */ {0x0,0x1,0x2,0x3,0x4,0x5,0x6,0x7,0x8,0x9,0xA,0xB,0xC,0xD,0xE,0xF},
  /* -- */ {0x0,0x1,0x2,0x3,0x4,0x5,0x6,0x7,0x8,0x9,0xA,0xB,0xC,0xD,0xF,0xF},
  /* 0q */ {0x0,0x1,0x0,0x1,0x4,0x5,0x4,0x5,0x8,0x9,0x8,0x9,0xC,0xD,0xE,0xD},
  /* 1q */ {0x2,0x3,0x2,0x3,0x6,0x7,0x6,0x7,0xA,0xB,0xA,0xB,0xE,0xF,0xF,0xF},
  /* 0h */ {0x0,0x1,0x0,0x1,0x0,0x1,0x0,0x1,0x8,0x9,0x8,0x9,0x8,0x9,0x8,0x9},
  /* 1h */ {0x2,0x3,0x2,0x3,0x2,0x3,0x2,0x3,0xA,0xB,0xA,0xB,0xA,0xB,0xA,0xB},
  /* 2h */ {0x4,0x5,0x4,0x5,0x4,0x5,0x4,0x5,0xC,0xD,0xC,0xD,0xC,0xD,0xC,0xD},
  /* 3h */ {0x6,0x7,0x6,0x7,0x6,0x7,0x6,0x7,0xE,0xF,0xE,0xF,0xE,0xF,0xE,0xF},
  /* 0w */ {0x0,0x1,0x0,0x1,0x0,0x1,0x0,0x1,0x0,0x1,0x0,0x1,0x0,0x1,0x0,0x1},
  /* 1w */ {0x2,0x3,0x2,0x3,0x2,0x3,0x2,0x3,0x2,0x3,0x2,0x3,0x2,0x3,0x2,0x3},
  /* 2w */ {0x4,0x5,0x4,0x5,0x4,0x5,0x4,0x5,0x4,0x5,0x4,0x5,0x4,0x5,0x4,0x5},
  /* 3w */ {0x6,0x7,0x6,0x7,0x6,0x7,0x6,0x7,0x6,0x7,0x6,0x7,0x6,0x7,0x6,0x7},
  /* 4w */ {0x8,0x9,0x8,0x9,0x8,0x9,0x8,0x9,0x8,0x9,0x8,0x9,0x8,0x9,0x8,0x9},
  /* 5w */ {0xA,0xB,0xA,0xB,0xA,0xB,0xA,0xB,0xA,0xB,0xA,0xB,0xA,0xB,0xA,0xB},
  /* 6w */ {0xC,0xD,0xC,0xD,0xC,0xD,0xC,0xD,0xC,0xD,0xC,0xD,0xC,0xD,0xC,0xD},
  /* 7w */ {0xE,0xF,0xE,0xF,0xE,0xF,0xE,0xF,0xE,0xF,0xE,0xF,0xE,0xF,0xE,0xF}
};

4th row has 0xF,0xF,0xF at the end, when it should be 0xF,0xE,0xF. The only way to reveal this bug was to use specific RSP instructions with a specific element specifier in the VT operand. Spotted it right away when it tried to render graphics and audio, though!

MarathonMan · Aug 3, 2013

Before SSE optimizations:

Code:

xxxxx@yyyyyy:~/Projects/mupen64plus  
$ du -b test/mupen64plus-rsp-cxd4.so 
81944	test/mupen64plus-rsp-cxd4.so

After (~halfway done) SSE optimizations:

Code:

$ du -b mupen64plus-rsp-cxd4.so 
74232	mupen64plus-rsp-cxd4.so

ROMs that push the RSP are much smoother on my machine. I've been testing with Conker's BFD and running with a LLE RDP plugin. Animations look ever so slightly smoother; seems like the bottleneck is more on the RDP than it is the RSP. But still good gains so far.

beannaich · Aug 3, 2013

MarathonMan said:
I'm guilty of preoptimization. I was doing unit testing (or at least trying), but was being rather careless about it. There are all these really latent bugs everywhere that the execution logs weren't revealing.

I'm also guilty of this, and it almost always leads to bugs

I hate bugs that you find eventually and wonder how anything was working at all! I once found a silly mistake with the BIT opcode for my 65816 emulator, and fixing it didn't change as much as you'd think.

MarathonMan · Aug 3, 2013

This is much more of an improvement than I would have imagined. I'm currently using a "scalar->SSE->scalar" layer that incurs an additional cost on each RSP vector instruction call. I removed it and benchmarked instruction times:

Did 100,000,000 iterations of VMADM:

Code:

$ time ./main-nosse

real	0m1.562s
user	0m1.560s
sys	0m0.000s

$ time ./main-sse

real	0m0.256s
user	0m0.252s
sys	0m0.000s

Same goes for VMADH, but even moreso:

Code:

$ time ./main-nosse

real	0m1.554s
user	0m1.552s
sys	0m0.000s


$ time ./main-sse

real	0m0.166s
user	0m0.164s
sys	0m0.000s

The differences aren't nearly as great now because of the cost of the layer, but it'll be exciting to see how much faster games are once I remove the layer.

MarathonMan · Aug 5, 2013

Enjoy!

Only partially vectorized, but massive speedups in PJ64.

Fanatic 64 · Aug 5, 2013

Is this FatCat's RSP with SSE or your own RSP ported to Zilmar Spec?

grivy · Aug 5, 2013

I don't know whether this information is of any use to you at this time, but the rotating mask before the start screen in Majora's Mask (E) (M4) [!] is not showing. This is on Project64 v2.1 with the provided dll.

Announcement: Cycle-accurate N64 development underway.

Emulator Developer

Fanatic 64

Guest

New member

Programmer | Moderator

New member

New member

Emulator Developer

Emulator Developer

Active member

Emulator Developer

Fanatic 64

Guest

Emulator Developer

New member

Emulator Developer

Emulator Developer

New member

Emulator Developer

Emulator Developer

Attachments

Fanatic 64

Guest

New member