What's new

Gameboy Advance

OP
F

Flerovium

Member
[MENTION=110864]Shonumi[/MENTION] I attempted this way to implement sprites, but currently kirby looks more like a cripple :D

rifub9sq.png


Currently Kirby Nightmare in Dreamland is the only game which is playable, most games crash, having some unrealistic address in r15 (like 0xFFFFFFFC for many games). I think it has something to do with swi / interrupt handling, or mode switching, but I cannot exactly find out why. I should write some tests to find out :D Also Kirby only runs stable when I HLE CpuSet, CpuFastSet and LZ77UncompVRAM, otherwise it randomly crashes (I think it has something to do with interrupts happening while processing software interrupts, also I know that some SWIs, atleast LZ77UncompVRAM, abuse r14 for arbitrary data).
 

Shonumi

EmuTalk Member
I remember having 0xFFFFFFFC for R15 quite frequently. The cause for me will probably be different for you, but generally, it seems to come from some value being 0x0 then subtracting 0x4 from it, then putting it in R15 later. The problem could be anywhere. I don't even know when exactly I fixed stuff like that.

Hardware IRQs can occur even while SWIs are being processed. If you don't handle nesting IRQs within SWIs, that could be a problem for an LLE approach. Using HLE though, it's possible to avoid that issue entirely.
 
OP
F

Flerovium

Member
[MENTION=110864]Shonumi[/MENTION] Yes, I already thought about that issue coming from the subs pc, lr, #4 instruction in the irq handler of the bios. I'll have to look how to handle that nesting problem correctly (don't exactly know how it functions, maybe gbatek has something about it).
 
OP
F

Flerovium

Member
Me and my project aren't dead! :p How is NDS emulation progressing, shonumi?

After taking a break in which I did almost no programming (beside of excersises for university) I'm back at work. And no, this time I'd not rewrite everything from the scratch (eventhough I almost did due to frustration) :D
Today I finally found and fixed a bug that was (I think so) the cause for lots of crashes. As we say in Germany "the devil lays in the detail". The bug was caused by a small bug in the method which initiates an IRQ. In my ARMv4T interpreter core I have a variable called "flush_pipe" which if set causes the pipeline to be flushed after an instruction was executed. This is used mainly in branch instructions. However, in my FireIRQ method, I set flush_pipe in order to flush the pipeline. The problem obviously is, that the pipeline gets only flushed after the next instruction was executed. What I had to do to initiate a flush manually (set pipe_state to zero). Finding this out was quite tricky for me because the code looked (logically) correct and you had to have exactly in mind what "flush_pipe" was for. Having this fixed and some other mainly small stuff that I did in my programming break, I can now run games like Pokemon Mystery Dungeon Team Red (the first GBA cartridge I ever owned!) per LLE but with graphical glitches:

pamda6h7.png

(I would've cropped the image if I wouldn't work on my Raspberry Pi 2, which is quite painful, currently)
 
Last edited:

Shonumi

EmuTalk Member
Hey Flerovium, nice to see you're still active. Looks like you made a lot of progress. It's always exciting to get things working, and getting rid of bugs so you can play games is always great! I'm glad to hear NanoboyAdvance can run one of your first GBA games. Keep it up, and don't be shy about posting more screenshots ;)

About NDS emulation, I work on it now and then. Currently, I am trying to get small demos to run (using libnds + devkitARM). The ARM9 CPU actually isn't significantly different from the ARM7, except the ARMv5 architecture has minor changes to some instructions, and it has a 5-stage pipeline (ARM7 has 3-stages). It has Fetch, Decode, and essentially 3 stages of Execution (ALU, Memory Access, Register Write). Previously, I tried emulating each stage individually. It worked and there was nothing wrong with it, but the setup I made was overly tedious and would have proven unbearably slow in an interpreter. Instead, I've chosen to keep Execution as a single function within the emulator, but I'm going to have to keep track of any cycles that are supposed to be simultaneous with other stages. This should make things simpler than before and maintain correct timing.

Currently, I can run the TinyFB demo but nothing else. All the demos I make with libnds make use of the coprocessor in the NDS (called the protection unit), which I do not emulate just yet. I don't even emulate the ARM coprocessor instructions, so my emulator exits due to unknown instructions :p
 
Last edited:

BigKong

New member
Hey Flerovium, nice to see you're still active. Looks like you made a lot of progress. It's always exciting to get things working, and getting rid of bugs so you can play games is always great! I'm glad to hear NanoboyAdvance can run one of your first GBA games. Keep it up, and don't be shy about posting more screenshots ;)

About NDS emulation, I work on it now and then. Currently, I am trying to get small demos to run (using libnds + devkitARM). The ARM9 CPU actually isn't significantly different from the ARM7, except the ARMv5 architecture has minor changes to some instructions, and it has a 5-stage pipeline (ARM7 has 3-stages). It has Fetch, Decode, and essentially 3 stages of Execution (ALU, Memory Access, Register Write). Previously, I tried emulating each stage individually. It worked and there was nothing wrong with it, but the setup I made was overly tedious and would have proven unbearably slow in an interpreter. Instead, I've chosen to keep Execution as a single function within the emulator, but I'm going to have to keep track of any cycles that are supposed to be simultaneous with other stages. This should make things simpler than before and maintain correct timing.

Currently, I can run the TinyFB demo but nothing else. All the demos I make with libnds make use of the coprocessor in the NDS (called the protection unit), which I do not emulate just yet. I don't even emulate the ARM coprocessor instructions, so my emulator exits due to unknown instructions :p

How hard do you think is DS emulation compared to PS1/N64/Saturn emulation or even something like PS2/GC/Dreamcast/PSP?
 

Shonumi

EmuTalk Member
How hard do you think is DS emulation compared to PS1/N64/Saturn emulation or even something like PS2/GC/Dreamcast/PSP?

Well, to accurately answer that question, I'd have to know a lot about those systems. I've only studied parts of the Gamecube, the rest I have only limited knowledge of their internals. Still, I'd take a guess. I'm only speaking personally from my own experience if I were to take on each.

PS1: Complete guess, but I'd wager that if I can tackle the DS, I'll have a very solid grip on how to approach Sony's first Playstation. The CPU is MIPS, and it should be well documented. I can't speak about the GPU or anything else. 3D stuff isn't my specialty right now, but I'm sure I could do this, given time.

N64: The N64 is a hot mess in terms of documentation, whereas the DS is mostly understood. Whether I'm taking a good LLE path or an advanced HLE path to the N64, I'm going to have to say the DS is easier to get to grips.

Saturn: Again, a hot mess when it comes to documentation. And keeping those CPUs in sync seems like a nightmare. The DS is simpler (has just two, with one running at 1/2 the speed of the main), and saner, and has better games to make testing more fun :p

PS2: Harder. More "moving parts" or components to emulate. Would take a loooot of time to get anywhere or see any results.

GC: Harder. Same reasons as the PS2.

Dreamcast: Honestly have no idea. I'd assume harder, simply because the DS is something I'm familiar with. Plus, I'd have to buy a bunch of games and hardware to do testing (never owned a DC).

PSP: Moderately more difficult, probably because of the graphical components (again, I'm not too good with 3D things just yet, I'm learning though). An HLE approach to the PSP makes some aspects easy to do, and MIPS is a nice architecture in my opinion.

Yeah, so I imagine just about everything is more challenging than the DS. You have to understand though, I've been working with Nintendo's portables for years. To me, the DS really is an evolution of the GBA (from its CPU to its LCD to various other facets). I know it well, so to go to another system or platform like the advanced ones mentioned above, it would require learning a lot about how they operate. If I had programmed a PS1 emulator, perhaps my outlook on how difficult PS2 or PSP emulation is would be different (I'm sure there are some hardware designs that get transferred or transform each generation).

I first emulated the GB so I could emulate the GBC. I emulated the GBC so I could emulate the GBA. I emulated the GBA so I could emulate the DS. Each step prepared me for the next. I chose this path because I knew it would gradually help me understand how to emulate a DS properly, instead of jumping in all at once. The DS used to scare me; I thought it'd be a long-shot before I could even try to get it running. But now I am confident that I can do it. So now, I do not think the DS is exceedingly hard to program an emulator for (just needs a little time and love). But I know little about other systems, so they seem like HUGE tasks me.
 
Last edited:

BigKong

New member
Well, to accurately answer that question, I'd have to know a lot about those systems. I've only studied parts of the Gamecube, the rest I have only limited knowledge of their internals. Still, I'd take a guess. I'm only speaking personally from my own experience if I were to take on each.

PS1: Complete guess, but I'd wager that if I can tackle the DS, I'll have a very solid grip on how to approach Sony's first Playstation. The CPU is MIPS, and it should be well documented. I can't speak about the GPU or anything else. 3D stuff isn't my specialty right now, but I'm sure I could do this, given time.

N64: The N64 is a hot mess in terms of documentation, whereas the DS is mostly understood. Whether I'm taking a good LLE path or an advanced HLE path to the N64, I'm going to have to say the DS is easier to get to grips.

Saturn: Again, a hot mess when it comes to documentation. And keeping those CPUs in sync seems like a nightmare. The DS is simpler (has just two, with one running at 1/2 the speed of the main), and saner, and has better games to make testing more fun :p

PS2: Harder. More "moving parts" or components to emulate. Would take a loooot of time to get anywhere or see any results.

GC: Harder. Same reasons as the PS2.

Dreamcast: Honestly have no idea. I'd assume harder, simply because the DS is something I'm familiar with. Plus, I'd have to buy a bunch of games and hardware to do testing (never owned a DC).

PSP: Moderately more difficult, probably because of the graphical components (again, I'm not too good with 3D things just yet, I'm learning though). An HLE approach to the PSP makes some aspects easy to do, and MIPS is a nice architecture in my opinion.

Yeah, so I imagine just about everything is more challenging than the DS. You have to understand though, I've been working with Nintendo's portables for years. To me, the DS really is an evolution of the GBA (from its CPU to its LCD to various other facets). I know it well, so to go to another system or platform like the advanced ones mentioned above, it would require learning a lot about how they operate. If I had programmed a PS1 emulator, perhaps my outlook on how difficult PS2 or PSP emulation is would be different (I'm sure there are some hardware designs that get transferred or transform each generation).

I first emulated the GB so I could emulate the GBC. I emulated the GBC so I could emulate the GBA. I emulated the GBA so I could emulate the DS. Each step prepared me for the next. I chose this path because I knew it would gradually help me understand how to emulate a DS properly, instead of jumping in all at once. The DS used to scare me; I thought it'd be a long-shot before I could even try to get it running. But now I am confident that I can do it. So now, I do not think the DS is exceedingly hard to program an emulator for (just needs a little time and love). But I know little about other systems, so they seem like HUGE tasks me.

How do you rank NES and SNES? NES appears simple at the first glance but has a multitude of mappers many of which still aren't emulated by any emulator while SNES seems similar to GBA.
 
OP
F

Flerovium

Member
I (think I) fixed another small IRQ issue. Now also Super Dodge Ball seems to run without problems (didn't peak too far into the game but it gets further than the titlescreen). Pokemon Mystery Dungeon Team Red seems to crash when entering the first dungeon... I will have to find out why this happens. I also implemented a screenshot functionality, using libpng to dump the screen contents. Using GIMP or any other software to crop the image on the raspberry pi 2 I think would be pain and also unnessecarly takes disk space.

Here are some screenshots (order is random):
scr00.png

scr01.png

scr02.png

scr03.png

scr04.png

scr05.png

scr06.png
 

Shonumi

EmuTalk Member
How do you rank NES and SNES? NES appears simple at the first glance but has a multitude of mappers many of which still aren't emulated by any emulator while SNES seems similar to GBA.

The NES is definitely harder than something like the GB or GBC if you're going for a decent amount of accuracy in either one. The GB isn't sensitive to a lot of timing issues (unlike the NES) and like you pointed out, the GB has far less mappers and cart layouts/formats to deal with. On the other hand, there are still various little things that are utterly undocumented about the GB. It's not enough to affect the compatibility of most games, but it's very annoying. Not to mention the MBCs like MBC7 are poorly documented. The NES, on the other hand, has been very extensively covered; its quirks are mostly known factors at this point. I still think the NES is harder, simply because it's a pain trying to keep the timings correct. It's not harder than the GBA or DS though, in my opinion :p

I'd put the SNES at or above the DS. I mean, if you look what byuu did just to get bsnes/higan to full commercial compatibility, and all of the undocumented stuff he had to deal with, the SNES is a bit of a nightmare in my opinion. Though in terms of graphical capabilities, it's most similar to the GBA (some aspects look kinda like copy+paste work, e.g. the special effects) the GBA has a lot less to worry about. There are no coprocessors like the SA-1 or Super FX, memory access is pretty straightforward, and the GBA's LCD vs the SNES' PPU just looks and feels more orderly to me. Keep in mind, there are still some odds and ends and obscure behavior we're discovering about the SNES too. I dunno, I just get the impression that I'd have an easier time trying my had at DS emulation than SNES emulation, again, because the DS is pretty familiar territory.

[MENTION=110839]Flerovium[/MENTION] - Those screenshots look amazing! I guess OBJ rendering could use some work, but everything else looks great :D
 
OP
F

Flerovium

Member
[MENTION=110864]Shonumi[/MENTION] Yes, OBJ rendering is something I haven't worked on for a very long time. But I'm too lazy to fix it, because I've the feeling that there are bigger bugs which I should concentrate on first. In my lastest commit I tried to implement sort of a decode cache for ARM and THUMB instructions, storing the instruction type for each possible instruction. First I did this only for thumb, because I thought that most code is written in THUMB and storing instruction type for each ARM instruction 0 to 0xFFFFFFFF would take waaaay too much memory lol. Then I realized that there are quite lot of bits in the ARM instruction format which can be declared as "don't care" when determining soley the type of instruction. First of all there are bits 24-31 storing the instruction execution condition. Then there are bits 12-19 which neither are relevant for instruction type detection (at this point I had to change my bx instruction detection code because it insisted on the bits 12...19 being all set). Given this I developed a fairly simple lossy "compression" algorithm for ARM instructions, leaving me with only 0x100000 entries in RAM for ARM. The algorithm is implemented in this simple macro:
Code:
#define arm_pack_instr(i) ((i) & 0xFFF) | (((i) & 0x0FF00000) >> 8)
The caches are built at runtime in the constructor of the cpu. At this point you might ask youself "How much is the impact of this optimization?" and I must admit that I currently don't really know. I tested it on my slow Raspberry Pi 2 (my laptop charger is currently broken, so no x86 for me) which isn't powerful enough to run my emulator at fullspeed. In side to side comparison I noticed no difference. This could be due to various reasons. Eventually my drawing code is the main bottleneck (especially on the raspberry) because I have no GPU acceleration, or maybe the ARM architecture is less sensitive to branches than x86 due to the smaller pipeline. I'd love to test it on a x86 computer, maybe I'll find someone to do this job for me :D (the optimization can be turned of by commenting out "#define ARM7_FASTHAX" in nanoboyadvance/src/core/arm7.h)

The entire changes I made to accomplish this can be viewed on github: https://github.com/fredericmeyer/nanoboyadvance/commit/de6a215082c468576e5fbd8040b4f96a2c3555d3
 
OP
F

Flerovium

Member
3507.png


[MENTION=110864]Shonumi[/MENTION] This is actually a dream becoming true!
In order to get further than to the titlescreen I eventually need to actually emulate FLASH (not just fake its existence by writing a valid chip identifier to 0x0E000000).
 
Last edited:

Shonumi

EmuTalk Member
Great work! I know I was pretty excited when I got Pokemon Ruby/Sapphire to boot in my emulator. Be careful not to spend all day playing games though. That's what happened to me once Pokemon started working :D
 

Exophase

Emulator Developer
How hard do you think is DS emulation compared to PS1/N64/Saturn emulation or even something like PS2/GC/Dreamcast/PSP?

I feel like I'm familiar enough with emulation of some of these systems to answer about those as well, if you don't mind.

PS1: Easier than DS. There's only CPU and it's significantly simpler and slower. PS1's GTE is simpler in scope and less autonomous than DS's geometry engine. PS1's GPU is also more straightforward and much less featureful than DS's. And the DS has an entirely separate set of 2D GPUs, and wireless functionality (although I haven't emulated it). PS1 has a somewhat more advanced audio processor, some coprocessors for XA and MDEC decoding, and a CD-ROM interface that DS lacks (and is more complex than its cart interface) but these things don't amount to enough to tip the scales.

N64: This is difficult to answer because N64 emulation has been "done" in a lot of ways. The usual very-HLE heavy but fairly inaccurate (and not even totally compatible) approach, combined with a hardware GPU renderer, is probably easier than DS emulation, especially if you don't try tackling the more obscure/custom ucodes. But a fully LLE and reasonably accurate N64 emulator with software renderer is approaching a similar amount of work. The dedicated audio hardware is very limited and there's no 2D engine, but the RSP has a significant amount of instructions and the RDP has various features that DS lacks (primarily bilinearish filtering and mipmapping)

Saturn: In a lot of ways comparable. Like DS Saturn has two CPUs, a 3D processor, and a 2D processor. The 2D is similar if maybe a bit more featured, while the 3D is significantly less featured but tricky to actually emulate correctly. There's no geometry coprocessor but there is a DSP that not a lot of games use. There's way more work involved in the audio part; if you want accurate LLE you have to emulate its DSP which is hardly documented, and there's a 68k, and a mixer. And the usual CD-ROM, including an SH1 CD-ROM controller, but I doubt anyone is going to bother LLEing that any time soon. Plus there's a 4-bit "system" CPU for some reason. So overall there are more individual parts but they get arguably more complex on DS, and you can get a lot of emulation done without doing much on some of them.

Also regarding your other question, I think GBA is easier to emulate than SNES because it has much simpler audio (no audio CPU, no real DSP), fewer pointless graphics features and overall more straightforward implementation, and less strange memory mapping. The games are also less timing sensitive. GBA does have some (actually useful) graphics features that are more work to emulate though, like sprite scaling/rotation, more comprehensive color blending and better windows.
 
OP
F

Flerovium

Member
Since I recently started working on NDS emulation I thought I would create some tests, I started with a very basic FIFO test in C
which should basically count up from 0. The value to be print is generated by the ARM7 core and sent through the FIFO to the ARM9 core which then outputs it to the screen, very basic stuff :)

You can get the code from my GitHub: https://github.com/fredericmeyer/NoDS/tree/master/tests/nds/fifo-basic
Expect more tests to come with time!

I cannot test this on my own emulator yet though since I have no working ARM9 core, only partial ARM/NDS7 emulation XD
I want to get the NDS7 part running before I tackle NDS9.
 

Shonumi

EmuTalk Member
Most games use the NDS9 for all the heavy lifting from what I understand, e.g. a majority of the game's code will be ARM code running on that CPU (THUMB code is not used as much as it was on the GBA, generally speaking). The NDS7 is generally used for Audio, WiFi(?), and other auxiliary stuff. That's why I'm focusing on the ARM9 and figuring out how to get the NDS7 running in alongside it later. I'm trying to get a very basic demo made with DevKitPro and libnds. So far, so good, but even the most basic demo compiled with libnds seems to want to use a lot of features: Immediate DMAs, the NDS CP15 co-processor, some SWIs, even interrupt handling. All this just to draw a red screen :p

I have not worked on my NDS branch in a bit. Trying to get the IRQs done right, but I was recently busy getting 1.0 of GBE+ ready for release. I just finished upgrading the whole project to SDL 2.0, so I'm going to have to rewrite parts of the NDS core to use the new version. I plan on using OpenGL for 3D rendering eventually, and SDL 2.0 makes it easy to create an OpenGL 3.3 or 4.0+ context. I'd like to have access to some of the newer OpenGL features and take a more "modern" approach to 3D graphics. Hopefully SDL 2.0 will support Vulkan sooner or later, and by that time there will be some literature or tutorials on how to properly utilize Vulkan.
 
OP
F

Flerovium

Member
I wanted to get basic NDS7 emulation first since I don't have to mess with any caches, coprocessors or ARMv5 instruction there, heh. I got support for basic SPI and IPC now, you can see this in
this video I did for some reason:

But I'll very soon be at a point where I have to start NDS9 emulation. Also working only on NDS7 is very frustrating since you don't get to see anything beside your debug output :p
 

Shonumi

EmuTalk Member
Looking good :)

Even if you can't see the results, it's important to work on anything you can. Even if something is a small component, eventually it will be necessary for the entire emulator to function. Better to write your code now than later. Working on the little things actually gives me encouragement. I know building an NDS emulator is a long, long process, but if I take some steps everyday, I will make it to my goal.

Not much happening on my end for NDS emulation. I'm trying to get basic IRQ handling finished. It's more or less just like the GBA IRQ handler (NDS9 is a bit different regarding the user IRQ handling address). On the NDS9, it requires some form of CP15 emulation (you need to know the DTCM base address, which is determined by the co-processor) but it's simple really. I think I'm really, really close to getting a libnds binary to run. The next hurdle is to get the IrqWait SWI working via HLE (probably just going to copy+paste the GBA core's code for this).
 
OP
F

Flerovium

Member
Interesting. Do you emulate the icache or dcache yet? Or do you happen to know how important it is to emulate them?
 

Shonumi

EmuTalk Member
No, none of the caches is emulated just yet. As far as I know, the caches just affect timing when fetching certain parts of memory, and right now there really is no timing code in my emulator. So, in order to at least draw things on screen, I don't think I'll have to worry about them right away. I'm making tests via DevKitPro, so I'm in control of whether or not the ROMs need exact timing.

It seems like I will need to emulate the NDS7 running alongside the NDS9 to get even a basic demo to work. I checked everything, and apparently even my homebrew test requires both CPUs to communicate on some level. I got the the point where the NDS9 sets up the CP15, does a few DMAs, then calls the IntrWait SWI, but that's still not enough. Program execution hangs. I hacked up Desmume and effectively disabled the NDS7, and the results are similar; it got stuck on the IntrWait SWI as well. So the only thing left is to emulate the NDS7. I don't think it'll be that hard (since it'll be the same code from my GBA core), it's just going to take a bit to make sure they can both run side-by-side.
 

Top