What's new

-

Shonumi

EmuTalk Member
Exophase said:
What you're doing is making it so that the interrupt handler never gets called for non-vblank interrupts that happen during the wait. It's not enough that those interrupts are fired, they need to actually call the interrupt handler (if the interrupt's enabled anyway). Otherwise they won't happen until it's too late.

For example, say you have an hblank IRQ interrupt handler that does something to the scanlines like scrolls them for a wavy background effect. The normal game loop runs then calls VBlankIntrWait while the screen is still in the middle of being rendered. If you sit in a tight loop until vblank hits all of those hblanks were something should have happened are going to be missed. They may be serviced along with the vblank interrupt but by then it's too late.

If you look at 0x34C in the GBA BIOS you'll see that after coming out of a CPU halt it checks to see if the right interrupt was triggered, and if not it goes back to waiting. Nowhere does it disable other interrupts. If your emulator doesn't do this it's not doing things correctly and there are some games that will definitely break - maybe not a ton, but they're out there.

Okay, now I see what you're saying :D You're saying that in HALT mode, ANY interrupt where IE and IF bits match up will exit this state and service those interrupts. VBlankIntrWait and IntrWait just guarantee that the ARM/THUMB code will continue running AFTER the specified interrupt happens. That makes sense when you put things that way. I was under the impression that HALT mode worked as I described it, ignoring interrupts that aren't what the SWI is looking for (which sounds more useful, but then again I don't code GBA games). Perhaps GBATEK could have better documented this.

I still don't think proper implementation should be that hard via HLE. All I have to do is run my tight loop until IE and IF bits match up. The key is then is to make sure the PC doesn't advance to the next instruction if the triggered interrupt isn't what the SWI is looking for. In the case that it isn't, handle_interrupt() can service it, then go back to the PC containing the SWI, and keep running the tight loop again and wait for the next interrupt. When the specified interrupt happens, advance the PC to the next instruction. handle_interrupt() will service the specified interrupt, complete said interrupt, then dump the PC to one instruction after the SWI, and processing continues as normal. Would require an extra if.. else.. or two, but not much. Only downside would be that the same SWI instruction would have to be fetched, decoded, and executing (from the emulated CPU's point of view) multiple times to account for the forced psuedo-infinite loop, but it shouldn't affect timing in my emulator (especially since SWIs aren't clocked at all, just like DMAs). This would be good, obscure behavior for a test ROM!

Exophase said:
That's a good work ethic. Just one commit a day can get a lot more done and keep you from losing momentum and stagnating. I wish I could always maintain at least that much.

Thanks! I probably won't be able to maintain this pace though, eventually. I mean, one day I want to start playing emulators again instead of making them. It's sort of a "trap" I've heard other authors talk about ;) Kinda miss those days when I'd just play Advance Wars on VBA for hours on end. But then again, I don't regret knowing more about the programs and consoles I love.

Exophase said:
There was an earlier question in this thread if anyone decoded Thumb to ARM, right?

I never did this for GBA, but DraStic did it from the start so at least I can attest that it's a workable approach.

BUT, you have to be aware that some Thumb instructions can't be done in ARM directly. bl is internally split into two distinct instructions, which you have to emulate separately because some games use the sub-instructions independently (eg Golden Sun), and neither of these can be expressed as ARM instructions. In DraStic I got around this by re-encoding them in some of the unused instruction space (you probably want to make sure you're in Thumb mode if these instructions are encountered, otherwise actual invalid ARM instructions will look like them.. but I think those hang the GBA anyway...)

Also, the ldr[pc, ...] and add rd, pc, imm instructions are relative to a PC that has the second-least significant bit cleared, so you have to communicate that somehow to the ARM core (or some other way of handling it properly) rather than just using the PC as-is.

Maybe it's not the best approach, but I dunno, it made sense when I came from the GB/GBC to the GBA. I emulate each instruction based on its "grouping" that appears in ARM's official documentation. That is to say, THUMB.1, THUMB.2, ARM.1, ARM.2. I don't have a generic mov() or ldr() function that can be used for both ARM and THUMB opcodes. ARM and THUMB instructions have their own implementations for each grouping. Even something like THUMB.6 to THUMB.9 each have their own separate functions, so that handles any idiosyncrasies. I decode BL as two instructions, but the same function handles THUMB.19 and determines if the 1st or 2nd part is to be executed. Like I said, I dunno, it just seemed the most straight-forward way to go about it at the time, so I went with this method. Interested in knowing how others handled it.
 
Last edited:

Exophase

Emulator Developer
Okay, now I see what you're saying :D You're saying that in HALT mode, ANY interrupt where IE and IF bits match up will exit this state and service those interrupts. VBlankIntrWait and IntrWait just guarantee that the ARM/THUMB code will continue running AFTER the specified interrupt happens. That makes sense when you put things that way. I was under the impression that HALT mode worked as I described it, ignoring interrupts that aren't what the SWI is looking for (which sounds more useful, but then again I don't code GBA games). Perhaps GBATEK could have better documented this.

I still don't think proper implementation should be that hard via HLE. All I have to do is run my tight loop until IE and IF bits match up. The key is then is to make sure the PC doesn't advance to the next instruction if the triggered interrupt isn't what the SWI is looking for. In the case that it isn't, handle_interrupt() can service it, then go back to the PC containing the SWI, and keep running the tight loop again and wait for the next interrupt. When the specified interrupt happens, advance the PC to the next instruction. handle_interrupt() will service the specified interrupt, complete said interrupt, then dump the PC to one instruction after the SWI, and processing continues as normal. Would require an extra if.. else.. or two, but not much. Only downside would be that the same SWI instruction would have to be fetched, decoded, and executing (from the emulated CPU's point of view) multiple times to account for the forced psuedo-infinite loop, but it shouldn't affect timing in my emulator (especially since SWIs aren't clocked at all, just like DMAs). This would be good, obscure behavior for a test ROM!

Hm.. I hadn't thought of doing it that way. I guess it strikes me oddly since it's pretty different from how the real version works, but I can't think of a reason why this would cause a problem. That is, unless you were trying to approximate timing of the BIOS routine, and even then you may be able to work around with it by fiddling with cycle counters.

Looks like that stuff I said about needing special tricks for HLEing this function was probably off the mark.

Maybe it's not the best approach, but I dunno, it made sense when I came from the GB/GBC to the GBA. I emulate each instruction based on its "grouping" that appears in ARM's official documentation. That is to say, THUMB.1, THUMB.2, ARM.1, ARM.2. I don't have a generic mov() or ldr() function that can be used for both ARM and THUMB opcodes. ARM and THUMB instructions have their own implementations for each grouping. Even something like THUMB.6 to THUMB.9 each have their own separate functions, so that handles any idiosyncrasies. I decode BL as two instructions, but the same function handles THUMB.19 and determines if the 1st or 2nd part is to be executed. Like I said, I dunno, it just seemed the most straight-forward way to go about it at the time, so I went with this method. Interested in knowing how others handled it.

Yeah, I think converting Thumb to ARM only really makes sense if you find yourself decoding instructions in several places. In my case, for the CPU interpreter and recompiler (and maybe if you're doing something like profiling). Otherwise, there isn't much saved in converting to Thumb vs emulating Thumb instructions, especially when you also have to take care to handle cases where Thumb doesn't actually map to ARM.

That said, there is one other thing this opens up, and that's optimizing Thumb code. If you're converting blocks of code to ARM (probably only applicable for a recompiler) you can do some optimizations like move/constant/shift propagation, which helps undo some of the inefficients of the Thumb instruction set. I haven't actually done this, and it's not that compelling for DS emulation since most games don't use Thumb much, but I'd still like to try it some day. If you're going full bore on optimizations for GBA emulation (and GBA games do usually use a lot of Thumb code) this may be something to think about.
 
Last edited:

Shonumi

EmuTalk Member
[MENTION=51648]Exophase[/MENTION] - I actually think I've found a good commercial game that can be used as a test case for what we discussed about the VBlankIntrWait() SWI. The intro logos for Yggdra Union have always been a head-scratcher for me. My emulator displays only the top half of the "Licensed By Nintendo", "505 Games", and "Sting" logos. I thought this had to do with something related to HBlank. It seems that the game sits in a loop with VBlankIntrWait(), but it also has HBlank interrupts enabled. I gather it's supposed to do something midscreen (switching tiles around?) Could be a false lead, but Riviera (another Sting game published by Atlus) has lots of VBlankIntrWait() calls with other interrupts enabled. Although Riviera plays fine, I think firing HBlank interrupts too late may be the cause of some graphical issues (mostly scene transitions where the BG is slides left or right a few scanlines at a time). I'll check it out and let you know how it goes.
 

Shonumi

EmuTalk Member
Pardon the double post, but I've come back with the results of using HLE for VBlankIntrWait(). As I suspected, Yggdra Union had many graphical issues because interrupts are supposed to be serviced even while VBlankIntrWait() is called, mainly HBlanks. The game didn't explode in my emulator, but there were scads of graphics simply missing, not there at all (mostly text, which is important in an RPG...) Riviera also uses it in places. The problem I had was that Riviera will service an HBlank on scanline 159, and during that time a VBlank IRQ is generated, but my emulator would service the VBlank IRQ without telling my HLE SWI code that a VBlank had occurred, so the SWI would get stuck in an endless loop. I've fixed that, just have to clean up the code now (bit of a mess atm). Thanks for the lead Exophase :D
 

Exophase

Emulator Developer
Glad to hear that worked out. Would have been interested to see if Sword of Mana had any issues as well. Also, good that you had a chance to test your fix.
 

Exophase

Emulator Developer
no$ just posted this on ngemu. Thought it would be good to pass along:

http://ngemu.com/threads/gba-open-bus.170809/

Summary: it looks like open bus reads return whatever was last read by the processor (probably residual from having just been driven), which takes into account the prefetch read performed during instruction execution. If a Thumb op is executed from a 32-bit address area it could mean that the top 16-bits contain the value of a recent load instead of a future instruction.
 

Shonumi

EmuTalk Member
Interesting. Obviously since he just released this info, I'm going off of what he previously wrote about in GBATEK regarding the open bus. I'll have a look at the new info later. This is one of those things that deserve dedicated test suites ;) I think I'm finally going to start working on a bunch of GBA homebrew tests aimed at hardware accuracy. Just gotta stop being lazy about it.

About Sword of Mana, it runs the same before and after my new HLE implementation of VBlankIntrWait(). No sound and the sepia intro is glitchy graphically. This is probably because 1) there are still a lot of things that are incorrect elsewhere and 2) DMA FIFO interrupts are not implemented at all and 3) DMA sound channel support is still rough. I know #2 is required to get much sound from the Metroid games. SoM doesn't explode, but I'm not accurate enough in areas where it counts to play this game.
 

Shonumi

EmuTalk Member
Glad to see you working on your emulator again :) Your progress so far looks nice.

As for timings, well, you don't really need accurate timings to play a majority of games. Only a handful are very sensitive to timing issues. Unlike a system such as the NES (which is notorious for timing issues as it relates to emulation), the GBA prompted a lot of game developers to stop counting individual cycles. This was probably due to the fact that the GBA was incredibly powerful for its time, so you generally have enough CPU cycles to not worry about tight timing conditions. Just as well, the GBA made programming in C viable, taking developers away from assembly and the notion of instruction cycles. You can have terribly inaccurate timings in your emulator; most games will not care, so long as you do things like interrupts when they expect it, especially VBlank.

I was going for cycle accuracy as best as I could, since it's just easier to do things that way, for me at least. I basically emulate the GBA one cycle at a time. I dunno if that has helped me avoid any issues or not. But some of it probably isn't right, I have to go back and look. I know some instructions don't have timings at all. Things like DMAs and all the BIOS functions don't have timings either in my emulator, but that hasn't been a problem so far.
 

Shonumi

EmuTalk Member
The purpose of those set and get functions is to be explicit in the code. To me, it's much clearer what I'm trying to do when I see get_reg() or set_reg() called rather than just accessing an array. Those parts of the code are self-documenting as some would say, so I plan to keep it for now. It's not too slow since the functions are inlined by the compiler and the function itself is just a jump-table. For the CPU interpreter, I actually don't mind how fast it is though because I'm going to write a dynamic recompiler one of these days :D

The real speed-killer in GBE+ is LCD rendering. I render everything per-pixel, rather than per-scanline, and that takes a lot of CPU power, especially for certain LCD operations. LCD performance is definitely something I'll have to optimize in the future though.

Oh, and before I forget, you should check out these demos for testing in your emulator -> http://www.coranac.com/projects/#tonc (see example binaries (v1.4.2) if you don't want to build them yourself).
 
Last edited:

Shonumi

EmuTalk Member
I'm letting the compiler handle inlining functions (the -O3 flag in g++). I'm pretty sure the compiler will do what I want, so I just use the optimization flags. I hear a lot of people say it's bad form to try and dictate things like that in C++ to the compiler when the compiler often has a better idea of how to build the executable than the programmer :p

Anyway, good luck trying to boot the BIOS. Just so you know, the BIOS does make use of several "advanced" LCD effects (scaling sprites, transparency), so don't worry if the graphical output doesn't look right at first.
 

Shonumi

EmuTalk Member
[MENTION=110839]Flerovium[/MENTION] - Yes, those ARM instructions don't use mem_check_16() or mem_check_32(), even though they should be there. I never got around to it, since most games do not exhibit problems without checking for alignment. I'm sure there are a number of cases where it matters, but for now, it's TODO. I want to actually improve the mem_check functions, but I'd also like to come up with some homebrew tests to make sure there are no errors. So for now, it's something to work on in the future :p

Glad my code helped you. It's been a year since I started working on ARM emulation, and that was my first time doing anything on this scale. Hopefully I'll go back over my code and make improvements before I move onto DS emulation (which uses ARMv5 and Coprocessors). Keep up the good work. Great to see that you got something to display. That's always the most exciting part for me. Good luck :D
 

Shonumi

EmuTalk Member
It probably has something to do with how you calculate the Carry Flag when doing rotates. I see that you made some commits to Nanoboy Advance. Did you manage to fix the problem?
 

Shonumi

EmuTalk Member
[MENTION=110839]Flerovium[/MENTION] - If I remember correctly, it's actually pretty common for games to read from the BIOS like that. I saw that behavior a lot while I was debugging various games. Those reads are supposed to return values from the open bus. It could have been a low-quality anti-piracy/anti-emulation measure. The values are pretty consistent when reading from the BIOS, so it's easy to implement. But it's not a very advanced trick, and most games can even be fooled by passing incorrect values (some games check to make sure it's not zero). Other games do this check, but they boot just fine without any proper emulation of BIOS reads.

One other possibility is that it's just wasting time. I don't know why a game would do that specifically, but a dummy check could wait for a precise number of cycles. Either way, it's odd code. I wonder what the original C source code looked like :)
 

Shonumi

EmuTalk Member
Be very careful about LDM! behavior. It's not the same for the GBA and DS. There are differences between the ARMv4 and ARMv5 architectures. The ARM CPU in the GBA is ARMv4 (technically ARMv4T since it supports THUMB mode). For the DS, both CPUs are ARMv5 as I recall. On ARMv4, if the current register is the base register, and the current register is in the list, writebacks are disabled for LDM! On ARMv5 writebacks are enabled if the current register is the only register in the list, or not the last register in the list.

It's important to know these differences if you're going to make a DS emulator like I plan to ;) See GBATEK about the details -> http://problemkaputt.de/gbatek.htm#armopcodesmemoryblockdatatransferldmstm
 

Shonumi

EmuTalk Member
Deadbody's CPUTest doesn't store results on screen. If it doesn't see a problem, it won't print anything other than "done with tests". The ROM is very basic, so I'd use ARMWrestler if you want something more exact.

Also, good work getting the to the "ATLUS" screen in Super Dodge Ball! It's a small step, but it's an important one. If I recall, you need to implement DMA 3 (immediate transfer) and the VBlankIntrWait SWI. If your emulated CPU is accurate enough, that should get you to the title screen.
 

Shonumi

EmuTalk Member
Well, when I got Super Dodge Ball booting, I failed a lot of tests in armwrestler. I don't think you need to be perfect (although being perfect certainly helps :p)

Try to debug things step by step. Take an emulator like VBA-M, no$gba, or mGBA and run it against yours. Try to see if you run code that those emulators don't (this means your code is messing up somewhere). Another handy trick is set a breakpoint for yourself, then check the registers between your emulator and others. If anything looks different, try to go through everything in reverse until you can find out where your emulator deviates from others. Hope that makes sense. Good luck :D
 

Shonumi

EmuTalk Member
[MENTION=110839]Flerovium[/MENTION] - No, Super Dodge Ball does not make use of memory mirror as far as I know. The only games that really do are the NES Classics. The code for Super Dodge Ball is pretty standard; it does not do a lot of tricks.
 

Top