What's new

Game Boy

CodeSlinger

New member
yeah thats all that register does. Dunno why it is called a divider though. One of its purposes is to supply a seed for a random number generator. As I found out the hardway with tetris, if you dont implement this then you always get given the block. Which makes the game a bit easy.

Edit:

One of the parts of the gameboy system I dont understand fully is the LCD Status register. I honestly dont think it is this that is causing the low compat problems im having but I still want to get it right.

This is how I understand it, I'll take each bit of the register at a time:

BIT 6: LYC=LY Coincidence Interrupt: This bit gets set when LYC == LY. When this bit is set it also sets the BIT 1 in the Interupt Request address (0xFF0F) which signals that an LCD interupt is required. When does this Bit 6 get reset?

BIT 5: OAM Interupt. This is set whenever the mode is set to 2. This also requests an interupt. When does it get reset?

BIT 4: V-Blank interupt. This is set during the entire V-Blank period. The thing I dont understand about this is that there is a seperate interupt for V-Blank so why would the CPU service the V-Blank interupt and then service the LCD interupt for V-Blank? They're the same thing.

BIT 3: H-Blank Interupt. This is set for the duration of H-Blank. Thing is is that there isnt a H-Blank interupt. Or does it mean request an LCD interupt and make sure BIT 3 is set?

BIT 2: This is set when LY==LYC same is Bit 6. This doesnt request an interupt. Whats the point in having bit 6 when bit 2 is the same?

BIT 1-0: Set for the current mode of the LCD. I dont know how to determine this.


One other thing. As Bit3-6 is all interupts can more than 1 be set at the same time?
Thanks for any help.
 
Last edited:

Exophase

Emulator Developer
CodeSlinger said:
yeah thats all that register does. Dunno why it is called a divider though. One of its purposes is to supply a seed for a random number generator. As I found out the hardway with tetris, if you dont implement this then you always get given the block. Which makes the game a bit easy.

It's called a divider because the clock is divided off of the main (4MHz) clock, by 256 to be exact. The timer is divided off the main clock too, but by a configurable amount.
CodeSlinger said:
One of the parts of the gameboy system I dont understand fully is the LCD Status register. I honestly dont think it is this that is causing the low compat problems im having but I still want to get it right.

This is how I understand it, I'll take each bit of the register at a time:

BIT 6: LYC=LY Coincidence Interrupt: This bit gets set when LYC == LY. When this bit is set it also sets the BIT 1 in the Interupt Request address (0xFF0F) which signals that an LCD interupt is required. When does this Bit 6 get reset?

BIT 5: OAM Interupt. This is set whenever the mode is set to 2. This also requests an interupt. When does it get reset?

BIT 4: V-Blank interupt. This is set during the entire V-Blank period. The thing I dont understand about this is that there is a seperate interupt for V-Blank so why would the CPU service the V-Blank interupt and then service the LCD interupt for V-Blank? They're the same thing.

BIT 3: H-Blank Interupt. This is set for the duration of H-Blank. Thing is is that there isnt a H-Blank interupt. Or does it mean request an LCD interupt and make sure BIT 3 is set?

Bits 3-6 are all interrupt enable bits. When they're on it means the LCDC IRQ, that is, the IRQ triggered in bit 1 in IF and serviced at 0x48, goes off whenever these events happen. One thing that was actually really unclear to me until I looked at the source for another GB emulator... the vblank IRQ (bit 0, serviced at 0x40) always goes off when vblank triggers when LY == 144. If bit 4 is on that means that the LCDC IRQ goes off too.

The way all exceptions work is that the IF bit that was raised is cleared as soon as the exception is raised. If for whatever reason the exception can't be raised, that is, if that interrupt is disabled (either globally through the di instruction or by the IME register) then the bit will stay 1 until it can be. Or until the user clears it manually. The CPU can tell what exception was raised by virtue of what exception handler was called. In the case of LCDC it has to look at the video mode bits because they all go to the same handler.

CodeSlinger said:
BIT 2: This is set when LY==LYC same is Bit 6. This doesnt request an interupt. Whats the point in having bit 6 when bit 2 is the same?

This isn't an interrupt enable bit, it's an actual status bit that's only on for the scanlines where LY == LYC.

CodeSlinger said:
BIT 1-0: Set for the current mode of the LCD. I dont know how to determine this.

The mode bits say what the video chip is currently doing. For 144 scanlines it's in active display and the documentation floating around lists it like this:

For ~204 cycles it's in mode 0, or hblank, and you can freely access OAM and VRAM.
Then for ~80 cycles it's in mode 2, where it is processing the sprites in preparation to draw the scanline. During this time you can't access OAM.
Then for ~172 cycles it's in mode 3, where it's drawing the scanline. During this time you can't access OAM or VRAM.

This makes up 456 cycles per scanline. The numbers given are only approximations, possibly because they can't be measured exactly with the CPU (not fine enough precision). These are in 4MHz units, by the way.

In reality what I hear is that these periods vary a lot depending on how many sprites are active in the scanline. I read VBA's source on this and couldn't really make a ton of sense of why it was doing things the way it was but it looked like every sprite drawn added 8 or 12 cycles depending on if it's an even or odd sprite out of the ones drawn. Having windows on can apparently also add 4 cycles if windows are active. These cycles come out of mode 0 for the next scanline (so the two together should still add up to the same amount always) I'd love to see some real documentation on all of this.

The cribsheet gives times in microseconds. It says 48.64 for mode 0 with no sprites, which at 4MHz (or 4194304Hz) confirms 204 cycles. It says 18.72 for 10 sprites, which is a bizarre 78.5 cycles. Following what I gave above, the odd sprites happen 5 of the time, so there should be (5 * 8) + (5 * 12) cycles taken away, or 100. Where the other 25.5 would be going is anyone's guess.

I looked at Gambatte as well, which is supposed to be the most accurate GB emulator there is. Unfortunately, the video part of the source code is next to impossible to decipher without a lot of studying (or maybe even with) - it's done in a very complex/cryptic OOP event based method that isn't really documented at all and uses a lot of nested ternary and other coding styles I find difficult to gleam from. It seems that it adds between 6 and 11 cycles per sprite, 6 for window, and something else for the current SCX value (how much off alignment it is from the tile map).

I personally wouldn't worry about getting the mode intervals right at all, but it's important to set them for at least some times because some games will poll, waiting for them to change.
 
Last edited:

Cyberman

Moderator
Moderator
Sorry about that Exophase, in some strange weird twisted event when I clicked quote I ended up editing your post. I restored it (hence it says Edited by Cyberman down there). I suppose I could do more editing, but I am kind fizzled out after explaining hardware sprite systems so I am just going to stop right here by fixing the post mostly and do less (or no more should I say) Tthat's what happens when you get an 11 year old asking you questions and playing with stuff whilst posting.

Cyb
 

Exophase

Emulator Developer
Sorry about that Exophase, in some strange weird twisted event when I clicked quote I ended up editing your post. I restored it (hence it says Edited by Cyberman down there). I suppose I could do more editing, but I am kind fizzled out after explaining hardware sprite systems so I am just going to stop right here by fixing the post mostly and do less (or no more should I say) Tthat's what happens when you get an 11 year old asking you questions and playing with stuff whilst posting.

Cyb

No problem, I've accidentally done that before on forums I was admin of too :B Fixed it up a little more.

I think GB's sprites are kinda interesting and unconventional. I don't know any other platform that has a variable hblank depending on number of sprites visible. I had a theory that PC-Engine might because we're getting really weird VDC timing flaws in various games (no emulator knows what's going on really), but that didn't pan out in the tests I got ran on hardware.
 

CodeSlinger

New member
Thanks for the detailed reply exophase.

So during V-Blank it is possible for both interupts to be executed? The V-Blank interupt and the LCD interupt with bit 4 set? Of course this is assuming IME is enabled and so is the corresponding IF bit.

I take it the LCD interupt only occurs when one of bits 3-6 gets set from 0 to 1. And when the interupt has finished servicing the bit gets reset from 1 to 0?

I had a look at the modes and am I right in thinking it starts with mode 2 then mode 3 then mode 0. Repeat until vblank and set mode 1. Then at the end of vblank back to mode 2?

As for the LYC = LY stuff. The status bit 2 is set for the entire duration of LYC = LY. Where as the interupt bit is set as soon as LYC = LY and is reset after the interupt has been serviced. So once it has serviced the interupt bit 6 gets reset even if LYC still equals LY?
 

Exophase

Emulator Developer
Thanks for the detailed reply exophase.

So during V-Blank it is possible for both interupts to be executed? The V-Blank interupt and the LCD interupt with bit 4 set? Of course this is assuming IME is enabled and so is the corresponding IF bit.

Probably. I haven't read anything to suggest otherwise.

I take it the LCD interupt only occurs when one of bits 3-6 gets set from 0 to 1. And when the interupt has finished servicing the bit gets reset from 1 to 0?

The LCD interrupt occurs when all four of these are true:

- The particular event happens (probably only at a transition point, like the hblank IRQ event would trigger when the mode transitions from to 0)
- The bit in STAT corresponding to that event is set to 1
- The LCD IRQ is enabled in IE
- IME is set

The bits in STAT are bits that allow for an interrupt to be generated on those conditions, not bits that cause the interrupt by themselves. Setting them high probably won't cause an interrupt to happen, even if it's in the middle of the appropriate period. They won't get set back low unless the code explicitly does so.

I had a look at the modes and am I right in thinking it starts with mode 2 then mode 3 then mode 0. Repeat until vblank and set mode 1. Then at the end of vblank back to mode 2?

From the way the diagram is laid out it looks like it starts with mode 0 (hblank), then transitions to 2 (OAM access), then 3 (hrefresh), for active display scanlines. For vblank scanlines (144 to 153) the mode is 1.

As for the LYC = LY stuff. The status bit 2 is set for the entire duration of LYC = LY. Where as the interupt bit is set as soon as LYC = LY and is reset after the interupt has been serviced. So once it has serviced the interupt bit 6 gets reset even if LYC still equals LY?

If by interrupt bit you mean bit 1 in IF than yes, and keep in mind that "serviced" just means that the interrupt handler has been jumped to, not that it has completed. I haven't seen anything in the documentation to suggest anything other than that bit being high for whenever LY == LYC, so the entire scanline unless the game resets LY, or if the game changes LYC.
 
Last edited:

CodeSlinger

New member
Thanks for the reply!

One thing I dont understand about the STAT register is if say BIT 4 is set then this is an interupt request and if the IF Bit 1 is also set (and IME is enabled) then it jumps to the address of the STAT interupt routine. However if it doesnt reset BIT 4 after servicing the STAT interupt then surely the next cycle will then service this interupt again and again until the mode of the LCD changes and BIT 4 is reset? I'd of thought that once servcing the STAT interupt then BIT 4 gets reset straight away so it doesnt reservice it?

Also with the timer register. I know how the timing register works. It counts up at a certain frequency and when it wraps round to 0 an interupt is requested and the timer resiger is set to the value of the timer modula.

It says in the documentation the first frequency is set to 4096 hz. Now does this mean that the timer register gets incremented at 4096hz a second, so in one second there is 16 timer interputs a second (not including when ther timer is disabled or running at a different frequency)? The reason why it would be 16 is because 4096 divided by 256 is 16 (256 being the amount of increments the register has to do before it wraps round). Or does the 4096 frequency mean there are 4096 timer interupts a second and the timer register has to be incremented at a rate that will achieve this? It says in the documentation that it is the second scenario (4096 interupts a second), but logically id say it was incorrect and would be the first scenario (16 interupts a second).

Thanks again for your help.
 

Exophase

Emulator Developer
Thanks for the reply!

One thing I dont understand about the STAT register is if say BIT 4 is set then this is an interupt request and if the IF Bit 1 is also set (and IME is enabled) then it jumps to the address of the STAT interupt routine. However if it doesnt reset BIT 4 after servicing the STAT interupt then surely the next cycle will then service this interupt again and again until the mode of the LCD changes and BIT 4 is reset? I'd of thought that once servcing the STAT interupt then BIT 4 gets reset straight away so it doesnt reservice it?

Bits 3-6 in STAT are not IRQ active lines, they're IRQ enable bits. The bits mean "allow this event to cause an interrupt when it happens", they do not reflect that the event has happened. If these bits are not set then the mode transitions they represent will never cause the LCDC IRQ bit to go high in IF. If they are set, then the bit in IF will go high, but if it's serviced (because the corresponding bit in IE is set, and IME is set) then that bit in IF will immediately go low again, preventing the IRQ from retriggering again.

It says in the documentation the first frequency is set to 4096 hz. Now does this mean that the timer register gets incremented at 4096hz a second, so in one second there is 16 timer interputs a second (not including when ther timer is disabled or running at a different frequency)? The reason why it would be 16 is because 4096 divided by 256 is 16 (256 being the amount of increments the register has to do before it wraps round).

It would be at least 16, 16 only if TMA is set to 0.

Or does the 4096 frequency mean there are 4096 timer interupts a second and the timer register has to be incremented at a rate that will achieve this? It says in the documentation that it is the second scenario (4096 interupts a second), but logically id say it was incorrect and would be the first scenario (16 interupts a second).

What documentation are you reading? If it's using the 4KHz divider then that means it increments the TIMA register at a rate of 4096 times per second.
 

CodeSlinger

New member
Thanks again for the reply :)

I think i understand the LCD status now. I just implemented it like you said and the compatibility improved. Basically Bit3-6 of the stat reg are set by the programmer not the emulator. These represent what circumstances the LCD IRQ should happen under. So if the programmer enabled bit 3 it means they are interested in the HBLANK so they will set Bit 3 and for the duration of mode 0 (HBLANK) the emulator will check Bit 3, if it is set then it requests an lcd interupt. However the interput will only get serviced if IF bit 1 is set along with IME enabled. That sound about right?

It seems to be the timers i'm having most of my problems with. This is what ive got so far:

Code:
void Emulator::DoTimers( int cycles )
{
	BYTE timerAtts = m_Rom[0xFF07];

	

	m_DividerVariable += cycles ;

	// get the timer frequency

	int timerVal = GetTimerFrequency(timerAtts) ;

	int incrementlimit= 0 ;


	//This is the part question 2 refers to
	// the values incrementlimit gets set to is clock speed / frequency
  	switch(timerVal)
	{
		case 0: incrementlimit= 1025 ; break ; // 1025 is clock speed 4Mhz divided by frequency 4Khz
		case 1: incrementlimit= 16; break ; 
		case 2: incrementlimit= 64 ;break ;
		case 3: incrementlimit= 256 ;break ; // 256
		default: assert(false); break ; // weird timer val
	}

	if(IsTimerEnabled(timerAtts))
	{
		m_TimerVariable += cycles ;

		// time to increment the timer
		if (m_TimerVariable >= incrementlimit)
		{
			m_TimerVariable = 0 ;
			bool overflow = false ;
			if (m_Rom[0xFF05] == 0xFF)
			{
				overflow = true ;
			}
			m_Rom[0xFF05]++ ;

			if (overflow)
			{
				
				m_Rom[0xFF05] = m_Rom[0xFF06] ;

				// request the interupt
				RequestInterupt(2) ;
			}	
		}
	}
	// timer not enabled. Reset timer
	else
	{
		//This is the part question 1 refers to
		m_Rom[0xFF05] = 0 ;
		m_TimerVariable = 0 ;
	}

	// do divider register
	if (m_DividerVariable >= 256)
	{
		m_DividerVariable = 0;	
		m_Rom[0xFF04]++ ;
	}
}

The parts of the above code im unsure about:

1. If the timer is disabled do I reset the current timercount (TIMA) to 0 along with the timertracer (m_TimerVariable)?

2. When the clock frequency changes the emulator immediately reacts. So if the current frequency is 4KHz the timercounter (TIMA) increments every 1025 clock cycles. So if the current amount of clock cycles is at 500 so its only half way to incrementing TIMA and the frequency changes to 65Khz this means the TIMA should increment every 64 cycles. So as its at 500 and the limit is now 64 it immediately requests a timer interupt and starts counting again from the value is TMA. Is this correct or should the timer frequency only ever take effect after the next TIMA overflow?

3. What happens to the timer when it is servicing the timer interput? Is it still counting or does it stop until its finished? Is it possible for the timer to interupt the current timer service if IME and IF are both enabled?

Thanks again for your help
 
Last edited:

CodeSlinger

New member
Well after a month of trying to figure out why bubble ghost wasnt working I finally cracked it.

It turns out my function for the opcode "Set Bit n in memory address pointed to by HL" was doing two reads. So it ended up setting the bit of the value pointed to by the address pointed to by HL rather setting the value of the address HL. DOH!

Compatibility is about 60% now. My main concern at the moment is why mario2 freezes at the start of the first level. Im pretty sure its to do with the lcd status and the modes because if I mess about with it I can get Mario2 working, but i know it is the wrong way of handling the stat register. Still the end is insight!
 

Exophase

Emulator Developer
Im pretty sure its to do with the lcd status and the modes because if I mess about with it I can get Mario2 working, but i know it is the wrong way of handling the stat register. Still the end is insight!

You're on the right track. Super Mario Land 2 will break if an LCD IRQ triggers while the LCD is disabled, because said IRQ while cause a bank switch without switching back.
 

CodeSlinger

New member
Yeah it was the LCD status. It turns out that the mode cycles through 2,3,0 and 1 during vblank. Not 0,2,3. Mario 2 now plays up until about the 3rd level.

My compatibility has suddenly shot up, probably near 80% mark now, which im really happy with. However currently im stamping over memory which is causing the emulator to bomb out so i'll have to fix that first before getting an accurate compatibility rating.

I heard sound emulation for the master system is easier than gameboy so i might get the gameboy emulator running at about 95% without sound then start the master system and once learning sound for that I'll comeback to the gameboy.
 

givemeachance

New member
Hi everyone...

Long time ago i finished my chip8 emulator(BTW, thanks for those who helped) , and now im almost ready to start coding a GB emulator.

Im looking for a list with opcodes ( showing their actual value etc) ...anyone knows where can i find them?

Thanks.


Nevermind ! found them :bouncy: ...at least wish me good luck :sombrero:
 
Last edited:

CodeSlinger

New member
Good luck with your gameboy emulator. I loved coding mine and learnt a hell of a lot in the process. It is much more difficult than the chip8 but there is a lot of content on this thread to help understand it.

Im personally finding the Master System easier than the Gameboy except the Master System uses a full Z80 making it more difficult to find bugs :)
 

givemeachance

New member
Thanks!

Its indeed harder than i thought , but i will make it ( i hope :fireman: ).

The only bad thing , is that i can't find some source code to study.
I downloaded GnUboy src , and hell ... code design is really bad and its
impossible to learn by just reading code :borg: .

Do you guys know where can i find source code so i can actually learn ?
 

bcrew1375

New member
Ah, the nostalgia. Looks like it's been about a year and three months since I last posted in this thread. I miss the days when we were all coding our emulators together. :( Anyway, if you look way back on page 40, I posted the source to my emulator. It is not very fast, and it doesn't have great compatibility, but I tried to make the code as readable as possible. Hopefully, you can learn most of what you need from it.
 

givemeachance

New member
Thanks! , your source is really clean!.
Even if im more familiar with C++ , i've never seen so well organized C project.

A few suggestions how to improve your emu:

1)In sdlfunc.c you can use register-type variables.It might improve the loop a little (or not at all).

2)In cpu.c use function array pointers when executing opcodes(this will give a high speed boost) , and also don't run all tasks so often. Use a timer.

Good job , and thanks!
 
Last edited:

bcrew1375

New member
Yeah, that source is pretty outdated. I fixed alot of bugs after that, but I did about squat for efficiency. I messed alot of my code up a while back and I'm going through it trying to find what I've broken.
 

Exophase

Emulator Developer
1)In sdlfunc.c you can use register-type variables.It might improve the loop a little (or not at all).

In UpdateScreen? For i and j? I doubt they wouldn't be allocated to registers for x86.

2)In cpu.c use function array pointers when executing opcodes(this will give a high speed boost) ,

This is a good case of conventional advice regarding interpretive emulation that's not correct, or at least, should not be correct. Calling arrays of function pointers should actually be slower than using a big switch - granted, he's calling functions inside the switch, but if he makes them inline then it'll make this moot. Which is probably what he should do if he wants to see an immediate improvement.

Calling a function is a lot heavier than going to a switch case. The compiler will likely turn the latter into a jump table (not a call table), with the only possible negative deterrent being an additional cmp for range checking, although some compilers are smart enough to omit this. With a function, you have a register save convention barrier to hop through, which can not be omitted because the compiler won't be smart enough to deduce the class of functions you're calling. You have to go through the function prologue/epilogue which can cost a lot depending on the arch (even just the additional return can cost more). You have to either pass the parameters individually, through a struct (or this pointer, in C++), or in globals - no matter what it's going to mean accessing memory, rather than keeping things in registers. Not necessarily a huge deal for x86, but again, for other popular archs like ARM you're making extra work. If you use parameters then boiler plate to shuffle them into place has to be done for every opcode executed. Because the compiler knows nothing about the function it's calling it has to respect the caller save conventions full stop, which weakens the register allocation pool overall.

With a switch, the most common variables like PC, flags, accumulator, can be kept in registers (here the "register" keyword MAY actually help) and don't have to be shuffled around.

If the switch is slower then that probably means the compiler is doing something very bad and you should investigate.

If using GCC then the best thing you can do is probably:

- Array of label pointers
- Variable goto for this
- Do this at the end of every instruction handler, instead of a loop, to avoid a jump

This is what people tend to do with the better ASM interpreters.

Some major things that you can do to improve speed: mainly you check for a TON of stuff every instruction and you can reduce this down to almost nothing:

- First, you can determine statically how many cycles will occur until the next IRQ for the most part - sometimes this number could suddenly get changed but there are ways to handle that. So you shouldn't have to constantly check for IRQs.
- You don't need to check halted or stopped every instruction, once it gets halted you can back out into a loop that deals with it specifically instead of running instructions. That's three if's avoided.
- You can make logging a compile time variable, so you don't have to check it.
- Although this is more complex, it's possible to factor out divide and timer counters and instead have a global timestamp that's updated, and then calculate these values when they're read (but this might complicate your memory reading code)
- At least, you can fold the check for zero into one counter that's the smallest of them.
- No reason to kill lower bits of STAT all the time, you can do that when it's written to.
- For that matter, no reason to check video state every instruction, when the lines are only changing once every so many cycles. You can factor this OUTSIDE of the loop.

If you did all these things you could probably increase speed by a good order of magnitude.

and also don't run all tasks so often. Use a timer.

What do you mean exactly?
 

givemeachance

New member
Wow ... awesome post , thanks for all the information!.

In UpdateScreen? For i and j? I doubt they wouldn't be allocated to registers for x86.
Maybe.
The best solution is to create an empty image , and write the pixels at runtime.
Then , you can just render the final image without having to check the color of each pixel.
I think its much faster...

This is a good case of conventional advice regarding interpretive emulation that's not correct, or at least, should not be correct. Calling arrays of function pointers should actually be slower than using a big switch - granted, he's calling functions inside the switch, but if he makes them inline then it'll make this moot. Which is probably what he should do if he wants to see an immediate improvement.
Hmm...im not the best coder of course , but how faster can be this:

Code:
switch(opcode)
{
	case 0 ... 255 
}

which translates to :

if (opcode == 0) else if ( ==1 ) else if ( ==255)

than this:
Code:
//stored at 0x00
forceinlined void OP_NOP(TVirtualMachine* vm)
{
	vm->pc++;
}

then executing the opcode directly :
Code:
	if( (currentOP>=0x00)&&(currentOP<= max) )CPU_FUNCS[currentOP]();

I think its much faster...
What do you mean exactly?

In games , usually we delay the update system for about ".1"ms , and it improves
alot the performance.
I think it might work well with gb emu.
 

Top