What's new

Porting Mupen64Plus to Android

Ari64

New member
First, I added 8MB more to .bss:
Code:
	.bss
	.align	12
	.type	extra_memory, %object
	.size	extra_memory, 16777216
extra_memory:
	.space	16777216+8388608+64+16+16+8+8+8+8+256+8+8+128+128+128+16+8+132+4+256+512+4194304
rdram_memory = extra_memory + 16777216
	.align	12
	.type	rdram_memory, %object
	.size	rdram_memory, 8388608
dynarec_local = rdram_memory + 8388608
	.align	4
	.type	dynarec_local, %object
	.size	dynarec_local, 64
This shouldn't even compile. The symbol is rdram, not rdram_memory. If the linker isn't failing due to undefined symbols then you must have seperate symbols for both rdram and rdram_memory.

And the redundant .aligns are useless.


Then I pointed rdram to this area in assem_arm.h:
Code:
// This is defined in linkage_arm.s, but gcc -O3 likes this better
//#define rdram ((unsigned int *)0x80000000)
extern char rdram_memory[8388608];
#define rdram ((unsigned int *)rdram_memory)

rdram is used in many more places than assem_arm.* This can not possibly work.

BTW, the reason I repeated that definition was to work around a bug in gcc's optimizer. It was making some invalid assumptions about where rdram was located, and removing null pointer checks. I'm assuming you won't run into this bug with rdram in bss.

Next, I mmapped it in new_dynarec_init():
Code:
  u_char *tmp=(u_char*)rdram;
  if (mmap (tmp, 8388608,
            PROT_READ | PROT_WRITE | PROT_EXEC,
            MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS,
            -1, 0) <= 0) {__android_log_print( ANDROID_LOG_ERROR, "new_dynarec", "Error: mmap() of rdram failed!" );}

There is absolutely no reason to make this region executable. Get rid of this.

And munmapped it in new_dynarec_cleanup():
Unmapping part of bss is just asking for trouble. Seriously, wtf?

And finally, I set using_tlb to 1 in new_dynarec_init():
Code:
  // TLB
  //using_tlb=0;
  using_tlb=1;
Okay.

Memory info after the changes:
Code:
003ba000 g    DO .bss	01000000 extra_memory
013ba000 g    DO .bss	00800000 rdram_memory

The allocation is fine, but should be named rdram.
 
Last edited:
OP
paulscode

paulscode

New member
This shouldn't even compile. The symbol is rdram, not rdram_memory. If the linker isn't failing due to undefined symbols then you must have seperate symbols for both rdram and rdram_memory.
rdram is used in many more places than assem_arm.* This can not possibly work.
The allocation is fine, but should be named rdram.
My programming instructor once informed the class that there are 3 ways to solve any problem: the right way, the wrong way, and Paul's way. In this case, "Paul's way" works where rdram_memory is the block of memory (a char array in c), and rdram (the symbol used elsewhere) is a pointer to it. I do realize that arrays can generally be considered pointers, but there are some things that can be done with pointers which cannot be done with arrays (and thus the redundancy here). I was just making sure I covered all my bases, since I don't know how the rest of the program is using rdram. If there is nothing that requires a pointer, I'll just call the area "rdram" and extern it as a char array, and be good to go. Also, I forgot to mention that I added the following line to memory.h, which is giving the rest of the core access to the pointer:
Code:
#ifdef __arm__
#include "../r4300/new_dynarec/assem_arm.h"
#endif

And the redundant .aligns are useless.
Sorry, I wasn't sure if that was the proper syntax (never used assembly before). So once I say ".align 12", it stays that way until I say ".align 4".

BTW, the reason I repeated that definition was to work around a bug in gcc's optimizer. It was making some invalid assumptions about where rdram was located, and removing null pointer checks. I'm assuming you won't run into this bug with rdram in bss.
Cool, so I can just call the char array "rdram" and be done with it. I'll give that a try later.

There is absolutely no reason to make this region executable. Get rid of this.
Unmapping part of bss is just asking for trouble. Seriously, wtf?
Not understanding how these areas are being used by the rest of the program, my main concern was leaving something out that might be needed, and thus the redundancy. I'll get rid of these. As for the Unmapping, sorry for insulting you by using the wrong protocol. This is my first experience with assembly and the term ".bss" -- I just copy-pasted that from the BASE_ADDR code without a lot of consideration. The ap crashes long before it ever reaches that line (that function would only be called when shutting down). Unmap was also called on BASE_ADDR as well, so I'll cut out both of these unmaps.
 

Ari64

New member
My programming instructor once informed the class that there are 3 ways to solve any problem: the right way, the wrong way, and Paul's way.

LOL... That quote is worthy of The Daily WTF.

Also, I forgot to mention that I added the following line to memory.h, which is giving the rest of the core access to the pointer:

Well I suppose that might work. There's the right way, the wrong way, and the way that might or might not work, but that nobody understands (aka Paul's way).

Sorry, I wasn't sure if that was the proper syntax (never used assembly before). So once I say ".align 12", it stays that way until I say ".align 4".

The .align 12 applies to the .space that follows it. There are no more allocations after that, so ".align 4" is useless.

Cool, so I can just call the char array "rdram" and be done with it. I'll give that a try later.

It would have been fine if you had simply left the existing definition in memory/memory.h:
Code:
extern unsigned int rdram[0x800000/4];


Not understanding how these areas are being used by the rest of the program, my main concern was leaving something out that might be needed, and thus the redundancy. I'll get rid of these. As for the Unmapping, sorry for insulting you by using the wrong protocol.
It wasn't leaving something out, but adding code that was never there before. You have a bss segment like so:
Code:
840ac000-873b5000 rwxp 00000000 00:00 0
and then you unmap an 8MB chunk from the middle, leaving something like this:
Code:
840ac000-85466000 rwxp 00000000 00:00 0
85c66000-873b5000 rwxp 00000000 00:00 0
Seriously, wtf?

This is my first experience with assembly and the term ".bss"
When you declare an uninitialized global variable in C, it gets allocated in bss. This really isn't such a difficult concept.

I just copy-pasted that from the BASE_ADDR code without a lot of consideration. The ap crashes long before it ever reaches that line (that function would only be called when shutting down).
That's why I said it was asking for trouble, not that that trouble had found you yet.

Unmap was also called on BASE_ADDR as well, so I'll cut out both of these unmaps.
Good idea.

The dynarec was designed to unmap the code generation buffer when it is finished, but the rest of the emulator is definitely not expecting to have its global variables disappear.
 
OP
paulscode

paulscode

New member
The way that might or might not work, but that nobody understands (aka Paul's way).
Exactly. I took several programming class from that instructor, and definitely got a few "wtf's" on my homework and tests. As long as you have a sense of humor.. I've definitely anoyed a few experienced folks before with my unconventional programming style.

There are no more allocations after that, so ".align 4" is useless.
Gotcha. Will having it all align 12 not mess anything up in your code, or do I need to have more than one allocation line?

It would have been fine if you had simply left the existing definition in memory/memory.h:
Code:
extern unsigned int rdram[0x800000/4];
I think this was one of the minor changes between 1.5 and 1.99.4. I don't have the code in front of me at the moment (typing from my phone), but as I recall this line was changed to some strange "extern ALIGN..." syntax I'm not familiar with, and it wouldn't compile with this line there (thus the include of assem_arm.h). I suppose the other option would be not to allocate the memory in linkage_arm.s and instead let the code that is "ifdef'ed" out right now in memory.c do the allocating instead.

When you declare an uninitialized global variable in C, it gets allocated in bss. This really isn't such a difficult concept.
Got it. I am definitely still a novice programmer. I'll get there eventually...
 

Ari64

New member
Exactly. I took several programming class from that instructor, and definitely got a few "wtf's" on my homework and tests. As long as you have a sense of humor.. I've definitely anoyed a few experienced folks before with my unconventional programming style.

The right way: Remove the declaration, and add a new one under .bss

The wrong way: There aren't too many ways to get this wrong, aside from syntax errors or declaring the wrong size. Or possibly forgetting to remove both of the original declarations.

Paul's way: Make a new symbol with a different name, declare it the wrong type, add a #define for the original name with a cast to the correct type, mess with the headers so this applies to the rest of the code, then throw in a mmap and munmap.

I'm still trying to figure out how you came up with that.

BTW, what was it that prevented mapping the ram at 0x80000000? Was there already something else at that location?


Gotcha. Will having it all align 12 not mess anything up in your code, or do I need to have more than one allocation line?

.align 12 is equivalent to __attribute__((aligned(4096))). The start address is divisible by 2^12.

Do you not understand the concept of alignment?

I think this was one of the minor changes between 1.5 and 1.99.4. I don't have the code in front of me at the moment (typing from my phone), but as I recall this line was changed to some strange "extern ALIGN..." syntax I'm not familiar with, and it wouldn't compile with this line there (thus the include of assem_arm.h).

It's just a macro to specify the alignment...

Code:
#define ALIGN(BYTES,DATA) DATA __attribute__((aligned(BYTES)));
...

extern ALIGN(16, unsigned int rdram[0x800000/4]);

I suppose the other option would be not to allocate the memory in linkage_arm.s and instead let the code that is "ifdef'ed" out right now in memory.c do the allocating instead.

That is an option.
 
OP
paulscode

paulscode

New member
Paul's way: Make a new symbol with a different name, declare it the wrong type, add a #define for the original name with a cast to the correct type, mess with the headers so this applies to the rest of the code, then throw in a mmap and munmap

Hey, at least I didn't get into the field of medicine...

The right way: "Nurse, scalpel"
The wrong way: "Oh, sh..!!"
Paul's way: "Nurse, sledgehammer"
 

Remote

Active member
Moderator
Not to mention NASA!

The right way: The moon!
The wrong way: Uranus!
Paul's way: Canada!
 
OP
paulscode

paulscode

New member
Ok, so now that I've fixed that mess, let me do a quick me recap:
1) I have the 16MB allocated in .bss (called extra_memory), and it is .align 12
2) I have BASE_ADDR pointing to extra_memory
3) I have BASE_ADDR mmapped to make it executable
4) I've removed the munmap of BASE_ADDR.
5) I'm letting memory.c take care of allocating 8MB for rdram
6) I've set using_tlb to 1

:blush: The "extra baggage" I've cleaned up:
1) I'm no longer allocating 8MB for rdram in .bss section of linkage_arm.s
2) I've removed the two redundant .align's from linkage_arm.s
3) I've returned the rdram definition back to memory.h
3) I've opened up the rdram allocation in memory.c
4) I've removed the include assem_arm.h from memory.h
5) I've removed the mmap of rdram from new_dynarec_init()
6) I've removed the munmap of rdram from new_dynarec_cleanup()

BTW, what was it that prevented mapping the ram at 0x80000000? Was there already something else at that location?
Yes, unfortunately, something called libicudata.so. That is the first thing that was causing the ap to crash.
Code:
80000000-804fb000 r-xp 00000000 b3:35 910        /system/lib/libicudata.so

.align 12 is equivalent to __attribute__((aligned(4096))). The start address is divisible by 2^12. Do you not understand the concept of alignment?
I do now (thanks for the quick explanation). So basically, since my 16MB is align 12 (and therefore its start address is divisible by 2^12), and since 2^12 and 16MB are each divisible by 2^4, your globals are obviously still align 4 (i.e. their start address is divisible by 2^4). So no worries.

Not to mention NASA!

The right way: The moon!
The wrong way: Uranus!
Paul's way: Canada!
Hey, at least the GPL has my back: "without even an implied warranty of merchantability or fitness for a particular purpose"

ANYWAY... now that I have all that cleared up, I'm ready to debug and try to figure out why the ap is crashing. I still haven't gotten gdb to work (debugging for Android works through an unusual "gdb server/client" infrastructure - you can't simply run gdb on the phone). I'm asking around on various other forums to try and come up with a better workaround for the compiler bug I mentioned before.

In the mean time I can use logcat to print messages and compare pointers that way. I don't have a lot of time for programming this evening, but I did run a quick test to compare a couple of pointers:
Code:
12-14 18:02:28.943: VERBOSE/new_dynarec(7198): dyna_linker: 0x84096e3c
12-14 18:02:28.943: VERBOSE/new_dynarec(7198): dyna_linker_ds: 0x84096fc0
12-14 18:02:28.943: VERBOSE/new_dynarec(7198): BASE_ADDR: 0x843ba000
However, as I mentioned, I am not clear yet on what is everything that needs to have relative offsets within 32MB? You mentioned the "text segment" before. Is this the ".text" which I get from objdump -h? If so, then it seems like they are close enough (assuming I'm interpreting the data correctly):
Code:
Idx Name          Size      VMA       LMA       File off  Algn
  6 .text         00081fb0  00018a78  00018a78  00018a78  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 14 .bss          03308b24  000ac000  000ac000  000aa260  2**12
                  ALLOC
 

Ari64

New member
Ok, so now that I've fixed that mess, let me do a quick me recap:
1) I have the 16MB allocated in .bss (called extra_memory), and it is .align 12
2) I have BASE_ADDR pointing to extra_memory
3) I have BASE_ADDR mmapped to make it executable
4) I've removed the munmap of BASE_ADDR.
5) I'm letting memory.c take care of allocating 8MB for rdram
6) I've set using_tlb to 1

:blush: The "extra baggage" I've cleaned up:
1) I'm no longer allocating 8MB for rdram in .bss section of linkage_arm.s
2) I've removed the two redundant .align's from linkage_arm.s
3) I've returned the rdram definition back to memory.h
3) I've opened up the rdram allocation in memory.c
4) I've removed the include assem_arm.h from memory.h
5) I've removed the mmap of rdram from new_dynarec_init()
6) I've removed the munmap of rdram from new_dynarec_cleanup()

That covers everything I can think of. I assume you have defined ARMv5_ONLY if your cpu is not ARMv7-A, and I assume you did a clean rebuild after removing -mthumb.

If it's crashing, the next step is to get a backtrace, register values, and disassembly using gdb.

Code:
80000000-804fb000 r-xp 00000000 b3:35 910        /system/lib/libicudata.so

That's unfortunate. It would work better if you could map ram there since it would avoid having to do pointer arithmetic.

In the mean time I can use logcat to print messages and compare pointers that way. I don't have a lot of time for programming this evening, but I did run a quick test to compare a couple of pointers:
Code:
12-14 18:02:28.943: VERBOSE/new_dynarec(7198): dyna_linker: 0x84096e3c
12-14 18:02:28.943: VERBOSE/new_dynarec(7198): dyna_linker_ds: 0x84096fc0
12-14 18:02:28.943: VERBOSE/new_dynarec(7198): BASE_ADDR: 0x843ba000
However, as I mentioned, I am not clear yet on what is everything that needs to have relative offsets within 32MB?

What needs to be within 32MB is everything in linkage_arm.s, fpu.c, tlb.c, cop0.c, and BASE_ADDR through BASE_ADDR+16777216. The addresses you posted of dyna_linker and BASE_ADDR are less than 4MB apart, so this should be fine.

You mentioned the "text segment" before. Is this the ".text" which I get from objdump -h?

Yeah, the .text segment is where all the compiled C and asm code goes.
 
Last edited:
OP
paulscode

paulscode

New member
I assume you have defined ARMv5_ONLY if your cpu is not ARMv7-A, and I assume you did a clean rebuild after removing -mthumb.
Yep, -mthumb is gone now, and my phone is the Motorola Droid X, which sports the 1.0 GHz TI OMAP3630-1000, ARM Cortex A8, which is built on the ARMv7-A architecture.

If it's crashing, the next step is to get a backtrace, register values, and disassembly using gdb.
Sounds like a plan. I'll post an update when I figure out anything new.

Thanks for all the great help!
 
OP
paulscode

paulscode

New member
HA! I had an epiphany that maybe I could name the file linkage_arm.S, and gcc would be able to compile it, but the plug-in wouldn't be smart enough to realize it was actually a .s file and call the output linkage_arm.o like it should. And wouldn't you know it - that worked! I am seriously jumping in my chair right now ... ok, now back to my homework..
 
OP
paulscode

paulscode

New member
I thought I would post a progress update.

I've been a bit busy with school and the holidays, but I have done a little more work on the project. I've discovered that Android's gdb server/client infrastructure does not work very well for multi-threaded aps (and since SDL is multi-threaded..) Basically, it assumes a single thread (not sure how it picks which one, exactly), and then it simply blows over breakpoints, etc if they are reached from another thread. Additionally, when the thread does stop at a breakpoint, the other threads continue to run (which leads to synchronization issues and crashes).

After some research, it seems that Google has fixed this problem in Android 2.3 and the latest NDK r5. Unfortunately, 2.3 has not yet been released for the Droid X, so in the mean time I'm having to debug the "old fashioned" way with log output and infinite loops (which is a real pain in the butt).
 
Last edited:

Rotkaeqpchen

New member
I hope that the emu will be released for pre 2.3 versions, too. My Milestone 2 takes 2 years to be updated to 2.3 unfortunately ^^
 
OP
paulscode

paulscode

New member
Android 2.3 would only be needed to avoid the bugs in the debugging process (that just sounds redundant to me...). Since the end-user is not going to be debugging the code, this won't limit anyone else to 2.3 devices. It should run on Android 1.6 and higher (at least that's the only limitation in Pelya's SDL port). However, to be honest, I rather doubt that it will run fast enough to be "playable" on anything but the newest Android devices. I'll also try and release two versions of the core (one for ARM v5 and one for v7-a), which should cover just about every phone.
 

Cpasjuste

New member
Hi !

I'm also playing with the sources. First thing to say, i'm on a Samsung Galaxy S (i9000). But i'm not working exactly like you, i decided to first have it running without the jni wrapper, i mean i'm building the "mupen core" as an executable (using dummy plugins of course) so if i understand correctly i should not have to play with memory management but just use the "Xlinker" ld flag ?.

Well, for me it segfault on the "new_dyna_start()" function which is the starting point of the asm code it seems. Here is informations, any hint would be appreciated :)


- mupen64plus memory maps before the gdb "continue" function : http://pastebin.com/P2wtSE1a
- mupen64plus memory maps at the segfault (new_dyna_start()) : http://pastebin.com/xu6C9EVM ( the "80000000-80800000" region is the rdram memory map ? )
- mupen64plus objdump : http://pastebin.com/mJWhP6Vv

Of course, i'm building it in arm mode, for armv7-a architecture. Finding the problem here is probably a little beyond my knowledge, but maybe ari64 or paulscode will have some hint :)
 
OP
paulscode

paulscode

New member
I haven't had time to look at your memory mapping that closely yet, Cpasjuste, but for reference, which version of Mupen64Plus are you working with? If you are not starting with Ari64's Pandora port (based on v. 1.5), there are a few other changes that need to be made in the code besides adding the new_dynarec sources. If you are working with 1.99.4, I can provide a detailed list of all the changes I made to the sources after studying the changes Ari64 made to the official version 1.5 (although my project isn't running at the moment either, but maybe we could compare notes).
 

Cpasjuste

New member
Hi paulscode,

i'm working with the stock ari64 version (1.5) for now, and yes it could be great to help together. That's said, i didn't did anything great for now, it "run" almost out of the box. The problem now is i probably didn't have the knowledge to go further, it seems to segfault at the first asm call and do not gdb output anything usefull (for me at least). I also tried it on android 2.3 with the same result.

That's sad because with the 2.3 android sdk release, it would be very easy to port the gles2 video plugin :) Maybe we can meet us on IRC, else my mail is cpasjuste AT gmail POINT com

See you !
 
OP
paulscode

paulscode

New member
Cpasjuste, looking at the top of your memory mapping:
Code:
00008000-0010b000 r-xp 00000000 b3:02 279        /data/misc/droid64/mupen64plus
0010b000-00111000 rw-p 00102000 b3:02 279        /data/misc/droid64/mupen64plus
00111000-01c3c000 rw-p 00000000 00:00 0          [heap]
07000000-08000000 rwxp 00000000 00:00 0
40000000-40008000 r--s 00000000 00:0a 595        /dev/__properties__ (deleted)
40008000-40009000 r--p 00000000 00:00 0
40009000-54ea1000 rw-p 00000000 00:00 0
80000000-80800000 rw-p 00000000 00:00 0
(...lots of shared libraries...)

And comparing that to the memory mapping that Ari64 posted earlier:
Code:
00008000-00021000 r--p 00000000 08:01 26198025   /media/sda1/devel/mupen64plus (ld.so symbol table)
07000000-08000000 rwxp 07000000 00:00 0                                        (dynarec target)
08000000-080ea000 r-xp 00020000 08:01 26198025   /media/sda1/devel/mupen64plus (.init, .text)
080f1000-080f6000 rw-p 00109000 08:01 26198025   /media/sda1/devel/mupen64plus (.data)
080f6000-09ed0000 rw-p 080f6000 00:00 0          [heap]                        (.bss)
40000000-4001d000 r-xp 00000000 b3:02 17317      /lib/ld-2.9.so
(...lots more shared libraries...)
450c8000-458c7000 rw-p 450c8000 00:00 0
458c7000-458d8000 r--p 00000000 08:01 24535139   /media/sda1/devel/fonts/font.ttf
458d8000-4592c000 rw-p 458d8000 00:00 0
80000000-80800000 rw-p 80000000 00:00 0                                        (rdram)

The main difference I notice is that in Ari64's memory mapping, .init and .text are located immediately after dynarec target (and he mentioned that the .text section and the dynarec target must be within 32MB of each other). Notice there is nothing actually at 0x08000000 in your mapping. It seems to me like the linker options "-Xlinker --section-start -Xlinker .init=0x08000000" were ignored (or maybe a typo, such as too many zeros or something?). I'm new to the whole concept of memory mappings, myself, but perhaps this could be caused by something else from another program being located at that memory location?? (If so, maybe try changing 0x07000000 and 0x08000000 to some other locations?) Obviously, I could be wrong about this - before your program crashes, could you print out the value of &dyna_linker, to check whether or not it is within 32MB of the dynarec target 0x07000000-0x08000000? My guess is that it won't be.
 

Ari64

New member
Paul's analysis of Cpasjuste's memory map is correct. The text segment is too far from the dynarec target. In hindsight, I should have put an assert in the linker to catch this. It truncates the offsets, and crashes.

If you moved the ram from 0x80000000 (it doesn't look like Cpasjuste did) then check that the constant propagation is inserting the correct addresses when it precalculates a memory address. I'm guessing that is why Paul's build doesn't work.

Notaz also found a bug in verify_dirty() and get_bounds() where the branch offset is cast to an unsigned type. If the value is negative, which it will be if the dynarec target is in .bss, then this won't work. So you might need to check that.
 
Last edited:

Top