What's new

Porting Mupen64Plus to Android

paulscode

New member
Having been out of the emulation scene for a few years, I recently decided to play some of my old N64 ROMs. I went with Mupen64Plus 1.99.3 for the emulator, and I was so impressed with the program that I had to dig around in the source code. I really like the modular nature of this library - it is perfectly structured to facilitate porting to new systems.

I've decided to take a crack at porting it to Android, using the NDK (so I can use native c/c++ directly rather than having to translate the entire project into Java). I realize that current devices are probably not at a level to run N64 emulation smoothly yet, but by the time I finish the port, who knows?

I'll be using pelya's Android SDL port, which doesn't require the target device to be rooted (allowing for easy distribution through the Android Market and avoiding any licensing/warranty concerns that can come with rooting). My initial challenge is translating the Makefiles used by the Mupen64Plus source into NDK-speak (I'm rather new to the whole Makefile concept, so it will be a bit of an initial learning curve). I'll post my progress on this project here. Comments and suggestions are welcome. Also, if anyone else is already working on an Android port of Mupen64Plus, I'd be happy to collaborate.
 

Surkow

Member
You might be interested in looking into the Pandora port. The Pandora port has a new ARM dynarec but is sadly enough based on Mupen64Plus 1.5.
 
Last edited:
OP
paulscode

paulscode

New member
I had a chance to look at the source code for that Pandora port - having an ARM dynarec will definitely save me a lot of work, thanks for the tip! It should be pretty straight forward to transfer the relevant source to the latest version of Mupen64Plus (it is conveniently packaged together a folder named "new_dynarec" under the r4300 folder of mupen64plus-core).

Having had a chance to sit down and study the code for the core and various required plug-ins in depth (and being able to utilize that ARM dynarec), my sense is that the biggest challenge is going to be the video plug-in (not impossible, just a rather tedious job of porting an extensive amount of code in video-rice over to "OpenGL ES"-speak) Developing an input plug-in to utilize keyboards, Bluetooth joysticks, and touch-screens will likely take some effort as well (I'll be borrowing code for this from one or more other open-source emulators that have already been ported to the Android, and interfacing it with the "input-sdl" plug-in).
 

Ari64

New member
There are a few other changes besides what's in the new_dynarec directory. There's some new init code, a patch to the savestate handler to restore the instruction pointer, a patch to make sure the TLB state gets updated as necessary, and a few hooks to make sure the cache gets flushed when memory is modified. There's also some linker flags added to the makefile to relocate the text segment on ARM, which is necessary to accommodate the +/-32MB limit of branch instructions in the ARM instruction set.
 
OP
paulscode

paulscode

New member
Thanks for the overview of changes - I was actually just reading through your thread on GP32X.com (when I was there the first time, I kind of just downloaded the source code and didn't take the time to read about your project). I will definitely have to pick your brain - it looks like you've done a remarkable amount of work in a similar direction that I will be going.
 
OP
paulscode

paulscode

New member
Yes, someone else pointed me to that project today as well. I have been digging around in the source code - the author doesn't seem to have made a whole lot of visible progress, but it is definitely helpful in that he has gotten the basic project set up, and the core and necessary plug-ins to compile - which is a whole lot better than starting from scratch. I'll see if I can collaborate with him to try and incorporate key parts of your code which his project seems to be missing (in particular the ARM dynarec and OpenGL ES components).
 
OP
paulscode

paulscode

New member
In preparation of porting the ARM dynarec from 1.5 to 1.99.4, I've been going through the core source code to identify all the changes that have been made in the Pandora port from the official 1.5 release. Thorough documentation of changes will not only help me port it to 1.99.4, but also to future updated Mupen64Plus versions as they come out.

The following is a list of files in the core source code that were changed (Ari64, let me know if you see anything obvious I missed):

Code:
Makefile
pre.mk
>main
    savestates.c
>memory
    dma.c
    memory.c
>r4300
    >new_dynared  **
    >x86
        gr4300.c
    bc.c
    cop0.c
    interrupt.c
    r4300.c
    regimm.c
    tlb.c

And a breakdown of all the actual code changes (other than the makefile, which I will tackle separately since the Android NDK uses its own special type of makefiles):
Code:
>main --> savestates.c
- New line 38, inserted after original line 37
    #include "../r4300/new_dynarec/new_dynarec.h"
- New lines 184-188, replaces original line 183
    #ifdef NEW_DYNAREC
    gzwrite(f, &pcaddr, 4);
    #else
    gzwrite(f, &PC->addr, 4);
    #endif
- New lines 304-313, replaces original lines 299-303
    #ifdef NEW_DYNAREC
        gzread(f, &pcaddr, 4);
        pending_exception = 1;
        invalidate_all_pages();
    #else
        int i;
        gzread(f, &queuelength, 4);
        for (i = 0; i < 0x100000; i++)
            invalid_code[i] = 1;
        jump_to(queuelength);
    #endif

>memory --> dma.c
- New line 34, inserted after original line 33
    #include "../r4300/new_dynarec/new/dynarec.h"
- New lines 207-211, inserted after original line 205
    if(!blocks[rdram_address1>>12])
    {
        invalid_code[rdram_address1>>12] = 1;
    }
    else
- New lines 216-218, inserted after original line 209
    #ifdef NEW_DYNAREC
        invalidate_block(rdram_address1>>12);
    #endif
- New lines 222-226, inserted after original line 212
    if(!blocks[rdram_address2>>12])
    {
        invalid_code[rdram_address2>>12] = 1;
    }
    else

>memory --> memory.c
- New line 29, inserted after original line 28
    #include <sys/mman.h>
- New lines 61-63, replaces original line 60
    #if defined(__x86_64__) || defined(NO_ASM) || (!defined(__i386__)&&!defined(__arm__))
    unsigned int rdram[0x800000/4]  __attribute__((aligned(16)));
    #endif
- New lines 73-75, replaces original line 70
    #if defined(NO_ASM) || !defined(__arm__)
    unsigned int address = 0;
    #endif
- New lines 84-89, replaces original lines 79-82
    #if defined(NO_ASM) || !defined(__arm__)
    unsigned int word;
    unsigned char byte;
    unsigned short hword;
    unsigned long long int dword;
    #endif
- New lines 153-162, replaces original line 146
    if((int)rdram!=0x80000000) {
        for (i=0; i<(0x800000/4); i++) rdram[i]=0;
    }
    else {
        munmap ((void*)0x80000000, 0x800000);
        if(mmap ((void*)0x80000000, 0x800000,
            PROT_READ | PROT_WRITE,
            MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS,
            -1, 0) <= 0) {printf("mmap(0x80000000) failed\n");}
    }


>r430 >x86 --> gr4300.c
- New lines 671-674, replaces original lines 671-673
    cmp_reg32_imm8(reg, 5);
    jbe_rj(18);
    
    sub_reg32_imm32(reg, 2); // 6
- New line 999, replaces original line 998
    genupdate_count(dst->addr+4);
- New line 1040, replaces original line 1039
    genupdate_count(dst->addr+4);


>r4300 --> bc.c
- New lines 140-143, replaces original lines 140-141
    else {
        PC+=2;
        update_count();
    }
- New lines 165-168, replaces original lines 163-164
    else {
        PC+=2;
        update_count();
    }
- New lines 202-205, replaces original lines 202-203
    else {
        PC+=2;
        update_count();
    }
- New lines 227-230, replaces original lines 221-222
    else {
        PC+=2;
        update_count();
    }


>r4300 --> cop0.c
- New lines 34-35, inserted after original line 33
    case 9:    // Count
    update_count();
- New line 77, from original line 75, commented out
- New line 121, moved from original line 121 to after original line 118
    update_count();


>r4300 --> interupt.c
- New line 44, inserted after original line 43
- New lines 610-618, replaces original line 609
    #ifdef NEW_DYNAREC
        EPC = pcaddr;
        pcaddr = 0x80000180;
        Status |= 2;
        Cause &= 0x7FFFFFFF;
        pending_exception=1;
    #else
        exception_general();
    #endif


>r4300 --> r4300.c
- New line 31, inserted after original line 30
    #include "new_dynarec/new_dynarec.h"
- New lines 41-64, replaces orginal lines 40-54
    int llbit, rompause;
    #if defined(NO_ASM) || !defined(__arm__)
    int stop;
    long long int reg[32], hi, lo;
    unsigned int reg_cop0[32];
    #endif
    long long int local_rs, local_rt;
    int local_rs32, local_rt32;
    unsigned int jump_target;
    #if defined(NO_ASM) || !defined(__arm__)
    float *reg_cop1_simple[32];
    double *reg_cop1_double[32];
    int FCR0, FCR31;
    #endif
    int reg_cop1_fgr_32[32];
    long long int reg_cop1_fgr_64[32];
    tlb tlb_e[32];
    unsigned int delay_slot, skip_jump = 0, dyna_interp = 0, last_addr;
    unsigned long long int debug_count = 0;
    unsigned int CIC_Chip;
    #if defined(NO_ASM) || !defined(__arm__)
    unsigned int next_interupt;
    precomp_instr *PC;
    #endif
- New lines 1528-1535, replaces original lines 1519-1524
    #if !defined(NEW_DYNAREC)
        if (PC->addr < last_addr)
        {
            printf("PC->addr < last_addr\n");
        }
        Count = Count + (PC->addr - last_addr)/2;
        last_addr = PC->addr;
    #endif
- New lines 1840-1854, replaces original lines 1829-1837
    #if !defined(NO_ASM) && (defined(__i386__) || defined(__x86_64__) || defined(__arm__))
    else if (dynacore == 1)
    {
        dynacore = 1;
        printf ("R4300 Core mode: Dynamic Recompiler\n");
        init_blocks();
        code = (void *)(actual->code+(actual->block[0x40/4].local_addr));
    #ifdef NEW_DYNAREC
        new_dynarec_init();
        new_dyna_start();
        new_dynarec_cleanup();
    #else
        dyna_start(code);
        PC++;
    #endif


>r4300 --> regimm.c
- New lines 141-144, replaces original lines 141-142
    else {
        PC+=2;
        update_count();
    }
- New lines 165-168, replaces original lines 163-164
    else {
        PC+=2;
        update_count();
    }
- New lines 201-204, replaces original lines 197-198
    else {
        PC+=2;
        update_count();
    }
- New lines 225-228, replaces original lines 219-220
    else {
        PC+=2;
        update_count();
    }


>r4300 --> tlb.c
- New lines 253-312, inserted after original line 234
//#ifdef NEW_DYNAREC
extern unsigned int memory_map[1048576];
extern unsigned int using_tlb;
#include <assert.h>
void TLBWI_new(void)
{
  unsigned int i;
  /* Remove old entries */
  unsigned int old_start_even=tlb_e[Index&0x3F].start_even;
  unsigned int old_end_even=tlb_e[Index&0x3F].end_even;
  unsigned int old_start_odd=tlb_e[Index&0x3F].start_odd;
  unsigned int old_end_odd=tlb_e[Index&0x3F].end_odd;
  for (i=old_start_even>>12; i<=old_end_even>>12; i++)
  {
    if(i<0x80000||i>0xBFFFF)
    {
      invalidate_block(i);
      memory_map[i]=-1;
    }
  }
  for (i=old_start_odd>>12; i<=old_end_odd>>12; i++)
  {
    if(i<0x80000||i>0xBFFFF)
    {
      invalidate_block(i);
      memory_map[i]=-1;
    }
  }
  TLBWI();
  //printf("TLBWI: index=%d\n",Index);
  //printf("TLBWI: start_even=%x end_even=%x phys_even=%x v=%d d=%d\n",tlb_e[Index&0x3F].start_even,tlb_e[Index&0x3F].end_even,tlb_e[Index&0x3F].phys_even,tlb_e[Index&0x3F].v_even,tlb_e[Index&0x3F].d_even);
  //printf("TLBWI: start_odd=%x end_odd=%x phys_odd=%x v=%d d=%d\n",tlb_e[Index&0x3F].start_odd,tlb_e[Index&0x3F].end_odd,tlb_e[Index&0x3F].phys_odd,tlb_e[Index&0x3F].v_odd,tlb_e[Index&0x3F].d_odd);
  /* Combine tlb_LUT_r, tlb_LUT_w, and invalid_code into a single table
     for fast look up. */
  for (i=tlb_e[Index&0x3F].start_even>>12; i<=tlb_e[Index&0x3F].end_even>>12; i++)
  {
    //printf("%x: r:%8x w:%8x\n",i,tlb_LUT_r[i],tlb_LUT_w[i]);
    if(i<0x80000||i>0xBFFFF)
    {
      if(tlb_LUT_r[i]) {
        memory_map[i]=((tlb_LUT_r[i]&0xFFFFF000)-(i<<12)+(unsigned int)rdram-0x80000000)>>2;
        // FIXME: should make sure the physical page is invalid too
        if(!tlb_LUT_w[i]||!invalid_code[i]) {
          memory_map[i]|=0x40000000; // Write protect
        }else{
          assert(tlb_LUT_r[i]==tlb_LUT_w[i]);
        }
        if(!using_tlb) printf("Enabled TLB\n");
        // Tell the dynamic recompiler to generate tlb lookup code
        using_tlb=1;
      }
      else memory_map[i]=-1;
    }
    //printf("memory_map[%x]: %8x (+%8x)\n",i,memory_map[i],memory_map[i]<<2);
  }
  for (i=tlb_e[Index&0x3F].start_odd>>12; i<=tlb_e[Index&0x3F].end_odd>>12; i++)
  {
    //printf("%x: r:%8x w:%8x\n",i,tlb_LUT_r[i],tlb_LUT_w[i]);
    if(i<0x80000||i>0xBFFFF)
    {
      if(tlb_LUT_r[i]) {
        memory_map[i]=((tlb_LUT_r[i]&0xFFFFF000)-(i<<12)+(unsigned int)rdram-0x80000000)>>2;
        // FIXME: should make sure the physical page is invalid too
        if(!tlb_LUT_w[i]||!invalid_code[i]) {
          memory_map[i]|=0x40000000; // Write protect
        }else{
          assert(tlb_LUT_r[i]==tlb_LUT_w[i]);
        }
        if(!using_tlb) printf("Enabled TLB\n");
        // Tell the dynamic recompiler to generate tlb lookup code
        using_tlb=1;
      }
      else memory_map[i]=-1;
    }
    //printf("memory_map[%x]: %8x (+%8x)\n",i,memory_map[i],memory_map[i]<<2);
  }
}

- New lines 500-572, inserted after original line 421
//#ifdef NEW_DYNAREC
void TLBWR_new(void)
{
  unsigned int i;
  Random = (Count/2 % (32 - Wired)) + Wired;
  /* Remove old entries */
  unsigned int old_start_even=tlb_e[Random&0x3F].start_even;
  unsigned int old_end_even=tlb_e[Random&0x3F].end_even;
  unsigned int old_start_odd=tlb_e[Random&0x3F].start_odd;
  unsigned int old_end_odd=tlb_e[Random&0x3F].end_odd;
  for (i=old_start_even>>12; i<=old_end_even>>12; i++)
  {
    if(i<0x80000||i>0xBFFFF)
    {
      invalidate_block(i);
      memory_map[i]=-1;
    }
  }
  for (i=old_start_odd>>12; i<=old_end_odd>>12; i++)
  {
    if(i<0x80000||i>0xBFFFF)
    {
      invalidate_block(i);
      memory_map[i]=-1;
    }
  }
  TLBWR();
  /* Combine tlb_LUT_r, tlb_LUT_w, and invalid_code into a single table
     for fast look up. */
  for (i=tlb_e[Random&0x3F].start_even>>12; i<=tlb_e[Random&0x3F].end_even>>12; i++)
  {
    //printf("%x: r:%8x w:%8x\n",i,tlb_LUT_r[i],tlb_LUT_w[i]);
    if(i<0x80000||i>0xBFFFF)
    {
      if(tlb_LUT_r[i]) {
        memory_map[i]=((tlb_LUT_r[i]&0xFFFFF000)-(i<<12)+(unsigned int)rdram-0x80000000)>>2;
        // FIXME: should make sure the physical page is invalid too
        if(!tlb_LUT_w[i]||!invalid_code[i]) {
          memory_map[i]|=0x40000000; // Write protect
        }else{
          assert(tlb_LUT_r[i]==tlb_LUT_w[i]);
        }
        if(!using_tlb) printf("Enabled TLB\n");
        // Tell the dynamic recompiler to generate tlb lookup code
        using_tlb=1;
      }
      else memory_map[i]=-1;
    }
    //printf("memory_map[%x]: %8x (+%8x)\n",i,memory_map[i],memory_map[i]<<2);
  }
  for (i=tlb_e[Random&0x3F].start_odd>>12; i<=tlb_e[Random&0x3F].end_odd>>12; i++)
  {
    //printf("%x: r:%8x w:%8x\n",i,tlb_LUT_r[i],tlb_LUT_w[i]);
    if(i<0x80000||i>0xBFFFF)
    {
      if(tlb_LUT_r[i]) {
        memory_map[i]=((tlb_LUT_r[i]&0xFFFFF000)-(i<<12)+(unsigned int)rdram-0x80000000)>>2;
        // FIXME: should make sure the physical page is invalid too
        if(!tlb_LUT_w[i]||!invalid_code[i]) {
          memory_map[i]|=0x40000000; // Write protect
        }else{
          assert(tlb_LUT_r[i]==tlb_LUT_w[i]);
        }
        if(!using_tlb) printf("Enabled TLB\n");
        // Tell the dynamic recompiler to generate tlb lookup code
        using_tlb=1;
      }
      else memory_map[i]=-1;
    }
    //printf("memory_map[%x]: %8x (+%8x)\n",i,memory_map[i],memory_map[i]<<2);
  }
}
 
Last edited:

Ari64

New member
In preparation of porting the ARM dynarec from 1.5 to
The following is a list of files in the core source code that were changed (Ari64, let me know if you see anything obvious I missed):
I think that's everything. The stuff in gr4300.c, bc.c regimm.c and cop0.c was to fix problems in the original cpu core, which was very inconsistent about how it counted cycles. You can either merge this or not, it just made it easier for me to debug things.

The only change in cop0.c which is strictly necessary is the call to gen_interupt; This will crash if you call it with the wrong cpu core active. Maybe I should have ifdefed that.
 
OP
paulscode

paulscode

New member
Thanks for the info. I assume the call to gen_interupt() is safe for both x86 and x86_64, just not for ARM. Rather than commenting it out, I'll change the line to:
Code:
#ifndef  NEW_DYNAREC
    if (next_interupt <= Count) gen_interupt();
#endif

My first step before porting all this to 1.99.4 will be to compare the n64oid source code to the official 1.99.1 source code to see what all the author had to change to make it compile in the NDK. Once I can get 1.99.4 to compile, I'll compare 1.5 and your Pandora port to 1.99.4 to determine how different the versions are in relation to the sections of code listed above that need to be changed.
 

Ari64

New member
Thanks for the info. I assume the call to gen_interupt() is safe for both x86 and x86_64, just not for ARM.

No, it changes the instruction pointer. Each CPU core (interpreter, dynarec) has a different way of doing this. This is why there is special-case code in the savestate handler, interrupt handler, and anywhere else that changes the flow of execution.
 
OP
paulscode

paulscode

New member
It looks like the only changes to the 1.99.1 core source code made in the n64oid code are to remove all references to osd (plus pointlessly replacing the word "Mupen64Plus" everywhere with "n64oid", but whatever..) I've been looking at osd to figure out what it is for. Seems to be related to screenshot capability and displaying messages. There is a bit of opengl code in it (which I imagine was the reason n64oid removed it). Anyone know if osd is an essential component for the core? If so I may need to port it to opengl es, depending on what functions it uses.
 

Richard42

Emulator Developer
It looks like the only changes to the 1.99.1 core source code made in the n64oid code are to remove all references to osd (plus pointlessly replacing the word "Mupen64Plus" everywhere with "n64oid", but whatever..) I've been looking at osd to figure out what it is for. Seems to be related to screenshot capability and displaying messages. There is a bit of opengl code in it (which I imagine was the reason n64oid removed it). Anyone know if osd is an essential component for the core? If so I may need to port it to opengl es, depending on what functions it uses.

The OSD is the on-screen display. It renders messages like "Mupen64Plus Started", "Volume: XX%", "Paused", "Fast Forward", etc. It is not essential to the operation of the core but it is nice to have. The screenshot capability is also tied to the OSD, because you want to take a screenshot without the OSD written over it, saying that you took a screenshot.
 
OP
paulscode

paulscode

New member
Thanks, I'll probably remove it for now until I get the core compiled and running on the Android and I'm ready to move on to the graphics plug-in (I think I'll want to study this part of the code a little more closely to understand how the osd and the graphics plug-in interact with each other).
 

Ari64

New member
The maemo port also removed the OSD and screenshot code. The reason was that it relies on OpenGL, and only OpenGL ES is available on many handheld Linux distributions.
 
OP
paulscode

paulscode

New member
I figured that was the reason it was cut from n64oid. I'll probably take a crack at porting it to OpenGL ES at some point.

Actually, I had thought Pandora ran OpenGL ES as well, but I guess not? (I don't actually own one - I just gathered that from the discussions in your thread on GP32X, although the thread went off topic a few times, so I probably misinterpreted).
 

Ari64

New member
The Pandora only has hardware acceleration for OpenGL ES. It's possible to install mesa and compile OpenGL code, but it's not very useful since you only get software rendering.
 
OP
paulscode

paulscode

New member
I see. So the OSD and screenshot code is disabled in the Pandora port as well then, or has someone ported it to OpenGL ES yet?
 
OP
paulscode

paulscode

New member
My next step is to get an SDL port that will fit my needs. I had originally thought Pelya's SDL port for Android was the obvious pick. It is well established and debugged and doesn't require the end-user's device to be rooted or image-flashed, allowing derived applications to be easily distributed through the Android market.

However upon closer examination of the source code and documentation, this port has a couple of serious limitations. Firstly, it is set up to be configured and built from the command-line through the use of scripts, which makes it very difficult to incorporate into an IDE like Eclipse (and without an IDE, debugging a large project would be a real pain in the butt). The second limitation is that the project is set up where it doesn't ship the application itself with the SDL ap. Instead, the end user downloads the application separately on the first run of the SDL ap. The problem with this approach (besides being needlessly complicated for the end user), is that it prevents the developer from distributing automatic updates to the application itself through the Android Market (only automatic updates to the SDL ap would be possible). So, for example, if I were to fix a bug or relase an update when a new version of Mupen64Plus is released, the end user would have to un-install their current version and then reinstall the new one (and download the application again on the first run), rather than simply being notified that there was an update ready to be installed.

I am talking with the developers who use Pelya's SDL port to see if these two limitations can be easily overcome (I'm more worried about the second one, I can probably figure out how to cram the thing into Eclipse myself through trial and error). If the whole first-run downloading issue turns out to be required for some reason I can't imagine at the moment, I'll have to start looking for a more suitable SDL port to use (any suggestions are appreciated), or create one myself (probably by studying Pelya's code and talking out just the pieces I need to make the Mupen64Plus core and plug-ins run).
 
Last edited:

Top