What's new

ds emulation

synch

New member
shizzy: yes, for a few days, but as I had it cached (and a copy on my hd), I didn't notice. Anyway, having a local copy doesn't hurt that much (for reading it while the internet connection is down or so).
 
OP
A

|\/|-a-\/

Uli Hecht
hi, i've two questions:

1. AND{cond}{S}, what is the S good for?

2. i can't find the "main cpu loop" - you know, fetching opcodes, decoding, executing - in the source of desmume released by yopyop... i think, synch can help me ^^
 

synch

New member
1. Afaik, if {S}, the opcode modifies flags, but my knowledge of arm cpus is vague.
2. It's in armcpu.cpp, I think. It's quite clear where :p
 
OP
A

|\/|-a-\/

Uli Hecht
sure but i can't find it in armcpu.cpp .........

i realised that i don't understand what the cp15 is and how it's used...
 

ShizZy

Emulator Developer
I have an emu framework setup, a very basic mmu, and a very basic cpu layout. :) That was about a week ago, havn't worked on it since. We'll see how it goes I guess..
 

ShizZy

Emulator Developer
Ho humm.. how do you guys handle instruction conditions? My quick dirty lame attempt is to call this function at the beginning of every op, then execute if it returns true:

Code:
    // Checks if Instruction is Conditionally True
    static inline int CheckInstructionCondition(u32* _psr, u8 _condition)
    {
        switch(_condition & 0xf)
        {
        case ARM9_COND_EQ: return ((*_psr & PSR_Z) != 0); // Z set
        case ARM9_COND_NE: return ((*_psr & PSR_Z) == 0); // Z clear
        case ARM9_COND_CS: return ((*_psr & PSR_C) != 0); // C set
        case ARM9_COND_CC: return ((*_psr & PSR_C) == 0); // C clear
        case ARM9_COND_MI: return ((*_psr & PSR_N) != 0); // N set
        case ARM9_COND_PL: return ((*_psr & PSR_N) == 0); // N clear
        case ARM9_COND_VS: return ((*_psr & PSR_V) != 0); // V set
        case ARM9_COND_VC: return ((*_psr & PSR_V) == 0); // V clear
        case ARM9_COND_HI: return (((*_psr & PSR_C) != 0) && ((*_psr & PSR_Z) == 0)); // C set and Z clear
        case ARM9_COND_LS: return (((*_psr & PSR_C) == 0) || ((*_psr & PSR_Z) != 0)); // C clear or Z set
        case ARM9_COND_GE: return ((((*_psr & PSR_N) == 0) && ((*_psr & PSR_V) == 0)) || 
                                    (((*_psr & PSR_N) != 0) && ((*_psr & PSR_V) != 0))); // N equals V
        case ARM9_COND_LT: return ((((*_psr & PSR_N) == 0) && ((*_psr & PSR_V) != 0)) || 
                                    (((*_psr & PSR_N) != 0) && ((*_psr & PSR_V) == 0))); // N does not equal V
        case ARM9_COND_GT: return (((*_psr & PSR_Z) == 0) && ((((*_psr & PSR_N) == 0) && ((*_psr & PSR_V) == 0)) || 
                                    (((*_psr & PSR_N) != 0) && ((*_psr & PSR_V) != 0)))); // Z clear and N equals V
        case ARM9_COND_LE: return (((*_psr & PSR_Z) != 0) || ((((*_psr & PSR_N) == 0) && ((*_psr & PSR_V) != 0)) || 
                                    (((*_psr & PSR_N) != 0) && ((*_psr & PSR_V) == 0)))); // Z set or N does not equal V
        case ARM9_COND_NV: Log(100, ".ARM9Interpreter: Unimplemented NV Opcode Condition!\n"); // Special
        case ARM9_COND_AL: return (1); // Always (unconditional)
        }

        return 0;
    }

Doesn't seem very effective.
 

synch

New member
|\/|-a-\/ said:
it's in ARM_CPU.h of course, because it's inline

No, just the instruction decode (INSTRUCTION_INDEX macro) and the test codition (TEST_COND macro), but they're macros to make the opcode execution readable :p

ShizZy said:
how do you guys handle instruction conditions?

Desmume seems to check all the possible combinations. You can check the TEST_COND macro in arm_cpu.h and check how it's used in arm_cpu.cpp. I haven't even checked if this could be improved, as I'm mainly focused on the 3D core and compatibility fixes, rather than speed, atm.

BTW: why I'm member of the month? Is that a random thingie?
 

Exophase

Emulator Developer
ShizZy said:
Ho humm.. how do you guys handle instruction conditions? My quick dirty lame attempt is to call this function at the beginning of every op, then execute if it returns true: (snipped)

For an interpretive emulator you have to check the condition code at every cycle. It might be worthwhile to only perform the check if the condition is not AL (this check is slightly cheaper than entering the switch), VBA does this. Now, as to how those checks are performed..

You're generally better off keeping the computational result flags (N, Z, C, and V) cached in separate variables, and extracting these from the CPSR register after msr instructions/mode changing returns, etc. (in addition to when you firstenter the interpretive loop). Likewise, you'd be collapsing them to CPSR upon mrs instructions and so forth. These flags are modified and accessed significantly more in isolation than in CPSR representation.

For the actual condition checking there isn't anything that can be done beyond what's obvious (GBATek's logical combinations). However, since the values are strictly boolean, you're best off using purely binary/arithmetic operations (as opposed to logical ones). Furthermore, you can use ^ (binary XOR) instead of !=, which may be or may not be faster (depends on platform. On x86 != seems to be about the same or better)

Now, you didn't ask for these in particular, but you might find them helpful.. let me know how you think these compare with what you're already using:

Code:
#define calculate_c_flag_sub(dest, src_a, src_b)                              \
  c_flag = ((unsigned)src_b <= (unsigned)src_a)                               \

#define calculate_v_flag_sub(dest, src_a, src_b)                              \
  v_flag = ((signed)src_b > (signed)src_a) != ((signed)dest < 0)              \

#define calculate_c_flag_add(dest, src_a, src_b)                              \
  c_flag = ((unsigned)dest < (unsigned)src_a)                                 \

#define calculate_v_flag_add(dest, src_a, src_b)                              \
  v_flag = ((signed)dest < (signed)src_a) != ((signed)src_b < 0)              \

These should perform significantly better than versions I've typically seen used that perform binary logic on the sign bits and what have you. These perform especially well on MIPS (it is possible to update all 4 flags in only 6 instructions). Again, ^ can be used instead of != where applicable.

I have a few other (pretty typical..) optimization techniques for ARM interpreters, nothing very amazing. Let me know if you're ever interested in doing a recompiler...
 
OP
A

|\/|-a-\/

Uli Hecht
hey shizzy, can i have your source, i'm very interested in a nds emu source which is in a very early state!

hey exophase, why are these programmers like you always mixing c++ code and c #define macros, which are imo very obsolete because you can use inline functions? the person who teached me some c++ stuff said that i should not use those #defines :)


ps.: i think it's a good idea to make this ds emulation thread sticky like the others ^^
 
Last edited:

Exophase

Emulator Developer
Okay, first of all, the code I pulled this from is 100% C (C99, which does include inline), don't know why you assumed I was using C++. Second, unlike what some people tried to jam down my throat, inlines are NOT as general as macros. Don't use macros if you don't want to, chances are you'll only be using them for things in which inlines ARE better. I use macros because they can generate anything you want. I build functions out of macros. I access the same set of variables across macros. I use the C preprocessor's name pasting, IE:

Code:
#define arm_access_memory(access_type, direction, adjust_op, mem_type,        \
 offset_type)                                                                 \
{                                                                             \
  arm_data_trans_##offset_type(adjust_op, direction);                         \
  arm_access_memory_##access_type(mem_type);                                  \
}                                                                             \

Personally, macros have never given me a hard time in terms of the popular caveats (not type safe, don't require actual arguments, multiple evaluation). I'm a low level programmer, I see through things like this easily. The only headaches they've given me is that in C they're not truly multi-line and they're much harder to debug (or sometimes, even get to compile).

Summary: macros are not obsolete if you use C or C++ because inline functions can't accomplish the same thing. And anyone who tells you that what I do is "undefined" by the standard or some garbage like that is full of it.

Here's another example to illustrate the usefulness of macros:

Code:
// These must be statically declared arrays (ie, global or on the stack,
// not dynamically allocated on the heap)

#defile file_read_array(filename_tag, array)
  file_read(filename_tag, array, sizeof(array))

#define file_write_array(filename_tag, array)
  file_write(filename_tag, array, sizeof(array))

Even though this is restricted (and in such a way that if you use it incorrectly it'll mess up your program rather than give you an error) it is something that as far as I'm aware you simply cannot do with inline functions at all. This pretty much characterizes macros in general: they're more powerful and can save you a lot of typing compared to inlines (where you have to at least pass variables around, in C by address (there's no pass by reference in C) and you're restricted by type) but are more dangerous, so should only be used when the person really knows what they're doing. So a lot of teachers will tell their C/C++ students to use inlines because said students DON'T know what they're doing, or because the teachers learned it that way themselves and don't know any better.

This is what tends to happen in general with programming, how "safe" something is determines whether or not it should be used, regardless of its power. I think that should be left to the programmer's discretion, and it shouldn't simply be assumed that problems like the common ones with macros will catch EVERY programmer (I've heard statements like this before. I've used macros very extensively and have never been caught in any of the typical traps except for a precedence mistake, after which I learned my lesson pretty completely).

Macros also have the benefit of being easy to expand at compile time, if you want to see what the actual generated code looks like. This can be useful for finding some hidden performance killers you missed.
 
Last edited:
OP
A

|\/|-a-\/

Uli Hecht
well, i don't said i was told it is "undefined" by the standard... the "multiline problem" mainly annoys me... but do what you want. don't take my comment personally, i asked you, because i saw those #defines so often and now i finally wanted to know why ^^
 
Last edited:
OP
A

|\/|-a-\/

Uli Hecht
how do you emu programmers manage the two cpus? i don't know where to start...

(still want an answer ^^)
 

ShizZy

Emulator Developer
@Exophase, very nice explanation, thank you. Here are the functions I scratched together for calculating the flags:
Code:
    // Computes the Negative and Zero bits of a Program status register
    static inline void ComputeNegativeZero(u32* _psr, u32 _operandA)
    {
        // Check if Negative
        if(_operandA & 0x80000000)
            ARM9_SET_BIT(*_psr, PSR_N);                // Negative
        else
            ARM9_RESET_BIT(*_psr, PSR_N);            // Positive

        // Check if Zero
        if(_operandA == 0)
            ARM9_SET_BIT(*_psr, PSR_Z);                // Zero
        else
            ARM9_RESET_BIT(*_psr, PSR_Z);            // No Zero
    }

    // Computes the Carry bit of a Program status register - Addition Opcode
    static inline void ComputeCarryAddition(u32* _psr, u32 _operandA, u32 _operandB)
    {
        // Check for Carry
        if((0xffffffff - _operandA) < _operandB)
            ARM9_SET_BIT(*_psr, PSR_C);                // Carry
        else
            ARM9_RESET_BIT(*_psr, PSR_C);            // No Carry
    }

    // Computes the Carry bit of a Program status register - Negative Opcode
    static inline void ComputeCarrySubtraction(u32* _psr, u32 _operandA, u32 _operandB)
    {
        // Check for Borrow
        if(_operandA > _operandB)
            ARM9_SET_BIT(*_psr, PSR_C);                // No Borrow
        else
            ARM9_RESET_BIT(*_psr, PSR_C);            // Borrow
    }

    // Computes the Carry bit of a Program status register - Shift left Opcode
    static inline void ComputeCarryShiftleft(u32* _psr, u32 _operandA, u32 _operandB)
    {
        u32 shift;
        shift = _operandA << (_operandB - 1);        // 32bit for speed...

        // Check for Carry
        if(shift & 0x80000000)
            ARM9_SET_BIT(*_psr, PSR_C);                // High
        else
            ARM9_RESET_BIT(*_psr, PSR_C);            // Low
    }
...And so on and so forth. A bit messy, I know, but should work (I think).

@|\/|-a-\/, trust me, you don't want my source :p When it runs something, maybe, but right now it's just a very boring framework. I don't even have any instructions coded, havn't had much time to work on it.
 
OP
A

|\/|-a-\/

Uli Hecht
@shizzy: sure it's a boring framework, but (as mentioned) i do not really know where to start... that's the cause why i'm interested in the first lines of your emulator, otherwise i could take yopyop's desmume source...
 

Top