What's new

Manipulating strings in C

Cyberman

Moderator
Moderator
First he is using C not C++ it's NOT possible to do that in C.

C++ == OOP
C == non OOP

the new and delete operators are introduced in C++ and are for C++ only.

As for allocating memory you should always CLEAR allocated memory (otherwise you could have a mess on your hands). Use calloc instead of malloc. You can use it identically as malloc.

Don't forget to free your memory after allocating it (the biggest error of most green programers ;) ).

Other things

delete k will work for BOTH objects in your C++ code. Older (non compliant to ANSI C++) compilers may require the [] placed in front of the k. As I said these are not ANSI compliant though.

you use the same destructor syntax for an array of objects
Code:
K = new char[144];
delete K;
as you do a single object
Code:
K = new char;
delete K;

Of course then there are object factories.. that's a very different horse but the syntax is somewhat the same. An object factory has a generic base type that creates the correct object needed.

Container *Box;

Box = new Container("SoapBox");

This creates a soap box. It doesn't seem special until you realize that a type of SoapBox is a descendant of type container. All these types can be assigned to a Container type and the base class functions are all accessable.

Cyb
 
OP
S

Slougi

New member
See, that is why I want to learn C first :) Distinguishing between C and C++ can be pretty important. Also all that OO stuff still confuses the heck outta me.
 

Doomulation

?????????????????????????
Meh. I did figure why they use malloc is emulator code :saint:
Forgive me, I'm only working with c++ not c. :doh:
 

euphoria

Emutalk Member
Cyberman said:
Use calloc instead of malloc. You can use it identically as malloc.
Using calloc is justified when speed is not an issue since setting the memory block to some value is not always wanted.
So the difference between calloc and malloc is that calloc zeroes the memblock so {a=calloc(1,1);} == {a=malloc(1); memset(a,0,1);}

What would be the best way for achieving the following:
i want to use memory adresses from 0x0000 - 0x0FFF bytes and use only 0x00-0xFF and 0x0900-0x0FFF and free the memory from 0x0100-0x08FF.
Could i make pointer to 0x100 and somehow poke the size of allocated mem so that it frees only up to 0x8ff?
or could i allocate 0x100 bytes and force the second allocate to align in 0x900 bytes from the first memory block?

hope that this was explained clearly enough :plain:
 
Last edited:

Cyberman

Moderator
Moderator
One should use Calloc whenever a structure is allocated. Yes there is overhead, however it is justifyable in the fact that all your pointers will magically become NULL. Most people don't initialize there pointers properly, and this leads to lots of errors, the simplest method is to use calloc().

Ironically this is what VC++ and BCB do when the allocate class objects in them. For arrays such as
Code:
Temp= new char[1024*1024*4];
It doesn't. For large arrays of just data malloc is better :)

Are you addressing and absolute location?
Or are you speaking of emulator memory?

If absolute you will get exception errors out your ears, not gonna happen unless you set your thread to kernal level privledges.
If these locations are absolutes and have predefined data in them then this works just fine.
Code:
typedef char bfr[0x100];
typedef bfr *bfr_ptr;
bfr_ptr Buffer1, Buffer2;

Buffer1 = (bfr_ptr)0x000;
Buffer2 = (bfr_ptr)0x900;
You can 'resize' allocated memory but you cannot specify the locations malloc returns. Since this would be EXISTING data you should just create pointers to the absolute locations, like I just showed.

Cyb
 

euphoria

Emutalk Member
It's so annoying when you try to be as specific as possible and still you forget to tell some vital information...

But im talking about allocating several blocks of memory that are no matter where the first one starts same length apart from each other.
Iis this what you mean by "emulator memory"? that way i can use fixed addresses without need to translate them. And yes im playing around with emulating a system.

I guess it would be better to translate the fixed addresses to the allocated blocks but speed could be improved by structuring the allocated memory so that there would be no need to translate them just access them by [start_of_allocated_memory + fixed_address];
 

Cyberman

Moderator
Moderator
Hmmmm

It really depends on the system you are emulating.

for example if you are emulating say a PSX system it makes no sense to create such a large memory (it can address 4 gigs).

although fast in 'percieved' speed, what is important is what the memory you wish to allocate is used for.

If your memory is < 1M and is all data just allocating the raw memory and using it is fine.. anything more and you will have problems.
Let's take as example the PSX
Memory stuff:
0x0000_0000-0x001f_ffff is Kernel and User Memory (2 Meg)
and
0x1f00_0000-0x1f00_ffff is the Parallel Port (64K)
and
0x1f80_0000-0x1f80_03ff is a Scratch Pad (1024 bytes)
and
0x1f80_1000-0x1f80_2fff are Hardware Registers (8K)
0x8000_0000-0x801f_ffff is Kernel and User Memory Mirror (2 Meg) Cached
and
0xa000_0000-0xa01f_ffff Kernel and User Memory Mirror (2 Meg) Uncached
and lastly
0xbfc0_0000-0xbfc7_ffff is the BIOS (512K)

Now what is it I am getting AT?

It's quite simple.. you cannot emulate a playstation with direct memory maping. What do you do then?

Notice the different behavior depending on location (IE cached uncached accesss etc) and the hardware registers etc. One can't forget the Bios either.

For straight C one can do this

Code:
unsigned char *PSX_mem;
unsigned char * BIOS_mem;

PSX_mem = malloc(2 * 1024 * 1024); // get your 2 megs!
BIOS_mem = malloc(512 * 1024); // allocate 512K of bios ROM

for accessing memory you need to note the behavior, I'm not sure about caching aspects but..
Code:
unsigned char Read_byte(unsigned long int Address)
{
  unsigned char Return;

  Return = 0; // NULL
  switch(Address >> 28)
  {
    case 0x0:
    case 0x8:
    case 0xA:
      // access PSX_mem
      Return = PSX_mem[Address & 0x1FFFFF);
      break;
    case 0x1:
      // access the hardware as a byte
      break;
    case 0xB:
      // access Bios memory
      if ((Address & 0xBFC00000) == 0xBFC00000)
      {
        Return = BIOS_mem[Adddress & 0x1FFFFF];
      }
      else
      {
        // ACCESS violation
      }
      break;
    default:
      // memory fault do something appropriate
      break;
  }
}
unsigned short Read_word(unsigned long int Address)
{

}
unsigned long int Read_long(unsigned long int Address)
{

}

This is a start.. really you can use HASH tables to improve the efficiency significantly for access to the real memory

a HASH table contains a list of pointers that refer to a location in memory.

For example break the memory into 64K segments (of which most is sparsely populated).
Code:
unsigned char *HASH_TABLE[0x10000];

void Fill_Hash(void)
{
  int Index;

  // clear hash table
  for(Index = 0; Index < 0x10000; Index++)
  {
    HASH_TABLE[Index] = NULL;
  }
  // compute PSX ram offsets
  for( Index = 0, Index < 0x1F; Index++)
  {
    HASH_TABLE[Index] = PSX_mem + (Index << 16);
    HASH_TABLE[Index+0x8000] = PSX_mem + (Index<<16);
    HASH_TABLE[Index+0xA000] = PSX_mem + (Index<<16);  
  }
  // compute Bios offsets
  for (Index = 0, Index < 7; Index++)
  {
    HASH_TABLE[Index+0xBFC0] = BIOS_mem + (Index<<16);
  }
}

you can do this for each segment used save hardware and scratch pad spots. The parallel port might require some additional work too. Whatever way you create an address, since you know NULL indicates it's an invalid memory location you can look up addresses very quickly for both BIOS and PSX memory. Additionally you can add some checks for the special cases (hardware and scratch pad registers).


Just some thoughts.. gah I typed too much ;)

Cyb
 
OP
S

Slougi

New member
Cyberman said:
Hmmmm

It really depends on the system you are emulating.

for example if you are emulating say a PSX system it makes no sense to create such a large memory (it can address 4 gigs).

although fast in 'percieved' speed, what is important is what the memory you wish to allocate is used for.

If your memory is < 1M and is all data just allocating the raw memory and using it is fine.. anything more and you will have problems.
Let's take as example the PSX
Memory stuff:
0x0000_0000-0x001f_ffff is Kernel and User Memory (2 Meg)
and
0x1f00_0000-0x1f00_ffff is the Parallel Port (64K)
and
0x1f80_0000-0x1f80_03ff is a Scratch Pad (1024 bytes)
and
0x1f80_1000-0x1f80_2fff are Hardware Registers (8K)
0x8000_0000-0x801f_ffff is Kernel and User Memory Mirror (2 Meg) Cached
and
0xa000_0000-0xa01f_ffff Kernel and User Memory Mirror (2 Meg) Uncached
and lastly
0xbfc0_0000-0xbfc7_ffff is the BIOS (512K)

Now what is it I am getting AT?

It's quite simple.. you cannot emulate a playstation with direct memory maping. What do you do then?

Notice the different behavior depending on location (IE cached uncached accesss etc) and the hardware registers etc. One can't forget the Bios either.

For straight C one can do this

Code:
unsigned char *PSX_mem;
unsigned char * BIOS_mem;

PSX_mem = malloc(2 * 1024 * 1024); // get your 2 megs!
BIOS_mem = malloc(512 * 1024); // allocate 512K of bios ROM

for accessing memory you need to note the behavior, I'm not sure about caching aspects but..
Code:
unsigned char Read_byte(unsigned long int Address)
{
  unsigned char Return;

  Return = 0; // NULL
  switch(Address >> 28)
  {
    case 0x0:
    case 0x8:
    case 0xA:
      // access PSX_mem
      Return = PSX_mem[Address & 0x1FFFFF);
      break;
    case 0x1:
      // access the hardware as a byte
      break;
    case 0xB:
      // access Bios memory
      if ((Address & 0xBFC00000) == 0xBFC00000)
      {
        Return = BIOS_mem[Adddress & 0x1FFFFF];
      }
      else
      {
        // ACCESS violation
      }
      break;
    default:
      // memory fault do something appropriate
      break;
  }
}
unsigned short Read_word(unsigned long int Address)
{

}
unsigned long int Read_long(unsigned long int Address)
{

}

This is a start.. really you can use HASH tables to improve the efficiency significantly for access to the real memory

a HASH table contains a list of pointers that refer to a location in memory.

For example break the memory into 64K segments (of which most is sparsely populated).
Code:
unsigned char *HASH_TABLE[0x10000];

void Fill_Hash(void)
{
  int Index;

  // clear hash table
  for(Index = 0; Index < 0x10000; Index++)
  {
    HASH_TABLE[Index] = NULL;
  }
  // compute PSX ram offsets
  for( Index = 0, Index < 0x1F; Index++)
  {
    HASH_TABLE[Index] = PSX_mem + (Index << 16);
    HASH_TABLE[Index+0x8000] = PSX_mem + (Index<<16);
    HASH_TABLE[Index+0xA000] = PSX_mem + (Index<<16);  
  }
  // compute Bios offsets
  for (Index = 0, Index < 7; Index++)
  {
    HASH_TABLE[Index+0xBFC0] = BIOS_mem + (Index<<16);
  }
}

you can do this for each segment used save hardware and scratch pad spots. The parallel port might require some additional work too. Whatever way you create an address, since you know NULL indicates it's an invalid memory location you can look up addresses very quickly for both BIOS and PSX memory. Additionally you can add some checks for the special cases (hardware and scratch pad registers).


Just some thoughts.. gah I typed too much ;)

Cyb

Ok now, why did you have to turn my small little programming thread into complete techno gibberish? :|

j/k ;)


Weeelllll, that code raised a few ???'s in my head ;)

1) What does switch do? switch(Address >> 28) Also what do the << and >> thingies do?
2) Why the extra default: break statement?
3) Why did I understand what you were doing even though I have no clue about coding? :|
 

icepir8

Moderator
Slougi said:
Ok now, why did you have to turn my small little programming thread into complete techno gibberish? :|

j/k ;)


Weeelllll, that code raised a few ???'s in my head ;)

1) What does switch do? switch(Address >> 28) Also what do the << and >> thingies do?
2) Why the extra default: break statement?
3) Why did I understand what you were doing even though I have no clue about coding? :|

1) it executes code depending on if the case statement value is equal to the value you are testing.

2) it executes this when there are no valid cases.

3) Lucky?!?!?!?
 
OP
S

Slougi

New member
icepir8 said:
Opps forgot them.
<< shift a value left. It like multiplying by 2 to # power.
>> shift a value right. It like deviding by 2 to # power.
Ah thanks!
 

Cyberman

Moderator
Moderator
Why did you quote the whole thing? ;)

The following operators are using for bitmanipulation.
<< Left shift
>> Right shift
| OR
^ XOR
& AND

Have fun ;)
Cyb
 
OP
S

Slougi

New member
Cyberman said:
Why did you quote the whole thing? ;)

The following operators are using for bitmanipulation.
<< Left shift
>> Right shift
| OR
^ XOR
& AND

Have fun ;)
Cyb
How does XOR work again?
 

Teamz

J'aime tes seins
here's a small table that shows how XOR works

v1 v2 result

1 1 false
1 0 true
0 1 true
0 0 false

OR would return true if both values were 1 (true), but XOR only returns true when one of the 2 values is 1

I had to edit cuz the board seems to remove "unecessary spaces"

v1 = value1 v2 = value 2
 
OP
S

Slougi

New member
Teamz said:
here's a small table that shows how XOR works

v1 v2 result

1 1 false
1 0 true
0 1 true
0 0 false

OR would return true if both values were 1 (true), but XOR only returns true when one of the 2 values is 1

I had to edit cuz the board seems to remove "unecessary spaces"

v1 = value1 v2 = value 2
Ah thanks :)
 

euphoria

Emutalk Member
Cyberman said:

Just some thoughts.. gah I typed too much ;)
Cyb

Well thanks for your thoughts, i really apreciate it, but still no answer to my question:
"Could i make pointer to 0x100 and somehow poke the size of allocated mem so that it frees only up to 0x8ff?
or could i allocate 0x100 bytes and force the second allocate to align in 0x900 bytes from the first memory block? "

Got a few more.
which is more effective way to write memory?pointer[address] = value
or
memcpy(pointer+address, &value, sizeof(value)); ?


And what are the + and - of declaring big structures as global variables vs. passing pointers to the variables to functions?
Code:
BIG_STRUCT_T struct;

void do(void)
{
// do something to struct.value
}
Code:
void do(void)
{
BIG_STRUCT_T struct;
do2(&struct);
}
void do2(BIG_STRUCT *s)
{
// do something to s->value
}

and sori slougi for spamming your thread :)
 

tooie

New member
euphoria said:
Well thanks for your thoughts, i really apreciate it, but still no answer to my question:
"Could i make pointer to 0x100 and somehow poke the size of allocated mem so that it frees only up to 0x8ff?
or could i allocate 0x100 bytes and force the second allocate to align in 0x900 bytes from the first memory block? "


Memory is allocated in blocks .. some blocks may be bigger then what your have asked to allocate so it is nice round numbers, so it is not possible to free part of the block. The way it Identifies which block it was it using the return value .. so if you did modify that it would have no idea what block that pointer belonged to.

With Virtual memory, you can reverse a massive amount of linea memory with out allocating any physical memory .. then you can choose where in that address space you want to allocate .. the thing here tho is that you can only allocate at 4k blocks .. if you allocate in the middle of the block for 1 byte it allocates that whole 4k block .. if you allocated 8 bytes which spaned both you would allocate 8k of data. Virtual memory gives more control .. but I am not sure what you are trying to achieve ?

Got a few more.
which is more effective way to write memory?pointer[address] = value
or
memcpy(pointer+address, &value, sizeof(value)); ?


And what are the + and - of declaring big structures as global variables vs. passing pointers to the variables to functions?
Code:
BIG_STRUCT_T struct;

void do(void)
{
// do something to struct.value
}
Code:
void do(void)
{
BIG_STRUCT_T struct;
do2(&struct);
}
void do2(BIG_STRUCT *s)
{
// do something to s->value
}

and sori slougi for spamming your thread :)

It is just nicer code not having global variables, not to mention at times the compiler is better with local ones but not always. If you had a global structure, or you created one instance of it then refrenced it, then it would have the same memory useage.

if it was a class, the thing I like with creating it latter and refrenceing it is that the constructor is executed after the main (winmain) if it is a global variable then it is allocated before it.

if you have a small program then a couple of global variables is not to bad, but it does become messy if you have a lot, and a lot harder to work with.
 

smcd

Active member
Doomulation said:
Heh. You're right. It has nothing to do with pointers.
Char* is pointer. That's what i tried to point out :)

PHP:
char* test = "hello";
char test = new char[20];
test = "hello";

That's pointers.

PHP:
char test = new char[20]; <--- pointer
test[5] = 'g'; <--- not pointer
test = "hello"; <--- pointer

Just thought I'd point out that this causes a memory "leak"
you allocate memory (which is pointed to by test),
then make test point to the string "hello" ...
since you no longer have the address of the "new char[20]" it cannot be deleted,
thus a memory leak... :p

If you want to make the newly allocated array containt "hello" you should use something like strcpy.
 

Doomulation

?????????????????????????
sethmcdoogle said:
Just thought I'd point out that this causes a memory "leak"
you allocate memory (which is pointed to by test),
then make test point to the string "hello" ...
since you no longer have the address of the "new char[20]" it cannot be deleted,
thus a memory leak... :p

If you want to make the newly allocated array containt "hello" you should use something like strcpy.
Nawww.... the compiler keeps track of that and destructs it at the end of the function. Unless ... unless, you somehow cut off the string.
Like

Code:
char* p = "test";
p[2] = '\0'; // This will cause memory error

Because it's basically the same as returning a char pointer.

Code:
char* some_func()
{
  return "test";
}

You'll get no memory leaks when exiting the program afaik. But it is still something I'd never do...
 

Top