cufunha said:
do you know names of good books on win32 assembly covering modern processor assembly, windows xp, common assembly algoritms, mmx, etc...?
First Tons of books on Assembly language for modern processors..
Let's try some manufacturers sites first

AMD Athalon XP (bleah bleah)
MMX + 3dnow extensions
AMD Opteron (bleah bleah)
Various information reguarding abusage
Intel P4
stuff skip down to MANUAL and pick what you feel best suits your needs.
If you are new to programing.. hmmm then maybe you should start with something less prone to lock up your computer. MMX instructions can be FATAL to your computers instruction execution. You cannot mix Floating point and MMX instructions for example. They use basically shadow same stack in the processor you either perform floating point or MMX instructions not both. Also you need to disable interrupts during MMX instruction sequences. After the sequences you must set a context switch to turn off MMX in case any FP instructions are executed when interrupts are enabled. I know fun stuff but that's life. You can't expect other programs to check for the MMX execution state.
Algorythm wise, depends on what you are doing. You might be better to look at books on digital signal processing for audio and video information instead. Algorythms imply a method of performing an action, and are inherently not processor specific. Implementing an algorythm optimally might be highly processor dependant. Also note you cannot use processor intrensics as MS supplies with there compilor. Some brain dead programer wrote it for there compilor and to be blunt it is TERRIBLE. There are ZERO none NYET zilch optimizations in using the processor intrensics. Essentially you my as well NOT use MMX because the implementation MS has is worse than not using processor intrensics to perform
embarassingly paraellel code. As one person put it "It's utterly pathetic". I also believe you will have to implement your assembly seperately from C code instead of inline. It has been said MS has removed inline pass through assembly for there compilors post VC++6. I guess they decided there MMX intrensics optimizations were good enough (sigh). This means you cannot put your assembly within a function and have the compilor handle the function construction and then handle substitution for parameters to your MMX SSE code. You have to handle all frame pointer manipulation etc. yourself. I guess yet another reason to use borlands compilor.
Cyb
PS: I edited your posts subject to better fit your question you might get more responses that way!