Raymond Chen has written a multi part series on the Intel Itanium processor architecture. It really helps you understand the CPU architecture from a software development and performance optimization perspective. To quote Raymond:
The Itanium may not have been much of a commercial success, but it is interesting as a processor architecture because it is different from anything else commonly seen today. It’s like learning a foreign language: It gives you an insight into how others view the world.
The next two weeks will be devoted to an introduction to the Itanium processor architecture, as employed by Win32.
It is a highly recommended read (as is his entire blog, The New Old Thing):
- Part 1 – Warming up
- Part 2 – Instruction encoding, templates, and stops
- Part 3 – The Windows calling convention, how parameters are passed
- Part 3b – How does spilling actually work?
- Part 4 – The Windows calling convention, leaf functions
- Part 5 – The GP register, calling functions, and function pointers
- Part 6 – Calculating conditionals
- Part 7 – Speculative loads
- Part 8 – Advanced loads
- Part 9 – Counted loops and loop pipelining
- Part 10 – Register rotation