instruction sets cheatsheet.txt 9.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249
  1. x86
  2. --------------------------------------------------------------------------------
  3. - a family of inst. sets (not a single one), extended with back. compat.
  4. - mainly for desktops
  5. - arch.: register-memory
  6. - CISC, 3000+ instructions total (~1000 mnemonics)
  7. - variable instruction size (usually 2-3 bytes)
  8. - little endian
  9. - partially "open"
  10. - 16, 32, 64 bit versions
  11. - supports (usually as extension) float, SIMD, MMX, SSE, ...
  12. - modes:
  13. - real: 20b segmented addr. (~1 Mib RAM), no mem. protection
  14. - unreal: weird
  15. - protected: 16 MB (1 GB) of physical (virtual) RAM, protected memory
  16. - long
  17. - ...
  18. - bloat, implementations use speculation, reordering, prediction, microcode,
  19. pipelines etc.
  20. - registers:
  21. general purpose registers:
  22. 64b | RAX | RBX | RCX | RDX |
  23. 32b | | EAX | | EBX | | CAX | | EDX |
  24. 16b | | | AX | | | BX | | | CX | | | DX |
  25. 8b | | |AH AL | | |BH BL | | |CH CL | | |DH DL |
  26. other registers:
  27. (E/R)FLAGS <- various flags set by operations
  28. CF PF AF ZF SF TF IF DF OF ...
  29. carry parity aux. zero sign trap int. dir. overflow
  30. (E/R)SP <- stack pointer
  31. (E/R)BP <- stack base pointer
  32. (E/R)IP <- instruciton poitner
  33. (E/R)SI <- source pointer
  34. (E/R)DI <- destination pointer
  35. CS <- code pointer \
  36. DS <- data pointer | segment
  37. SS <- stack | registers
  38. ES,FS,GS <- extra pointer /
  39. - instruction format:
  40. - 0 to 4 prefix bytes modifying the instruction
  41. - 1 to 2 bytes opcode identifying the instruction
  42. - 0 to 1 bytes describing the operands (memory/registers)
  43. - 0 to 1 bytes of a weird "scaled index byte"
  44. - 0 to 4 memory displacement bytes, specify the address offset
  45. - 0 to 4 immediate bytes, specify a constant value
  46. - basic instructions:
  47. ADD add
  48. ADC add with carry
  49. CALL call procedure (pushes EIP and jumps)
  50. DEC decrement
  51. DIV unsigned divide
  52. IDIV signed divide
  53. IMUL signed multiply
  54. INC increment
  55. JNE, JNZ, JZ, ... jump if condition (not equal, not zero, zero, ...)
  56. JMP unconditional jump
  57. MOV move (copy data)
  58. MUL unsigned multiply
  59. NEG negation (two's complement)
  60. NOP no operation
  61. POP pop from stack
  62. PUSH push onto stack
  63. ROL rotate left
  64. SHR shift right
  65. ARM (advanced RISC machines)
  66. --------------------------------------------------------------------------------
  67. - family of instruction sets (ARMv1, ARMv2, ARMv3, ...)
  68. - mainly for embedded, simple, low energy sonsumption and heat
  69. - arch.: load-store
  70. - "proprietary"
  71. - fixed instr. length (32b), BUT there is also a Thumb subset that encodes
  72. instrs. as 16b (smaller code but fewer instructions), and Thumb2 (variable
  73. instr. size)
  74. - little endian, can be switched to big
  75. - RISC, 232 instructions (~50 mnemonics)
  76. - 32b, 64b
  77. - mostly 1 CPI
  78. - modes:
  79. - user: unpriviledged (can't do certain things)
  80. - supervisor: priviledged
  81. - undefined: after undefined inst.
  82. - abort: after memory access violation
  83. - ...
  84. - doesn't have divide instruction!
  85. - implementaitons don't use microcode, are often simple without caches etc.
  86. - instruction format:
  87. | operand |dst|src||opc||0|co |
  88. | 2 |reg|reg||ode||0| nd|
  89. --------........--------........
  90. Almost all instruciton can have a condition.
  91. - registers:
  92. - all 32 bit
  93. - general purpose: R0 - R12
  94. - stack pointer: R13
  95. - link register: R14 (function return address)
  96. - program counter: R15
  97. - flags: CPSR (CPU mode, thumb, endian, zero, carry, ...)
  98. - basic instructions:
  99. ADC add with carry
  100. ADD add
  101. AND and operation
  102. B, BNE, BEQ, ... branch if (always, not equal, equal, ...)
  103. CMP compare
  104. LDR load memory to register
  105. MOV move register/constant to register
  106. MUL multiply
  107. STR store register to memory
  108. SWI software interrupt
  109. RISC-V
  110. --------------------------------------------------------------------------------
  111. POWER PC
  112. --------------------------------------------------------------------------------
  113. AVR
  114. --------------------------------------------------------------------------------
  115. - by Atmel, for embedded
  116. - arch.: load-store
  117. - RISC, ~120 instructions
  118. - 8 bit
  119. - Harward arhitecture (separate instruction and data memory)
  120. - instruction format:
  121. - 16 bit (but some which have long addresses are 32 bit)
  122. - the format differs between instructions (opcode is in different places,
  123. of different size etc.)
  124. - doesn't have divide instruction!
  125. - registers:
  126. - most 8 bit
  127. - general purpose: R0 - R31
  128. - addressing: X (R27,R26), Y (R29,R28), Z (R31,R30)
  129. - program counter: PC (16 or 22 bit)
  130. - stack pointer: SP (8 or 16 bit)
  131. - flags (status): SREG (carry, zero, negative, overflow, sign, half-carry,
  132. bit copy, interrupt)
  133. - basic instructions:
  134. ADC add with carry
  135. ADD add without carry
  136. AND logical and
  137. BRBC branch if SREG bit is set, jump if specified SREG bit is 1
  138. BRGE branch if >= (signed), branches if sign flag is 0
  139. BRSH branch if >= (unsigned), branches if carry flag is 0
  140. CP compare, only sets flags
  141. INC increment
  142. JMP jump (long, to any address)
  143. LDI load immediate 8 bit value to register
  144. LDS load direct from data space, loads 8 bits from memory
  145. LPM load program memory, loads 8 bits from program memory
  146. MOV move register to register
  147. MUL multiply unsigned (16 bit result)
  148. MULS multiply signed (16 bit result)
  149. MULSU multiply signed with unsigned (signed 16 bit result)
  150. NEG two's complement negation
  151. NOP no operation
  152. SBRC skip if register bit is 0, conditionally skips next inst.
  153. SUB substract
  154. Java bytecode
  155. --------------------------------------------------------------------------------
  156. - stack/register architecture
  157. - 202 opcodes
  158. - in each function there is a stack (arguments, computation, return value) and
  159. local variable array (same as registers)
  160. - variable instructions size: 1B opcode and 1 to N operands
  161. - has objects
  162. - basic instructions:
  163. arraylength pushes length of array reference on top of stack
  164. breakpoint beak point for debuggers
  165. f2i converts float on top of stack to int
  166. goto jump
  167. goto_w longer jump
  168. iadd pushes result of addition of 2 ints on stack top
  169. iand performs bitwise and on 2 ints on stack top
  170. iconst_m1 loads -1 on top of stack
  171. idiv divides two integers on top of stack
  172. ifeq if top of stack is 0, branch to given address
  173. iload_0 load int local variable # 0 on top of stack
  174. new create new object of class of given ID
  175. newarray create new array of given length
  176. nop no operaion
  177. pop pop top of stack
  178. putfield set given field of given object
  179. return return void from function
  180. Python bytecode
  181. --------------------------------------------------------------------------------
  182. - not official, just an implementaion detail, and differs between Py versions
  183. - 2 bytes per instruction (1B opcode, 1B argument)
  184. - evaluation stack contains abstract object just like python (numbers, lists,
  185. objects, ...)
  186. - basic instructions:
  187. BINARY_ADD adds 2 top stack items and pushes result
  188. BINARY_MULTIPLY multiplies 2 top stack items and pushes result
  189. CALL_FUNCTION passes N args from stack top and calls func below them
  190. EXTEND_ARGS for arguments bigger than 1 byte
  191. GET_LEN pushes len() of top of the stack
  192. JUMP_FORWARD unconditionally jump to address
  193. LIST_APPEND appends stack top to stack 2nd list
  194. LOAD_CONST loads constant on top of stack
  195. NOP no operation (placeholder for optimizer)
  196. POP_JUMP_IF_TRUE pops and conditionally jumps to given address
  197. POP_TOP pop top of the stack
  198. RETURN_VALUE returns value to the caller
  199. ROT_TWO swaps two top items in the stack
  200. UNARY_NOT negates top of the stack
  201. LLVM
  202. --------------------------------------------------------------------------------
  203. - intermediate representation for compilers
  204. - RISC
  205. - strongly typed
  206. - kind of bloat, many "features"
  207. - abstracts things like calling conventions and modules (but programs
  208. compiled to this from languages may be not 100% target-independent because
  209. of things like sizeof())
  210. - registers:
  211. - infinitely many tmp. registers (%0, %1, ...)
  212. - basic instructions:
  213. add add numbers of identical types
  214. br branch within function (either conditional or not)
  215. alloca allocates stack memory (and auto deallocates later)
  216. call call a function
  217. fadd add float numbers
  218. icmp compare and return a binary result
  219. mul multiplay two numbers of same type (gives same type)
  220. ret return from function
  221. switch switch (like in C)