protection-keys.txt 3.1 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
  1. Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
  2. which will be found on future Intel CPUs.
  3. Memory Protection Keys provides a mechanism for enforcing page-based
  4. protections, but without requiring modification of the page tables
  5. when an application changes protection domains. It works by
  6. dedicating 4 previously ignored bits in each page table entry to a
  7. "protection key", giving 16 possible keys.
  8. There is also a new user-accessible register (PKRU) with two separate
  9. bits (Access Disable and Write Disable) for each key. Being a CPU
  10. register, PKRU is inherently thread-local, potentially giving each
  11. thread a different set of protections from every other thread.
  12. There are two new instructions (RDPKRU/WRPKRU) for reading and writing
  13. to the new register. The feature is only available in 64-bit mode,
  14. even though there is theoretically space in the PAE PTEs. These
  15. permissions are enforced on data access only and have no effect on
  16. instruction fetches.
  17. =========================== Syscalls ===========================
  18. There are 3 system calls which directly interact with pkeys:
  19. int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
  20. int pkey_free(int pkey);
  21. int pkey_mprotect(unsigned long start, size_t len,
  22. unsigned long prot, int pkey);
  23. Before a pkey can be used, it must first be allocated with
  24. pkey_alloc(). An application calls the WRPKRU instruction
  25. directly in order to change access permissions to memory covered
  26. with a key. In this example WRPKRU is wrapped by a C function
  27. called pkey_set().
  28. int real_prot = PROT_READ|PROT_WRITE;
  29. pkey = pkey_alloc(0, PKEY_DENY_WRITE);
  30. ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
  31. ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
  32. ... application runs here
  33. Now, if the application needs to update the data at 'ptr', it can
  34. gain access, do the update, then remove its write access:
  35. pkey_set(pkey, 0); // clear PKEY_DENY_WRITE
  36. *ptr = foo; // assign something
  37. pkey_set(pkey, PKEY_DENY_WRITE); // set PKEY_DENY_WRITE again
  38. Now when it frees the memory, it will also free the pkey since it
  39. is no longer in use:
  40. munmap(ptr, PAGE_SIZE);
  41. pkey_free(pkey);
  42. (Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
  43. An example implementation can be found in
  44. tools/testing/selftests/x86/protection_keys.c)
  45. =========================== Behavior ===========================
  46. The kernel attempts to make protection keys consistent with the
  47. behavior of a plain mprotect(). For instance if you do this:
  48. mprotect(ptr, size, PROT_NONE);
  49. something(ptr);
  50. you can expect the same effects with protection keys when doing this:
  51. pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
  52. pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
  53. something(ptr);
  54. That should be true whether something() is a direct access to 'ptr'
  55. like:
  56. *ptr = foo;
  57. or when the kernel does the access on the application's behalf like
  58. with a read():
  59. read(fd, ptr, 1);
  60. The kernel will send a SIGSEGV in both cases, but si_code will be set
  61. to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
  62. the plain mprotect() permissions are violated.