drm-mm.rst 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499
  1. =====================
  2. DRM Memory Management
  3. =====================
  4. Modern Linux systems require large amount of graphics memory to store
  5. frame buffers, textures, vertices and other graphics-related data. Given
  6. the very dynamic nature of many of that data, managing graphics memory
  7. efficiently is thus crucial for the graphics stack and plays a central
  8. role in the DRM infrastructure.
  9. The DRM core includes two memory managers, namely Translation Table Maps
  10. (TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory
  11. manager to be developed and tried to be a one-size-fits-them all
  12. solution. It provides a single userspace API to accommodate the need of
  13. all hardware, supporting both Unified Memory Architecture (UMA) devices
  14. and devices with dedicated video RAM (i.e. most discrete video cards).
  15. This resulted in a large, complex piece of code that turned out to be
  16. hard to use for driver development.
  17. GEM started as an Intel-sponsored project in reaction to TTM's
  18. complexity. Its design philosophy is completely different: instead of
  19. providing a solution to every graphics memory-related problems, GEM
  20. identified common code between drivers and created a support library to
  21. share it. GEM has simpler initialization and execution requirements than
  22. TTM, but has no video RAM management capabilities and is thus limited to
  23. UMA devices.
  24. The Translation Table Manager (TTM)
  25. ===================================
  26. TTM design background and information belongs here.
  27. TTM initialization
  28. ------------------
  29. **Warning**
  30. This section is outdated.
  31. Drivers wishing to support TTM must pass a filled :c:type:`ttm_bo_driver
  32. <ttm_bo_driver>` structure to ttm_bo_device_init, together with an
  33. initialized global reference to the memory manager. The ttm_bo_driver
  34. structure contains several fields with function pointers for
  35. initializing the TTM, allocating and freeing memory, waiting for command
  36. completion and fence synchronization, and memory migration.
  37. The :c:type:`struct drm_global_reference <drm_global_reference>` is made
  38. up of several fields:
  39. .. code-block:: c
  40. struct drm_global_reference {
  41. enum ttm_global_types global_type;
  42. size_t size;
  43. void *object;
  44. int (*init) (struct drm_global_reference *);
  45. void (*release) (struct drm_global_reference *);
  46. };
  47. There should be one global reference structure for your memory manager
  48. as a whole, and there will be others for each object created by the
  49. memory manager at runtime. Your global TTM should have a type of
  50. TTM_GLOBAL_TTM_MEM. The size field for the global object should be
  51. sizeof(struct ttm_mem_global), and the init and release hooks should
  52. point at your driver-specific init and release routines, which probably
  53. eventually call ttm_mem_global_init and ttm_mem_global_release,
  54. respectively.
  55. Once your global TTM accounting structure is set up and initialized by
  56. calling ttm_global_item_ref() on it, you need to create a buffer
  57. object TTM to provide a pool for buffer object allocation by clients and
  58. the kernel itself. The type of this object should be
  59. TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct
  60. ttm_bo_global). Again, driver-specific init and release functions may
  61. be provided, likely eventually calling ttm_bo_global_init() and
  62. ttm_bo_global_release(), respectively. Also, like the previous
  63. object, ttm_global_item_ref() is used to create an initial reference
  64. count for the TTM, which will call your initialization function.
  65. See the radeon_ttm.c file for an example of usage.
  66. .. kernel-doc:: drivers/gpu/drm/drm_global.c
  67. :export:
  68. The Graphics Execution Manager (GEM)
  69. ====================================
  70. The GEM design approach has resulted in a memory manager that doesn't
  71. provide full coverage of all (or even all common) use cases in its
  72. userspace or kernel API. GEM exposes a set of standard memory-related
  73. operations to userspace and a set of helper functions to drivers, and
  74. let drivers implement hardware-specific operations with their own
  75. private API.
  76. The GEM userspace API is described in the `GEM - the Graphics Execution
  77. Manager <http://lwn.net/Articles/283798/>`__ article on LWN. While
  78. slightly outdated, the document provides a good overview of the GEM API
  79. principles. Buffer allocation and read and write operations, described
  80. as part of the common GEM API, are currently implemented using
  81. driver-specific ioctls.
  82. GEM is data-agnostic. It manages abstract buffer objects without knowing
  83. what individual buffers contain. APIs that require knowledge of buffer
  84. contents or purpose, such as buffer allocation or synchronization
  85. primitives, are thus outside of the scope of GEM and must be implemented
  86. using driver-specific ioctls.
  87. On a fundamental level, GEM involves several operations:
  88. - Memory allocation and freeing
  89. - Command execution
  90. - Aperture management at command execution time
  91. Buffer object allocation is relatively straightforward and largely
  92. provided by Linux's shmem layer, which provides memory to back each
  93. object.
  94. Device-specific operations, such as command execution, pinning, buffer
  95. read & write, mapping, and domain ownership transfers are left to
  96. driver-specific ioctls.
  97. GEM Initialization
  98. ------------------
  99. Drivers that use GEM must set the DRIVER_GEM bit in the struct
  100. :c:type:`struct drm_driver <drm_driver>` driver_features
  101. field. The DRM core will then automatically initialize the GEM core
  102. before calling the load operation. Behind the scene, this will create a
  103. DRM Memory Manager object which provides an address space pool for
  104. object allocation.
  105. In a KMS configuration, drivers need to allocate and initialize a
  106. command ring buffer following core GEM initialization if required by the
  107. hardware. UMA devices usually have what is called a "stolen" memory
  108. region, which provides space for the initial framebuffer and large,
  109. contiguous memory regions required by the device. This space is
  110. typically not managed by GEM, and must be initialized separately into
  111. its own DRM MM object.
  112. GEM Objects Creation
  113. --------------------
  114. GEM splits creation of GEM objects and allocation of the memory that
  115. backs them in two distinct operations.
  116. GEM objects are represented by an instance of struct :c:type:`struct
  117. drm_gem_object <drm_gem_object>`. Drivers usually need to
  118. extend GEM objects with private information and thus create a
  119. driver-specific GEM object structure type that embeds an instance of
  120. struct :c:type:`struct drm_gem_object <drm_gem_object>`.
  121. To create a GEM object, a driver allocates memory for an instance of its
  122. specific GEM object type and initializes the embedded struct
  123. :c:type:`struct drm_gem_object <drm_gem_object>` with a call
  124. to :c:func:`drm_gem_object_init()`. The function takes a pointer
  125. to the DRM device, a pointer to the GEM object and the buffer object
  126. size in bytes.
  127. GEM uses shmem to allocate anonymous pageable memory.
  128. :c:func:`drm_gem_object_init()` will create an shmfs file of the
  129. requested size and store it into the struct :c:type:`struct
  130. drm_gem_object <drm_gem_object>` filp field. The memory is
  131. used as either main storage for the object when the graphics hardware
  132. uses system memory directly or as a backing store otherwise.
  133. Drivers are responsible for the actual physical pages allocation by
  134. calling :c:func:`shmem_read_mapping_page_gfp()` for each page.
  135. Note that they can decide to allocate pages when initializing the GEM
  136. object, or to delay allocation until the memory is needed (for instance
  137. when a page fault occurs as a result of a userspace memory access or
  138. when the driver needs to start a DMA transfer involving the memory).
  139. Anonymous pageable memory allocation is not always desired, for instance
  140. when the hardware requires physically contiguous system memory as is
  141. often the case in embedded devices. Drivers can create GEM objects with
  142. no shmfs backing (called private GEM objects) by initializing them with
  143. a call to :c:func:`drm_gem_private_object_init()` instead of
  144. :c:func:`drm_gem_object_init()`. Storage for private GEM objects
  145. must be managed by drivers.
  146. GEM Objects Lifetime
  147. --------------------
  148. All GEM objects are reference-counted by the GEM core. References can be
  149. acquired and release by :c:func:`calling drm_gem_object_get()` and
  150. :c:func:`drm_gem_object_put()` respectively. The caller must hold the
  151. :c:type:`struct drm_device <drm_device>` struct_mutex lock when calling
  152. :c:func:`drm_gem_object_get()`. As a convenience, GEM provides
  153. :c:func:`drm_gem_object_put_unlocked()` functions that can be called without
  154. holding the lock.
  155. When the last reference to a GEM object is released the GEM core calls
  156. the :c:type:`struct drm_driver <drm_driver>` gem_free_object_unlocked
  157. operation. That operation is mandatory for GEM-enabled drivers and must
  158. free the GEM object and all associated resources.
  159. void (\*gem_free_object) (struct drm_gem_object \*obj); Drivers are
  160. responsible for freeing all GEM object resources. This includes the
  161. resources created by the GEM core, which need to be released with
  162. :c:func:`drm_gem_object_release()`.
  163. GEM Objects Naming
  164. ------------------
  165. Communication between userspace and the kernel refers to GEM objects
  166. using local handles, global names or, more recently, file descriptors.
  167. All of those are 32-bit integer values; the usual Linux kernel limits
  168. apply to the file descriptors.
  169. GEM handles are local to a DRM file. Applications get a handle to a GEM
  170. object through a driver-specific ioctl, and can use that handle to refer
  171. to the GEM object in other standard or driver-specific ioctls. Closing a
  172. DRM file handle frees all its GEM handles and dereferences the
  173. associated GEM objects.
  174. To create a handle for a GEM object drivers call
  175. :c:func:`drm_gem_handle_create()`. The function takes a pointer
  176. to the DRM file and the GEM object and returns a locally unique handle.
  177. When the handle is no longer needed drivers delete it with a call to
  178. :c:func:`drm_gem_handle_delete()`. Finally the GEM object
  179. associated with a handle can be retrieved by a call to
  180. :c:func:`drm_gem_object_lookup()`.
  181. Handles don't take ownership of GEM objects, they only take a reference
  182. to the object that will be dropped when the handle is destroyed. To
  183. avoid leaking GEM objects, drivers must make sure they drop the
  184. reference(s) they own (such as the initial reference taken at object
  185. creation time) as appropriate, without any special consideration for the
  186. handle. For example, in the particular case of combined GEM object and
  187. handle creation in the implementation of the dumb_create operation,
  188. drivers must drop the initial reference to the GEM object before
  189. returning the handle.
  190. GEM names are similar in purpose to handles but are not local to DRM
  191. files. They can be passed between processes to reference a GEM object
  192. globally. Names can't be used directly to refer to objects in the DRM
  193. API, applications must convert handles to names and names to handles
  194. using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls
  195. respectively. The conversion is handled by the DRM core without any
  196. driver-specific support.
  197. GEM also supports buffer sharing with dma-buf file descriptors through
  198. PRIME. GEM-based drivers must use the provided helpers functions to
  199. implement the exporting and importing correctly. See ?. Since sharing
  200. file descriptors is inherently more secure than the easily guessable and
  201. global GEM names it is the preferred buffer sharing mechanism. Sharing
  202. buffers through GEM names is only supported for legacy userspace.
  203. Furthermore PRIME also allows cross-device buffer sharing since it is
  204. based on dma-bufs.
  205. GEM Objects Mapping
  206. -------------------
  207. Because mapping operations are fairly heavyweight GEM favours
  208. read/write-like access to buffers, implemented through driver-specific
  209. ioctls, over mapping buffers to userspace. However, when random access
  210. to the buffer is needed (to perform software rendering for instance),
  211. direct access to the object can be more efficient.
  212. The mmap system call can't be used directly to map GEM objects, as they
  213. don't have their own file handle. Two alternative methods currently
  214. co-exist to map GEM objects to userspace. The first method uses a
  215. driver-specific ioctl to perform the mapping operation, calling
  216. :c:func:`do_mmap()` under the hood. This is often considered
  217. dubious, seems to be discouraged for new GEM-enabled drivers, and will
  218. thus not be described here.
  219. The second method uses the mmap system call on the DRM file handle. void
  220. \*mmap(void \*addr, size_t length, int prot, int flags, int fd, off_t
  221. offset); DRM identifies the GEM object to be mapped by a fake offset
  222. passed through the mmap offset argument. Prior to being mapped, a GEM
  223. object must thus be associated with a fake offset. To do so, drivers
  224. must call :c:func:`drm_gem_create_mmap_offset()` on the object.
  225. Once allocated, the fake offset value must be passed to the application
  226. in a driver-specific way and can then be used as the mmap offset
  227. argument.
  228. The GEM core provides a helper method :c:func:`drm_gem_mmap()` to
  229. handle object mapping. The method can be set directly as the mmap file
  230. operation handler. It will look up the GEM object based on the offset
  231. value and set the VMA operations to the :c:type:`struct drm_driver
  232. <drm_driver>` gem_vm_ops field. Note that
  233. :c:func:`drm_gem_mmap()` doesn't map memory to userspace, but
  234. relies on the driver-provided fault handler to map pages individually.
  235. To use :c:func:`drm_gem_mmap()`, drivers must fill the struct
  236. :c:type:`struct drm_driver <drm_driver>` gem_vm_ops field
  237. with a pointer to VM operations.
  238. The VM operations is a :c:type:`struct vm_operations_struct <vm_operations_struct>`
  239. made up of several fields, the more interesting ones being:
  240. .. code-block:: c
  241. struct vm_operations_struct {
  242. void (*open)(struct vm_area_struct * area);
  243. void (*close)(struct vm_area_struct * area);
  244. int (*fault)(struct vm_fault *vmf);
  245. };
  246. The open and close operations must update the GEM object reference
  247. count. Drivers can use the :c:func:`drm_gem_vm_open()` and
  248. :c:func:`drm_gem_vm_close()` helper functions directly as open
  249. and close handlers.
  250. The fault operation handler is responsible for mapping individual pages
  251. to userspace when a page fault occurs. Depending on the memory
  252. allocation scheme, drivers can allocate pages at fault time, or can
  253. decide to allocate memory for the GEM object at the time the object is
  254. created.
  255. Drivers that want to map the GEM object upfront instead of handling page
  256. faults can implement their own mmap file operation handler.
  257. For platforms without MMU the GEM core provides a helper method
  258. :c:func:`drm_gem_cma_get_unmapped_area`. The mmap() routines will call
  259. this to get a proposed address for the mapping.
  260. To use :c:func:`drm_gem_cma_get_unmapped_area`, drivers must fill the
  261. struct :c:type:`struct file_operations <file_operations>` get_unmapped_area
  262. field with a pointer on :c:func:`drm_gem_cma_get_unmapped_area`.
  263. More detailed information about get_unmapped_area can be found in
  264. Documentation/nommu-mmap.txt
  265. Memory Coherency
  266. ----------------
  267. When mapped to the device or used in a command buffer, backing pages for
  268. an object are flushed to memory and marked write combined so as to be
  269. coherent with the GPU. Likewise, if the CPU accesses an object after the
  270. GPU has finished rendering to the object, then the object must be made
  271. coherent with the CPU's view of memory, usually involving GPU cache
  272. flushing of various kinds. This core CPU<->GPU coherency management is
  273. provided by a device-specific ioctl, which evaluates an object's current
  274. domain and performs any necessary flushing or synchronization to put the
  275. object into the desired coherency domain (note that the object may be
  276. busy, i.e. an active render target; in that case, setting the domain
  277. blocks the client and waits for rendering to complete before performing
  278. any necessary flushing operations).
  279. Command Execution
  280. -----------------
  281. Perhaps the most important GEM function for GPU devices is providing a
  282. command execution interface to clients. Client programs construct
  283. command buffers containing references to previously allocated memory
  284. objects, and then submit them to GEM. At that point, GEM takes care to
  285. bind all the objects into the GTT, execute the buffer, and provide
  286. necessary synchronization between clients accessing the same buffers.
  287. This often involves evicting some objects from the GTT and re-binding
  288. others (a fairly expensive operation), and providing relocation support
  289. which hides fixed GTT offsets from clients. Clients must take care not
  290. to submit command buffers that reference more objects than can fit in
  291. the GTT; otherwise, GEM will reject them and no rendering will occur.
  292. Similarly, if several objects in the buffer require fence registers to
  293. be allocated for correct rendering (e.g. 2D blits on pre-965 chips),
  294. care must be taken not to require more fence registers than are
  295. available to the client. Such resource management should be abstracted
  296. from the client in libdrm.
  297. GEM Function Reference
  298. ----------------------
  299. .. kernel-doc:: include/drm/drm_gem.h
  300. :internal:
  301. .. kernel-doc:: drivers/gpu/drm/drm_gem.c
  302. :export:
  303. GEM CMA Helper Functions Reference
  304. ----------------------------------
  305. .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c
  306. :doc: cma helpers
  307. .. kernel-doc:: include/drm/drm_gem_cma_helper.h
  308. :internal:
  309. .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c
  310. :export:
  311. VMA Offset Manager
  312. ==================
  313. .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c
  314. :doc: vma offset manager
  315. .. kernel-doc:: include/drm/drm_vma_manager.h
  316. :internal:
  317. .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c
  318. :export:
  319. PRIME Buffer Sharing
  320. ====================
  321. PRIME is the cross device buffer sharing framework in drm, originally
  322. created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME
  323. buffers are dma-buf based file descriptors.
  324. Overview and Driver Interface
  325. -----------------------------
  326. Similar to GEM global names, PRIME file descriptors are also used to
  327. share buffer objects across processes. They offer additional security:
  328. as file descriptors must be explicitly sent over UNIX domain sockets to
  329. be shared between applications, they can't be guessed like the globally
  330. unique GEM names.
  331. Drivers that support the PRIME API must set the DRIVER_PRIME bit in the
  332. struct :c:type:`struct drm_driver <drm_driver>`
  333. driver_features field, and implement the prime_handle_to_fd and
  334. prime_fd_to_handle operations.
  335. int (\*prime_handle_to_fd)(struct drm_device \*dev, struct drm_file
  336. \*file_priv, uint32_t handle, uint32_t flags, int \*prime_fd); int
  337. (\*prime_fd_to_handle)(struct drm_device \*dev, struct drm_file
  338. \*file_priv, int prime_fd, uint32_t \*handle); Those two operations
  339. convert a handle to a PRIME file descriptor and vice versa. Drivers must
  340. use the kernel dma-buf buffer sharing framework to manage the PRIME file
  341. descriptors. Similar to the mode setting API PRIME is agnostic to the
  342. underlying buffer object manager, as long as handles are 32bit unsigned
  343. integers.
  344. While non-GEM drivers must implement the operations themselves, GEM
  345. drivers must use the :c:func:`drm_gem_prime_handle_to_fd()` and
  346. :c:func:`drm_gem_prime_fd_to_handle()` helper functions. Those
  347. helpers rely on the driver gem_prime_export and gem_prime_import
  348. operations to create a dma-buf instance from a GEM object (dma-buf
  349. exporter role) and to create a GEM object from a dma-buf instance
  350. (dma-buf importer role).
  351. struct dma_buf \* (\*gem_prime_export)(struct drm_device \*dev,
  352. struct drm_gem_object \*obj, int flags); struct drm_gem_object \*
  353. (\*gem_prime_import)(struct drm_device \*dev, struct dma_buf
  354. \*dma_buf); These two operations are mandatory for GEM drivers that
  355. support PRIME.
  356. PRIME Helper Functions
  357. ----------------------
  358. .. kernel-doc:: drivers/gpu/drm/drm_prime.c
  359. :doc: PRIME Helpers
  360. PRIME Function References
  361. -------------------------
  362. .. kernel-doc:: include/drm/drm_prime.h
  363. :internal:
  364. .. kernel-doc:: drivers/gpu/drm/drm_prime.c
  365. :export:
  366. DRM MM Range Allocator
  367. ======================
  368. Overview
  369. --------
  370. .. kernel-doc:: drivers/gpu/drm/drm_mm.c
  371. :doc: Overview
  372. LRU Scan/Eviction Support
  373. -------------------------
  374. .. kernel-doc:: drivers/gpu/drm/drm_mm.c
  375. :doc: lru scan roster
  376. DRM MM Range Allocator Function References
  377. ------------------------------------------
  378. .. kernel-doc:: include/drm/drm_mm.h
  379. :internal:
  380. .. kernel-doc:: drivers/gpu/drm/drm_mm.c
  381. :export:
  382. DRM Cache Handling
  383. ==================
  384. .. kernel-doc:: drivers/gpu/drm/drm_cache.c
  385. :export:
  386. DRM Sync Objects
  387. ===========================
  388. .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
  389. :doc: Overview
  390. .. kernel-doc:: include/drm/drm_syncobj.h
  391. :internal:
  392. .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
  393. :export: