123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131 |
- Segmentation Offloads in the Linux Networking Stack
- Introduction
- ============
- This document describes a set of techniques in the Linux networking stack
- to take advantage of segmentation offload capabilities of various NICs.
- The following technologies are described:
- * TCP Segmentation Offload - TSO
- * UDP Fragmentation Offload - UFO
- * IPIP, SIT, GRE, and UDP Tunnel Offloads
- * Generic Segmentation Offload - GSO
- * Generic Receive Offload - GRO
- * Partial Generic Segmentation Offload - GSO_PARTIAL
- TCP Segmentation Offload
- ========================
- TCP segmentation allows a device to segment a single frame into multiple
- frames with a data payload size specified in skb_shinfo()->gso_size.
- When TCP segmentation requested the bit for either SKB_GSO_TCP or
- SKB_GSO_TCP6 should be set in skb_shinfo()->gso_type and
- skb_shinfo()->gso_size should be set to a non-zero value.
- TCP segmentation is dependent on support for the use of partial checksum
- offload. For this reason TSO is normally disabled if the Tx checksum
- offload for a given device is disabled.
- In order to support TCP segmentation offload it is necessary to populate
- the network and transport header offsets of the skbuff so that the device
- drivers will be able determine the offsets of the IP or IPv6 header and the
- TCP header. In addition as CHECKSUM_PARTIAL is required csum_start should
- also point to the TCP header of the packet.
- For IPv4 segmentation we support one of two types in terms of the IP ID.
- The default behavior is to increment the IP ID with every segment. If the
- GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
- ID and all segments will use the same IP ID. If a device has
- NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
- and we will either increment the IP ID for all frames, or leave it at a
- static value based on driver preference.
- UDP Fragmentation Offload
- =========================
- UDP fragmentation offload allows a device to fragment an oversized UDP
- datagram into multiple IPv4 fragments. Many of the requirements for UDP
- fragmentation offload are the same as TSO. However the IPv4 ID for
- fragments should not increment as a single IPv4 datagram is fragmented.
- IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
- ========================================================
- In addition to the offloads described above it is possible for a frame to
- contain additional headers such as an outer tunnel. In order to account
- for such instances an additional set of segmentation offload types were
- introduced including SKB_GSO_IPIP, SKB_GSO_SIT, SKB_GSO_GRE, and
- SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify
- cases where there are more than just 1 set of headers. For example in the
- case of IPIP and SIT we should have the network and transport headers moved
- from the standard list of headers to "inner" header offsets.
- Currently only two levels of headers are supported. The convention is to
- refer to the tunnel headers as the outer headers, while the encapsulated
- data is normally referred to as the inner headers. Below is the list of
- calls to access the given headers:
- IPIP/SIT Tunnel:
- Outer Inner
- MAC skb_mac_header
- Network skb_network_header skb_inner_network_header
- Transport skb_transport_header
- UDP/GRE Tunnel:
- Outer Inner
- MAC skb_mac_header skb_inner_mac_header
- Network skb_network_header skb_inner_network_header
- Transport skb_transport_header skb_inner_transport_header
- In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
- SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the
- fact that the outer header also requests to have a non-zero checksum
- included in the outer header.
- Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header
- has requested a remote checksum offload. In this case the inner headers
- will be left with a partial checksum and only the outer header checksum
- will be computed.
- Generic Segmentation Offload
- ============================
- Generic segmentation offload is a pure software offload that is meant to
- deal with cases where device drivers cannot perform the offloads described
- above. What occurs in GSO is that a given skbuff will have its data broken
- out over multiple skbuffs that have been resized to match the MSS provided
- via skb_shinfo()->gso_size.
- Before enabling any hardware segmentation offload a corresponding software
- offload is required in GSO. Otherwise it becomes possible for a frame to
- be re-routed between devices and end up being unable to be transmitted.
- Generic Receive Offload
- =======================
- Generic receive offload is the complement to GSO. Ideally any frame
- assembled by GRO should be segmented to create an identical sequence of
- frames using GSO, and any sequence of frames segmented by GSO should be
- able to be reassembled back to the original by GRO. The only exception to
- this is IPv4 ID in the case that the DF bit is set for a given IP header.
- If the value of the IPv4 ID is not sequentially incrementing it will be
- altered so that it is when a frame assembled via GRO is segmented via GSO.
- Partial Generic Segmentation Offload
- ====================================
- Partial generic segmentation offload is a hybrid between TSO and GSO. What
- it effectively does is take advantage of certain traits of TCP and tunnels
- so that instead of having to rewrite the packet headers for each segment
- only the inner-most transport header and possibly the outer-most network
- header need to be updated. This allows devices that do not support tunnel
- offloads or tunnel offloads with checksum to still make use of segmentation.
- With the partial offload what occurs is that all headers excluding the
- inner transport header are updated such that they will contain the correct
- values for if the header was simply duplicated. The one exception to this
- is the outer IPv4 ID field. It is up to the device drivers to guarantee
- that the IPv4 ID field is incremented in the case that a given header does
- not have the DF bit set.
|