|  | .. _net_pkt_interface: | 
|  |  | 
|  | Packet Management | 
|  | ################# | 
|  |  | 
|  | .. contents:: | 
|  | :local: | 
|  | :depth: 2 | 
|  |  | 
|  | Overview | 
|  | ******** | 
|  |  | 
|  | Network packets are the main data the networking stack manipulates. | 
|  | Such data is represented through the net_pkt structure which provides | 
|  | a means to hold the packet, write and read it, as well as necessary | 
|  | metadata for the core to hold important information. Such an object is | 
|  | called net_pkt in this document. | 
|  |  | 
|  | The data structure and the whole API around it are defined in | 
|  | :zephyr_file:`include/net/net_pkt.h`. | 
|  |  | 
|  | Architectural notes | 
|  | =================== | 
|  |  | 
|  | There are two network packets flows within the stack, **TX** for the | 
|  | transmission path, and **RX** for the reception one. In both paths, | 
|  | each net_pkt is written and read from the beginning to the end, or | 
|  | more specifically from the headers to the payload. | 
|  |  | 
|  |  | 
|  | Memory management | 
|  | ***************** | 
|  |  | 
|  | Allocation | 
|  | ========== | 
|  |  | 
|  | All net_pkt objects come from a pre-defined pool of struct net_pkt. | 
|  | Such pool is defined via | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | NET_PKT_SLAB_DEFINE(name, count) | 
|  |  | 
|  | Note, however, one will rarely have to use it, as the core provides | 
|  | already two pools, one for the TX path and one for the RX path. | 
|  |  | 
|  | Allocating a raw net_pkt can be done through: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | pkt = net_pkt_alloc(timeout); | 
|  |  | 
|  | However, by its nature, a raw net_pkt is useless without a buffer and | 
|  | needs various metadata information to become relevant as well.  It | 
|  | requires at least to get the network interface it is meant to be sent | 
|  | through or through which it was received. As this is a very common | 
|  | operation, a helper exist: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | pkt = net_pkt_alloc_on_iface(iface, timeout); | 
|  |  | 
|  | A more complete allocator exists, where both the net_pkt and its buffer | 
|  | can be allocated at once: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | pkt = net_pkt_alloc_with_buffer(iface, size, family, proto, timeout); | 
|  |  | 
|  | See below how the buffer is allocated. | 
|  |  | 
|  |  | 
|  | Buffer allocation | 
|  | ================= | 
|  |  | 
|  | The net_pkt object does not define its own buffer, but instead uses an | 
|  | existing object for this: :c:type:`struct net_buf`. (See | 
|  | :ref:`net_buf_interface` for more information). However, it mostly | 
|  | hides the usage of such a buffer because net_pkt brings network | 
|  | awareness to buffer allocation and, as we will see later, its | 
|  | operation too. | 
|  |  | 
|  | To allocate a buffer, a net_pkt needs to have at least its network | 
|  | interface set. This works if the family of the packet is unknown at | 
|  | the time of buffer allocation. Then one could do: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_alloc_buffer(pkt, size, proto, timeout); | 
|  |  | 
|  | Where proto could be 0 if unknown (there is no IPPROTO_UNSPEC). | 
|  |  | 
|  | As seen previously, the net_pkt and its buffer can be allocated at | 
|  | once via :c:func:`net_pkt_alloc_with_buffer`. It is actually the most | 
|  | widely used allocator. | 
|  |  | 
|  | The network interface, the family, and the protocol of the packet are | 
|  | used by the buffer allocation to determine if the requested size can | 
|  | be allocated.  Indeed, the allocator will use the network interface to | 
|  | know the MTU and then the family and protocol for the headers space | 
|  | (if only these 2 are specified).  If the whole fits within the MTU, | 
|  | the allocated space will be of the requested size plus, eventually, | 
|  | the headers space. If there is insufficient MTU space, the requested | 
|  | size will be shrunk so the possible headers space and new size will | 
|  | fit within the MTU. | 
|  |  | 
|  | For instance, on an Ethernet network interface, with an MTU of 1500 | 
|  | bytes: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | pkt = net_pkt_alloc_with_buffer(iface, 800, AF_INET4, IPPROTO_UDP, K_FOREVER); | 
|  |  | 
|  | will successfully allocate 800 + 20 + 8 bytes of buffer for the new | 
|  | net_pkt where: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | pkt = net_pkt_alloc_with_buffer(iface, 1600, AF_INET4, IPPROTO_UDP, K_FOREVER); | 
|  |  | 
|  | will successfully allocate 1500 bytes, and where 20 + 8 bytes (IPv4 + | 
|  | UDP headers) will not be used for the payload. | 
|  |  | 
|  | On the receiving side, when the family and protocol are not known: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | pkt = net_pkt_rx_alloc_with_buffer(iface, 800, AF_UNSPEC, 0, K_FOREVER); | 
|  |  | 
|  | will allocate 800 bytes and no extra header space. | 
|  | But a: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | pkt = net_pkt_rx_alloc_with_buffer(iface, 1600, AF_UNSPEC, 0, K_FOREVER); | 
|  |  | 
|  | will allocate 1514 bytes, the MTU + Ethernet header space. | 
|  |  | 
|  | One can increase the amount of buffer space allocated by calling | 
|  | :c:func:`net_pkt_alloc_buffer`, as it will take into account the | 
|  | existing buffer. It will also account for the header space if | 
|  | net_pkt's family is a valid one, as well as the proto parameter. In | 
|  | that case, the newly allocated buffer space will be appended to the | 
|  | existing one, and not inserted in the front. Note however such a use | 
|  | case is rather limited.  Usually, one should know from the start how | 
|  | much size should be requested. | 
|  |  | 
|  |  | 
|  | Deallocation | 
|  | ============ | 
|  |  | 
|  | Each net_pkt is reference counted. At allocation, the reference is set | 
|  | to 1.  The reference count can be incremented with | 
|  | :c:func:`net_pkt_ref()` or decremented with | 
|  | :c:func:`net_pkt_unref()`. When the count drops to zero the buffer is | 
|  | also un-referenced and net_pkt is automatically placed back into the | 
|  | free net_pkt_slabs | 
|  |  | 
|  | If net_pkt's buffer is needed even after net_pkt deallocation, one | 
|  | will need to reference once more all the chain of net_buf before | 
|  | calling last net_pkt_unref. See :ref:`net_buf_interface` for more | 
|  | information. | 
|  |  | 
|  |  | 
|  | Operations | 
|  | ********** | 
|  |  | 
|  | There are two ways to access the net_pkt buffer, explained in the | 
|  | following sections: basic read/write access and data access, the | 
|  | latter being the preferred way. | 
|  |  | 
|  | Read and Write access | 
|  | ===================== | 
|  |  | 
|  | As said earlier, though net_pkt uses net_buf for its buffer, it | 
|  | provides its own API to access it. Indeed, a network packet might be | 
|  | scattered over a chain of net_buf objects, the functions provided by | 
|  | net_buf are then limited for such case.  Instead, net_pkt provides | 
|  | functions which hide all the complexity of potential non-contiguous | 
|  | access. | 
|  |  | 
|  | Data movement into the buffer is made through a cursor maintained | 
|  | within each net_pkt.  All read/write operations affect this | 
|  | cursor. Note as well that read or write functions are strict on their | 
|  | length parameters: if it cannot r/w the given length it will | 
|  | fail. Length is not interpreted as an upper limit, it is instead the | 
|  | exact amount of data that must be read or written. | 
|  |  | 
|  | As there are two paths, TX and RX, there are two access modes: write | 
|  | and overwrite.  This might sound a bit unusual, but is in fact simple | 
|  | and provides flexibility. | 
|  |  | 
|  | In write mode, whatever is written in the buffer affects the length of | 
|  | actual data present in the buffer. Buffer length should not be | 
|  | confused with the buffer size which is a limit any mode cannot pass. | 
|  | In overwrite mode then, whatever is written must happen on valid data, | 
|  | and will not affect the buffer length. By default, a newly allocated | 
|  | net_pkt is on write mode, and its cursor points to the beginning of | 
|  | its buffer. | 
|  |  | 
|  | Let's see now, step by step, the functions and how they behave | 
|  | depending on the mode. | 
|  |  | 
|  | When freshly allocated with a buffer of 500 bytes, a net_pkt has 0 | 
|  | length, which means no valid data is in its buffer. One could verify | 
|  | this by: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | len = net_pkt_get_len(pkt); | 
|  |  | 
|  | Now, let's write 8 bytes: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_write(pkt, data, 8); | 
|  |  | 
|  | The buffer length is now 8 bytes. | 
|  | There are various helpers to write a byte, or big endian uint16_t, uint32_t. | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_write_u8(pkt, &foo); | 
|  | net_pkt_write_be16(pkt, &ba); | 
|  | net_pkt_write_be32(pkt, &bar); | 
|  |  | 
|  | Logically, net_pkt's length is now 15. But if we try to read at this | 
|  | point, it will fail because there is nothing to read at the cursor | 
|  | where we are at in the net_pkt. It is possible, while in write mode, | 
|  | to read what has been already written by resetting the cursor of the | 
|  | net_pkt. For instance: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_cursor_init(pkt); | 
|  | net_pkt_read(pkt, data, 15); | 
|  |  | 
|  | This will reset the cursor of the pkt to the beginning of the buffer | 
|  | and then let you read the actual 15 bytes present. The cursor is then | 
|  | again pointing at the end of the buffer. | 
|  |  | 
|  | To set a large area with the same byte, a memset function is provided: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_memset(pkt, 0, 5); | 
|  |  | 
|  | Our net_pkt has now a length of 20 bytes. | 
|  |  | 
|  | Switching between modes can be achieved via | 
|  | :c:func:`net_pkt_set_overwrite` function. It is possible to switch | 
|  | mode back and forth at any time.  The net_pkt will be set to overwrite | 
|  | and its cursor reset: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_set_overwrite(pkt, true); | 
|  | net_pkt_cursor_init(pkt); | 
|  |  | 
|  | Now the same operators can be used, but it will be limited to the | 
|  | existing data in the buffer, i.e. 20 bytes. | 
|  |  | 
|  | If it is necessary to know how much space is available in the net_pkt | 
|  | call: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_available_buffer(pkt); | 
|  |  | 
|  | Or, if headers space needs to be accounted for, call: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_available_payload_buffer(pkt, proto); | 
|  |  | 
|  | If you want to place the cursor at a known position use the function | 
|  | :c:func:`net_pkt_skip`.  For example, to go after the IP header, use: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_cursor_init(pkt); | 
|  | net_pkt_skip(pkt, net_pkt_ip_header_len(pkt)); | 
|  |  | 
|  |  | 
|  | Data access | 
|  | =========== | 
|  |  | 
|  | Though the API shown previously is rather simple, it involves always | 
|  | copying things to and from the net_pkt buffer. In many occasions, it | 
|  | is more relevant to access the information stored in the buffer | 
|  | contiguously, especially with network packets which embed headers. | 
|  |  | 
|  | These headers are, most of the time, a known fixed set of bytes. It is | 
|  | then more natural to have a structure representing a certain type of | 
|  | header.  In addition to this, if it is known the header size appears | 
|  | in a contiguous area of the buffer, it will be way more efficient to | 
|  | cast the actual position in the buffer to the type of header. Either | 
|  | for reading or writing the fields of such header, accessing it | 
|  | directly will save memory. | 
|  |  | 
|  | Net pkt comes with a dedicated API for this, built on top of the | 
|  | previously described API. It is able to handle both contiguous and | 
|  | non-contiguous access transparently. | 
|  |  | 
|  | There are two macros used to define a data access descriptor: | 
|  | :c:macro:`NET_PKT_DATA_ACCESS_DEFINE` when it is not possible to | 
|  | tell if the data will be in a contiguous area, and | 
|  | :c:macro:`NET_PKT_DATA_ACCESS_CONTIGUOUS_DEFINE` when | 
|  | it is guaranteed the data is in a contiguous area. | 
|  |  | 
|  | Let's take the example of IP and UDP. Both IPv4 and IPv6 headers are | 
|  | always found at the beginning of the packet and are small enough to | 
|  | fit in a net_buf of 128 bytes (for instance, though 64 bytes could be | 
|  | chosen). | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | NET_PKT_DATA_ACCESS_CONTIGUOUS_DEFINE(ipv4_access, struct net_ipv4_hdr); | 
|  | struct net_ipv4_hdr *ipv4_hdr; | 
|  |  | 
|  | ipv4_hdr = (struct net_ipv4_hdr *)net_pkt_get_data(pkt, &ipv4_acess); | 
|  |  | 
|  | It would be the same for struct net_ipv4_hdr. For a UDP header it | 
|  | is likely not to be in a contiguous area in IPv6 | 
|  | for instance so: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | NET_PKT_DATA_ACCESS_DEFINE(udp_access, struct net_udp_hdr); | 
|  | struct net_udp_hdr *udp_hdr; | 
|  |  | 
|  | udp_hdr = (struct net_udp_hdr *)net_pkt_get_data(pkt, &udp_access); | 
|  |  | 
|  | At this point, the cursor of the net_pkt points at the beginning of | 
|  | the requested data. On the RX path, these headers will be read but not | 
|  | modified so to proceed further the cursor needs to advance past the | 
|  | data. There is a function dedicated for this: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_acknowledge_data(pkt, &ipv4_access); | 
|  |  | 
|  | On the TX path, however, the header fields have been modified. In such | 
|  | a case: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | net_pkt_set_data(pkt, &ipv4_access); | 
|  |  | 
|  | If the data are in a contiguous area, it will advance the cursor | 
|  | relevantly. If not, it will write the data and the cursor will be | 
|  | updated. Note that :c:func:`net_pkt_set_data` could be used in the RX | 
|  | path as well, but it is slightly faster to use | 
|  | :c:func:`net_pkt_acknowledge_data` as this one does not care about | 
|  | contiguity at all, it just advances the cursor via | 
|  | :c:func:`net_pkt_skip` directly. | 
|  |  | 
|  |  | 
|  | API Reference | 
|  | ************* | 
|  |  | 
|  | .. doxygengroup:: net_pkt | 
|  | :project: Zephyr |