| .. _net_pkt_interface: |
| |
| Packet Management |
| ################# |
| |
| .. contents:: |
| :local: |
| :depth: 2 |
| |
| Overview |
| ******** |
| |
| Network packets are the main data the networking stack manipulates. |
| Such data is represented through the net_pkt structure which provides |
| a means to hold the packet, write and read it, as well as necessary |
| metadata for the core to hold important information. Such an object is |
| called net_pkt in this document. |
| |
| The data structure and the whole API around it are defined in |
| :zephyr_file:`include/net/net_pkt.h`. |
| |
| Architectural notes |
| =================== |
| |
| There are two network packets flows within the stack, **TX** for the |
| transmission path, and **RX** for the reception one. In both paths, |
| each net_pkt is written and read from the beginning to the end, or |
| more specifically from the headers to the payload. |
| |
| |
| Memory management |
| ***************** |
| |
| Allocation |
| ========== |
| |
| All net_pkt objects come from a pre-defined pool of struct net_pkt. |
| Such pool is defined via |
| |
| .. code-block:: c |
| |
| NET_PKT_SLAB_DEFINE(name, count) |
| |
| Note, however, one will rarely have to use it, as the core provides |
| already two pools, one for the TX path and one for the RX path. |
| |
| Allocating a raw net_pkt can be done through: |
| |
| .. code-block:: c |
| |
| pkt = net_pkt_alloc(timeout); |
| |
| However, by its nature, a raw net_pkt is useless without a buffer and |
| needs various metadata information to become relevant as well. It |
| requires at least to get the network interface it is meant to be sent |
| through or through which it was received. As this is a very common |
| operation, a helper exist: |
| |
| .. code-block:: c |
| |
| pkt = net_pkt_alloc_on_iface(iface, timeout); |
| |
| A more complete allocator exists, where both the net_pkt and its buffer |
| can be allocated at once: |
| |
| .. code-block:: c |
| |
| pkt = net_pkt_alloc_with_buffer(iface, size, family, proto, timeout); |
| |
| See below how the buffer is allocated. |
| |
| |
| Buffer allocation |
| ================= |
| |
| The net_pkt object does not define its own buffer, but instead uses an |
| existing object for this: :c:struct:`net_buf`. (See |
| :ref:`net_buf_interface` for more information). However, it mostly |
| hides the usage of such a buffer because net_pkt brings network |
| awareness to buffer allocation and, as we will see later, its |
| operation too. |
| |
| To allocate a buffer, a net_pkt needs to have at least its network |
| interface set. This works if the family of the packet is unknown at |
| the time of buffer allocation. Then one could do: |
| |
| .. code-block:: c |
| |
| net_pkt_alloc_buffer(pkt, size, proto, timeout); |
| |
| Where proto could be 0 if unknown (there is no IPPROTO_UNSPEC). |
| |
| As seen previously, the net_pkt and its buffer can be allocated at |
| once via :c:func:`net_pkt_alloc_with_buffer`. It is actually the most |
| widely used allocator. |
| |
| The network interface, the family, and the protocol of the packet are |
| used by the buffer allocation to determine if the requested size can |
| be allocated. Indeed, the allocator will use the network interface to |
| know the MTU and then the family and protocol for the headers space |
| (if only these 2 are specified). If the whole fits within the MTU, |
| the allocated space will be of the requested size plus, eventually, |
| the headers space. If there is insufficient MTU space, the requested |
| size will be shrunk so the possible headers space and new size will |
| fit within the MTU. |
| |
| For instance, on an Ethernet network interface, with an MTU of 1500 |
| bytes: |
| |
| .. code-block:: c |
| |
| pkt = net_pkt_alloc_with_buffer(iface, 800, AF_INET4, IPPROTO_UDP, K_FOREVER); |
| |
| will successfully allocate 800 + 20 + 8 bytes of buffer for the new |
| net_pkt where: |
| |
| .. code-block:: c |
| |
| pkt = net_pkt_alloc_with_buffer(iface, 1600, AF_INET4, IPPROTO_UDP, K_FOREVER); |
| |
| will successfully allocate 1500 bytes, and where 20 + 8 bytes (IPv4 + |
| UDP headers) will not be used for the payload. |
| |
| On the receiving side, when the family and protocol are not known: |
| |
| .. code-block:: c |
| |
| pkt = net_pkt_rx_alloc_with_buffer(iface, 800, AF_UNSPEC, 0, K_FOREVER); |
| |
| will allocate 800 bytes and no extra header space. |
| But a: |
| |
| .. code-block:: c |
| |
| pkt = net_pkt_rx_alloc_with_buffer(iface, 1600, AF_UNSPEC, 0, K_FOREVER); |
| |
| will allocate 1514 bytes, the MTU + Ethernet header space. |
| |
| One can increase the amount of buffer space allocated by calling |
| :c:func:`net_pkt_alloc_buffer`, as it will take into account the |
| existing buffer. It will also account for the header space if |
| net_pkt's family is a valid one, as well as the proto parameter. In |
| that case, the newly allocated buffer space will be appended to the |
| existing one, and not inserted in the front. Note however such a use |
| case is rather limited. Usually, one should know from the start how |
| much size should be requested. |
| |
| |
| Deallocation |
| ============ |
| |
| Each net_pkt is reference counted. At allocation, the reference is set |
| to 1. The reference count can be incremented with |
| :c:func:`net_pkt_ref()` or decremented with |
| :c:func:`net_pkt_unref()`. When the count drops to zero the buffer is |
| also un-referenced and net_pkt is automatically placed back into the |
| free net_pkt_slabs |
| |
| If net_pkt's buffer is needed even after net_pkt deallocation, one |
| will need to reference once more all the chain of net_buf before |
| calling last net_pkt_unref. See :ref:`net_buf_interface` for more |
| information. |
| |
| |
| Operations |
| ********** |
| |
| There are two ways to access the net_pkt buffer, explained in the |
| following sections: basic read/write access and data access, the |
| latter being the preferred way. |
| |
| Read and Write access |
| ===================== |
| |
| As said earlier, though net_pkt uses net_buf for its buffer, it |
| provides its own API to access it. Indeed, a network packet might be |
| scattered over a chain of net_buf objects, the functions provided by |
| net_buf are then limited for such case. Instead, net_pkt provides |
| functions which hide all the complexity of potential non-contiguous |
| access. |
| |
| Data movement into the buffer is made through a cursor maintained |
| within each net_pkt. All read/write operations affect this |
| cursor. Note as well that read or write functions are strict on their |
| length parameters: if it cannot r/w the given length it will |
| fail. Length is not interpreted as an upper limit, it is instead the |
| exact amount of data that must be read or written. |
| |
| As there are two paths, TX and RX, there are two access modes: write |
| and overwrite. This might sound a bit unusual, but is in fact simple |
| and provides flexibility. |
| |
| In write mode, whatever is written in the buffer affects the length of |
| actual data present in the buffer. Buffer length should not be |
| confused with the buffer size which is a limit any mode cannot pass. |
| In overwrite mode then, whatever is written must happen on valid data, |
| and will not affect the buffer length. By default, a newly allocated |
| net_pkt is on write mode, and its cursor points to the beginning of |
| its buffer. |
| |
| Let's see now, step by step, the functions and how they behave |
| depending on the mode. |
| |
| When freshly allocated with a buffer of 500 bytes, a net_pkt has 0 |
| length, which means no valid data is in its buffer. One could verify |
| this by: |
| |
| .. code-block:: c |
| |
| len = net_pkt_get_len(pkt); |
| |
| Now, let's write 8 bytes: |
| |
| .. code-block:: c |
| |
| net_pkt_write(pkt, data, 8); |
| |
| The buffer length is now 8 bytes. |
| There are various helpers to write a byte, or big endian uint16_t, uint32_t. |
| |
| .. code-block:: c |
| |
| net_pkt_write_u8(pkt, &foo); |
| net_pkt_write_be16(pkt, &ba); |
| net_pkt_write_be32(pkt, &bar); |
| |
| Logically, net_pkt's length is now 15. But if we try to read at this |
| point, it will fail because there is nothing to read at the cursor |
| where we are at in the net_pkt. It is possible, while in write mode, |
| to read what has been already written by resetting the cursor of the |
| net_pkt. For instance: |
| |
| .. code-block:: c |
| |
| net_pkt_cursor_init(pkt); |
| net_pkt_read(pkt, data, 15); |
| |
| This will reset the cursor of the pkt to the beginning of the buffer |
| and then let you read the actual 15 bytes present. The cursor is then |
| again pointing at the end of the buffer. |
| |
| To set a large area with the same byte, a memset function is provided: |
| |
| .. code-block:: c |
| |
| net_pkt_memset(pkt, 0, 5); |
| |
| Our net_pkt has now a length of 20 bytes. |
| |
| Switching between modes can be achieved via |
| :c:func:`net_pkt_set_overwrite` function. It is possible to switch |
| mode back and forth at any time. The net_pkt will be set to overwrite |
| and its cursor reset: |
| |
| .. code-block:: c |
| |
| net_pkt_set_overwrite(pkt, true); |
| net_pkt_cursor_init(pkt); |
| |
| Now the same operators can be used, but it will be limited to the |
| existing data in the buffer, i.e. 20 bytes. |
| |
| If it is necessary to know how much space is available in the net_pkt |
| call: |
| |
| .. code-block:: c |
| |
| net_pkt_available_buffer(pkt); |
| |
| Or, if headers space needs to be accounted for, call: |
| |
| .. code-block:: c |
| |
| net_pkt_available_payload_buffer(pkt, proto); |
| |
| If you want to place the cursor at a known position use the function |
| :c:func:`net_pkt_skip`. For example, to go after the IP header, use: |
| |
| .. code-block:: c |
| |
| net_pkt_cursor_init(pkt); |
| net_pkt_skip(pkt, net_pkt_ip_header_len(pkt)); |
| |
| |
| Data access |
| =========== |
| |
| Though the API shown previously is rather simple, it involves always |
| copying things to and from the net_pkt buffer. In many occasions, it |
| is more relevant to access the information stored in the buffer |
| contiguously, especially with network packets which embed headers. |
| |
| These headers are, most of the time, a known fixed set of bytes. It is |
| then more natural to have a structure representing a certain type of |
| header. In addition to this, if it is known the header size appears |
| in a contiguous area of the buffer, it will be way more efficient to |
| cast the actual position in the buffer to the type of header. Either |
| for reading or writing the fields of such header, accessing it |
| directly will save memory. |
| |
| Net pkt comes with a dedicated API for this, built on top of the |
| previously described API. It is able to handle both contiguous and |
| non-contiguous access transparently. |
| |
| There are two macros used to define a data access descriptor: |
| :c:macro:`NET_PKT_DATA_ACCESS_DEFINE` when it is not possible to |
| tell if the data will be in a contiguous area, and |
| :c:macro:`NET_PKT_DATA_ACCESS_CONTIGUOUS_DEFINE` when |
| it is guaranteed the data is in a contiguous area. |
| |
| Let's take the example of IP and UDP. Both IPv4 and IPv6 headers are |
| always found at the beginning of the packet and are small enough to |
| fit in a net_buf of 128 bytes (for instance, though 64 bytes could be |
| chosen). |
| |
| .. code-block:: c |
| |
| NET_PKT_DATA_ACCESS_CONTIGUOUS_DEFINE(ipv4_access, struct net_ipv4_hdr); |
| struct net_ipv4_hdr *ipv4_hdr; |
| |
| ipv4_hdr = (struct net_ipv4_hdr *)net_pkt_get_data(pkt, &ipv4_acess); |
| |
| It would be the same for struct net_ipv4_hdr. For a UDP header it |
| is likely not to be in a contiguous area in IPv6 |
| for instance so: |
| |
| .. code-block:: c |
| |
| NET_PKT_DATA_ACCESS_DEFINE(udp_access, struct net_udp_hdr); |
| struct net_udp_hdr *udp_hdr; |
| |
| udp_hdr = (struct net_udp_hdr *)net_pkt_get_data(pkt, &udp_access); |
| |
| At this point, the cursor of the net_pkt points at the beginning of |
| the requested data. On the RX path, these headers will be read but not |
| modified so to proceed further the cursor needs to advance past the |
| data. There is a function dedicated for this: |
| |
| .. code-block:: c |
| |
| net_pkt_acknowledge_data(pkt, &ipv4_access); |
| |
| On the TX path, however, the header fields have been modified. In such |
| a case: |
| |
| .. code-block:: c |
| |
| net_pkt_set_data(pkt, &ipv4_access); |
| |
| If the data are in a contiguous area, it will advance the cursor |
| relevantly. If not, it will write the data and the cursor will be |
| updated. Note that :c:func:`net_pkt_set_data` could be used in the RX |
| path as well, but it is slightly faster to use |
| :c:func:`net_pkt_acknowledge_data` as this one does not care about |
| contiguity at all, it just advances the cursor via |
| :c:func:`net_pkt_skip` directly. |
| |
| |
| API Reference |
| ************* |
| |
| .. doxygengroup:: net_pkt |
| :project: Zephyr |