blob: 674be43e227a161ce7f6930520641d35b4c21b7d [file] [log] [blame] [view]
# LwIP changes for Matter
LwIP is one of the network layers used in the Matter platform. Although it has
some good IPv6 support, there are areas that are lacking that we should
implement for Matter. The recommendations here are listed roughly from most to
least important.
## Route Information Options (RIO)
The specification requires devices to store route options from Route Information
Options (RIO) sent in router advertisements. This functionality is not currently
present in upstream LwIP. The patch to add this is relatively small, but we may
need to upstream this in order to require its use in Matter. Platforms would
need to incorporate this into their own middleware
### Recommendation:
- write a RIO patch, upstream to lwip
- Ensure patch is RFC compliant (especially re: expiry)
- UPDATE: Patch is available at https://savannah.nongnu.org/patch/?10114
## Address Scopes
Link local addresses are less common on IPv4, which normally rely on NAT at the
router to do address translation. Matter mandates the use of IPv6 link local
addresses for communication to nodes on the same network (wifi or thread). When
there is more than one netif in the system (ex. loopback, softAP, STA), the link
local address needs more information to determine which link the address is
local to. This is normally added as the link local scope and can be seen on
addresses ex. `FE80::xxxx:xxxx:xxxx:xxxx%<scope>`, where the <scope> identifies
the netif (something like `%wlan0` or `%eno1` etc.).
Without this indicator, the link local address can only be resolved if there is
one netif. LwIP will also allow a direct address match to the netif source
address, but this does not scale well at all and is VERY racy. LwIP also
supports output to a specific netif, but this is not brought up to the socket
layer.
Upstream LwIP has support for IPv6 address scopes, but only as an option.
However, the code to support this is not present in the CHIP LwIP codebase.
Other platform versions assume this option is not present (ex. M5 has an
assertion on ip address sizes that disallow the use of a scope tag).
### Recommendation:
- Ensure Matter SDK code works with scopes on our various platforms OR
alternate: bring netif sendto up through the api / sockets layers
- Audit Matter code to ensure LL addresses are properly scoped to their netif
in all areas (DNS returned addresses especially)
## Duplicate address detection
The DAD in LwIP is actually implemented correctly right now, but there are
routers that incorrectly implement multicast for IPv6 and send packets back to
the sender. This triggers the LwIP DAD because it doesnt check the source. This
can be fixed in the wifi layer as a filter, but its easy enough to add the fix
into the LwIP layer. This would help implementers so they dont all have to
debug the same issues. Recommendation:
- Create an LwIP patch to check NS/NA packets for source and discard if they
originate from the same device. Upstream and offer patch to vendors.
## Timers, including TCP
lwIP uses on-demand timers for IGMP and MLD (see
https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/lwip.html#esp-lwip-custom-modifications
for changes Espressif made to lwIP to help power usage on ESP32 and better
support IPv6), and also has several uncorrelated always-on timers for TCP. These
timers have caused power issues on some products.
### Recommendation:
- Make sure to take-in Espressif improvements to timers (not sure they are
upstreamed)
- Look into supporting aligned TCP timers to aggregate multiple timers within
a single wake
## pbuf management
Pool-based management has been a source of problems on several products, but
does have advantages over purely heap based allocation of pbufs as done in
ESP32 and many common lwIP stacks.
Overall, having the ability to instrument all PBUF allocations for usage (e.g.
Driver TX, Driver RX, Manual PacketBuffer allocation, internal TCP stack pbufs,
etc) would allow us to move towards a pool approach by allowing us to track the
following: Understanding of the overall memory usage of lwIP packet buffers over
time, helping debug issues related to out-of-pbuf or overly-long queuing. Keep
track of incoming packets dwelling and outgoing packets dwelling to start
dropping at ingress when running out of memory Overall, allow sizing of heap and
pools based on usage patterns.
### Recommendation:
- Upstream a portable version of pbuf alloc/free accounting, allowing
registration of instrumentation handlers.
- Add support to account for high watermark of pbuf memory used and concurrent
pbuf allocations.
- Add more pbuf allocation types to allow finer-grained recording of reason
for a pbuf alloc
## IPv6 Ping
Although ping is not required for Matter, it is very helpful for debugging
networking issues. Having a reliable ping would be beneficial for a lot of
developers.
LwIP will automatically respond to pings, but has no built-in way to send them.
The current ping implementation is a contrib app that only works for IPv4.
Extending the app is challenging for two reasons: 1) IPv6 checksum needs access
to the pbuf for calculation, which the app doesnt have and 2) IPv6 has a lot
more ICMP traffic for SLAAC that the app would have to be updated to disregard.
Instead, it might be better to build this into the ICMP layer itself.
### Recommendation:
- Add an ASYNC send_icmp6_ping function and add a hook to check ping
responses. Upstream patch if possible. OR write an external ICMP6 ping util
## DNS
LwIP's DNS handling isn’t great and breaks down when the router supports
IPv4/IPv6. There is a single list of DNS servers, DHCP, SLAAC and DHCPv6 all
update the list without locks. Basically, whatever wrote to the list last gets
to set the list. Although there is handling for IP type (requesting A or `AAAA`
records), there isn’t handling to specify an IPv6 or IPv4 server specifically,
which can be challenging since not all servers serve all record types.
The design of the weave connectivity manager moves the DNS selection to the
upper layers by stopping lwip from directly changing the DNS list and hooking to
the DNS selection. This means the DNS selection policy isn’t hard-coded into the
lwip layer. This seems like a good model for CHIP going forward.
Additionally, we should ensure that CHIP uses non-blocking DNS APIs.
### Recommendation:
- bug fix for DHCPv6 to avoid it setting bad addresses.
- note - fixed in
https://git.savannah.nongnu.org/cgit/lwip.git/commit/?id=941300c21c45a4dbf1c074b29a9ca3c88c9f6553,
but not yet released as a part of an official release.
- Create a patch to add hooks to the SetDns and GetDns functions so logic for
selecting the DNS server can be moved into the manager layer