IPv6, fragmentation and MTU
Unlike IPv4 where routers can (and do) fragment packets when a larger MTU packet must be forwarded over a path that does not support it. Assuming the packet doesn’t have the df-bit set, the IPv4 router will fragment a packet and forward two or more packets downstream for reassembly by the end host.
In the IPv6 world, routers or intermediate nodes do not fragment packets. A packet received by a router that is too large to forward will cause the router to drop the packet. This practice was likely designed into IPv6 for certain security vulnerabilities, where sending fragments could hide protocol level header information and cause reassembly surprises, DoS a host or router when trying to process these fragments or reassemble them, and a slew of various issues that made fragmentation become a IPv4 four-letter word.
In IPv6, although the packet is dropped, the router or node that drops the packet sends a ICMPv6 message “packet too big” back to the source host (Minimum IPv6 MTU has been defined as 1280). The source host then must perform the fragmentation and re-transmit the ensuing flow at a smaller MTU. The end node performs packet reassembly. This behaviour clearly indicates that arbitrarily dropping ICMP, which was a common practice by clueless IPv4 system administrators of yesteryear, is now a deprecated practice 😉
Ensuring ICMPv6 messaging delivery is particularly important in today’s reality, as a significant proportion of IPv6-capable networks are utilizing a tunneling mechanism of some sort, be it Hurricane Electric’s excellent tunnel broker service, 6RD, or static IPv6-over-IPv4 tunnels; a 1500 MTU cannot be assumed. The use of path mtu discovery (PMTUD), is critical.
The use of tracepath6, becomes particularily interesting – my distro (RedHat et al) seems to include it in the iputils package:
Here is an example of non-tunnelled, native dual-stack tracepath from a node to www.ripe.net:
tracepath6 www.ripe.net -n 1?: [LOCALHOST] pmtu 1500 1: 2001:db1::3 1.785ms 2: 2001:db1::2 asymm 1 1.944ms 3: 2001:db1:2:16::9 asymm 2 2.937ms 4: ::ffff:188.8.131.52 asymm 11 16.193ms 5: ::ffff:184.108.40.206 asymm 11 16. 60ms 6: ::ffff:220.127.116.11 asymm 11 16. 69ms 7: ::ffff:18.104.22.168 asymm 11 16.279ms 8: 2001:550:3::8a asymm 9 15.860ms 9: 2001:5a0:600:500::a asymm 8 16.746ms 10: 2001:5a0:f00:400::1 19.212ms 11: 2001:5a0:2000:400::19 asymm 9 90.538ms 12: 2001:5a0:2000:500::1 asymm 8 86.440ms 13: 2001:5a0:2000:500::a asymm 9 103.741ms 14: 2001:5a0:200:100::5 asymm 11 103.324ms 15: 2001:5a0:200:200::16 asymm 9 97.851ms 16: no reply 17: 2001:67c:2e8:27::1 asymm 13 98.410ms !A Resume: pmtu 1500
Apologies for the ::ffff:154.0.0.xx IPv4 mapped addresses, AS174 (Cogent…) seems to return them in traceroutes…Yuck! And the asymm warning, which normally can indicate an asymmetrical path, is apparently caused by my local network’s use of HSRP. It can also appear in certain tracepaths when an intermediate router is a Juniper, which I am told reduces the TTL by one before transmitting ICMP, and tracepath(6) uses TTL to determine path symmetry. I don’t have a Juniper to test this <:-)
Anyway, here is an example of a tracepath through a IPv6-over-IPv4 tunnel:
tracepath6 -n www.ripe.net 1?: [LOCALHOST] pmtu 1500 1: 2001:db1:ffff:fffd::1 5.301ms 1: 2001:db1:ffff:fffd::1 5.135ms 2: 2001:db1:ffff:fffd::1 5.144ms pmtu 1480 2: 2001:db1:ffff:ffff::1 7.608ms 2: 2001:db1:ffff:ffff::1 8.031ms 3: 2001:db1:2:16::9 8.369ms 4: ::ffff:22.214.171.124 20.434ms asymm 13 5: ::ffff:126.96.36.199 20.127ms asymm 13 6: ::ffff:188.8.131.52 23.160ms asymm 13 7: ::ffff:184.108.40.206 23.119ms asymm 13 8: 2001:550:3::8a 29.943ms asymm 11 9: 2001:5a0:600:500::a 25.208ms asymm 10 10: 2001:5a0:f00:400::1 25.141ms asymm 12 11: 2001:5a0:2000:400::19 95.158ms 12: 2001:5a0:2000:500::1 95.819ms asymm 10 13: 2001:5a0:2000:500::a 109.372ms asymm 11 14: 2001:5a0:200:100::5 111.653ms asymm 12 15: 2001:5a0:200:200::16 107.308ms asymm 11 16: 2001:7f8:1::a500:3333:2 181.523ms asymm 15 17: 2001:67c:2e8:27::1 105.202ms !A Resume: pmtu 1480
tracepath6 is a useful tool, especially in today’s Internet where IPv6 connectivity is not necessarily the usual 1500-byte MTU standard of the past and many hosts use tunneling transition mechanisms to access IPv6 content. It will be useful to identify broken ICMPv6 paths, which for v6 traffic is essential for reliable communication. Until dual-stack or even IPv6-only nodes are the norm, PMTUD and ICMPv6 will be critically important. Here is hoping sysadmins will be wise enough to follow best practices and allow ICMPv6 messages, especially “packet too big” and “destination unreachable” messages.