[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Status of Operational issues with Tiny Fragments in IPv6



On May 26, 2006, at 1:21 AM, Elwyn Davies wrote:
I would suggest that rather than making this into a separate draft we can look at improving the text in the security overview (if necessary) given that we are probably going to have to do *another* round on this one.

That seems rational to me.

A quick comment:

- no overlapping fragments and
- non-last fragments to be close to the guaranteed minimum MTU.

I think the latter point wants to read "non-last fragments close to the PATH MTU, and therefore at least as large as the largest fragment size less than or equal in size to the minimum MTU."

There is a discussion of fragmentation and reassembly in RFC 1812 section 4.2.2.7 that may be useful to reference, or at least learn lessons from. In part, this results from the behavior of NIC cards in the late 1980's that couldn't reliably receive datagrams back to back for very long due to chip or memory issues, and partly this is is due to brain-dead behaviors in early end-station OS's. One issue was that many systems sent packets at the minimum MTU size (then 576 bytes, derived from the memory structure of the Fuzzball and BBN Routers) rather than at the path MTU, which meant that there were greater opportunities for per-datagram errors because there were more of them. There were also various strategies on how to fragment - should the first datagram be the smallest, the last, or should the fragmenting system try to make them all approximately the same size? It makes some specific recommendations:

4.2.2.7 Fragmentation: RFC 791 Section 3.2

   Fragmentation, as described in [INTERNET:1], MUST be supported by a
   router.

When a router fragments an IP datagram, it SHOULD minimize the number of fragments. When a router fragments an IP datagram, it SHOULD send the fragments in order. A fragmentation method that may generate one
   IP fragment that is significantly smaller than the other MAY cause
   the first IP fragment to be the smaller one.

   DISCUSSION
      There are several fragmentation techniques in common use in the
      Internet.  One involves splitting the IP datagram into IP
      fragments with the first being MTU sized, and the others being
approximately the same size, smaller than the MTU. The reason for
      this is twofold.  The first IP fragment in the sequence will be
      the effective MTU of the current path between the hosts, and the
      following IP fragments are sized to minimize the further
      fragmentation of the IP datagram.  Another technique is to split
      the IP datagram into MTU sized IP fragments, with the last
fragment being the only one smaller, as described in [INTERNET:1].

      A common trick used by some implementations of TCP/IP is to
fragment an IP datagram into IP fragments that are no larger than
      576 bytes when the IP datagram is to travel through a router.
      This is intended to allow the resulting IP fragments to pass the
      rest of the path without further fragmentation.  This would,
      though, create more of a load on the destination host, since it
would have a larger number of IP fragments to reassemble into one IP datagram. It would also not be efficient on networks where the
      MTU only changes once and stays much larger than 576 bytes.
Examples include LAN networks such as an IEEE 802.5 network with a
      MTU of 2048 or an Ethernet network with an MTU of 1500).

      One other fragmentation technique discussed was splitting the IP
      datagram into approximately equal sized IP fragments, with the
      size less than or equal to the next hop network's MTU.  This is
      intended to minimize the number of fragments that would result
      from additional fragmentation further down the path, and assure
      equal delay for each fragment.

Routers SHOULD generate the least possible number of IP fragments.

      Work with slow machines leads us to believe that if it is
      necessary to fragment messages, sending the small IP fragment
      first maximizes the chance of a host with a slow interface of
      receiving all the fragments.

I think the NIC card issue (where should the smallest fragment be?) is historical and not to be worried about. But the matter of generating (by whatever algorithm) the least number of fragments that can represent a transport-level message is, I think, important.