[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [RRG] LISP-NERD reachability and MTU detection
Fragmentation is indeed harmful especially if it is left
unchecked in the presence of sustained high data rates
(RFC4963). Based on analysis over many years, complete
avoidance of all fragmentation and in all operational
scenarios is not possible if applications are to see
a reasonable path MTU. However, fragmentation must be
detected, managed and especially *dampened* in order to
overcome the issues. Having examined this for some time,
my opinion is that effective mitigation of Tunnel Near-
End (TNE) -> Tunnel Far End (TFE) fragmentation and
reassembly begins with protocol and ends with full
deployment of links and TFEs that support larger MTUs.
As many have suggested over the years, IMHO it is better to
have the TNE perform fragmentation on encapsualted packets
rather than leave it up to chance occurrence in the network.
(Iljitsch expressed an idea of having the TNE split 1500 byte
packets up into two 750 byte packets instead of 1480 + 20,
and I admit there is a certain attraction to that concept.)
Unlike Iljitsch, however, I believe there needs to be a
"magic number" above which no TNE->TFE fragmentation is
permitted - call it the "TNE->TFE fragmentation threshold".
Note that this is specifically *not* saying that there needs
to be an upper bound on the tunnel MTU - quite the opposite.
Key considerations are: 1) 1500 bytes has become the
"magic number" expected by applications, 2) 1280 bytes is
the "magic number" specified for IPv6, and 3) fragmentation
at the TFE MUST be kept to a minimum in order to avoid
reassembly misassociations at the TFE. Of these, IMHO 3) is
the dominating consideration followed distantly by 1). ( 2)
is the hard lower bound for IPv6, and we can't change that.)
In particular, I want to see a requirement that TNEs MUST NOT
configure a fragmentation threshold larger than 1500 bytes
for the packets they admit into the tunnel. This would be
large enough to satisfy 1) and 2), yet small enough not to
diminish any integrity checks performed by the TFE during
decapsulation. Even so, in some deployments (i.e., paths that
include links with ~1500 byte or smaller MTUs) a fragmentation
threshold of 1500 bytes will result in fragmentation which MUST
be mitigated and preferrably dampened. Therefore, I believe a
TNE<->TFE protocol such as that specified in sprite-mtu is
required for the near term, but its use can eventually
diminish and fade as network equipment is transitioned over
the longer term.
Specific transitions I would like to see include:
1) Require that all TFEs configure an EMTU_R that is no
smaller than 2KB and at least as large as the smallest
EMTU_R of all underlying links over which the TFE is
configured. (IMHO 2KB is a good number because it
allows for a 1500 byte fragmentation threshold at the
TNE yet allows room for additional encapsulations
on the path.)
2) Require that all links transition to adopting IEEE
802.3as Ethernet Frame Size expansion, or better yet
Gigabit Ethernet Jumboframes.
3) Require that all original sources that send packets
of 1501 bytes or larger with DF=1 also implement
This leaves open the question of how does the ITR handle
1501+ byte packets with DF=1. The two viable choices I see
are: a) have the TNE probe the TFE to determine a "Packet
Too Big (PTB) threshold" above which it sends a PTB back to
the original source, or b) simply have the TNE admit the
packet into the tunnel with no PTB returned, and let the
original source deal with any MTU-releated loss. IMHO, TNEs
are advised to do a) until such time that widespread use of
RFC4821 can be confirmed, after which they may do b).
In both the 1500-and-smaller and 1501-and-larger cases,
a simple UDP echo service such as specified in sprite-mtu
would support the short-term requirements until the
wish-list of network equipment transitions is complete.
(Its use would extend also into the long term, but in a
> -----Original Message-----
> From: Dino Farinacci [mailto:email@example.com]
> Sent: Sunday, December 16, 2007 11:35 AM
> To: Tony Li
> Cc: Iljitsch van Beijnum; Routing Research Group list
> Subject: Re: [RRG] LISP-NERD reachability and MTU detection
> > On Dec 16, 2007, at 10:23 AM, Dino Farinacci wrote:
> >> So what's wrong with fragmentation?
> > Please see http://tinyurl.com/23nc5x
> Not that harmful if it's a remote corner case.
> to unsubscribe send a message to firstname.lastname@example.org with the
> word 'unsubscribe' in a single line as the message text body.
> archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg
to unsubscribe send a message to email@example.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg