[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] MTU, jumboframes, ITR & ETR placement, ITR function in hosts



Hi Iljitsch,

In http://psg.com/lists/rrg/2007/msg00614.html I wrote that my
understanding of your message 611 was that so many hosts do not
respond properly to RFC 1191 "Packet Too Big" (PTB actually
"Datagram Too Big") messages for Path MTU Discovery to work with
Fred's Sprite proposal or my IPTM proposal.  Checking to see if this
is what you meant, or whether you meant the problem was due to
filters dropping the PTB messages, you replied that you meant the
latter (msg 628).

I think this latter problem should not prevent IPTM, or an ITR-ETR
implementation of Sprite from working well.

> What you see is that if you are behind a link with a < 1500-byte
> MTU (but not directly attached to that link because then the TCP
> MSS option avoids trouble) 

At least for TCP packets.  Maybe not for other protocols which
choose their own packet sizes and foolishly assume a 1500 byte MTU.

> that you can't reach certain
> destinations on the internet: sessions establish but then stall.
> In most cases, you, the client, will request something and the
> request is small but the reply is large. So this triggers PMTUD
> on the part of the remote website etc but they never adjust their
> packet sizes so nothing happens. If this happens with one site
> you could go after them and hopefully they'll fix it, but that
> doesn't work when it's tens or hundreds of thousands. The only 
> option you have as a customer behind a PPPoE link or some such is
> for your CPE to adjust the TCP MSS so others never send you large
> packets in the first place. And because everyone does that, the
> people responsible for the problem continue their brokenness.

I think it is not acceptable for such host adjustments to be
required in order to make an ITR-ETR scheme robust against PMTUD
black holes.

Here is a diagram depicting my understanding of what you describe:

Server                                                    Client

 SH-->--R1--->---R2--->---R3--->---R4--->---R5-->--R6-->--DH

Link 1       2        3        4        5       6      7

 Hosting co.     |   ISPA |        DFZ      | ISPB | DSL
 network         |        |                 |      | link

Let's say the reduced MTU is in link 7.  The  filter which drops the
ICMP PTB messages generated by the DH (Destination Host) is located in:

Case 1:  ISPB (R5 and/or R6).

Case 2:  ISPA (R2 and/or R3).

Case 3:  Hosting company (R1).

In Case 1, the client is going to have PMTUD troubles with all hosts
outside ISPB that try to send it packets 1500 bytes long, as many
will.  I assume Case 1 will not persist, because any ISP with
customers using MTU-challenged DSL links would be badly mistaken to
drop the ICMP PTB messages their customer's hosts generate.

In Case 3, the hosting company is shooting itself in the foot - but
as you describe, those who suffer are generally unable to figure out
what is wrong, and are too few and too widely scattered to provide
sufficient feedback for the hosting company to change their ways.

Case 2 is much the same, but I would think it daft for an ISP to
busy itself filtering out ICMP messages of any kind for one of its
presumably technically sophisticated customers.

Now I will consider these cases with an ITR-ETR scheme and either
Sprite or IPTM.  In both Sprite and IPTM, the ITR uses special probe
packets (not traffic packets) to determine the PMTU to the ETR.  The
ITPM ITR could receive PTB messages which result from probe packets
which are too large for some router en-route to the ETR, but the ITR
does not depend at all on PTB messages to determine the PMTU to the
ETR.  The ITR and the ETR have their own explicit protocol and the
test is what length probe packets are acknowledged by the ETR.

Once the ITR has reliably determined the PMTU to the ETR (or at
least stopped probing for higher values, being satisfied with the
relatively high value it has already proven to work) it can
communicate this to the Sending Host (SH) via an RFC 1191 PTB
message, assuming the SH sends a packet larger than this.  That
message contains a value such that the SH will send packets of a
maximum size, which when encapsulated, will just fit in the reliably
determined PMTU limit to the ETR.

Please see my recent message on how IPTM would behave and querying
Fred on how Sprite would behave:

  http://psg.com/lists/rrg/2007/msg00635.html


Anyway, back to Case 1, 2 and 3.  There ITR could be located
anywhere from the SH Sending Host itself to a router in the DFZ.

With all such ITR placements, Case 1 will generate persistent
trouble for customers, as it does today - so we would expect that
very few Case 1 filtering arrangements would persist.

For Case 2 and 3, what matters is whether the ICMP filter is between
the Sending Host and the ITR.  An ITR function in the host (Ivip
ITFH) fixes that problem!

In either Case 2 or 3, as long as the filtering is between the SH
and the ITR, there is going to be persistent trouble.  This is
because the ubiquitous use of the ITR-ETR scheme will cause far more
PMTU limitations than currently exist, and (assuming the ITR-ETR
scheme includes something like IPTM or Sprite) the filtering will
clobber the only way the ITR can tell the SH to send shorter packets
to cope with these restrictions.

Perhaps, in the early stages of ITR-ETR deployment, when only a
handful of Destination Hosts use the ITR-ETR-mapped addresses, there
may be such a low level of trouble that the errant party (the
hosting company or ISPA which is filtering out the ICMP packets)
doesn't get enough pressure to change their ways.

This would have the effect of making the ITR-ETR-mapped address
space suck - which we want to avoid like the plague.

Once the ITR-ETR scheme was widely used (I see no other option for
the Net), then I expect that the problems would become so persistent
that the errant party would ensure their filters allow the PTB
messages to go back to the SH.

  - Robin


--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg