[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Review of draft-ietf-radext-tcp-transport (Part I)



Section 1

 

      Transport of fragmented UDP packets

      appears to be a poorly tested code path on network devices.  Some

      devices appear to be incapable of transporting fragmented UDP

      packets, making it difficult to deploy RADIUS in a network where

      those devices are deployed.

 

[BA] In particular, filters routers and firewalls often drop UDP fragments, since otherwise they would need to reassemble them in order to apply the filter rules.

However, this is not a “transport” issue, so much as a forwarding/filtering issue.  Instead of “Transport” you might say “handling”.

 

 

      * Connectionless transport.  Neither clients nor servers receive
      positive statements that a "connection" is down.  This information
      has to be deduced instead from the absence of a reply to a
      request.

 

[BA] The same thing is also true of TCP transport, unless you’re willing to wait for a Reset or connection timeout.  That’s why the Watchdog timer is needed.

 

   As RADIUS is widely deployed, and has been widely deployed for well
   over a decade, these issues have been minor in some use-cases, and
   problematic in others..  New systems may be interested in choosing a
   different set of trade-offs than those outlined in [RFC2865] Section
   2.4.  New systems may also be interested in choosing a more reliable
   transport for use-cases such as inter-server proxying.  For those
   systems, we define RADIUS over TCP

 

[BA] Note double periods in first sentence, and no period in the last sentence.  As I read the document, it is really only suggesting a different set of tradeoffs for inter-server proxying.  So you might say “For use-cases such as inter-server proxying, [RTLS] suggests an alternative transport and security model -- RADIUS over TLS.  This document describes the transport implications of running RADIUS over TLS/TCP.”

 

1.1.  Applicability of Reliable Transport

 
 
   The intent of this document is to address transport issues related to
   RADIUS over TLS [RTLS].  The use of "bare" TCP transport (i.e.
   without TLS) is NOT RECOMMENDED, as there has been little
   implementational or operational experience with it.  Additionally,
   [RFC2865] Section 2.4 contains a list of reasons why UDP was
   originally chosen as the transport protocol for RADIUS.  UDP SHOULD
   be used as transport protocol in all cases where the rationale given
   in [RFC2865] Section 2.4 applies.
 
   Deployment experience with RADIUS over TLS indicates that it is most
   useful for inter-server communication, such as inter-domain
   communication between proxies.  These situations benefit from the
   confidentiality and ciphersuite negotiation that can be provided by
   TLS. Since TLS is already widely available within the operating
   systems used by proxies, implementation barriers are low.
 
   RADIUS over TCP has a similar set of use cases.  Use of TCP as a
   transport between a NAS and RADIUS server is a poor fit, since as
   noted in [RFC3539], there is likely to be insufficient traffic for
   the congestion window to remain above the minimum value on a long-
   term basis.  The result is an increase in packets due to ACKs as
   compared to UDP, without a corresponding set of benefits.
 
   In server-server communications the traffic levels in both directions
   are typically high enough to support a larger congestion window as
   well as ACK piggy-backing.  Through use of an application-layer
   watchdog as described in [RFC3539], it is possible to address the
   objections to reliable transport described in [RFC2865] Section 2.4.
   However, in these scenarios "bare" TCP does not provide for
   confidentiality or enable negotiation of stronger ciphersuites than
   are available in RADIUS.
 
   As a result of these considerations, use of RADIUS over TCP SHOULD be
   restricted to situations where RADIUS over TLS is employed.  RADIUS
   over "bare" TCP is NOT RECOMMENDED.
 
   There are still a number of benefits to using a reliable transport.
   For example, when RADIUS is used to carry EAP conversions [RFC3579],
   the EAP exchanges may involve 5 round trips at the RADIUS application
   layer.  We may assume a probability P of packet loss in each
   direction (with P having a value of 1% or less).  Any one
   authentication attempt will then have at least one lost packet, with
   a probability of approximately (10 * P).
 
   These lost packets require the supplicant and/or the NAS to re-
   transmit packets at the application layer.  The difficulty with this
   approach is that retransmission implementations have historically
   been poor.  Some implementations retransmit packets, others do not,
   and others send new packets rather then performing retransmission.
   Some implementations are incapable of detecting EAP retransmissions,
   and will instead treat the retransmitted packet as an error.
 
   These retransmissions have a high likelihood of causing the entire
   authentication session to fail.  For a system with a million logins a
   day, and having a packet loss probability of P=0.01%, we expect that
   0.1% of connections will experience a lost packet.  That is, 1,000
   user sessions each day will experience authentication failure.
 
   In addition, transport of fragmented UDP packets is a poorly tested
   code path on network devices.  Some devices appear to be incapable of
   transporting fragmented UDP packets, meaning that the packet loss
   rate for fragmented packets approaches 100 percent.  The net effect
   can be to prevent the deployment of authentication methods such as
   EAP-TLS that require large RADIUS packets.
 
   Using a reliable transport method such as TCP means that RADIUS
   implementations can remove all application-layer retransmissions, and
   instead rely on the Operating System (OS) kernel's well-tested TCP
   transport to ensure reliable delivery.  In addition, most TCP
   implementations discover Path MTU better than RADIUS application
   implementations, resulting in significantly fewer fragmented packets.
   Modern TCP implementations also implement anti-spoofing provisions,
   which is more difficult to do in UDP applications.
 
   Transporting RADIUS over TCP means that the RADIUS applications can
   leverage these additional protections offered by TCP.
 
   However, there are also some drawbacks to using TCP.  RADIUS over TCP
   has some drawbacks, as noted in [RFC2865] Section 2.4[RFC3539]
   Section 2 discusses further issues with using TCP as a transport for
   Authentication, Authorization, and/or Accounting (AAA) protocols such
   as RADIUS.
 
   Specifically, as noted in [RFC3539] Section 2.1, for systems
   originating low numbers of RADIUS request packets, inter-packet
   spacing is often larger than the packet RTT.  In those situations,
   RADIUS over TCP SHOULD NOT be used.
 
   In general, RADIUS clients generating small amounts of RADIUS traffic
   SHOULD NOT use TCP.  This suggestion will usually apply to most
   NASes, and to most clients that originate CoA-Request and Disconnect-
   Request packets.
 
   RADIUS over TCP is most applicable to RADIUS proxies that exchange a
   large volume of packets with RADIUS clients and servers (10's to
   1000's of packets per second).  In those situations, RADIUS over TCP
   may be a good fit, and may result in increased network stability and
   performance.

 

[BA]  Suggested rewrite:

 

Section 1.1

 

   The intent of this document is to address transport issues related to

   RADIUS over TLS [RTLS] in inter-server communications scenarios,

   such as inter-domain communication between proxies.  These

   situations benefit from the confidentiality and ciphersuite

   negotiation that can be provided by TLS. Since TLS is already

   widely available within the operating systems used by proxies,

   implementation barriers are low. 

 

   In scenarios where RADIUS proxies exchange a large volume of

   packets (10+ packets per second), it is likely that there will be sufficient

   traffic to enable the congestion window to be widened beyond

   the minimum value on a long-term basis, enabling ACK piggy-backing.

   Through use of an application-layer watchdog as described in [RFC3539],

   it is possible to address the objections to reliable transport

   described in [RFC2865] Section 2.4 without substantial

   watchdog traffic, since regular traffic is expected in both

   directions. 

 

   In addition, use of RADIUS over TLS/TCP has been found to

   improve operational performance when used with multi-round

   trip authentication mechanisms such as RADIUS over EAP [RFC3579].

   In such exchanges, it is typical for EAP fragmentation to

   increase the number of round-trips required.  For example, where

   EAP-TLS authentication [RFC5216] is attempted and both the EAP

   peer and server utilize certificate chains of 8KB, as many as

   15 round-trips can be required if RADIUS packets are restricted

   to 1500 octets in size.  Fragmentation of RADIUS over UDP packets

   is generally inadvisable due to lack of fragmentation support

   within intermediate devices such as filtering routers, firewalls

   and NATs.  However, since RADIUS over UDP implementations typically do

   not support MTU discovery, fragmentation can occur even when the

   maximum RADIUS over UDP packet size is restricted to 1500 octets.

 

   These problems disappear if a 4096 application-layer payload

   can be used alongside RADIUS over TLS/TCP.  Since most TCP

   implementations support MTU discovery, the TCP MSS is automatically

   adjusted to account for the MTU, and the larger congestion

   window supported by TCP may allow multiple TCP segments to

   be sent within a single window.  As a result, RADIUS/EAP

   traffic required for an EAP-TLS authentication with 8KB

   certificate chains may be reduced to 7 round-trips or less,

   resulting in substantially reduced authentication times.

 

   In addition, experience indicates that EAP sessions transported

   over RTLS are less likely to abort unsuccessfully.  Historically,

   RADIUS over UDP implementations have exhibited poor retransmission

   behavior.  Some implementations retransmit packets, others do not,

   and others send new packets rather then performing retransmission.

   Some implementations are incapable of detecting EAP retransmissions,

   and will instead treat the retransmitted packet as an error. 

   As a result, within RADIUS over UDP implementations, retransmissions

   have a high likeilhood of causing an EAP authentication session

   to fail.  For a system with a million logins a day running EAP-TLS

   mutual authentication with 15 round-trips, and having a packet loss

   probability of P=0.01%, we expect that 0.3% of connections will experience

   at least one lost packet.  That is, 3,000 user sessions each day

   will experience authentication failure.  This is an unacceptable

   failure rate for a mass-market network service.

 

   Using a reliable transport method such as TCP means that RADIUS

   implementations can remove all application-layer retransmissions, and

   instead rely on the Operating System (OS) kernel's well-tested TCP

   transport to ensure reliable delivery.  In addition, most TCP

   implementations discover Path MTU better than RADIUS application

   implementations, resulting in significantly fewer fragmented packets.

   Modern TCP implementations also implement anti-spoofing provisions,

   which is more difficult to do in UDP applications.

 

   In contrast, use of TLS/TCP as a transport between a NAS and a

   RADIUS server is a poor fit.  As noted in [RFC3539] Section 2.1,

   for systems originating low numbers of RADIUS request packets,

   inter-packet spacing is often larger than the packet RTT, and

   as a result, the congestion window will t ypically not remain

   above the minimum value on a long-term basis. The

   result is an increase in packets due to ACKs as compared to UDP,

   without a corresponding set of benefits.  In addition, the lack

   of substantial traffic implies the need for additional watchdog

   traffic to confirm reachability. 

 

   As a result, the objections to reliable transport indicated in

   [RFC2865] Section 2.4 continue to apply to NAS-RADIUS server

   communications and UDP SHOULD continue to be used as the transport

   protocol in this scenario.  In addition, it is recommended that

   implementations of "RADIUS Dynamic AUthorization Extensions" [RFC5176]

   SHOULD continue to utilize UDP transport, since the volume of

   dynamic authorization traffic is usually expected to be small. 

 

   Since "bare" TCP does not provide for confidentiality or enable

   negotiation of credible ciphersuites, its use is not appropriate for

   inter-server communications where strong security is required.  As a result

   the use of "bare" TCP transport (i.e. without TLS) is NOT RECOMMENDED

   for use in any situation, and there has been little or no

   operational experience with it.