[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: comments on current-practices-05



Hi Pekka....thanks for thorough read and my replies are embedded....

On Jul 17, 2006, at 4:31 AM, Pekka Savola wrote:

Hi,

I read current-practices-05 on the plane. Some parts seem very assertive ("This is the practice") as I doubt the practice is necessarily that commonplace especially for smaller ISPs (the ones OPSEC is probably targeting). I think the document may still need a bit more work, but I'd suspect we could be done with it after a revision or two.

Abstract specifically states that this is a survey of large ISPs........basically the ones considered tier1......the practices which most ISPs had in common I was more assertive than the ones where the implementation of a practice varied widely. Note that my personal opinion is that smaller ISPs *should* follow the practices that larger ISPs follow to avoid a lot of the risks and 'propagation of bad behavior' . But I certainly do understand the business/ operational time tradeoff. Remember that the survey was undertaken to see more what capabilities are necessary and missing rather than what ISPs aren't doing that maybe they should (which would mostly apply to the smaller ones).


Specific comments below..

Higher level
------------

==> section 2.2 should *really* be merged with 2.3. It's almost identical except the terms. That should save 4 pages and reduce repetition and text
duplication.

I had made the decision to leave separate but can merge if that is really necessary.


==> particularly some text in 2.2/2.3 seems to imply that some SPs have _really_ strict security policies. I wonder if this is applicable outside (some) tier1/tier2 networks. This seems a LOT of work..

The ones surveyed do and this is a survey of large ISPs.....


==> some sections have the "confidentiality violations", "off-line crypto attacks", etc. subsections while some do not. Some are also 2.x.y.y subsections where others are 2.x.y. Should this be consistent. Are these sections even necessary? At least in some cases it seems as if the typical attacks don't fit well under that kind of classification?

html formatting error for where subsections are not 2.x.y.z so thanks for noticing that. I had a lot of comments earlier in this work to include so I'd hate to take this out. Can you point to some specific examples where the typical attack doesn't fit well?


substantial
-----------

However, if appropriate monitoring mechanisms are in place,
   these attacks can be as easily detected and mitigated as with any
   other attack source.

==> [these attacks] refers to insider attacks and unintentional events. It's a bit of stretch to claim the equal amount of detection for the former.
Maybe add "typically" or some such here...

Not sure I agree. Let's say someone mistypes a routing configuration. When a routing propagation error is detected you go through same method to resolve issue and what's the difference if it's sourced erroneously from your box vs someone else's? Isn't detection and mitigation similar? Once you figure out where error is coming from then you have to determine if box was compromised or whether it was an honest mistake but that should be traceable from logs and audit trails (if enough info is available).

Can you give me an example of where there is a big difference in effort? Be happy to change text if I am missing a point here.....


For privileged (i.e. enable) access, a second
   authentication step needs to be completed.

==> this seems vendor-specific. Not all vendors have a separation of ro/rw access levels (for most purposes in any case), but are rather role- based.
(Also in sect 2.1.3 though text there is more generic.)

Equipment that the ISPs use that I surveyed this is the case.........perhaps all equipment should support this functionality?!? Remember that this doc is supposed to support capabilities documents and give a justification of why specific capabilities should be supported.....


2.1.3.  Security Services

   o  Access Control - Not applicable

==> hmm. I thought this was discussed at the start of the section?

Access Control in this context is meant as logical......filtering basically.....I'll add a note when I introduce that in section 1.5 document layout.


2.2.1
For off-path active
   attacks, the attack is generally limited to message insertion or
   modification.

==> how could an off-path attacker perform message modification ?
(The same in section 2.3.1 and actually many others as well, including
2.6.1-- maybe I have a different notion of "off-path"..)

Let's say an attacker causes traffic to be rerouted to him and then he's basically a MIM....although technically 'off-path' since he is not in the correct data path. Sure, grey area of on-path vs off-path but I think this applies. Most attacks are combination
of attacks from how people try and classify them........

2.2.2
  Static username/passwords are expired after a specified
   period of time, usually 30 days.

==> is this true? If I'd have to bet, in most cases passwords never change, at least not more often than in the timescale of O(year)... Maybe "usually"
is exaggeration..

Yes, definitely true in the case of the ISPs I surveyed.....and automated as well in some cases.....remember this is a survey of large ISPs....


When an
   individual leaves the company, his/her AAA account is immediately
deleted and the TACACS/RADIUS shared secret is reset for all devices.

==> is this really done? Seems like a lot of bother if a random NOC guy
would change jobs..

Yes, this is done for basically all ISPs I surveyed.......


The community strings are carefully chosen to
   be difficult to crack and there are procedures in place to change
   these community strings between 30-90 days.

==> does this happen..? Wow.. I guess I should consider a career in ISP security business -- there seems to be a lot of work to do there :-)

Remember that a lot of this is automated (through years of work) by these larger ISPs.......and yes, it does actually happen......

 With thousands of devices to manage, some ISPs have created automated
mechanisms to authenticate to devices. Kerberos is used to automate
   the authentication process.  An individual would first log in to a
   Kerberized UNIX server using SSH and generate a Kerberos 'ticket'.
This 'ticket' is generally set to have a lifespan of 10 hours and is
   used to automatically authenticate the individual to the
   infrastructure devices.

==> Umm. I think there is only about one vendor supporting Kerberos. Maybe the "is used to.." should be changed to "can be used, where applicable, .."? It is not clear to me why a multi-vendor ISP would bother to implement
Kerberos for partial protection though...

You would have to ask the ISPs in question but it seemed like an elegant solution when logging in to several devices at once when torubleshooting issues. I agree that I should have a note here to say that few vendors support Kerberos and that this can be used where applicable.


  The ISPs which do not have performance issues with their equipment
   follow BCP38 [RFC2827] and BCP84 [RFC3704] guidelines for ingress
   filtering.

==> the doc should probably say something explicitly about filters or lack thereof between service providers. (May also imply minor wording change in section 2.4.3 DOA) Note that if you don't have any filters, you IX peers could steal transit from you by static routing. This is why at least some
smaller parties have implemented ingress/egress filters which block
unexpected source/destination addresses..

I will add a sentence or two to highlight this issue.

For layer 2 devices, MAC address filtering and authentication is not
used. This is due to the problems it can cause when troubleshooting
   networking issues.  Port security becomes unmanageable at a large
   scale where 1000s of switches are deployed.

==> maybe rephrased "is not used" with "is not typically used in large-scale
deployments".

Agreed....and will change....

One such example is at edge boxes
where you have up to 1000 T1's connecting into a router with an OC-12 uplink. Some deployed devices experience a large performance impact
   with filtering which is unacceptable for passing customer traffic
   through.

==> so why wouldn't uRPF be applied at the box at the other end of this OC-12 uplink then? The security perimeter would be drawn in a different
place but this should work fine...

Something to ask the ISPs who were surveyed.......anyone on the list reading this care to comment? All large ISPs I talked to did not use MAc address filtering or authentication.


2.5.4.  Replay Attacks

   For a replay attack to be successful, the routing control plane
traffic would need to first be captured either on-path or diverted to
   an attacker to later be replayed to the intended recipient.

==> most routing protocols have some way or another to protect from replay (I'm assuming this means replaying without modification, because otherwise this would be just insertion), so in the general case I doubt this is very
useful..

But it can be done in some instances right? Or are you advocating just getting rid of this
paragraph?


[[ similar comment applies to 2.6.4. -- I don't see how you could apply a replay attack here in practical sense -- TFTP config download at most but
because config isn't signed, modification is more lucrative...]]

Again, are you advocating I should just get rid of this paragraph or reword?



Note that validating
   whether a legitimate peer has the authority to send the contents of
   the routing update is a difficult problem that needs yet to be
   resolved.

==> this statement appears slightly misplaced (or redundant) given that the
next paragraph discusses this very issue..

Redundancy in my mind is good....and I prefer to think of it as a lead-in :) You feel strongly
that it should not be there?

Consistency between these policies
   varies greatly although there is a trend to start depending on AS-
   PATH filters because they are much more manageable than the large
   numbers of prefix filters that would need to be maintained.

==> the important distinction here is probably towards an end-site vs peer or another big ISP (even if a customer). There is no argument that you should prefix filter your end-site customers. Others are up to debate..

==> the document should mention maximum-prefix-limiters, typically applied with peers or others where prefix lists cannot be applied. This helps in
avoiding unintentional leaks, misconfig, etc.

Agreed. Can you offer any specific wording? I.e. a sentence or two that would be appropriate?

2.5.9.  Additional Considerations

==> it'd seem to be that a part of this information overlaps with 2.5.7 and
should actually go there (e.g., related to route filters).

Possibly be in both places....I do state in filtering section that most filtering concerns have been discussed in other sections.....

  In all configuration files, most passwords are stored in an
   obfuscated format.

==> it's not clear whether you refer to crypted format (i.e., vulnerable to off-line dictionary attacks) or omitted or otherwise mangled completely.

Will clarify by changing wording...


   o  Data Integrity - All systems use either a CRC-check or MD5
      authentication to ensure data integrity.

==> you might also mention the generic config handling methods here (ala
rancid) that you described earlier.

OK


2.7.2.  Security Practices

   Logging is mostly performed on an exception auditing basis when it
   comes to filtering (i.e. traffic which is NOT allowed is logged).
   This is to assure that the logging servers are not overwhelmed with
   data which would render most logs unusable.  Typically the data
logged will contain the source and destination IP addresses and layer
   4 port numbers as well as a timestamp. [...]

==> this seems to be presume that 'logging' only refers to ACL logging, not
sending syslog about router's various other actions (adjacency events,
tracebacks, etc.). But you clarify this later on in the section. Should
the text be adjusted from the start?

I don't know.....do you think it's necessary? I don't but will defer to wg concensus....


The timestamp is derived from NTP which is generally configured as a
   flat hierarchy at stratum1 and stratum2 to have less configuration
   and less maintenance.  Each router is configured with one stratum1
   peer both locally and remotely.

==> was the choice of word 'peer' intentional? Did you mean 'server'? Peers (routers here) are able to update their peers' clock, which you might not want to allow for stratum 1 NTP clocks. It's also not clear what you
meant with "one strateum peer _both locally and remotely_"..

I'll reword for clarification....


 This provides a backup mechanism to see
   what is going on in the network in the event that a device may
   'forget' to do syslog if the CPU is busy.

==> you may also want to mention here that as syslog is an unreliable
protocol, when routers boot or lose adjacencies, not all messages will get delivered. Some vendors may implement syslog buffering (e.g., buffer the messages until you have a route to the syslog destination) but this is not standard. Hence, operators have to live with the fact that syslogs can be
incomplete, and often may need to take a look at the local syslogs at
devices to see what has happened. (Unfortunately, with many vendors, local
syslog buffer is very short..)

Agreed...


Routing filters are used to control the flow of routing information. In IPv6 networks, some providers are liberal in accepting /48s due to
   the still unresolved multihoming issues.

==> you should also add the other side of the coin: "Others filter at
allocation boundaries (i.e., typically at /32)."

OK....


2.9.2.  Black-Hole Triggered Routing

==> it might be worth pointing out here or somewhere that blackholing
techniques may actually fulfill the goal of the attacker. If the attacker wanted to shut down www.ebay.com (or whatever), blackholing the traffic would do exactly that. On the other hand, blackholing might decrease the _collateral_ damage caused by an overly large attack aimed at something
other than a critical service.

OK.....I can add that as an added consideration for implementing this technique


   uRPF is not used on interfaces that are likely to have routing
   asymmetry, meaning multiple routes to the source of a packet.
   Usually for ISPs, uRPF is placed at the customer edge of a network.

==> as described in rfc 3704 and draft-savola-bcp84-urpf-experiences,
asymmetry does not preclude applying feasible paths strict uRPF so maybe a
few more words or a toning down would be needed here.

OK.




editorial
---------

==> the MUST, SHOULD, etc. language terminology can be removed as it isn't used in this document.

Any of the specific attacks discussed further
   in this document will elaborate on attacks which are sourced by an
   "outsider" and are deliberate attacks.

==> "Any of the .." ?  "... attacks ... elaborate on attacks ..."?
A bit of rewording would help here.

Why?  Too many uses of the word 'attacks'?  Hmmmm......


itself, arguably the largest. oldest and most well understood area of

==> s/./,/

   between 3-10 minutes.  Individual users are authentication to get
   basic access.  For privileged (i.e. enable) access, a second

==> s/ion/ed/ ?

   right systems.  SNMP RW is not used and disabled by configuration.

==> s/disabled/is disabled/

machines and usually have limited access. Note that Telent is NEVER

==> s/Telent/Telnet/

  causing unwelcome ICPM redirects, creating unwelcome IP options or

==> s/ICPM/ICMP/

2.7.1.3.  Replay Attacks

For a replay attack to be successful, the logging data would need to first be captured either on-path or diverted to an attacker and later
   replayed to the recipient. [is reply handled by syslog protocol?]

==> s/reply/replay/, maybe also remove the bracketed comment or integrate it
to the text..

   to a network device.  A good guideline for IPv6 filtering is in the
draft work in progress on Best Current Practices for Filtering ICMPv6
   Messages in Firewalls [I-D.ietf-v6ops-icmpv6-filtering-bcp].

==> this I-D was downgraded to informational and the name is now ..- recs

Informative References

==> there appear to be some refs which are not cited in the main body of the
doc.

Again, thanks....

- merike