[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: comments on current-practices-05
Hi Pekka....thanks for thorough read and my replies are embedded....
On Jul 17, 2006, at 4:31 AM, Pekka Savola wrote:
Hi,
I read current-practices-05 on the plane. Some parts seem very
assertive ("This is the practice") as I doubt the practice is
necessarily that commonplace especially for smaller ISPs (the ones
OPSEC is probably targeting). I think the document may still need
a bit more work, but I'd suspect we could be done with it after a
revision or two.
Abstract specifically states that this is a survey of large
ISPs........basically the ones considered tier1......the practices
which most ISPs had in common I was more assertive than the ones
where the implementation of a practice varied widely. Note that my
personal opinion is that smaller ISPs *should* follow the practices
that larger ISPs follow to avoid a lot of the risks and 'propagation
of bad behavior' . But I certainly do understand the business/
operational time tradeoff. Remember that the survey was undertaken
to see more what capabilities are necessary and missing rather than
what ISPs aren't doing that maybe they should (which would mostly
apply to the smaller ones).
Specific comments below..
Higher level
------------
==> section 2.2 should *really* be merged with 2.3. It's almost
identical
except the terms. That should save 4 pages and reduce repetition
and text
duplication.
I had made the decision to leave separate but can merge if that is
really necessary.
==> particularly some text in 2.2/2.3 seems to imply that some SPs
have _really_ strict security policies. I wonder if this is
applicable outside (some) tier1/tier2 networks. This seems a LOT
of work..
The ones surveyed do and this is a survey of large ISPs.....
==> some sections have the "confidentiality violations", "off-line
crypto attacks", etc. subsections while some do not. Some are also
2.x.y.y subsections where others are 2.x.y. Should this be
consistent. Are these sections even necessary? At least in some
cases it seems as if the typical attacks don't fit well under that
kind of classification?
html formatting error for where subsections are not 2.x.y.z so thanks
for noticing that. I had a lot of comments earlier in this work to
include so I'd hate to take this out. Can you point to some specific
examples where the typical attack doesn't fit well?
substantial
-----------
However, if appropriate monitoring mechanisms are in place,
these attacks can be as easily detected and mitigated as with any
other attack source.
==> [these attacks] refers to insider attacks and unintentional
events. It's a bit of stretch to claim the equal amount of
detection for the former.
Maybe add "typically" or some such here...
Not sure I agree. Let's say someone mistypes a routing
configuration. When a routing propagation error is detected you go
through same method to resolve issue and what's the difference if
it's sourced erroneously from your box vs someone else's? Isn't
detection and mitigation similar? Once you figure out where error is
coming from then you have to determine if box was compromised or
whether it was an honest mistake but that should be traceable from
logs and audit trails (if enough info is available).
Can you give me an example of where there is a big difference in
effort? Be happy to change text if I am missing a point here.....
For privileged (i.e. enable) access, a second
authentication step needs to be completed.
==> this seems vendor-specific. Not all vendors have a separation
of ro/rw
access levels (for most purposes in any case), but are rather role-
based.
(Also in sect 2.1.3 though text there is more generic.)
Equipment that the ISPs use that I surveyed this is the
case.........perhaps all equipment should support this
functionality?!? Remember that this doc is supposed to support
capabilities documents and give a justification of why specific
capabilities should be supported.....
2.1.3. Security Services
o Access Control - Not applicable
==> hmm. I thought this was discussed at the start of the section?
Access Control in this context is meant as logical......filtering
basically.....I'll add a note when I introduce that in section 1.5
document layout.
2.2.1
For off-path active
attacks, the attack is generally limited to message insertion or
modification.
==> how could an off-path attacker perform message modification ?
(The same in section 2.3.1 and actually many others as well, including
2.6.1-- maybe I have a different notion of "off-path"..)
Let's say an attacker causes traffic to be rerouted to him and then
he's basically a MIM....although technically 'off-path' since
he is not in the correct data path. Sure, grey area of on-path vs
off-path but I think this applies. Most attacks are combination
of attacks from how people try and classify them........
2.2.2
Static username/passwords are expired after a specified
period of time, usually 30 days.
==> is this true? If I'd have to bet, in most cases passwords
never change,
at least not more often than in the timescale of O(year)... Maybe
"usually"
is exaggeration..
Yes, definitely true in the case of the ISPs I surveyed.....and
automated as well in some cases.....remember this is a survey of
large ISPs....
When an
individual leaves the company, his/her AAA account is immediately
deleted and the TACACS/RADIUS shared secret is reset for all
devices.
==> is this really done? Seems like a lot of bother if a random
NOC guy
would change jobs..
Yes, this is done for basically all ISPs I surveyed.......
The community strings are carefully chosen to
be difficult to crack and there are procedures in place to change
these community strings between 30-90 days.
==> does this happen..? Wow.. I guess I should consider a career
in ISP security business -- there seems to be a lot of work to do
there :-)
Remember that a lot of this is automated (through years of work) by
these larger ISPs.......and yes, it does actually happen......
With thousands of devices to manage, some ISPs have created automated
mechanisms to authenticate to devices. Kerberos is used to
automate
the authentication process. An individual would first log in to a
Kerberized UNIX server using SSH and generate a Kerberos 'ticket'.
This 'ticket' is generally set to have a lifespan of 10 hours
and is
used to automatically authenticate the individual to the
infrastructure devices.
==> Umm. I think there is only about one vendor supporting
Kerberos. Maybe
the "is used to.." should be changed to "can be used, where
applicable, .."? It is not clear to me why a multi-vendor ISP would
bother to implement
Kerberos for partial protection though...
You would have to ask the ISPs in question but it seemed like an
elegant solution when logging in to several devices at once when
torubleshooting issues. I agree that I should have a note here to
say that few vendors support Kerberos and that this can be used where
applicable.
The ISPs which do not have performance issues with their equipment
follow BCP38 [RFC2827] and BCP84 [RFC3704] guidelines for ingress
filtering.
==> the doc should probably say something explicitly about filters
or lack
thereof between service providers. (May also imply minor wording
change in
section 2.4.3 DOA) Note that if you don't have any filters, you IX
peers
could steal transit from you by static routing. This is why at
least some
smaller parties have implemented ingress/egress filters which block
unexpected source/destination addresses..
I will add a sentence or two to highlight this issue.
For layer 2 devices, MAC address filtering and authentication is not
used. This is due to the problems it can cause when
troubleshooting
networking issues. Port security becomes unmanageable at a large
scale where 1000s of switches are deployed.
==> maybe rephrased "is not used" with "is not typically used in
large-scale
deployments".
Agreed....and will change....
One such example is at edge boxes
where you have up to 1000 T1's connecting into a router with an
OC-12
uplink. Some deployed devices experience a large performance
impact
with filtering which is unacceptable for passing customer traffic
through.
==> so why wouldn't uRPF be applied at the box at the other end of
this
OC-12 uplink then? The security perimeter would be drawn in a
different
place but this should work fine...
Something to ask the ISPs who were surveyed.......anyone on the list
reading this care to comment? All
large ISPs I talked to did not use MAc address filtering or
authentication.
2.5.4. Replay Attacks
For a replay attack to be successful, the routing control plane
traffic would need to first be captured either on-path or
diverted to
an attacker to later be replayed to the intended recipient.
==> most routing protocols have some way or another to protect from
replay
(I'm assuming this means replaying without modification, because
otherwise
this would be just insertion), so in the general case I doubt this
is very
useful..
But it can be done in some instances right? Or are you advocating
just getting rid of this
paragraph?
[[ similar comment applies to 2.6.4. -- I don't see how you could
apply a
replay attack here in practical sense -- TFTP config download at
most but
because config isn't signed, modification is more lucrative...]]
Again, are you advocating I should just get rid of this paragraph or
reword?
Note that validating
whether a legitimate peer has the authority to send the contents of
the routing update is a difficult problem that needs yet to be
resolved.
==> this statement appears slightly misplaced (or redundant) given
that the
next paragraph discusses this very issue..
Redundancy in my mind is good....and I prefer to think of it as a
lead-in :) You feel strongly
that it should not be there?
Consistency between these policies
varies greatly although there is a trend to start depending on AS-
PATH filters because they are much more manageable than the large
numbers of prefix filters that would need to be maintained.
==> the important distinction here is probably towards an end-site
vs peer
or another big ISP (even if a customer). There is no argument that
you
should prefix filter your end-site customers. Others are up to
debate..
==> the document should mention maximum-prefix-limiters, typically
applied
with peers or others where prefix lists cannot be applied. This
helps in
avoiding unintentional leaks, misconfig, etc.
Agreed. Can you offer any specific wording? I.e. a sentence or two
that would be appropriate?
2.5.9. Additional Considerations
==> it'd seem to be that a part of this information overlaps with
2.5.7 and
should actually go there (e.g., related to route filters).
Possibly be in both places....I do state in filtering section that
most filtering concerns have been discussed in other sections.....
In all configuration files, most passwords are stored in an
obfuscated format.
==> it's not clear whether you refer to crypted format (i.e.,
vulnerable to
off-line dictionary attacks) or omitted or otherwise mangled
completely.
Will clarify by changing wording...
o Data Integrity - All systems use either a CRC-check or MD5
authentication to ensure data integrity.
==> you might also mention the generic config handling methods here
(ala
rancid) that you described earlier.
OK
2.7.2. Security Practices
Logging is mostly performed on an exception auditing basis when it
comes to filtering (i.e. traffic which is NOT allowed is logged).
This is to assure that the logging servers are not overwhelmed with
data which would render most logs unusable. Typically the data
logged will contain the source and destination IP addresses and
layer
4 port numbers as well as a timestamp. [...]
==> this seems to be presume that 'logging' only refers to ACL
logging, not
sending syslog about router's various other actions (adjacency events,
tracebacks, etc.). But you clarify this later on in the section.
Should
the text be adjusted from the start?
I don't know.....do you think it's necessary? I don't but will defer
to wg concensus....
The timestamp is derived from NTP which is generally configured
as a
flat hierarchy at stratum1 and stratum2 to have less configuration
and less maintenance. Each router is configured with one stratum1
peer both locally and remotely.
==> was the choice of word 'peer' intentional? Did you mean
'server'? Peers (routers here) are able to update their peers'
clock, which you might
not want to allow for stratum 1 NTP clocks. It's also not clear
what you
meant with "one strateum peer _both locally and remotely_"..
I'll reword for clarification....
This provides a backup mechanism to see
what is going on in the network in the event that a device may
'forget' to do syslog if the CPU is busy.
==> you may also want to mention here that as syslog is an unreliable
protocol, when routers boot or lose adjacencies, not all messages
will get
delivered. Some vendors may implement syslog buffering (e.g.,
buffer the
messages until you have a route to the syslog destination) but this
is not
standard. Hence, operators have to live with the fact that syslogs
can be
incomplete, and often may need to take a look at the local syslogs at
devices to see what has happened. (Unfortunately, with many
vendors, local
syslog buffer is very short..)
Agreed...
Routing filters are used to control the flow of routing
information.
In IPv6 networks, some providers are liberal in accepting /48s
due to
the still unresolved multihoming issues.
==> you should also add the other side of the coin: "Others filter at
allocation boundaries (i.e., typically at /32)."
OK....
2.9.2. Black-Hole Triggered Routing
==> it might be worth pointing out here or somewhere that blackholing
techniques may actually fulfill the goal of the attacker. If the
attacker
wanted to shut down www.ebay.com (or whatever), blackholing the
traffic
would do exactly that. On the other hand, blackholing might
decrease the
_collateral_ damage caused by an overly large attack aimed at
something
other than a critical service.
OK.....I can add that as an added consideration for implementing this
technique
uRPF is not used on interfaces that are likely to have routing
asymmetry, meaning multiple routes to the source of a packet.
Usually for ISPs, uRPF is placed at the customer edge of a network.
==> as described in rfc 3704 and draft-savola-bcp84-urpf-experiences,
asymmetry does not preclude applying feasible paths strict uRPF so
maybe a
few more words or a toning down would be needed here.
OK.
editorial
---------
==> the MUST, SHOULD, etc. language terminology can be removed as
it isn't used in this document.
Any of the specific attacks discussed further
in this document will elaborate on attacks which are sourced by an
"outsider" and are deliberate attacks.
==> "Any of the .." ? "... attacks ... elaborate on attacks ..."?
A bit of rewording would help here.
Why? Too many uses of the word 'attacks'? Hmmmm......
itself, arguably the largest. oldest and most well understood
area of
==> s/./,/
between 3-10 minutes. Individual users are authentication to get
basic access. For privileged (i.e. enable) access, a second
==> s/ion/ed/ ?
right systems. SNMP RW is not used and disabled by configuration.
==> s/disabled/is disabled/
machines and usually have limited access. Note that Telent is
NEVER
==> s/Telent/Telnet/
causing unwelcome ICPM redirects, creating unwelcome IP options or
==> s/ICPM/ICMP/
2.7.1.3. Replay Attacks
For a replay attack to be successful, the logging data would
need to
first be captured either on-path or diverted to an attacker and
later
replayed to the recipient. [is reply handled by syslog protocol?]
==> s/reply/replay/, maybe also remove the bracketed comment or
integrate it
to the text..
to a network device. A good guideline for IPv6 filtering is in the
draft work in progress on Best Current Practices for Filtering
ICMPv6
Messages in Firewalls [I-D.ietf-v6ops-icmpv6-filtering-bcp].
==> this I-D was downgraded to informational and the name is now ..-
recs
Informative References
==> there appear to be some refs which are not cited in the main
body of the
doc.
Again, thanks....
- merike