[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Open issues in Re: shim6-proto-07 review



there are some open issues from Iljitsch review...


El 03/01/2007, a las 12:38, Iljitsch van Beijnum escribió:

the original locators become invalid at the same time and depending on the time that is required to update the DNS and for those updates
   to propagate.

Why is the DNS relevant here?

because at least one valid locator must be published in the DNS in order for a shim host to be reachable, since the DNS is used as a rendez vous mechanism. OTOH, this is no specific of the shim6 protocol, but generic for any host... do you prefer we remove the DNS consideration from here?

I don't understand what you mean here... The way I see it, the stuff that's in the DNS doesn't change when there are outages, only when links are added to or removed from the site. As such, the DNS timing is irrelevant except in the corner case where all existing links are replaced by all new ones or all the links that don't change go down.


The text on the current draft states:

   When a host is renumbered, the effect is that one or more locators
   become invalid, and zero or more locators are added to the host's
   network interface.  This means that the set of locators that is used
   in the shim will change, which the shim can handle as long as not all
   the original locators become invalid at the same time and depending
   on the time that is required to update the DNS and for those updates
   to propagate.


i think this is accurate and the time required for changes in the DNS to propagate is a relevant factor when all addresses in a site are renumbered and i don't think there is problem leaving this here.

OTOH, i don't have a strong preference to keep this here.

Please suggest text if you think otherwise


This makes me uncomfortable. How do you know that an address has become terminally invalid, rather than accidentally unusable?

this was discussed during the last meeting in san diego and there was consensus that if a prefix is removed from the site, existing shim contexts that are using a ulid containing the prefix must be terminated when the prefix is removed. however, i read from your comment is that you don't have a problem with this idea but your problem is with how to implement this, right?

Indeed. The autoconfig protocols don't have a mechanism to see the difference between a router going down accidentally and administrative removal of prefixes, unless you want to add significant amounts of guesswork to the mix. And that's something I don't think is a good idea, especially considering that in SITE multihoming the chances of clashing ULIDs after renumbering are near-zero.

i may agree with you here but there was a strong concensous in the San Diego meeting to prevent the usage of prefixes that are no longer available at the multihomed site, even as ULIDs.

So as long as i don't see strong concensous in the other sense, i will have to keep this as it is.

again, if you can suggest text for addressing this issue...


what about when the address becomes invalid?

In many Unix stacks you can keep using an address even if it's removed from all interfaces. This is a good thing, because it allows sessions to survive short transitions such as moving out of radio range for a few moments. With shim6 there is the potential that the address is gone for a long time but it remains in use because the failure is repaired using the shim.

I think the best thing to do here is keep using the address for a limited amount of time, like 1 - 24 hours.


same as before

many people argued that they do not want to use a ULID that is no longer assigned to the site

i think that this concensous still holds,

if you want, i can change the wording, so that the ULID is no longer used when the prefix is invalid, but when the prefix is no longer assigned to the site, would that be better?



   Bidirectional Communication (FBD).  FBD uses a Keepalive message
which is sent when a host has received packets from its peer but has
   not yet sent any packets from its ULP to the peer.

No, this works per address (per locator even, not per ULID, IIRC), not per ULP.

the meaning here is that FBD generates packets when the ULPs are silent. It doesn't mean that it generates packets per ULP, just that when all ULPs are silent, FBD generates packets. It may be rephrased as follows:

   FBD uses a Keepalive message
which is sent when a host has received packets from its peer but has
   not yet sent any packets from any of its ULPs to the peer.

would this be better?

No, because I don't see what the ULP has to do with anything. FBD is between address pairs.


could you suggest text to describe more precisely what FBD does?

So what happens when some other header follows the shim header? Could this be used for attacks?

this is for shim6 control messages and so far we haven't defined means to perform piggybacking of other protocols in shim6 control messages, so it is not possible for other protocols to follow the shim header. If this is defined in future extensions, they will need to update this part of the spec

Well, wouldn't it be better then to forbid having additional headers after the shim header?

i don't think we need to take such a measure unless we identify what would be the attack that this could open

I mean, piggybacking seems a nice to have feature in the future, so imho we need a strong reason to close that door, don't you agree?


About the different messages: they are very similar. If I were to implement all of this, I would rather work with one basic structure for all of the messages, even if the _meaning_ of some fields is different as long as their structure is always the same. I think this can easily be done here, by including fields that nearly all messages need (simply leave it zero when a particular message doesn't need a field) and use options for things that a particular message needs that aren't accommodated in the unified structure.

i can clearly see the cost of this approach, additional overhead for unused fields and potential source of confusion when fields change their name... what is the advantage of such approach?

When you write an implementation, you need to extract all the values from a header. Since these are all different sizes without natural alignment this involves some complexity. When there are different message headers, you need to do this multiple times. If there is only one message header, you only have to implement the value extraction code once.

The field names aren't an issue at all. In code it works something like this:

void getheaders(int *field1, char *field2, unsigned int *field3);

void lookatmessage1()
{
  int contexttag;
  char evilbit;
  unsigned int remotecookie;

  getheaders(&contexttag, &evilbit, &remotecookie);
  if (evilbit)
printf("You are breaking Evil Corp.'s patent, prepare to be sued!\n");
}

void lookatmessage2()
{
  int contexttag;
  char ossbit;
  unsigned int localcookie;

  getheaders(&contexttag, &ossbit, &localcookie);
  if (!ossbit)
    printf("You are breaking the GPL, prepare to be sued!\n");
}

Since the formats are so similar already there are only 1 or 2 cases where an unused field would be included so I don't think the overhead is a concern.


I am not sure what to do with this.... maybe we can have some feedback from the people working on implementations...

update request: why is this a request?

because this is a two way protocol, update and request... (i do agree that it is not the best name in this case... suggestion?)

Just "update"?

Hm, is it smart to defer verification here?

depending what verification we are talking about....

this is explained in detail in section 7.2.  Locator Verification

there are two type of verification HBA/CGA verification is one type and the other type is the one against flooding attakcs that is achived using a probe packets (of the path exploration protocol)

The spec in section 7.2 states that:

   Thus the HBA/CGA
   verification SHOULD be performed by the host before the host
   acknowledges the new locator, by sending an Update Acknowledgement
   message, or an R2 message.

   Before a host can use a locator (different than the ULID) as the
destination locator it MUST perform the HBA/CGA verification if this
   was not performed before upon the reception of the locator set.  In
   addition, it MUST verify that the ULID is indeed present at that
   locator.  This verification is performed by doing a return-
   routability test as part of the Probe sub-protocol [9].

makes sense?

Is it useful to be able to defer locator verification? Simply doing it immediately and either acknowledging that everything is ok or sending back an error is much more robust than saying nothing and failing later.

In its current form the spec allows for deferred until it is really necesary, but it doesn't require to deffer it. Besides it reccomends that the verification that does not requires additional packet exchange is prefroemd before sending the ACK.

I think this is enough, since we allow implementations to decide whether to wait until is needed to perform the reachability verification that will be needed in any case before using the locator pair, in order verify that is actually reachable.

So i suggest keeping it in its current form



When the traffic generated by the ULPs results in a bidirectional flow of packet between the peers, no extra packets need to be inserted.

is this ok?

No, it's still confusing. Why not leave this up to the reachability draft?

I think it would be useful to have a hint of how this works in this document... could you suggest some text that you don't find confusing?


Ugh, this is certainly enough to make a grown man cry... Why all of this alignment silliness? BGP works pretty well without it.

i think this is general in IPv6 protocols which are optimized for handling 64 bits units.... but maybe someone with more expertise can answer this one...

Believe me, implementations aren't going to be any simpler because of all the padding and length calculations, and the amounts of data involved are so small that having faster copying because of the alignment isn't an issue.



i would like some feedback from other in this one...

I would preffer not changing all the packet format at this stage... really


[...]

More in general, most error conditions are handled by silently dropping packets, however, which is a very bad idea because that way, there is no difference between an error and lost messages.

[...]

sent this issue in a separate email

IIRC this mail doesn't go into the issue of silently dropping messages, so I'll restate my objection to that here.


ok, will try to address this issue along with the error message issue

About the locator option: how many locators are allowed?

there is no upper limit other than the ones that fit inside a packet with the other required options... As i mentioned above, roughly this seemed enough not to become a limitation

Some limit might be helpful, though.


why?

Regarding section 7.9: shouldn't there be checks to make sure that seemingly duplicate packets contain the same information as the earlier packets they are supposedly the duplicate of?

I am not following you here...

are we talking about an initiator that receives a duplicate R1 back or a receiver that receives a duplicate I1?

If a receiver receives a duplicate I1 it doesn't do anything special, just replies with a R1. don't see any problem here

So the issue whether it's a duplicate is irrelevant because the reply is the same either way?


yes


If an initiator receives a duplicated R1 (this may be due that the initiator have retried with multiple I1), it will process the first one received, and send a I2. The second one received, there will be no shim6 context in I1-SENT state, so it will be discarded. I do not see the point in verifying that the other information in the packet is ok or not... what would be the point in doing this verification?

If you have special case handling for duplicates it's important to be sure that you're actually dealing with duplicates. For instance, an attack could be sending out fake shim6 packets so that when the real ones come in later, those are rejected because they are presumed duplicates.


agree but in this case, the R1 contains the INIT nonce that prevents such attack. So while it is not verified that the second R1 is a replay of the first one, but that any R1 received is a proper reply of the initial I1 sent. I think this provides the required protection

What if validators don't match? Eventually this shouldn't be a problem but I expect some initial trouble here because you're doing hashes over a fairly large number of values, a small mistake somewhere means the hash doesn't work, some feedback in the form of an error message would be good.

you mean for the first packet or for the following (duplicated) packets? for the first packet, this is a specific case of the general problem about whether it is ok to silently discard wrong packets (that i have initiated a new thread for this)
For the second case, i don't see any point on doing this...

Initial = when implementations first appear.

It's important that people know what's going on in order to debug problems, not only when doing interoperability testing, but also "in the wild" later on.


ok, this is part of the silently discarding packets when there is an error issue, that is discussed in other thread.


Adn why verify whether the source address is in Ls(peer)? The security mechanisms do all the checking we need.

could you point me where is the text you are referring to? i mean there is no such verification in section 7.15 which was the section the previous comment was referring to...

Page 59:

If such a context exists in ESTABLISHED state, the host verifies that
   the locator of the Initiator is included in Ls(peer) (This check is
   unnecessary if there is no ULID-pair option in the I1 message).

Page 63:

   o  If the state is I1-SENT, then the host verifies if the source
      locator is included in Ls(peer) or, it is included in the Locator
      List contained in the the I2 message and the HBA/CGA verification
      for this specific locator is successful


Page 69:

   o  If the state is I1-SENT, then the host verifies if the source
      locator is included in Ls(peer) or, it is included in the Locator
      List contained in the the I2 message and the HBA/CGA verification
      for this specific locator is successful


o If the state is ESTABLISHED, I2-SENT, or I2BIS-SENT, then the host
      verifies if the source locator is included in Ls(peer) or, it is
      included in the Locator List contained in the the I2 message and
      the HBA/CGA verification for this specific locator is successful

Page 74:

   Since context tags can be reused, the host MUST verify that the IPv6
   source address field is part of Ls(peer) and that the IPv6

Page 76:

   Since context tags can be reused, the host MUST verify that the IPv6
   source address field is part of Ls(peer) and that the IPv6
   destination address field is part of Ls(local).

Page 81:

   o  Other control messages (Update, Keepalive, Probe): Deliver to the
      context with CT(local) equal to the Receiver Context Tag included
      in the packet.  Verify that the IPv6 source address field is part
of Ls(peer) and that the IPv6 destination address field is part of
      Ls(local).  If not, send a R1bis message.



The security checks that you mention, verify that the locators contained in the locator set are bound to the ULID.

These checks that you point out below, check if the particular locator that is being used for sending the packets belongs to the locator set that has previously been bound to the ULID.

So, this is needed in order to avoid accepting packets coming from any source address that is not included in the verified locator set.



   NO_R1_HOLDDOWN_TIME = 1 min

   ICMP_HOLDDOWN_TIME = 10 min

This seems rather short, basically a shim host talking to a non-shim host would retry setting up the shim every minute or every 10 minutes even though there is good reason to assume this won't be successful. Something like several hours seems more appropriate. (And only when packets are actively exchanged.)

well, NO_R1_HOLDDOWN_TIME is about how long we should wait until we retry when no R1 have been received. Note that this may be due to network outages that can be solved in a short time. So, we have a good reason to assume this will be successful, someone fixed the network path :-)

in the other case, i agree, what about:

   NO_R1_HOLDDOWN_TIME = 5 min

   ICMP_HOLDDOWN_TIME = 60 min

The case where a firewall silently filters the packets is much more likely. I still find what you list above rather short.


can you suggest values for the above parameters?


The Locator List Option Format only specifies two verification methods at this time: CGA or HBA. What about the case where a locator can be verified using either CGA or HBA? Maybe it makes more sense to have each method be a bit so they can be present or absent independently.

if this turns to be useful we could define a verification code that would be CGA or HBA but i don't see why this would be useful... do you?

I guess you can argue that if HBA verification is possible, there is never a need to do CGA verification.

Still, I think having bits for this rather than a single code is good for two reasons:

1. In case someone wants to implement either HBA or CGA but not both for IPR reasons, then it becomes useful to announce both HBA and CGA capability

2. So that when in the future a new verification method is introduced that obviously isn't universally implemented but is better than CGA/HBA in some way, so that people may want to take advantage of the new method when supported but fall back on HBA/CGA if not


so the open issue here would be to define a code 3 of verification that would be HBA or CGA?

is that the proposal?