From iesg-secretary@ietf.org  Thu Aug  9 08:33:41 2012
Return-Path: <iesg-secretary@ietf.org>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A7DB21F874F; Thu,  9 Aug 2012 08:33:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.527
X-Spam-Level: 
X-Spam-Status: No, score=-102.527 tagged_above=-999 required=5 tests=[AWL=0.072, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tNRBHGdisZxS; Thu,  9 Aug 2012 08:33:40 -0700 (PDT)
Received: from ietfa.amsl.com (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0ACA321F866E; Thu,  9 Aug 2012 08:33:40 -0700 (PDT)
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: The IESG <iesg-secretary@ietf.org>
To: IETF-Announce <ietf-announce@ietf.org>
X-Test-IDTracker: no
X-IETF-IDTracker: 4.33
Message-ID: <20120809153340.13276.51388.idtracker@ietfa.amsl.com>
Date: Thu, 09 Aug 2012 08:33:40 -0700
Cc: armd@ietf.org
Subject: [armd] Last Call: <draft-ietf-armd-problem-statement-03.txt> (Problem	Statement for ARMD) to Informational RFC
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: ietf@ietf.org
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Aug 2012 15:33:41 -0000

The IESG has received a request from the Address Resolution for Massive
numbers of hosts in the Data center WG (armd) to consider the following
document:
- 'Problem Statement for ARMD'
  <draft-ietf-armd-problem-statement-03.txt> as Informational RFC

The IESG plans to make a decision in the next few weeks, and solicits
final comments on this action. Please send substantive comments to the
ietf@ietf.org mailing lists by 2012-08-23. Exceptionally, comments may be
sent to iesg@ietf.org instead. In either case, please retain the
beginning of the Subject line to allow automated sorting.

Abstract


   This document examines address resolution issues related to the
   massive scaling of data centers.  Our initial scope is relatively
   narrow.  Specifically, it focuses on address resolution (ARP and ND)
   within the data center.


The file can be obtained via
http://datatracker.ietf.org/doc/draft-ietf-armd-problem-statement/

IESG discussion can be tracked via
http://datatracker.ietf.org/doc/draft-ietf-armd-problem-statement/ballot/


No IPR declarations have been submitted directly on this I-D.


From jmh@joelhalpern.com  Thu Aug  9 19:28:55 2012
Return-Path: <jmh@joelhalpern.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 81CB721F8646; Thu,  9 Aug 2012 19:28:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.265
X-Spam-Level: 
X-Spam-Status: No, score=-102.265 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qpZ8ls+y1mzc; Thu,  9 Aug 2012 19:28:55 -0700 (PDT)
Received: from morbo.mail.tigertech.net (morbo.mail.tigertech.net [67.131.251.54]) by ietfa.amsl.com (Postfix) with ESMTP id 054BE21F8644; Thu,  9 Aug 2012 19:28:55 -0700 (PDT)
Received: from mailc2.tigertech.net (mailc2.tigertech.net [208.80.4.156]) by morbo.tigertech.net (Postfix) with ESMTP id D79AF557F2F; Thu,  9 Aug 2012 19:28:54 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by mailc2.tigertech.net (Postfix) with ESMTP id 072541619C1; Thu,  9 Aug 2012 19:28:54 -0700 (PDT)
X-Virus-Scanned: Debian amavisd-new at c2.tigertech.net
Received: from [192.168.1.2] (c-71-204-207-35.hsd1.de.comcast.net [71.204.207.35]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailc2.tigertech.net (Postfix) with ESMTPSA id 159421619C0; Thu,  9 Aug 2012 19:28:52 -0700 (PDT)
Message-ID: <502471DB.80303@joelhalpern.com>
Date: Thu, 09 Aug 2012 22:28:43 -0400
From: "Joel M. Halpern" <jmh@joelhalpern.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20120713 Thunderbird/14.0
MIME-Version: 1.0
To: "A. Jean Mahoney" <mahoney@nostrum.com>
References: <50243C05.3080006@nostrum.com>
In-Reply-To: <50243C05.3080006@nostrum.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: gen-art@ietf.org, "armd@ietf.org" <armd@ietf.org>
Subject: [armd] Gen-art] review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Aug 2012 02:28:55 -0000

I am the assigned Gen-ART reviewer for this draft. For background on
Gen-ART, please see the FAQ at

<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq> .

Please resolve these comments along with any other Last Call comments
you may receive.

Document: draft-ietf-armd-problem-statement-03
     Problem Statement for ARMD
Reviewer: Joel M. Halpern
Review Date: 9-Aug-2012
IETF LC End Date: 23-Aug-2012
IESG Telechat date: N/A

Summary: This document is almost ready for publication as an 
Informational RFC

Major issues:
     The use of the term "switch" seems confusing.  I had first assumed 
that it meant an ethernet switch (which might have  abit of L3 smarts, 
or might not.  I was trying not to be picky.)  But then, in section 6.3 
it refers to "core switches ... are the data center gateways to external 
networks" which means that those are routers.

Moderate Issue:
    The document seems to be interestingly selective in what modern 
technologies it chooses to mention.  Mostly it seems to be describing 
problems with data center networks using technology more than 5 years 
old.  Since that is the widely deployed practice, that is defensible. 
But then the document chooses to mention new work such as OpenFlow, 
without mentioning the work IEEE has done on broadcast ad multicast 
containment for data centers.  It seems to me that we need to be 
consistent, either describing only the widely deployed technology, or 
including a fair mention of already defined and productized solutions 
that are not yet widely deployed.

     On a related note, the document assumes that multicast NDs are 
delivered to all nodes, while in practice I believe existing techniques 
to filter such multicast messages closer to the source are widely 
deployed.  (Section 5.)

Minor issues:
     I presume that section 6.4.2 which describes needing to enable all 
VLANs on all aggregation ports is a description of current practice, 
since it is not a requirement of current technologies, either via VLAN 
management or orchestration?

     Section 6.4.4 seems very odd.  The title is "overlays".  Are there 
widely deployed overlays?  If so, it would be good to name the 
technologies being referred to here.  If this is intended to refer to 
the overlay proposal in IETF and IEEE, I think that the characterization 
is somewhat misleading, and probably is best simply removed.

     Is the fifth paragraph of section 71. on ARP processing and 
buffering in the absence of ARP cache entries accurate?  I may well be 
out of date, but it used to be the case that most routers dropped the 
packets, and some would buffer 1 packet deep at most.  This description 
indicates a rather more elaborate behavior.

     Given that this document says it is a general document about 
scaling issues for data centers, I am surprised that the security 
considerations section does not touch on the increased complexity of 
segregating subscriber traffic (customer A can not talk to customer B) 
when there are very large numbers of customers, and the itneraction of 
this with L2 scope.

Nits/editorial comments:

From narten@us.ibm.com  Mon Aug 27 14:25:23 2012
Return-Path: <narten@us.ibm.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7585221F8508 for <armd@ietfa.amsl.com>; Mon, 27 Aug 2012 14:25:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.599
X-Spam-Level: 
X-Spam-Status: No, score=-110.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NwxuZchnZ04C for <armd@ietfa.amsl.com>; Mon, 27 Aug 2012 14:25:22 -0700 (PDT)
Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) by ietfa.amsl.com (Postfix) with ESMTP id 1CE2B21F84F2 for <armd@ietf.org>; Mon, 27 Aug 2012 14:25:22 -0700 (PDT)
Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <armd@ietf.org> from <narten@us.ibm.com>; Mon, 27 Aug 2012 15:25:21 -0600
Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;  Mon, 27 Aug 2012 15:25:19 -0600
Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 640283E40040; Mon, 27 Aug 2012 15:25:18 -0600 (MDT)
Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7RLOsnd167520; Mon, 27 Aug 2012 15:24:57 -0600
Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7RLOrfP001214; Mon, 27 Aug 2012 15:24:53 -0600
Received: from cichlid.raleigh.ibm.com ([9.80.24.175]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q7RLOqG3001151 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 27 Aug 2012 15:24:52 -0600
Received: from cichlid.raleigh.ibm.com (localhost.localdomain [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.5/8.12.5) with ESMTP id q7RLOnx7015943; Mon, 27 Aug 2012 17:24:49 -0400
Message-Id: <201208272124.q7RLOnx7015943@cichlid.raleigh.ibm.com>
To: "Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com>
In-reply-to: <7C362EEF9C7896468B36C9B79200D8350D063A0AF5@INBANSXCHMBSA1.in.alcatel-lucent.com>
References: <7C362EEF9C7896468B36C9B79200D8350D063A0AF5@INBANSXCHMBSA1.in.alcatel-lucent.com>
Comments: In-reply-to "Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com> message dated "Wed, 15 Aug 2012 18:09:38 +0530."
Date: Mon, 27 Aug 2012 17:24:48 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12082721-1780-0000-0000-000008C7172E
X-IBM-ISS-SpamDetectors: 
X-IBM-ISS-DetailInfo: BY=3.00000293; HX=3.00000196; KW=3.00000007; PH=3.00000001; SC=3.00000007; SDB=6.00169038; UDB=6.00038311; UTC=2012-08-27 21:25:20
Cc: "rtg-dir@ietf.org" <rtg-dir@ietf.org>, armd@ietf.org, "draft-ietf-armd-problem-statement.all@tools.ietf.org" <draft-ietf-armd-problem-statement.all@tools.ietf.org>, "rtg-ads@tools.ietf.org" <rtg-ads@tools.ietf.org>
Subject: Re: [armd] RtgDir review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Aug 2012 21:25:23 -0000

Hi Manov.

Thanks for the review comments.

"Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com> writes:

> Summary: I have some concerns about this document that I think
>  should be resolved before publication.

> Major Issues:

> 1. In Sec 5 why is there a "may" in the following statement?

> "From an L2 perspective, sending to a multicast vs. broadcast
>  address *may* result in the packet being delivered to all nodes,
>  but most (if not all) nodes will filter out the (unwanted) query
>  via filters installed in the NIC -- hosts will never see such
>  packets. "

This is poorly worded. How about I replace  the paragraph with the
following:

	Broadly speaking, from the perspective of address resolution,
        IPv6's Neighbor Discovery (ND) behaves much like ARP, with a
        few notable differences. First, ARP uses broadcast, whereas ND
        uses multicast. Specifically, when querying for a target IP
        address, ND maps the target address into an IPv6 Solicited
        Node multicast address. Using multicast rather than broadcast
        has the benefit that the multicast frames do not necessarily
        need to be sent to all parts of the network, i.e., only to
        segments where listeners for the Solicited Node multicast
        address reside. In the case where multicast frames are
        delivered to all parts of the network, sending to a multicast
        still has the advantage that most (if not all) nodes will
        filter out the (unwanted) multicast query via filters
        installed in the NIC rather than burdening host software with
        the need to process such packets. Thus, whereas all nodes must
        process every ARP query, ND queries are processed only by the
        nodes to which they are intended. In cases where multicast
        filtering can't effectively be implemented in the NIC (e.g.,
        as on hypervisors supporting virtualization), filtering would
        need to be done in software (e.g., in the hypervisor's
        vSwitch).

> "may" seems to indicate that there are scenarios when a multicast
>  from an L2 perspective will not be delivered to all nodes.

Correct.

> I am unable to envisage a scenario when this can happen? All BUM
>  (broadcast, unlearnt unicast and multicast) traffic in vanilla L2
>  and VPLS (Virtual Private Lan Service) is delivered to *all*
>  nodes. There are exceptions in H-VPLS or if MMRP is enabled but I
>  suspect if the authors had this in their mind when they wrote the
>  above text.

Hopefully the proposed text answers the above questions.

> 2. Sec 7.1 begins with the following text:

> "One pain point with large L2 broadcast domains is that the routers
>  connected to the L2 domain need to process "a lot of" ARP traffic."

> I am not sure if this is correct with how an L2 broadcast domain has
>  been defined in Sec 2. I would wager that a bigger pain point for a
>  large L2 broadcast domain would be handling unknown unicast traffic
>  that needs to get flooded, instead of dealing with the "ARP"
>  traffic. I am aware of very very large L2 broadcast domains that
>  have no ARP/ND scaling problems. Would it then make more sense to
>  replace the L2 broadcast domain with an ARP/ND domain? If Yes, then
>  ARP/ND domain too needs to be defined in Sec 2.

The issue (as has been discussed in ARMD) is specifically the ARP
processing load (and not unknown unicast traffic). In typical
implementations, ARP processing is done by a service processor with
limited capacity. The cited problem is that the amount of ARP traffic
places a significant load on that processor.

This is explained in the next pargraph. How about I add the following
sentence to the 2nd paragraph.:

     In some deployments, limitations on the rate of ARP processing
     have been cited as being a problem.

Does that work?     

> 3. Sec 7.1 seems to suggest that Gratuitous ARPs pre-populate ARP
>  caches on the neighboring devices. Without an explicit description
>  of what a neighboring device is, I would presume that this also
>  includes edge/core routers. In that case this statement is not
>  entirely correct as I am aware of routers that will by default not
>  pre-populate their ARP caches on receiving Gratuitous ARPs.

Right. The spec says "don't do this". But I believe it was asserted
that some implementations do this. That said, I'm not aware of any
such implementations. I would be willing to remove this sentence in
the absence of known implementations of this.

> 4. Sec 7.2 must also discuss the scaling impact of how the neighbor
>  cache is maintained in IPv6 - especially the impact of moving the
>  neighbor state from REACHABLE to STALE. Once the "IPv6 ARP" gets
>  resolved the neighbor entry moves from the REACHABLE to STALE after
>  around 30secs. The neighbor entry remains in this state till a
>  packet needs to be forwarded to this neighbor. The first time a
>  node sends a packet to a neighbor whose entry is STALE, the sender
>  changes the state to DELAY and sets a timer to expire in around 5
>  seconds. Most routers initiate moving the state from STALE to DELAY
>  by punting a copy of the data packet to CPU so that the sender can
>  reinitiate the Neighbor discovery process. This patently can be
>  quite CPU and buffer intensive if the neighbor cache size is huge.

This could be. But the WG did not report such specific details in
terms of actual problems reported from deployments.

Care to say more about what these "most implementations" are and how
common they are? And are they the *only* way to implement this
feature, or have other vendors chosen different implementations
without this limitation?

That said, I could add the following to the document:

	Routers implementing NUD (for neighboring destinations) will
	need to process neighbor cache state changes such as
	transitioning entries from REACHABLE to STALE. How this
	capability is implemented may impact the scalabability of ND
	on a router. For example, one possible implementation is to
	have the forwarding operation detect when an ND entry is
	referenced that needs to transition from REACHABLE to STALE,
	by signalling an event that would need to be processed by the
	software processor. Such an implementation could increase the
	load on the service processor much in the same way that a high
	rate of ARP requests have led to problems on some routers.


> Minor Issues:

> 1. Sec 2 - Terminology should define Address Resolution as this
>  seems to be the core issue that the draft is discussing.

> Address Resolution: Address resolution is the process through which
>  a node determines the link-layer address of a neighbor given only
>  its IP address.  In IPv6, address resolution is performed as part
>  of Neighbor Discovery [RFC4861], Section 7.2.

How about:

	  <t hangText="Address Resolution:">
	    the process of determining the link-layer address
	    corresponding to a given IP address. In IPv4, address
	    resolution is performed by ARP <xref target="RFC0826">; in
	    IPv6, it is provided by Neighbor Discovery
	    (ND) <xref target="RFC4861"></xref>.


> 2. In Sec 7.1 you mention that routers need to drop all transit
>  traffic when there is no response received for an ARP/ND
>  request. You should mention that in addition to this, routers also
>  need to send an ICMP host unreachable error packet back to the
>  sender. ICMP error packets are generated in the control card
>  CPU. So, if the CPU has to generate a high number of such ICMP
>  errors then this can load the CPU. The whole process can be quite
>  CPU as well as buffer intensive. The CPU/buffer overload is usually
>  mitigated by rate limiting the number of ICMP errors generated.

Added:

   "and may send an ICMP destination unreachable message as well."

> 3. In Sec 7.1 you mention that the entire ARP/ND process can be
>  quite CPU intensive since transit data traffic needs to be queued
>  while the address resolution is underway. You could mention that
>  this is mitigated by offloading the queuing part to the line card
>  CPUs so that the CPU on the control card is not inundated with such
>  packets. This obviously would only work on distributed systems that
>  have separate CPUs on the line cards and the main card.

There are many things one could say about ARP implementations. But
that is not the purpose of this document. It is really about outlining
the problems... So I think the above is getting too detailed.

> 4. Sec 7.1 should mention that this could be used as a DoS attack
>  wherein the attacker sends a high volume of packets for which ARPs
>  need to be resolved. This could result in genuine packets that need
>  to resolve ARPs getting dropped as there is only a finite rate at
>  which packets are sent to CPU for ARP resolution. Again this is
>  both CPU and buffer intensive.

Again, I don't think this document needs to cover all aspects of ND.

> 5. Sec 7.2 discusses issues with address resolution mechanism in
>  IPv6. I think its useful for this draft to discuss the fact that
>  unlike IPv4, IPv6 has subnets that are /64. This number is quite
>  large and will perhaps cover trillions of IP addresses, most of
>  which would be unassigned. Thus simplistic IPv6 ND implementations
>  can be vulnerable to attacks which inundates the CPU with huge
>  requests to perform address resolution for a large number of IPv6
>  addresses, most of which are unassigned. As a result of this
>  genuine IPv6 devices will not be able to join the network. You
>  might want to refer to RFC 6583 for more details.

Ditto.

> 6. The last paragraph of Sec 7.3 says the following:

> "Finally, IPv6 and IPv4 are often run simultaneously and in parallel
>  on the same network, i.e., in dual-stack mode.  In such
>  environments, the IPv4 and IPv6 issues enumerated above compound
>  each other."

> While I understand the sentiment behind the above statement, I fail
>  to see how this is related to the MAC problem being described in
>  Sec 7.3. The MAC scaling is a function of the total number of
>  unique MACs that the system has to learn and is orthogonal to the
>  presence of IPv4 or IPv6. I read this statement to mean that
>  something extra happens in the dual stack mode which exacerbates
>  the MAC problem even further. This I believe is patently not the
>  case.

That paragraph was intended to cover all of 7.1 and 7.2, and not be in
7.3. I'll move it.

> 7. Sec 11 - Security Considerations should at the very least give
>  pointers to references on issues related to ARP security
>  vulnerabilities. I don't see IPv6 ND mentioned at all. Since ND
>  relies on ICMPv6 and does not run directly over layer 2, there
>  could possibly be security concerns specific to ND in the data
>  center environments that don't apply to ARP. This document ought to
>  discuss those so that ARMD (or some other WG) can look at solutions
>  addressing those concerns.

Actually, I disagree somewhat. This document doesn't need to get into
all the security issues of ARP and/or ND. For one thing, they did not
come up as "problems" in ARMD. :-) I will put in pointers to the ND
security considerations section. How about I add the following
sentence:

    Security considerations for Neighbor Discovery are discussed in
    <xref target="RFC4861"></xref> and <xref target="RFC6583"></xref>.

> 8. Should it be mentioned in the document somewhere (sec 11?) that
>  data center administrators can configure ACLs to filter packets
>  addressed to unallocated IPv6 addresses? Folks can consider the
>  valid IPv6 address ranges and filter out packets that use the
>  unallocated addresses. Doing this will avoid unnecessary ARP
>  resolution for invalid IPv6 addresses. The list of the IPv6
>  addresses that are legitimate and should be permitted is small and
>  maintainable because of IPv6's address
>  hierarchy. http://www.iana.org/assignments/ipv6-unicast-address-assignments/ipv6-unicast-address-assignments.xml
>  gives a list of large address blocks that have been allocated by
>  IANA.

IMO no. This goes beyond the scope of this document.

> Cheers, Manav

Thanks again for your detailed review!!

Thomas


From narten@us.ibm.com  Mon Aug 27 14:32:01 2012
Return-Path: <narten@us.ibm.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 836CE21E8039 for <armd@ietfa.amsl.com>; Mon, 27 Aug 2012 14:32:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.599
X-Spam-Level: 
X-Spam-Status: No, score=-110.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IxzD3+zIMI+d for <armd@ietfa.amsl.com>; Mon, 27 Aug 2012 14:32:00 -0700 (PDT)
Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.149]) by ietfa.amsl.com (Postfix) with ESMTP id 8D50621E803D for <armd@ietf.org>; Mon, 27 Aug 2012 14:32:00 -0700 (PDT)
Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <armd@ietf.org> from <narten@us.ibm.com>; Mon, 27 Aug 2012 15:32:00 -0600
Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e31.co.us.ibm.com (192.168.1.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;  Mon, 27 Aug 2012 15:31:56 -0600
Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id A0B533E4003D for <armd@ietf.org>; Mon, 27 Aug 2012 15:31:55 -0600 (MDT)
Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7RLU3IX123790 for <armd@ietf.org>; Mon, 27 Aug 2012 15:30:05 -0600
Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7RLTvms028663 for <armd@ietf.org>; Mon, 27 Aug 2012 15:29:57 -0600
Received: from cichlid.raleigh.ibm.com ([9.80.24.175]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q7RLTuRN028625 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <armd@ietf.org>; Mon, 27 Aug 2012 15:29:56 -0600
Received: from cichlid.raleigh.ibm.com (localhost.localdomain [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.5/8.12.5) with ESMTP id q7RLTtCK015974 for <armd@ietf.org>; Mon, 27 Aug 2012 17:29:55 -0400
Message-Id: <201208272129.q7RLTtCK015974@cichlid.raleigh.ibm.com>
To: armd@ietf.org
Date: Mon, 27 Aug 2012 17:29:55 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12082721-7282-0000-0000-00000C5B73A0
Subject: [armd] Gratuitous ARP pre-populating ARP caches.
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Aug 2012 21:32:01 -0000

Hi. The RtgDir review of this document raised an issue, to which I
responded as follows. Anyone care to comment on this point?

Thomas Narten <narten@us.ibm.com> writes:

> > 3. Sec 7.1 seems to suggest that Gratuitous ARPs pre-populate ARP
> >  caches on the neighboring devices. Without an explicit description
> >  of what a neighboring device is, I would presume that this also
> >  includes edge/core routers. In that case this statement is not
> >  entirely correct as I am aware of routers that will by default not
> >  pre-populate their ARP caches on receiving Gratuitous ARPs.

> Right. The spec says "don't do this". But I believe it was asserted
> that some implementations do this. That said, I'm not aware of any
> such implementations. I would be willing to remove this sentence in
> the absence of known implementations of this.

Thomas


From warren@kumari.net  Mon Aug 27 17:54:36 2012
Return-Path: <warren@kumari.net>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9969D21F8489 for <armd@ietfa.amsl.com>; Mon, 27 Aug 2012 17:54:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.521
X-Spam-Level: 
X-Spam-Status: No, score=-106.521 tagged_above=-999 required=5 tests=[AWL=0.078, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EOkxkHVe3k-U for <armd@ietfa.amsl.com>; Mon, 27 Aug 2012 17:54:36 -0700 (PDT)
Received: from vimes.kumari.net (vimes.kumari.net [198.186.192.250]) by ietfa.amsl.com (Postfix) with ESMTP id 1E71521F842F for <armd@ietf.org>; Mon, 27 Aug 2012 17:54:35 -0700 (PDT)
Received: from [10.242.21.221] (m2a5f36d0.tmodns.net [208.54.95.42]) by vimes.kumari.net (Postfix) with ESMTPSA id 1F6261B40674; Mon, 27 Aug 2012 20:54:35 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1278)
Content-Type: text/plain; charset=windows-1252
From: Warren Kumari <warren@kumari.net>
In-Reply-To: <201208272129.q7RLTtCK015974@cichlid.raleigh.ibm.com>
Date: Mon, 27 Aug 2012 20:54:32 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <358551A0-56F1-43DF-AD46-E98B4FF9E7BE@kumari.net>
References: <201208272129.q7RLTtCK015974@cichlid.raleigh.ibm.com>
To: Thomas Narten <narten@us.ibm.com>
X-Mailer: Apple Mail (2.1278)
Cc: armd@ietf.org
Subject: Re: [armd] Gratuitous ARP pre-populating ARP caches.
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Aug 2012 00:54:36 -0000

On Aug 27, 2012, at 5:29 PM, Thomas Narten wrote:

> Hi. The RtgDir review of this document raised an issue, to which I
> responded as follows. Anyone care to comment on this point?

Cisco IOS used to prepopulate ARP cache from Gratuitous ARP, but I do =
not think that this was the default behavior.

I used to rely on the gratuitous ARP behavior in a network back in =
~1994. We had some weird home grown network widgets that simply didn't =
do ARP, and so we had some other device on the same LAN that would send =
(spoofed) gratuitous ARPs on their behalf every minute or two.=20

A number of load balancer (and similar devices) do failover using =
Gratuitous ARPs -- when the primary goes down the backup sends gARP for =
all of the VIPs. Netscaler used to send these at GigE line rate -- if =
there were a large number of VIPs the routers would often nat be able to =
keep up, and hilarity would ensue=85
=20
W

>=20
> Thomas Narten <narten@us.ibm.com> writes:
>=20
>>> 3. Sec 7.1 seems to suggest that Gratuitous ARPs pre-populate ARP
>>> caches on the neighboring devices. Without an explicit description
>>> of what a neighboring device is, I would presume that this also
>>> includes edge/core routers. In that case this statement is not
>>> entirely correct as I am aware of routers that will by default not
>>> pre-populate their ARP caches on receiving Gratuitous ARPs.
>=20
>> Right. The spec says "don't do this". But I believe it was asserted
>> that some implementations do this. That said, I'm not aware of any
>> such implementations. I would be willing to remove this sentence in
>> the absence of known implementations of this.
>=20
> Thomas
>=20
> _______________________________________________
> armd mailing list
> armd@ietf.org
> https://www.ietf.org/mailman/listinfo/armd
>=20

--
After you'd known Christine for any length of time, you found yourself =
fighting a desire to look into her ear to see if you could spot daylight =
coming the other way.

    -- (Terry Pratchett, Maskerade)


From manav.bhatia@alcatel-lucent.com  Tue Aug 28 01:25:17 2012
Return-Path: <manav.bhatia@alcatel-lucent.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2D13F11E80DE; Tue, 28 Aug 2012 01:25:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.349
X-Spam-Level: 
X-Spam-Status: No, score=-9.349 tagged_above=-999 required=5 tests=[AWL=1.250,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NBIb85gHjatM; Tue, 28 Aug 2012 01:25:15 -0700 (PDT)
Received: from ihemail2.lucent.com (ihemail2.lucent.com [135.245.0.35]) by ietfa.amsl.com (Postfix) with ESMTP id C0D4D11E80A3; Tue, 28 Aug 2012 01:25:15 -0700 (PDT)
Received: from inbansmailrelay1.in.alcatel-lucent.com (h135-250-11-31.lucent.com [135.250.11.31]) by ihemail2.lucent.com (8.13.8/IER-o) with ESMTP id q7S8PBwW019574 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 28 Aug 2012 03:25:14 -0500 (CDT)
Received: from INBANSXCHHUB02.in.alcatel-lucent.com (inbansxchhub02.in.alcatel-lucent.com [135.250.12.35]) by inbansmailrelay1.in.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id q7S8P9Gl002910 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT); Tue, 28 Aug 2012 13:55:10 +0530
Received: from INBANSXCHMBSA1.in.alcatel-lucent.com ([135.250.12.38]) by INBANSXCHHUB02.in.alcatel-lucent.com ([135.250.12.35]) with mapi; Tue, 28 Aug 2012 13:55:08 +0530
From: "Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com>
To: Thomas Narten <narten@us.ibm.com>
Date: Tue, 28 Aug 2012 13:50:59 +0530
Thread-Topic: RtgDir review: draft-ietf-armd-problem-statement-03
Thread-Index: Ac2Emn+J32bcE+5qQJSlycTwjNR6KgAOZYow
Message-ID: <7C362EEF9C7896468B36C9B79200D8350D06450BB6@INBANSXCHMBSA1.in.alcatel-lucent.com>
References: <7C362EEF9C7896468B36C9B79200D8350D063A0AF5@INBANSXCHMBSA1.in.alcatel-lucent.com> <201208272124.q7RLOnx7015943@cichlid.raleigh.ibm.com>
In-Reply-To: <201208272124.q7RLOnx7015943@cichlid.raleigh.ibm.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.35
Cc: "rtg-dir@ietf.org" <rtg-dir@ietf.org>, "armd@ietf.org" <armd@ietf.org>, "draft-ietf-armd-problem-statement.all@tools.ietf.org" <draft-ietf-armd-problem-statement.all@tools.ietf.org>, "rtg-ads@tools.ietf.org" <rtg-ads@tools.ietf.org>
Subject: Re: [armd] RtgDir review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Aug 2012 08:25:17 -0000

Hi Thomas,

[clipped]
=20
> This is poorly worded. How about I replace  the paragraph with the
> following:
>=20
> 	Broadly speaking, from the perspective of address resolution,
>         IPv6's Neighbor Discovery (ND) behaves much like ARP, with a
>         few notable differences. First, ARP uses broadcast, whereas ND
>         uses multicast. Specifically, when querying for a target IP
>         address, ND maps the target address into an IPv6 Solicited
>         Node multicast address. Using multicast rather than broadcast
>         has the benefit that the multicast frames do not necessarily
>         need to be sent to all parts of the network, i.e., only to
>         segments where listeners for the Solicited Node multicast
>         address reside. In the case where multicast frames are
>         delivered to all parts of the network, sending to a multicast
>         still has the advantage that most (if not all) nodes will
>         filter out the (unwanted) multicast query via filters
>         installed in the NIC rather than burdening host software with
>         the need to process such packets. Thus, whereas all nodes must
>         process every ARP query, ND queries are processed only by the
>         nodes to which they are intended. In cases where multicast
>         filtering can't effectively be implemented in the NIC (e.g.,
>         as on hypervisors supporting virtualization), filtering would
>         need to be done in software (e.g., in the hypervisor's
>         vSwitch).

>=20
> > "may" seems to indicate that there are scenarios when a multicast
> >  from an L2 perspective will not be delivered to all nodes.
>=20
> Correct.
>=20
> > I am unable to envisage a scenario when this can happen? All BUM
> >  (broadcast, unlearnt unicast and multicast) traffic in vanilla L2
> >  and VPLS (Virtual Private Lan Service) is delivered to *all*
> >  nodes. There are exceptions in H-VPLS or if MMRP is enabled but I
> >  suspect if the authors had this in their mind when they wrote the
> >  above text.
>=20
> Hopefully the proposed text answers the above questions.

Thanks, the proposed text is much better.

However, the draft still says "multicast frames do not necessarily need to =
be sent to all parts of the network". I could be missing something but ther=
e still seems to be some disconnect because in the context of L2, multicast=
 frames will be sent to all parts of the network.=20

>=20
> > 2. Sec 7.1 begins with the following text:
>=20
> > "One pain point with large L2 broadcast domains is that the routers
> >  connected to the L2 domain need to process "a lot of" ARP traffic."
>=20
> > I am not sure if this is correct with how an L2 broadcast domain has
> >  been defined in Sec 2. I would wager that a bigger pain point for a
> >  large L2 broadcast domain would be handling unknown unicast traffic
> >  that needs to get flooded, instead of dealing with the "ARP"
> >  traffic. I am aware of very very large L2 broadcast domains that
> >  have no ARP/ND scaling problems. Would it then make more sense to
> >  replace the L2 broadcast domain with an ARP/ND domain? If Yes, then
> >  ARP/ND domain too needs to be defined in Sec 2.
>=20
> The issue (as has been discussed in ARMD) is specifically the ARP
> processing load (and not unknown unicast traffic). In typical
> implementations, ARP processing is done by a service processor with
> limited capacity. The cited problem is that the amount of ARP traffic
> places a significant load on that processor.
>=20
> This is explained in the next pargraph. How about I add the following
> sentence to the 2nd paragraph.:
>=20
>      In some deployments, limitations on the rate of ARP processing
>      have been cited as being a problem.
>=20
> Does that work?

Yes it does as long as you remove the original line that I had quoted.

>=20
> > 3. Sec 7.1 seems to suggest that Gratuitous ARPs pre-populate ARP
> >  caches on the neighboring devices. Without an explicit description
> >  of what a neighboring device is, I would presume that this also
> >  includes edge/core routers. In that case this statement is not
> >  entirely correct as I am aware of routers that will by default not
> >  pre-populate their ARP caches on receiving Gratuitous ARPs.
>=20
> Right. The spec says "don't do this". But I believe it was asserted
> that some implementations do this. That said, I'm not aware of any
> such implementations. I would be willing to remove this sentence in
> the absence of known implementations of this.

This clearly is not the default behavior for several core/edge router imple=
mentations that I am aware of. So at best there could be a subset of router=
s that do this. In which case you need to fix the text that claims that *al=
l* routers pre-populate ARP caches upon receiving Gratuitous ARPs.

>=20
> > 4. Sec 7.2 must also discuss the scaling impact of how the neighbor
> >  cache is maintained in IPv6 - especially the impact of moving the
> >  neighbor state from REACHABLE to STALE. Once the "IPv6 ARP" gets
> >  resolved the neighbor entry moves from the REACHABLE to STALE after
> >  around 30secs. The neighbor entry remains in this state till a
> >  packet needs to be forwarded to this neighbor. The first time a
> >  node sends a packet to a neighbor whose entry is STALE, the sender
> >  changes the state to DELAY and sets a timer to expire in around 5
> >  seconds. Most routers initiate moving the state from STALE to DELAY
> >  by punting a copy of the data packet to CPU so that the sender can
> >  reinitiate the Neighbor discovery process. This patently can be
> >  quite CPU and buffer intensive if the neighbor cache size is huge.
>=20
> This could be. But the WG did not report such specific details in
> terms of actual problems reported from deployments.
>=20
> Care to say more about what these "most implementations" are and how
> common they are? And are they the *only* way to implement this
> feature, or have other vendors chosen different implementations
> without this limitation?
>=20
> That said, I could add the following to the document:
>=20
> 	Routers implementing NUD (for neighboring destinations) will
> 	need to process neighbor cache state changes such as
> 	transitioning entries from REACHABLE to STALE. How this
> 	capability is implemented may impact the scalabability of ND
> 	on a router. For example, one possible implementation is to
> 	have the forwarding operation detect when an ND entry is
> 	referenced that needs to transition from REACHABLE to STALE,
> 	by signalling an event that would need to be processed by the
> 	software processor. Such an implementation could increase the
> 	load on the service processor much in the same way that a high
> 	rate of ARP requests have led to problems on some routers.

Looks good.

[clipped]

>=20
>=20
> > 2. In Sec 7.1 you mention that routers need to drop all transit
> >  traffic when there is no response received for an ARP/ND
> >  request. You should mention that in addition to this, routers also
> >  need to send an ICMP host unreachable error packet back to the
> >  sender. ICMP error packets are generated in the control card
> >  CPU. So, if the CPU has to generate a high number of such ICMP
> >  errors then this can load the CPU. The whole process can be quite
> >  CPU as well as buffer intensive. The CPU/buffer overload is usually
> >  mitigated by rate limiting the number of ICMP errors generated.
>=20
> Added:
>=20
>    "and may send an ICMP destination unreachable message as well."

Why a "may"? An implementation is violating a standard if it isn't.

>=20
> > 3. In Sec 7.1 you mention that the entire ARP/ND process can be
> >  quite CPU intensive since transit data traffic needs to be queued
> >  while the address resolution is underway. You could mention that
> >  this is mitigated by offloading the queuing part to the line card
> >  CPUs so that the CPU on the control card is not inundated with such
> >  packets. This obviously would only work on distributed systems that
> >  have separate CPUs on the line cards and the main card.
>=20
> There are many things one could say about ARP implementations. But
> that is not the purpose of this document. It is really about outlining
> the problems... So I think the above is getting too detailed.
>=20
> > 4. Sec 7.1 should mention that this could be used as a DoS attack
> >  wherein the attacker sends a high volume of packets for which ARPs
> >  need to be resolved. This could result in genuine packets that need
> >  to resolve ARPs getting dropped as there is only a finite rate at
> >  which packets are sent to CPU for ARP resolution. Again this is
> >  both CPU and buffer intensive.
>=20
> Again, I don't think this document needs to cover all aspects of ND.
>=20
> > 5. Sec 7.2 discusses issues with address resolution mechanism in
> >  IPv6. I think its useful for this draft to discuss the fact that
> >  unlike IPv4, IPv6 has subnets that are /64. This number is quite
> >  large and will perhaps cover trillions of IP addresses, most of
> >  which would be unassigned. Thus simplistic IPv6 ND implementations
> >  can be vulnerable to attacks which inundates the CPU with huge
> >  requests to perform address resolution for a large number of IPv6
> >  addresses, most of which are unassigned. As a result of this
> >  genuine IPv6 devices will not be able to join the network. You
> >  might want to refer to RFC 6583 for more details.
>=20
> Ditto.

I am fine with your resolution to the comments 3 and 4. However, I believe =
that 5 ought to be discussed. This document is about ARP/ND issues that fol=
ks are either seeing or will see in large data centers. Given this, I don't=
 see why this should not even be discussed in this draft. I think its quite=
 reasonable to address the above mentioned aspect of IPv6 ND and one of way=
 getting attention to issue is by discussing this here in this draft.

>=20
> > 7. Sec 11 - Security Considerations should at the very least give
> >  pointers to references on issues related to ARP security
> >  vulnerabilities. I don't see IPv6 ND mentioned at all. Since ND
> >  relies on ICMPv6 and does not run directly over layer 2, there
> >  could possibly be security concerns specific to ND in the data
> >  center environments that don't apply to ARP. This document ought to
> >  discuss those so that ARMD (or some other WG) can look at solutions
> >  addressing those concerns.
>=20
> Actually, I disagree somewhat. This document doesn't need to get into
> all the security issues of ARP and/or ND. For one thing, they did not
> come up as "problems" in ARMD. :-) I will put in pointers to the ND
> security considerations section. How about I add the following
> sentence:
>=20
>     Security considerations for Neighbor Discovery are discussed in
>     <xref target=3D"RFC4861"></xref> and <xref target=3D"RFC6583"></xref>=
.

This should be good. I assume that this then means that there are no additi=
onal security concerns with ARPs/ND in data centers.

Can you also remove the first line from the Security Consideration? Its red=
undant and has already been said earlier.

>=20
> > 8. Should it be mentioned in the document somewhere (sec 11?) that
> >  data center administrators can configure ACLs to filter packets
> >  addressed to unallocated IPv6 addresses? Folks can consider the
> >  valid IPv6 address ranges and filter out packets that use the
> >  unallocated addresses. Doing this will avoid unnecessary ARP
> >  resolution for invalid IPv6 addresses. The list of the IPv6
> >  addresses that are legitimate and should be permitted is small and
> >  maintainable because of IPv6's address
> >  hierarchy. http://www.iana.org/assignments/ipv6-unicast-address-
> assignments/ipv6-unicast-address-assignments.xml
> >  gives a list of large address blocks that have been allocated by
> >  IANA.
>=20
> IMO no. This goes beyond the scope of this document.

While I don't see any harm in mentioning this, I leave it on you/WG to deci=
de if you want to include this or not.

I just noticed that Sec 8 - Summary, is redundant. Shouldnt that entire tex=
t be moved to either the Abstract or the Introduction?

Cheers, Manav


From narten@us.ibm.com  Wed Aug 29 07:58:54 2012
Return-Path: <narten@us.ibm.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4AD8D21F8624 for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 07:58:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.599
X-Spam-Level: 
X-Spam-Status: No, score=-110.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id triUcZAyP6ac for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 07:58:53 -0700 (PDT)
Received: from e9.ny.us.ibm.com (e9.ny.us.ibm.com [32.97.182.139]) by ietfa.amsl.com (Postfix) with ESMTP id B1A5121F8647 for <armd@ietf.org>; Wed, 29 Aug 2012 07:58:52 -0700 (PDT)
Received: from /spool/local by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <armd@ietf.org> from <narten@us.ibm.com>; Wed, 29 Aug 2012 10:58:51 -0400
Received: from d01dlp02.pok.ibm.com (9.56.250.167) by e9.ny.us.ibm.com (192.168.1.109) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;  Wed, 29 Aug 2012 10:58:49 -0400
Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 3BACE6E8041; Wed, 29 Aug 2012 10:58:48 -0400 (EDT)
Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7TEwlCk118912; Wed, 29 Aug 2012 10:58:47 -0400
Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7TEwlZN010479; Wed, 29 Aug 2012 10:58:47 -0400
Received: from cichlid.raleigh.ibm.com ([9.80.31.201]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q7TEwk25010315 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 Aug 2012 10:58:47 -0400
Received: from cichlid.raleigh.ibm.com (localhost.localdomain [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.5/8.12.5) with ESMTP id q7TEwgxI011886; Wed, 29 Aug 2012 10:58:42 -0400
Message-Id: <201208291458.q7TEwgxI011886@cichlid.raleigh.ibm.com>
To: "Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com>
In-reply-to: <7C362EEF9C7896468B36C9B79200D8350D06450BB6@INBANSXCHMBSA1.in.alcatel-lucent.com>
References: <7C362EEF9C7896468B36C9B79200D8350D063A0AF5@INBANSXCHMBSA1.in.alcatel-lucent.com> <201208272124.q7RLOnx7015943@cichlid.raleigh.ibm.com> <7C362EEF9C7896468B36C9B79200D8350D06450BB6@INBANSXCHMBSA1.in.alcatel-lucent.com>
Comments: In-reply-to "Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com> message dated "Tue, 28 Aug 2012 13:50:59 +0530."
Date: Wed, 29 Aug 2012 10:58:42 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12082914-7182-0000-0000-0000026F8A71
Cc: "rtg-dir@ietf.org" <rtg-dir@ietf.org>, "armd@ietf.org" <armd@ietf.org>, "draft-ietf-armd-problem-statement.all@tools.ietf.org" <draft-ietf-armd-problem-statement.all@tools.ietf.org>, "rtg-ads@tools.ietf.org" <rtg-ads@tools.ietf.org>
Subject: Re: [armd] RtgDir review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Aug 2012 14:58:54 -0000

"Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com> writes:
> Thanks, the proposed text is much better.

> However, the draft still says "multicast frames do not necessarily
>  need to be sent to all parts of the network". I could be missing
>  something but there still seems to be some disconnect because in
>  the context of L2, multicast frames will be sent to all parts of
>  the network.

L2 IGMP snooping may be taking place, which can then result in multicast
traffic not being forwarded everywhere in the L2 broadcst domain...

> > 
> > > 2. Sec 7.1 begins with the following text:
> > 
> > > "One pain point with large L2 broadcast domains is that the routers
> > >  connected to the L2 domain need to process "a lot of" ARP traffic."
> > 
> > > I am not sure if this is correct with how an L2 broadcast domain has
> > >  been defined in Sec 2. I would wager that a bigger pain point for a
> > >  large L2 broadcast domain would be handling unknown unicast traffic
> > >  that needs to get flooded, instead of dealing with the "ARP"
> > >  traffic. I am aware of very very large L2 broadcast domains that
> > >  have no ARP/ND scaling problems. Would it then make more sense to
> > >  replace the L2 broadcast domain with an ARP/ND domain? If Yes, then
> > >  ARP/ND domain too needs to be defined in Sec 2.
> > 
> > The issue (as has been discussed in ARMD) is specifically the ARP
> > processing load (and not unknown unicast traffic). In typical
> > implementations, ARP processing is done by a service processor with
> > limited capacity. The cited problem is that the amount of ARP traffic
> > places a significant load on that processor.
> > 
> > This is explained in the next pargraph. How about I add the following
> > sentence to the 2nd paragraph.:
> > 
> >      In some deployments, limitations on the rate of ARP processing
> >      have been cited as being a problem.
> > 
> > Does that work?

> Yes it does as long as you remove the original line that I had
  quoted.

Removing that line IMO removes something essential. It is the case
that on some routers (i.e., devices at the edge of an L2 boundary) do
not have sufficient resources to process "a lot of ARP traffic". "a
lot" is in quotes because we don't have an exact figure for what that
is. This is one of the key points to come out of the ARMD effort.

What exactly do you object to in that sentence?

> > 
> > > 3. Sec 7.1 seems to suggest that Gratuitous ARPs pre-populate ARP
> > >  caches on the neighboring devices. Without an explicit description
> > >  of what a neighboring device is, I would presume that this also
> > >  includes edge/core routers. In that case this statement is not
> > >  entirely correct as I am aware of routers that will by default not
> > >  pre-populate their ARP caches on receiving Gratuitous ARPs.
> > 
> > Right. The spec says "don't do this". But I believe it was asserted
> > that some implementations do this. That said, I'm not aware of any
> > such implementations. I would be willing to remove this sentence in
> > the absence of known implementations of this.

To clarify, the current text says "Some routers can be configured to
broadcast periodic gratuitous ARPs."

This statement is true, and presumably you are not objecting to
that. right?

Note also that Warren Kumari
(http://www.ietf.org/mail-archive/web/armd/current/msg00489.html)
reports the Cisco IOS at one point could be configured to pre-populate
ARP caches via received gratuitous ARPs.

> This clearly is not the default behavior for several core/edge
>  router implementations that I am aware of. So at best there could
>  be a subset of routers that do this.

Which I believe is consistent with the current text saying "some
routers".

> In which case you need to fix
>  the text that claims that *all* routers pre-populate ARP caches
>  upon receiving Gratuitous ARPs.

How about I change the sentence:

    Gratuitous ARPs, broadcast to all nodes in the L2 broadcast
    domain, can also pre-populate ARP caches on neighboring devices,
    further reducing ARP traffic.

to:

    Gratuitous ARPs, broadcast to all nodes in the L2 broadcast
    domain, may in some cases also pre-populate ARP caches on
    neighboring devices, further reducing ARP traffic. But it is not
    believed that pre-population of ARP entries is supported by most
    implementations, as the ARP specification <xref
    target="RFC0826"></xref> recommends only that pre-existing ARP
    entries be updated upon receipt of ARP messages; it does not call
    for the creation of new entries when none already exist.

> > > 2. In Sec 7.1 you mention that routers need to drop all transit
> > >  traffic when there is no response received for an ARP/ND
> > >  request. You should mention that in addition to this, routers also
> > >  need to send an ICMP host unreachable error packet back to the
> > >  sender. ICMP error packets are generated in the control card
> > >  CPU. So, if the CPU has to generate a high number of such ICMP
> > >  errors then this can load the CPU. The whole process can be quite
> > >  CPU as well as buffer intensive. The CPU/buffer overload is usually
> > >  mitigated by rate limiting the number of ICMP errors generated.
> > 
> > Added:
> > 
> >    "and may send an ICMP destination unreachable message as well."

> Why a "may"? An implementation is violating a standard if it isn't.

The might not if rate limiting says otherwise. I.e., there are times
when an ICMP won't be sent that is not in violation of the spec.

> > > 3. In Sec 7.1 you mention that the entire ARP/ND process can be
> > >  quite CPU intensive since transit data traffic needs to be queued
> > >  while the address resolution is underway. You could mention that
> > >  this is mitigated by offloading the queuing part to the line card
> > >  CPUs so that the CPU on the control card is not inundated with such
> > >  packets. This obviously would only work on distributed systems that
> > >  have separate CPUs on the line cards and the main card.
> > 
> > There are many things one could say about ARP implementations. But
> > that is not the purpose of this document. It is really about outlining
> > the problems... So I think the above is getting too detailed.
> > 
> > > 4. Sec 7.1 should mention that this could be used as a DoS attack
> > >  wherein the attacker sends a high volume of packets for which ARPs
> > >  need to be resolved. This could result in genuine packets that need
> > >  to resolve ARPs getting dropped as there is only a finite rate at
> > >  which packets are sent to CPU for ARP resolution. Again this is
> > >  both CPU and buffer intensive.
> > 
> > Again, I don't think this document needs to cover all aspects of ND.
> > 
> > > 5. Sec 7.2 discusses issues with address resolution mechanism in
> > >  IPv6. I think its useful for this draft to discuss the fact that
> > >  unlike IPv4, IPv6 has subnets that are /64. This number is quite
> > >  large and will perhaps cover trillions of IP addresses, most of
> > >  which would be unassigned. Thus simplistic IPv6 ND implementations
> > >  can be vulnerable to attacks which inundates the CPU with huge
> > >  requests to perform address resolution for a large number of IPv6
> > >  addresses, most of which are unassigned. As a result of this
> > >  genuine IPv6 devices will not be able to join the network. You
> > >  might want to refer to RFC 6583 for more details.
> > 
> > Ditto.

> I am fine with your resolution to the comments 3 and 4. However, I
>  believe that 5 ought to be discussed. This document is about ARP/ND
>  issues that folks are either seeing or will see in large data
>  centers.

To clarify: "are seeing". We can speculate at length for what problems
will be seen in the future. :-)

> Given this, I don't see why this should not even be discussed in
>  this draft. I think its quite reasonable to address the above
>  mentioned aspect of IPv6 ND and one of way getting attention to
>  issue is by discussing this here in this draft.

The issue you raise above is fully documented in RFC 6583, which I
have added to the references (per my previous note).

> > > 7. Sec 11 - Security Considerations should at the very least give
> > >  pointers to references on issues related to ARP security
> > >  vulnerabilities. I don't see IPv6 ND mentioned at all. Since ND
> > >  relies on ICMPv6 and does not run directly over layer 2, there
> > >  could possibly be security concerns specific to ND in the data
> > >  center environments that don't apply to ARP. This document ought to
> > >  discuss those so that ARMD (or some other WG) can look at solutions
> > >  addressing those concerns.
> > 
> > Actually, I disagree somewhat. This document doesn't need to get into
> > all the security issues of ARP and/or ND. For one thing, they did not
> > come up as "problems" in ARMD. :-) I will put in pointers to the ND
> > security considerations section. How about I add the following
> > sentence:
> > 
> >     Security considerations for Neighbor Discovery are discussed in
> >     <xref target="RFC4861"></xref> and <xref target="RFC6583"></xref>.

> This should be good. I assume that this then means that there are no
>  additional security concerns with ARPs/ND in data centers.

I don't recall any coming up in the WG.

> Can you also remove the first line from the Security Consideration?
>  Its redundant and has already been said earlier.

OK.

> > > 8. Should it be mentioned in the document somewhere (sec 11?) that
> > >  data center administrators can configure ACLs to filter packets
> > >  addressed to unallocated IPv6 addresses? Folks can consider the
> > >  valid IPv6 address ranges and filter out packets that use the
> > >  unallocated addresses. Doing this will avoid unnecessary ARP
> > >  resolution for invalid IPv6 addresses. The list of the IPv6
> > >  addresses that are legitimate and should be permitted is small and
> > >  maintainable because of IPv6's address
> > >  hierarchy. http://www.iana.org/assignments/ipv6-unicast-address-
> > assignments/ipv6-unicast-address-assignments.xml
> > >  gives a list of large address blocks that have been allocated by
> > >  IANA.
> > 
> > IMO no. This goes beyond the scope of this document.

> While I don't see any harm in mentioning this, I leave it on you/WG
>  to decide if you want to include this or not.

> I just noticed that Sec 8 - Summary, is redundant. Shouldnt that
  entire text be moved to either the Abstract or the Introduction?

It's the last section of the document. The document needs a summary or
something (summary seems more accurate than conclusions, IMO).

Thomas


From narten@us.ibm.com  Wed Aug 29 09:01:57 2012
Return-Path: <narten@us.ibm.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C10C321F8668 for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 09:01:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.599
X-Spam-Level: 
X-Spam-Status: No, score=-110.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gTwYqqKi0lkV for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 09:01:56 -0700 (PDT)
Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.149]) by ietfa.amsl.com (Postfix) with ESMTP id 8FD8C21F868A for <armd@ietf.org>; Wed, 29 Aug 2012 09:01:56 -0700 (PDT)
Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <armd@ietf.org> from <narten@us.ibm.com>; Wed, 29 Aug 2012 10:01:44 -0600
Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e31.co.us.ibm.com (192.168.1.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;  Wed, 29 Aug 2012 10:01:40 -0600
Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com [9.17.195.107]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id ADCDA1FF0055; Wed, 29 Aug 2012 10:01:33 -0600 (MDT)
Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7TG0v1d078682; Wed, 29 Aug 2012 10:01:01 -0600
Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7TG0rhA032277; Wed, 29 Aug 2012 10:00:53 -0600
Received: from cichlid.raleigh.ibm.com ([9.80.31.201]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q7TG0nO6031741 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 Aug 2012 10:00:51 -0600
Received: from cichlid.raleigh.ibm.com (localhost.localdomain [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.5/8.12.5) with ESMTP id q7TFxgmJ012331; Wed, 29 Aug 2012 11:59:43 -0400
Message-Id: <201208291559.q7TFxgmJ012331@cichlid.raleigh.ibm.com>
To: "Joel M. Halpern" <jmh@joelhalpern.com>
In-reply-to: <502471DB.80303@joelhalpern.com>
References: <50243C05.3080006@nostrum.com> <502471DB.80303@joelhalpern.com>
Comments: In-reply-to "Joel M. Halpern" <jmh@joelhalpern.com> message dated "Thu, 09 Aug 2012 22:28:43 -0400."
Date: Wed, 29 Aug 2012 11:59:41 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12082916-7282-0000-0000-00000C6AFA4E
Cc: gen-art@ietf.org, "A. Jean Mahoney" <mahoney@nostrum.com>, "armd@ietf.org" <armd@ietf.org>
Subject: Re: [armd] Gen-art] review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Aug 2012 16:01:58 -0000

Hi Joel.

Thanks for the review comments. (And sorry for taking so long to respond!)

"Joel M. Halpern" <jmh@joelhalpern.com> writes:

> Major issues:
>      The use of the term "switch" seems confusing.  I had first assumed 
> that it meant an ethernet switch (which might have  abit of L3 smarts, 
> or might not.  I was trying not to be picky.)  But then, in section 6.3 
> it refers to "core switches ... are the data center gateways to external 
> networks" which means that those are routers.

The switch vs. router terminology is tricky.

6.3 says:

   Core switches connect multiple aggregation switches and are the data
   center gateway(s) to external networks or interconnect to different
   sets of racks within one data center.

How about I change that to:

   Core switches connect multiple aggregation switches and interface
   with data center gateway(s) to external networks or interconnect to
   different sets of racks within one data center.

I know that is just side stepping this a bit, but Section 6.4 has more
text about the L2/L3 boundaries in various deployments. This document
is walking a bit of tightrope by trying to be general and not too
specific. If we get too specific, folk start screaming "that's not the
way my data center looks".

> Moderate Issue:
>     The document seems to be interestingly selective in what modern 
> technologies it chooses to mention.  Mostly it seems to be describing 
> problems with data center networks using technology more than 5 years 
> old.  Since that is the widely deployed practice, that is
>     defensible.

I think this has to do with how the WG was chartered.

> But then the document chooses to mention new work such as OpenFlow, 
> without mentioning the work IEEE has done on broadcast ad multicast 
> containment for data centers.  It seems to me that we need to be 
> consistent, either describing only the widely deployed technology, or 
> including a fair mention of already defined and productized solutions 
> that are not yet widely deployed.

I'd be fine with taking out the references to OpenFlow. I don't think
it adds much to the document.

>      On a related note, the document assumes that multicast NDs are 
> delivered to all nodes, while in practice I believe existing techniques 
> to filter such multicast messages closer to the source are widely 
> deployed.  (Section 5.)

This paragraph has been signficantly revised. The current proposed
text  is:

	Broadly speaking, from the perspective of address resolution,
        IPv6's Neighbor Discovery (ND) behaves much like ARP, with a
        few notable differences. First, ARP uses broadcast, whereas ND
        uses multicast. Specifically, when querying for a target IP
        address, ND maps the target address into an IPv6 Solicited
        Node multicast address. Using multicast rather than broadcast
        has the benefit that the multicast frames do not necessarily
        need to be sent to all parts of the network, i.e., only to
        segments where listeners for the Solicited Node multicast
        address reside. In the case where multicast frames are
        delivered to all parts of the network, sending to a multicast
        still has the advantage that most (if not all) nodes will
        filter out the (unwanted) multicast query via filters
        installed in the NIC rather than burdening host software with
        the need to process such packets. Thus, whereas all nodes must
        process every ARP query, ND queries are processed only by the
        nodes to which they are intended. In cases where multicast
        filtering can't effectively be implemented in the NIC (e.g.,
        as on hypervisors supporting virtualization), filtering would
        need to be done in software (e.g., in the hypervisor's
        vSwitch).

Is that better?	

> Minor issues:
>      I presume that section 6.4.2 which describes needing to enable all 
> VLANs on all aggregation ports is a description of current practice, 
> since it is not a requirement of current technologies, either via VLAN 
> management or orchestration?

Yes.

>      Section 6.4.4 seems very odd.  The title is "overlays".  Are there 
> widely deployed overlays?

I keep hearing yes, but proprietary, so little can be said about them.

> If so, it would be good to name the 
> technologies being referred to here.  If this is intended to refer to 
> the overlay proposal in IETF and IEEE, I think that the characterization 
> is somewhat misleading, and probably is best simply removed.

Hmm, I didn't actually write this text. It originally came from
draft-karir-armd-datacenter-reference-arch, which was merged into the
problem statement document by the WG.

I agree this section is kind of fuzzy,

I'm on the fence about what to do. Are there other opinions?

>      Is the fifth paragraph of section 71. on ARP processing and 
> buffering in the absence of ARP cache entries accurate?  I may well be 
> out of date, but it used to be the case that most routers dropped the 
> packets, and some would buffer 1 packet deep at most.  This description 
> indicates a rather more elaborate behavior.

RFC 1122 says:

         2.3.2.2  ARP Packet Queue

            The link layer SHOULD save (rather than discard) at least
            one (the latest) packet of each set of packets destined to
            the same unresolved IP address, and transmit the saved
            packet when the address has been resolved.

RFC 1812 says:

3.3.2 Address Resolution Protocol - ARP

   Routers that implement ARP MUST be compliant and SHOULD be
   unconditionally compliant with the requirements in [INTRO:2].

   The link layer MUST NOT report a Destination Unreachable error to IP
   solely because there is no ARP cache entry for a destination; it
   SHOULD queue up to a small number of datagrams breifly while
   performing the ARP request/reply sequence, and reply that the
   destination is unreachable to one of the queued datagrams only when
   this proves fruitless.


>      Given that this document says it is a general document about 
> scaling issues for data centers, I am surprised that the security 
> considerations section does not touch on the increased complexity of 
> segregating subscriber traffic (customer A can not talk to customer B) 
> when there are very large numbers of customers, and the itneraction of 
> this with L2 scope.

The ARMD WG struggled a bit about scope, and all it was chartered to
do was a problem statement related to address resolution.

Looking at the title of the document "Problem Statement for ARMD", I'd
argue that's not helpful for an RFC given that ARMD will close  and
there is no followup WG planned. How about I change the title to
something like:

    Address Resolution Problems in Large Data Center Networks

I don't want to add other issues like traffic segregation to the
document at this point. Amoung other things, the WG really doesn't
have the energy for this... The intro is pretty clear (IMO) about the
limited scope of the document.

Thomas


From jmh@joelhalpern.com  Wed Aug 29 09:07:56 2012
Return-Path: <jmh@joelhalpern.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 432AE21F852E; Wed, 29 Aug 2012 09:07:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.349
X-Spam-Level: 
X-Spam-Status: No, score=-102.349 tagged_above=-999 required=5 tests=[AWL=-0.084, BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xTV7S2f8AiBc; Wed, 29 Aug 2012 09:07:55 -0700 (PDT)
Received: from morbo.mail.tigertech.net (morbo.mail.tigertech.net [67.131.251.54]) by ietfa.amsl.com (Postfix) with ESMTP id 7427021F853A; Wed, 29 Aug 2012 09:07:55 -0700 (PDT)
Received: from mailc2.tigertech.net (mailc2.tigertech.net [208.80.4.156]) by morbo.tigertech.net (Postfix) with ESMTP id 1B5C8A6E0A; Wed, 29 Aug 2012 09:07:54 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by mailc2.tigertech.net (Postfix) with ESMTP id 71D261BD3F68; Wed, 29 Aug 2012 09:07:53 -0700 (PDT)
X-Virus-Scanned: Debian amavisd-new at c2.tigertech.net
Received: from [10.10.10.104] (pool-71-161-50-84.clppva.btas.verizon.net [71.161.50.84]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailc2.tigertech.net (Postfix) with ESMTPSA id 253D71BD4505; Wed, 29 Aug 2012 09:07:52 -0700 (PDT)
Message-ID: <503E3E54.70506@joelhalpern.com>
Date: Wed, 29 Aug 2012 12:07:48 -0400
From: "Joel M. Halpern" <jmh@joelhalpern.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120824 Thunderbird/15.0
MIME-Version: 1.0
To: Thomas Narten <narten@us.ibm.com>
References: <50243C05.3080006@nostrum.com> <502471DB.80303@joelhalpern.com> <201208291559.q7TFxgmJ012331@cichlid.raleigh.ibm.com>
In-Reply-To: <201208291559.q7TFxgmJ012331@cichlid.raleigh.ibm.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: gen-art@ietf.org, "A. Jean Mahoney" <mahoney@nostrum.com>, "armd@ietf.org" <armd@ietf.org>
Subject: Re: [armd] Gen-art] review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Aug 2012 16:07:56 -0000

All of the proposed resolutions look very good.  Thank you.

With regard to routers and ARP caches, my concern is that from what I 
saw of the years, common practice did not seem to match the SHOULD from 
the RFCs.  I am a little remote from most implementations at the moment 
(the ones I can check easily are a tiny fraction of the market), so I 
was suggesting that be double-checked.

Yours,
Joel

On 8/29/2012 11:59 AM, Thomas Narten wrote:
> Hi Joel.
>
> Thanks for the review comments. (And sorry for taking so long to respond!)
>
> "Joel M. Halpern" <jmh@joelhalpern.com> writes:
>
>> Major issues:
>>       The use of the term "switch" seems confusing.  I had first assumed
>> that it meant an ethernet switch (which might have  abit of L3 smarts,
>> or might not.  I was trying not to be picky.)  But then, in section 6.3
>> it refers to "core switches ... are the data center gateways to external
>> networks" which means that those are routers.
>
> The switch vs. router terminology is tricky.
>
> 6.3 says:
>
>     Core switches connect multiple aggregation switches and are the data
>     center gateway(s) to external networks or interconnect to different
>     sets of racks within one data center.
>
> How about I change that to:
>
>     Core switches connect multiple aggregation switches and interface
>     with data center gateway(s) to external networks or interconnect to
>     different sets of racks within one data center.
>
> I know that is just side stepping this a bit, but Section 6.4 has more
> text about the L2/L3 boundaries in various deployments. This document
> is walking a bit of tightrope by trying to be general and not too
> specific. If we get too specific, folk start screaming "that's not the
> way my data center looks".
>
>> Moderate Issue:
>>      The document seems to be interestingly selective in what modern
>> technologies it chooses to mention.  Mostly it seems to be describing
>> problems with data center networks using technology more than 5 years
>> old.  Since that is the widely deployed practice, that is
>>      defensible.
>
> I think this has to do with how the WG was chartered.
>
>> But then the document chooses to mention new work such as OpenFlow,
>> without mentioning the work IEEE has done on broadcast ad multicast
>> containment for data centers.  It seems to me that we need to be
>> consistent, either describing only the widely deployed technology, or
>> including a fair mention of already defined and productized solutions
>> that are not yet widely deployed.
>
> I'd be fine with taking out the references to OpenFlow. I don't think
> it adds much to the document.
>
>>       On a related note, the document assumes that multicast NDs are
>> delivered to all nodes, while in practice I believe existing techniques
>> to filter such multicast messages closer to the source are widely
>> deployed.  (Section 5.)
>
> This paragraph has been signficantly revised. The current proposed
> text  is:
>
> 	Broadly speaking, from the perspective of address resolution,
>          IPv6's Neighbor Discovery (ND) behaves much like ARP, with a
>          few notable differences. First, ARP uses broadcast, whereas ND
>          uses multicast. Specifically, when querying for a target IP
>          address, ND maps the target address into an IPv6 Solicited
>          Node multicast address. Using multicast rather than broadcast
>          has the benefit that the multicast frames do not necessarily
>          need to be sent to all parts of the network, i.e., only to
>          segments where listeners for the Solicited Node multicast
>          address reside. In the case where multicast frames are
>          delivered to all parts of the network, sending to a multicast
>          still has the advantage that most (if not all) nodes will
>          filter out the (unwanted) multicast query via filters
>          installed in the NIC rather than burdening host software with
>          the need to process such packets. Thus, whereas all nodes must
>          process every ARP query, ND queries are processed only by the
>          nodes to which they are intended. In cases where multicast
>          filtering can't effectively be implemented in the NIC (e.g.,
>          as on hypervisors supporting virtualization), filtering would
>          need to be done in software (e.g., in the hypervisor's
>          vSwitch).
>
> Is that better?	
>
>> Minor issues:
>>       I presume that section 6.4.2 which describes needing to enable all
>> VLANs on all aggregation ports is a description of current practice,
>> since it is not a requirement of current technologies, either via VLAN
>> management or orchestration?
>
> Yes.
>
>>       Section 6.4.4 seems very odd.  The title is "overlays".  Are there
>> widely deployed overlays?
>
> I keep hearing yes, but proprietary, so little can be said about them.
>
>> If so, it would be good to name the
>> technologies being referred to here.  If this is intended to refer to
>> the overlay proposal in IETF and IEEE, I think that the characterization
>> is somewhat misleading, and probably is best simply removed.
>
> Hmm, I didn't actually write this text. It originally came from
> draft-karir-armd-datacenter-reference-arch, which was merged into the
> problem statement document by the WG.
>
> I agree this section is kind of fuzzy,
>
> I'm on the fence about what to do. Are there other opinions?
>
>>       Is the fifth paragraph of section 71. on ARP processing and
>> buffering in the absence of ARP cache entries accurate?  I may well be
>> out of date, but it used to be the case that most routers dropped the
>> packets, and some would buffer 1 packet deep at most.  This description
>> indicates a rather more elaborate behavior.
>
> RFC 1122 says:
>
>           2.3.2.2  ARP Packet Queue
>
>              The link layer SHOULD save (rather than discard) at least
>              one (the latest) packet of each set of packets destined to
>              the same unresolved IP address, and transmit the saved
>              packet when the address has been resolved.
>
> RFC 1812 says:
>
> 3.3.2 Address Resolution Protocol - ARP
>
>     Routers that implement ARP MUST be compliant and SHOULD be
>     unconditionally compliant with the requirements in [INTRO:2].
>
>     The link layer MUST NOT report a Destination Unreachable error to IP
>     solely because there is no ARP cache entry for a destination; it
>     SHOULD queue up to a small number of datagrams breifly while
>     performing the ARP request/reply sequence, and reply that the
>     destination is unreachable to one of the queued datagrams only when
>     this proves fruitless.
>
>
>>       Given that this document says it is a general document about
>> scaling issues for data centers, I am surprised that the security
>> considerations section does not touch on the increased complexity of
>> segregating subscriber traffic (customer A can not talk to customer B)
>> when there are very large numbers of customers, and the itneraction of
>> this with L2 scope.
>
> The ARMD WG struggled a bit about scope, and all it was chartered to
> do was a problem statement related to address resolution.
>
> Looking at the title of the document "Problem Statement for ARMD", I'd
> argue that's not helpful for an RFC given that ARMD will close  and
> there is no followup WG planned. How about I change the title to
> something like:
>
>      Address Resolution Problems in Large Data Center Networks
>
> I don't want to add other issues like traffic segregation to the
> document at this point. Amoung other things, the WG really doesn't
> have the energy for this... The intro is pretty clear (IMO) about the
> limited scope of the document.
>
> Thomas
>
>

From narten@us.ibm.com  Wed Aug 29 12:55:05 2012
Return-Path: <narten@us.ibm.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A64C11E80F1 for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 12:55:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.599
X-Spam-Level: 
X-Spam-Status: No, score=-110.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zr8xYxQm6kKU for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 12:55:04 -0700 (PDT)
Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by ietfa.amsl.com (Postfix) with ESMTP id 9068E11E80EE for <armd@ietf.org>; Wed, 29 Aug 2012 12:55:04 -0700 (PDT)
Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <armd@ietf.org> from <narten@us.ibm.com>; Wed, 29 Aug 2012 13:55:03 -0600
Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;  Wed, 29 Aug 2012 13:54:22 -0600
Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id B43F319D8043; Wed, 29 Aug 2012 13:54:21 -0600 (MDT)
Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7TJsKXr115518; Wed, 29 Aug 2012 13:54:20 -0600
Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7TJsJmo007163; Wed, 29 Aug 2012 13:54:19 -0600
Received: from cichlid.raleigh.ibm.com ([9.80.31.201]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q7TJsI1e006958 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 Aug 2012 13:54:18 -0600
Received: from cichlid.raleigh.ibm.com (localhost.localdomain [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.5/8.12.5) with ESMTP id q7TJsGqx015175; Wed, 29 Aug 2012 15:54:17 -0400
Message-Id: <201208291954.q7TJsGqx015175@cichlid.raleigh.ibm.com>
To: "Joel M. Halpern" <jmh@joelhalpern.com>
In-reply-to: <503E3E54.70506@joelhalpern.com>
References: <50243C05.3080006@nostrum.com> <502471DB.80303@joelhalpern.com> <201208291559.q7TFxgmJ012331@cichlid.raleigh.ibm.com> <503E3E54.70506@joelhalpern.com>
Comments: In-reply-to "Joel M. Halpern" <jmh@joelhalpern.com> message dated "Wed, 29 Aug 2012 12:07:48 -0400."
Date: Wed, 29 Aug 2012 15:54:16 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12082919-6148-0000-0000-0000090DBF5F
Cc: gen-art@ietf.org, "A. Jean Mahoney" <mahoney@nostrum.com>, "armd@ietf.org" <armd@ietf.org>
Subject: Re: [armd] Gen-art] review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Aug 2012 19:55:05 -0000

Hi Joel.

> All of the proposed resolutions look very good.  Thank you.

Great!

> With regard to routers and ARP caches, my concern is that from what I 
> saw of the years, common practice did not seem to match the SHOULD from 
> the RFCs.  I am a little remote from most implementations at the moment 
> (the ones I can check easily are a tiny fraction of the market), so I 
> was suggesting that be double-checked.

I hear you, and I suspect that there is a wide variability in what
routers implement. And the easy implementation (especially for the
fast path) is not to queue anything at all, which would still be
compliant since queuing is only a SHOULD...

I've tweaked the text a bit more after looking at the actual
requirements (e.g., the spec doesn't say you have to send an ICMP
unreachable on an ARP miss, it only says that if you do, you shouldn't
do it just because there is no ARP entry).

In any case, the point of this paragraph is really just to explain the
steps, to show they can be "cpu intensive". I think the WG asserted
pretty strongly that for some implementations/deployments, the
implementation cost is a problem (i.e., CPU intensive to the point of
being problematical).

   Finally, another area concerns the overhead of processing IP packets
   for which no ARP entry exists.  Existing standards specify that one
   (or more) IP packets for which no ARP entry exists should be queued
   pending succesful completion of the address resolution process
   [RFC1122] [RFC1812].  Once an ARP query has been resolved, any queued
   packets can be forwarded on.  Again, the processing of such packets
   is handled in the "slow path", effectively limiting the rate at which
   a router can process ARP "cache misses" and is viewed as a problem in
   some deployments today.  Additionally, if no response is received,
   the router may send the ARP/ND query multiple times.  If no response
   is received after a number of ARP/ND requests, the router needs to
   drop any queued data packets, and may send an ICMP destination
   unreachable message as well [RFC0792].  This entire process can be
   CPU intensive.

Is that any better?

Thomas


From jmh@joelhalpern.com  Wed Aug 29 13:59:15 2012
Return-Path: <jmh@joelhalpern.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0720611E80D7; Wed, 29 Aug 2012 13:59:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.348
X-Spam-Level: 
X-Spam-Status: No, score=-102.348 tagged_above=-999 required=5 tests=[AWL=-0.083, BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pq+ZQN77TwAS; Wed, 29 Aug 2012 13:59:14 -0700 (PDT)
Received: from morbo.mail.tigertech.net (morbo.mail.tigertech.net [67.131.251.54]) by ietfa.amsl.com (Postfix) with ESMTP id 7518011E80EA; Wed, 29 Aug 2012 13:59:14 -0700 (PDT)
Received: from mailb2.tigertech.net (mailb2.tigertech.net [208.80.4.154]) by morbo.tigertech.net (Postfix) with ESMTP id 2F74CA5FCD; Wed, 29 Aug 2012 13:59:14 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by mailb2.tigertech.net (Postfix) with ESMTP id 9A6B41C9F23; Wed, 29 Aug 2012 13:59:12 -0700 (PDT)
X-Virus-Scanned: Debian amavisd-new at b2.tigertech.net
Received: from [10.10.10.104] (pool-71-161-50-84.clppva.btas.verizon.net [71.161.50.84]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailb2.tigertech.net (Postfix) with ESMTPSA id 9F52E1C9E9D; Wed, 29 Aug 2012 13:59:11 -0700 (PDT)
Message-ID: <503E829B.1000404@joelhalpern.com>
Date: Wed, 29 Aug 2012 16:59:07 -0400
From: "Joel M. Halpern" <jmh@joelhalpern.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120824 Thunderbird/15.0
MIME-Version: 1.0
To: Thomas Narten <narten@us.ibm.com>
References: <50243C05.3080006@nostrum.com> <502471DB.80303@joelhalpern.com> <201208291559.q7TFxgmJ012331@cichlid.raleigh.ibm.com> <503E3E54.70506@joelhalpern.com> <201208291954.q7TJsGqx015175@cichlid.raleigh.ibm.com>
In-Reply-To: <201208291954.q7TJsGqx015175@cichlid.raleigh.ibm.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: gen-art@ietf.org, "A. Jean Mahoney" <mahoney@nostrum.com>, "armd@ietf.org" <armd@ietf.org>
Subject: Re: [armd] Gen-art] review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Aug 2012 20:59:15 -0000

Yes, that is better.
(While I disagree with the WG on some of the drivers here, I think the 
text is sufficiently accurate and clear at this point that I should be 
considered to be in the rough.)

Thank you for the time and attention,
Joel

On 8/29/2012 3:54 PM, Thomas Narten wrote:
> Hi Joel.
>
>> All of the proposed resolutions look very good.  Thank you.
>
> Great!
>
>> With regard to routers and ARP caches, my concern is that from what I
>> saw of the years, common practice did not seem to match the SHOULD from
>> the RFCs.  I am a little remote from most implementations at the moment
>> (the ones I can check easily are a tiny fraction of the market), so I
>> was suggesting that be double-checked.
>
> I hear you, and I suspect that there is a wide variability in what
> routers implement. And the easy implementation (especially for the
> fast path) is not to queue anything at all, which would still be
> compliant since queuing is only a SHOULD...
>
> I've tweaked the text a bit more after looking at the actual
> requirements (e.g., the spec doesn't say you have to send an ICMP
> unreachable on an ARP miss, it only says that if you do, you shouldn't
> do it just because there is no ARP entry).
>
> In any case, the point of this paragraph is really just to explain the
> steps, to show they can be "cpu intensive". I think the WG asserted
> pretty strongly that for some implementations/deployments, the
> implementation cost is a problem (i.e., CPU intensive to the point of
> being problematical).
>
>     Finally, another area concerns the overhead of processing IP packets
>     for which no ARP entry exists.  Existing standards specify that one
>     (or more) IP packets for which no ARP entry exists should be queued
>     pending succesful completion of the address resolution process
>     [RFC1122] [RFC1812].  Once an ARP query has been resolved, any queued
>     packets can be forwarded on.  Again, the processing of such packets
>     is handled in the "slow path", effectively limiting the rate at which
>     a router can process ARP "cache misses" and is viewed as a problem in
>     some deployments today.  Additionally, if no response is received,
>     the router may send the ARP/ND query multiple times.  If no response
>     is received after a number of ARP/ND requests, the router needs to
>     drop any queued data packets, and may send an ICMP destination
>     unreachable message as well [RFC0792].  This entire process can be
>     CPU intensive.
>
> Is that any better?
>
> Thomas
>

From narten@us.ibm.com  Wed Aug 29 17:42:31 2012
Return-Path: <narten@us.ibm.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 75BD311E80FF for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 17:42:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -108.487
X-Spam-Level: 
X-Spam-Status: No, score=-108.487 tagged_above=-999 required=5 tests=[AWL=-1.787, BAYES_00=-2.599, FB_CIALIS_LEO3=3.899, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tdjzTdvu4cvA for <armd@ietfa.amsl.com>; Wed, 29 Aug 2012 17:42:30 -0700 (PDT)
Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.144]) by ietfa.amsl.com (Postfix) with ESMTP id 497B411E8102 for <armd@ietf.org>; Wed, 29 Aug 2012 17:42:29 -0700 (PDT)
Received: from /spool/local by e4.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <armd@ietf.org> from <narten@us.ibm.com>; Wed, 29 Aug 2012 20:42:29 -0400
Received: from d01dlp03.pok.ibm.com (9.56.250.168) by e4.ny.us.ibm.com (192.168.1.104) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;  Wed, 29 Aug 2012 20:42:27 -0400
Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 75318C9003E; Wed, 29 Aug 2012 20:42:26 -0400 (EDT)
Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7U0gPIv192410; Wed, 29 Aug 2012 20:42:26 -0400
Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7U0gPlW008083; Wed, 29 Aug 2012 18:42:25 -0600
Received: from cichlid.raleigh.ibm.com ([9.80.31.201]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q7U0gN7I008052 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 Aug 2012 18:42:24 -0600
Received: from cichlid.raleigh.ibm.com (localhost.localdomain [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.5/8.12.5) with ESMTP id q7U0gNJJ018727; Wed, 29 Aug 2012 20:42:23 -0400
Message-Id: <201208300042.q7U0gNJJ018727@cichlid.raleigh.ibm.com>
To: "Ralph Droms" <rdroms.ietf@gmail.com>
In-reply-to: <20120829182602.22800.41833.idtracker@ietfa.amsl.com>
References: <20120829182602.22800.41833.idtracker@ietfa.amsl.com>
Comments: In-reply-to "Ralph Droms" <rdroms.ietf@gmail.com> message dated "Wed, 29 Aug 2012 11:26:02 -0700."
Date: Wed, 29 Aug 2012 20:42:23 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12083000-3534-0000-0000-00000C04956D
Cc: The IESG <iesg@ietf.org>, armd@ietf.org
Subject: Re: [armd] Ralph Droms' No Objection on draft-ietf-armd-problem-statement-03: (with COMMENT)
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Aug 2012 00:42:31 -0000

Hi Ralph.

"Ralph Droms" <rdroms.ietf@gmail.com> writes:

> Ralph Droms has entered the following ballot position for
> draft-ietf-armd-problem-statement-03: No Objection

> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)


> Please refer to http://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.


> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------

> 1. In section 7.1, does a high volume of ARP traffic have more impact
> on routers than on hosts or VMs?  If so, why?

I think the answer is in some cases yes.

At one level, the amount of ARP traffic a router receives is the same
as a host. But (to cut to the chase) there are a number of reasons why
the problem can be worse for routers:

1) router architectures in practice can result in hosts being able to
handle a higher rate of ARP requests. One can argue that routers
should just fix their implementations, but that doesn't change the
fact that in some deployments/implementations there are issues.

2) Routers sometimes have way more networks hanging off of them than
hosts do. E.g., a router might have 100 interfaces (to 100 different
networks - each generating ARP traffic the router would need to
process), whereas hosts would on on only one network and hence see a
lot less traffic. Hence, a router might see 100x more ARP traffic than
one host.

3) Routers are the targets of a lot of communication. So a lot of ARP
traffic is aimed at them. (Forwarding data traffic is fast/easy and
done by the ASIC, ARP processing is slow, done in the software
processor). I'm guessing a bit here, but I suspect that if you looked
at a typical network, the average rate of ARP queries directed at
nodes is likely higher for routers than hosts.

One other detail (that the docuemnt doesn't get into) is that more
recent implementations of Windows borrowed NUD from IPv6 and
retrofitted it into IPv4. Thus, they generate unicast ARP queries
frequently to revalidate entries associated with neighbors just like
IPv6 does.  This has noticably increased the ARP traffic routers have
to process (on networks with more recent versions of Windows).

> 2. In section 7.1, does the total volume of ARP traffic ever become
> great enough to have a measurable impact on available traffic
> capacity?

What I'm told is that the CPU on routers can saturate or come close to
saturating, meaning that they become unable to process all the ARP
traffic and other essential routing functions as well. At this point,
you start having major problems (e.g., the router isn't responding to
other stuff it is supposed to in a timely manner).

> 3. Does this sentence from section 7.2 imply that IPv6 stacks that
> exhibit the described behavior are compliant with RFC 4861?

>    Consequently, some
>    implementations will send out "probe" ND queries to validate in-use
>    ND entries as frequently as every 35 seconds [RFC4861].

The above is the correct behavior as called for in 4861. While the
time may seem short, its intended to insure that recovery takes place
(should the router you are using go down) before TCP connections time
out.

> 4. I suggest dropping the sentence about the impact of VMs in section
> 7.3.  Any growth in the datacenter that increases the number of
> addresses used in an L2 domain, whether it be the physical span of the
> L2 domain or the use of VMs, will have the impact described in section
> 7.3.  The impact of growth will also have an impact on the scenarios
> in section 7.1 and 7.2.  The specific impact of VMs is also mentioned
> earlier in the document.

But is it is well documented that virtualization (using VMs)
exacerbates the problem. So I think saying so here is useful to
mention (even if redundent).

> 5. Are the three problems described in sections 7.1-3 really the only
> address resolution problems in large datacenters?

Well, they are the ones I know of and that the WG called out... Do you
think there are others?

> How do the three problems interact with each other (as mentioned at
> the end of section 7.3), when the ARP and ND problems seem to be
> related to CPU usage and the MAC table issue seems to be a memory
> problem.

The problem is just the more processing that has to be done, the less
cycles there are to go around. And in some deployments there aren't
quite enough cycles, so anything that adds to the load is potentially
problematical...

> 6. It was a little surprising to me that section 5 describes multicast
> ND for address resolution, but section 7.2 only cites the unicast use
> of ND for NUD as a problem.

The problem with ND and ARP are not so much about the
bandwidth/network usage per se. It's really more about routers needing
to process such packets. That's where things start breaking down (in
some deployments). There aren't enough cycles in the router's service
processor to do the work... So whether the packets received are
multicast vs. unicast isn't the issue (for received packets)

That section wasn't really trying to focus on multicast
vs. unicast. Maybe that didn't come out as clearly as it could. I.e,
the first paragraph really should say that in terms of processing of
ND traffic on a router, many of the same costs/issues are equivalent
to the case of handling an ARP packet.

How about I change the first paragraph as follows:

old:

   Though IPv6's Neighbor Discovery behaves much like ARP there are
   several notable differences which result in a different set of
   potential issues.  From an L2 perspective there is the simple
   difference between sending to a multicast versus broadcast address
   which results in ND queries only being processed by the nodes for
   which they are intended.

new:

   Though IPv6's Neighbor Discovery behaves much like ARP there are
   several notable differences which result in a different set of
   potential issues.  From an L2 perspective, an important difference
   is that ND address resolution requests are sent via multicast,
   which results in ND queries only being processed by the nodes for
   which they are intended. This reduces the total number of ND
   packets that an implementation will receive compared with
   broadcast ARPs.

Thomas


From manav.bhatia@alcatel-lucent.com  Thu Aug 30 08:36:30 2012
Return-Path: <manav.bhatia@alcatel-lucent.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BA81321F858F; Thu, 30 Aug 2012 08:36:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.422
X-Spam-Level: 
X-Spam-Status: No, score=-9.422 tagged_above=-999 required=5 tests=[AWL=1.177,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Alx9RafWVpcP; Thu, 30 Aug 2012 08:36:29 -0700 (PDT)
Received: from ihemail1.lucent.com (ihemail1.lucent.com [135.245.0.33]) by ietfa.amsl.com (Postfix) with ESMTP id C972421F857E; Thu, 30 Aug 2012 08:36:29 -0700 (PDT)
Received: from inbansmailrelay1.in.alcatel-lucent.com (h135-250-11-31.lucent.com [135.250.11.31]) by ihemail1.lucent.com (8.13.8/IER-o) with ESMTP id q7UFaL78021970 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 30 Aug 2012 10:36:24 -0500 (CDT)
Received: from INBANSXCHHUB02.in.alcatel-lucent.com (inbansxchhub02.in.alcatel-lucent.com [135.250.12.35]) by inbansmailrelay1.in.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id q7UFaJux009818 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT); Thu, 30 Aug 2012 21:06:20 +0530
Received: from INBANSXCHMBSA1.in.alcatel-lucent.com ([135.250.12.38]) by INBANSXCHHUB02.in.alcatel-lucent.com ([135.250.12.35]) with mapi; Thu, 30 Aug 2012 21:06:19 +0530
From: "Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com>
To: Thomas Narten <narten@us.ibm.com>
Date: Thu, 30 Aug 2012 21:06:25 +0530
Thread-Topic: RtgDir review: draft-ietf-armd-problem-statement-03
Thread-Index: Ac2F9tEdiNnE460+QqiYmVD1vas4CwAAbyhQ
Message-ID: <7C362EEF9C7896468B36C9B79200D8350D064CAF3A@INBANSXCHMBSA1.in.alcatel-lucent.com>
References: <7C362EEF9C7896468B36C9B79200D8350D063A0AF5@INBANSXCHMBSA1.in.alcatel-lucent.com> <201208272124.q7RLOnx7015943@cichlid.raleigh.ibm.com> <7C362EEF9C7896468B36C9B79200D8350D06450BB6@INBANSXCHMBSA1.in.alcatel-lucent.com> <201208291458.q7TEwgxI011886@cichlid.raleigh.ibm.com>
In-Reply-To: <201208291458.q7TEwgxI011886@cichlid.raleigh.ibm.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.33
Cc: "rtg-dir@ietf.org" <rtg-dir@ietf.org>, "armd@ietf.org" <armd@ietf.org>, "draft-ietf-armd-problem-statement.all@tools.ietf.org" <draft-ietf-armd-problem-statement.all@tools.ietf.org>, "rtg-ads@tools.ietf.org" <rtg-ads@tools.ietf.org>
Subject: Re: [armd] RtgDir review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Aug 2012 15:36:30 -0000

Hi Thomas,

>=20
> > However, the draft still says "multicast frames do not necessarily =20
> > need to be sent to all parts of the network". I could be missing =20
> > something but there still seems to be some disconnect=20
> because in  the=20
> > context of L2, multicast frames will be sent to all parts of  the=20
> > network.
>=20
> L2 IGMP snooping may be taking place, which can then result=20
> in multicast traffic not being forwarded everywhere in the L2=20
> broadcst domain...

Yes, that's correct. So youre suggesting that IGMP snooping will help reduc=
e "ARP" traffic in case of IPv6. Do hosts send out MLD reports for the link=
 local addresses that they expect to receive the Neighbor Discovery message=
s on? If they don't, then snooping will be of no help in reducing IPv6 ND t=
raffic that the draft is discussing in that section.

[clipped]

>=20
> > Yes it does as long as you remove the original line that I had
>   quoted.
>=20
> Removing that line IMO removes something essential. It is the=20
> case that on some routers (i.e., devices at the edge of an L2=20
> boundary) do not have sufficient resources to process "a lot=20
> of ARP traffic". "a lot" is in quotes because we don't have=20
> an exact figure for what that is. This is one of the key=20
> points to come out of the ARMD effort.
>=20
> What exactly do you object to in that sentence?

My concern with the opening sentence of Sec 7.1 is that it generalizes larg=
e L2 domains and makes a sweeping statement that seems to suggest that all =
routers in large L2 domains need to process "a lot of " ARP traffic. This i=
s patently incorrect. As I have said earlier, this issue is specific to L2 =
domains that see a lot of ARP/ND traffic and is not true in general for all=
 large L2 domains.=20

[clipped]

> > >=20
> > > Right. The spec says "don't do this". But I believe it=20
> was asserted=20
> > > that some implementations do this. That said, I'm not=20
> aware of any=20
> > > such implementations. I would be willing to remove this=20
> sentence in=20
> > > the absence of known implementations of this.
>=20
> To clarify, the current text says "Some routers can be=20
> configured to broadcast periodic gratuitous ARPs."
>=20
> This statement is true, and presumably you are not objecting=20
> to that. right?

Yes, I am not.

[clipped]

>=20
> How about I change the sentence:
>=20
>     Gratuitous ARPs, broadcast to all nodes in the L2 broadcast
>     domain, can also pre-populate ARP caches on neighboring devices,
>     further reducing ARP traffic.
>=20
> to:
>=20
>     Gratuitous ARPs, broadcast to all nodes in the L2 broadcast
>     domain, may in some cases also pre-populate ARP caches on
>     neighboring devices, further reducing ARP traffic. But it is not
>     believed that pre-population of ARP entries is supported by most
>     implementations, as the ARP specification <xref
>     target=3D"RFC0826"></xref> recommends only that pre-existing ARP
>     entries be updated upon receipt of ARP messages; it does not call
>     for the creation of new entries when none already exist.

Sounds good.

>=20
> > > > 2. In Sec 7.1 you mention that routers need to drop all=20
> transit =20
> > > > traffic when there is no response received for an=20
> ARP/ND  request.=20
> > > > You should mention that in addition to this, routers=20
> also  need to=20
> > > > send an ICMP host unreachable error packet back to the  sender.=20
> > > > ICMP error packets are generated in the control card =20
> CPU. So, if=20
> > > > the CPU has to generate a high number of such ICMP  errors then=20
> > > > this can load the CPU. The whole process can be quite =20
> CPU as well=20
> > > > as buffer intensive. The CPU/buffer overload is usually=20
>  mitigated=20
> > > > by rate limiting the number of ICMP errors generated.
> > >=20
> > > Added:
> > >=20
> > >    "and may send an ICMP destination unreachable message as well."
>=20
> > Why a "may"? An implementation is violating a standard if it isn't.
>=20
> The might not if rate limiting says otherwise. I.e., there=20
> are times when an ICMP won't be sent that is not in violation=20
> of the spec.

Aah, ok.=20

[clipped]

> > >=20
> > > > 5. Sec 7.2 discusses issues with address resolution=20
> mechanism in =20
> > > > IPv6. I think its useful for this draft to discuss the=20
> fact that =20
> > > > unlike IPv4, IPv6 has subnets that are /64. This number=20
> is quite =20
> > > > large and will perhaps cover trillions of IP addresses,=20
> most of =20
> > > > which would be unassigned. Thus simplistic IPv6 ND=20
> implementations =20
> > > > can be vulnerable to attacks which inundates the CPU with huge =20
> > > > requests to perform address resolution for a large=20
> number of IPv6 =20
> > > > addresses, most of which are unassigned. As a result of this =20
> > > > genuine IPv6 devices will not be able to join the network. You =20
> > > > might want to refer to RFC 6583 for more details.
> > >=20
> > > Ditto.
>=20
> > I am fine with your resolution to the comments 3 and 4. However, I =20
> > believe that 5 ought to be discussed. This document is=20
> about ARP/ND =20
> > issues that folks are either seeing or will see in large data =20
> > centers.
>=20
> To clarify: "are seeing". We can speculate at length for what=20
> problems will be seen in the future. :-)

If the scope is only the former, then I am ok with you not considering poin=
t 5.

>=20
> > Given this, I don't see why this should not even be=20
> discussed in  this=20
> > draft. I think its quite reasonable to address the above  mentioned=20
> > aspect of IPv6 ND and one of way getting attention to  issue is by=20
> > discussing this here in this draft.
>=20
> The issue you raise above is fully documented in RFC 6583,=20
> which I have added to the references (per my previous note).

Which is indeed very helpful.

[clipped]

> > >=20
> > >     Security considerations for Neighbor Discovery are=20
> discussed in
> > >     <xref target=3D"RFC4861"></xref> and <xref=20
> target=3D"RFC6583"></xref>.
>=20
> > This should be good. I assume that this then means that=20
> there are no =20
> > additional security concerns with ARPs/ND in data centers.
>=20
> I don't recall any coming up in the WG.

Ok.

[clipped]

>=20
> > I just noticed that Sec 8 - Summary, is redundant. Shouldnt that
>   entire text be moved to either the Abstract or the Introduction?
>=20
> It's the last section of the document. The document needs a=20
> summary or something (summary seems more accurate than=20
> conclusions, IMO).

I will not argue on this! :-)

Cheers, Manav
> =

From manav.bhatia@alcatel-lucent.com  Thu Aug 30 17:52:47 2012
Return-Path: <manav.bhatia@alcatel-lucent.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 685CF21F84A5; Thu, 30 Aug 2012 17:52:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.488
X-Spam-Level: 
X-Spam-Status: No, score=-7.488 tagged_above=-999 required=5 tests=[AWL=-0.889, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9PlHbGcycIFJ; Thu, 30 Aug 2012 17:52:46 -0700 (PDT)
Received: from ihemail4.lucent.com (ihemail4.lucent.com [135.245.0.39]) by ietfa.amsl.com (Postfix) with ESMTP id D28E721F848F; Thu, 30 Aug 2012 17:52:46 -0700 (PDT)
Received: from inbansmailrelay2.in.alcatel-lucent.com (h135-250-11-33.lucent.com [135.250.11.33]) by ihemail4.lucent.com (8.13.8/IER-o) with ESMTP id q7V0qeog018953 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 30 Aug 2012 19:52:43 -0500 (CDT)
Received: from INBANSXCHHUB02.in.alcatel-lucent.com (inbansxchhub02.in.alcatel-lucent.com [135.250.12.35]) by inbansmailrelay2.in.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id q7V0qaRR008071 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT); Fri, 31 Aug 2012 06:22:38 +0530
Received: from INBANSXCHMBSA1.in.alcatel-lucent.com ([135.250.12.38]) by INBANSXCHHUB02.in.alcatel-lucent.com ([135.250.12.35]) with mapi; Fri, 31 Aug 2012 06:22:36 +0530
From: "Bhatia, Manav (Manav)" <manav.bhatia@alcatel-lucent.com>
To: Thomas Narten <narten@us.ibm.com>
Date: Fri, 31 Aug 2012 06:22:50 +0530
Thread-Topic: RtgDir review: draft-ietf-armd-problem-statement-03
Thread-Index: Ac2F9tEdiNnE460+QqiYmVD1vas4CwAAbyhQAEZVgVA=
Message-ID: <7C362EEF9C7896468B36C9B79200D8350D064CAF6E@INBANSXCHMBSA1.in.alcatel-lucent.com>
References: <7C362EEF9C7896468B36C9B79200D8350D063A0AF5@INBANSXCHMBSA1.in.alcatel-lucent.com> <201208272124.q7RLOnx7015943@cichlid.raleigh.ibm.com> <7C362EEF9C7896468B36C9B79200D8350D06450BB6@INBANSXCHMBSA1.in.alcatel-lucent.com> <201208291458.q7TEwgxI011886@cichlid.raleigh.ibm.com> <7C362EEF9C7896468B36C9B79200D8350D064CAF3A@INBANSXCHMBSA1.in.alcatel-lucent.com>
In-Reply-To: <7C362EEF9C7896468B36C9B79200D8350D064CAF3A@INBANSXCHMBSA1.in.alcatel-lucent.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.39
Cc: "rtg-dir@ietf.org" <rtg-dir@ietf.org>, "rtg-ads@tools.ietf.org" <rtg-ads@tools.ietf.org>, "draft-ietf-armd-problem-statement.all@tools.ietf.org" <draft-ietf-armd-problem-statement.all@tools.ietf.org>, "armd@ietf.org" <armd@ietf.org>
Subject: Re: [armd] RtgDir review: draft-ietf-armd-problem-statement-03
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 31 Aug 2012 00:52:47 -0000

=20
> >=20
> > What exactly do you object to in that sentence?
>=20
> My concern with the opening sentence of Sec 7.1 is that it=20
> generalizes large L2 domains and makes a sweeping statement=20
> that seems to suggest that all routers in large L2 domains=20
> need to process "a lot of " ARP traffic. This is patently=20
> incorrect. As I have said earlier, this issue is specific to=20
> L2 domains that see a lot of ARP/ND traffic and is not true=20
> in general for all large L2 domains.=20

Some more clarification.

There are large L2 domains that might see lot of ARP/ND traffic but will NO=
T process them. Such domains will treat all such traffic as regular bcast/m=
cast traffic and will flood it appropriately. The draft seems to suggest th=
at large L2 domains have an issue because they need to process all such tra=
ffic which could lead the reader to believe that all such traffic is punted=
 to the CPU where its processed. However in reality, most large L2 domains =
are oblivious to whether the bcast traffic is ARP or something else. Handli=
ng this traffic is not an issue at all. Its dealing with the unlearnt traff=
ic that's an issue in such domains as the source MACs need to be learnt. If=
 the L2 table is hash based then dealing with collisions, etc is another pr=
oblem. If its CAM based then the size is often a limitation on the number o=
f MACs that can be learnt.

Cheers, Manav