
From timothyhartrick@gmail.com  Fri Apr 29 14:25:26 2011
Return-Path: <timothyhartrick@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 26741E068F for <tcpm@ietfa.amsl.com>; Fri, 29 Apr 2011 14:25:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.299
X-Spam-Level: 
X-Spam-Status: No, score=-3.299 tagged_above=-999 required=5 tests=[AWL=0.300,  BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ENddj3-GlqvH for <tcpm@ietfa.amsl.com>; Fri, 29 Apr 2011 14:25:25 -0700 (PDT)
Received: from mail-iy0-f172.google.com (mail-iy0-f172.google.com [209.85.210.172]) by ietfa.amsl.com (Postfix) with ESMTP id 64D7CE0669 for <tcpm@ietf.org>; Fri, 29 Apr 2011 14:25:25 -0700 (PDT)
Received: by iyn15 with SMTP id 15so4315776iyn.31 for <tcpm@ietf.org>; Fri, 29 Apr 2011 14:25:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:subject:from:reply-to:to:cc:in-reply-to :references:content-type:organization:date:message-id:mime-version :x-mailer:content-transfer-encoding; bh=69sFzYidyB1Y4p3ANHZZG9FXTjZmhH4yUTN6NR+iZhg=; b=SqRE91miCBXzvVV78UtCnVhDE89v+7yeeD5vxscL2edtE+v3zdSNbZB79+hupJEUTc On7yLfZYZc9RkRrRCunDb1a+y0riYGV6oXzYeLC8iPH0SiFW6rjBkc7qDzZFu5HMbqZA vMYbrgPMxq5EGGXng2ihBxDmsNqR9UZ7v+08E=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:reply-to:to:cc:in-reply-to:references:content-type :organization:date:message-id:mime-version:x-mailer :content-transfer-encoding; b=pYtke++4QczQBuhhVt6kYLz+h/Kf2v/X3mGJw+947q679iH/a3Uj3QADeFTmQXsnB4 SVHvQKJJmUMuJdmZnJxfBIdvetOFoLgMkCKMpQ06kcZGwhQbyF8T4elRKilzoLUaA5t6 KzzwAuE9BOxaIPXiHL0W1xSFu1ixRfvEJNNUA=
Received: by 10.42.244.71 with SMTP id lp7mr76499icb.177.1304112324820; Fri, 29 Apr 2011 14:25:24 -0700 (PDT)
Received: from [192.168.29.48] ([67.41.221.225]) by mx.google.com with ESMTPS id 4sm1263827ibc.49.2011.04.29.14.25.21 (version=SSLv3 cipher=OTHER); Fri, 29 Apr 2011 14:25:22 -0700 (PDT)
From: Tim Hartrick <timothyhartrick@gmail.com>
To: Caitlin Bestler <cait@asomi.com>
In-Reply-To: <A8F44774-DE7E-4A1D-8BEC-1826B800E3FE@asomi.com>
References: <201104181447.p3IElwLI025952@cichlid.raleigh.ibm.com> <BANLkTinpVay6AE0hJ-iXwNXWQWaqtgMVWw@mail.gmail.com> <CBB1DA5A-DE55-4CB2-8FED-7D0CAE451CAE@mac.com> <4DB9B4FC.7020507@isi.edu> <1304021611.3998.11.camel@feller> <A8F44774-DE7E-4A1D-8BEC-1826B800E3FE@asomi.com>
Content-Type: text/plain; charset="UTF-8"
Organization: Tim Hartrick
Date: Fri, 29 Apr 2011 15:25:13 -0600
Message-ID: <1304112313.2355.2.camel@feller>
Mime-Version: 1.0
X-Mailer: Evolution 2.30.3 (2.30.3-1.fc13) 
Content-Transfer-Encoding: 7bit
X-Mailman-Approved-At: Sun, 01 May 2011 10:08:39 -0700
Cc: Thomas Narten <narten@us.ibm.com>, tcpm@ietf.org, Matt Mathis <mattmathis@google.com>, Joe Touch <touch@isi.edu>
Subject: Re: [tcpm] pmtu discovery (RFC 4821)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: timothyhartrick@gmail.com
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Apr 2011 21:25:26 -0000

Caitlin,

On Fri, 2011-04-29 at 09:39 -0700, Caitlin Bestler wrote:
> On Apr 28, 2011, at 1:13 PM, Tim Hartrick wrote:
> 

> So clearly a larger MTU will always benefit hosts, even if the benefit has been
> properly limited to only restoring the context of a specific TCP message rather
> than requiring the restoration of the context for the full TCP connection.
> 
> The real tradeoff is one of optimizing for the hosts versus the forwarding
> elements. Larger frame sizes make efficient memory allocation in the
> network more problematic. But nobody forces switches or routers to
> support larger MTUs. I can't think of any reason not to take advantage
> when working at L3 and above not to just take what L2 is willing to provide.
> 

If it is free to determine what the L2 will provide end-to-end then this
is obviously true but the previous poster was noting that it was less
than free.  Again, thus the reference to diminishing returns.



tim


From Michael.Scharf@alcatel-lucent.com  Thu May  5 09:02:26 2011
Return-Path: <Michael.Scharf@alcatel-lucent.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A26DE0593; Thu,  5 May 2011 09:02:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.057
X-Spam-Level: 
X-Spam-Status: No, score=-6.057 tagged_above=-999 required=5 tests=[AWL=0.192,  BAYES_00=-2.599, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LWxU0+hMFiaX; Thu,  5 May 2011 09:02:25 -0700 (PDT)
Received: from mailrelay2.alcatel.de (mailrelay2.alcatel.de [194.113.59.96]) by ietfa.amsl.com (Postfix) with ESMTP id 07137E06D8; Thu,  5 May 2011 09:02:21 -0700 (PDT)
Received: from SLFSNX.rcs.alcatel-research.de (slfsn1.rcs.de.alcatel-lucent.com [149.204.60.98]) by mailrelay2.alcatel.de (8.14.3/8.14.3/ICT) with ESMTP id p45G2JF0010005; Thu, 5 May 2011 18:02:19 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Thu, 5 May 2011 18:00:42 +0200
Message-ID: <133D9897FB9C5E4E9DF2779DC91E947C05679CAB@SLFSNX.rcs.alcatel-research.de>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Requesting publication from TCPM of draft-ietf-tcpm-persist-04
Thread-Index: AcwLPZXq1wuMFP1TSRaP0PvOY/Z3DQ==
From: "SCHARF, Michael" <Michael.Scharf@alcatel-lucent.com>
To: <iesg-secretary@ietf.org>, <tsv-ads@ietf.org>
X-Scanned-By: MIMEDefang 2.64 on 149.204.45.73
Cc: tcpm@ietf.org
Subject: [tcpm] Requesting publication from TCPM of draft-ietf-tcpm-persist-04
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 May 2011 16:02:26 -0000

Hello,

attached is the PROTO statement for the TCPM working group document
draft-ietf-tcpm-persist-04. We would like to request for publication as
Informational. Michael Scharf (TCPM co-chair) is the document shepherd.=20

Thanks

Michael



draft-ietf-tcpm-persist-04
 =20


  (1.a) Who is the Document Shepherd for this document? Has the
        Document Shepherd personally reviewed this version of the=20
        document and, in particular, does he or she believe this=20
        version is ready for forwarding to the IESG for publication?=20


Michael Scharf (michael.scharf@alcatel-lucent.com) is the document
shepherd. He has personally reviewed this version and believes it is
ready for forwarding to the IESG for publication.



  (1.b) Has the document had adequate review both from key WG members=20
        and from key non-WG members? Does the Document Shepherd have=20
        any concerns about the depth or breadth of the reviews that=20
        have been performed? =20


The document is the outcome of lengthy discussions that have spanned
over years. The WG chairs (Wes Eddy and David Borman) have had one on
one discussions with the authors prior to it being adopted as a WG
document, as well as discussions on the mailing list since it was
adopted. The working group last call generated late feedback from key
WG members that is addressed in the document. Since this is a short
and informational document, the reviews are sufficient.



  (1.c) Does the Document Shepherd have concerns that the document=20
        needs more review from a particular or broader perspective,=20
        e.g., security, operational complexity, someone familiar with=20
        AAA, internationalization or XML?=20


No concerns.



  (1.d) Does the Document Shepherd have any specific concerns or=20
        issues with this document that the Responsible Area Director
        and/or the IESG should be aware of? For example, perhaps he=20
        or she is uncomfortable with certain parts of the document, or=20
        has concerns whether there really is a need for it. In any=20
        event, if the WG has discussed those issues and has indicated=20
        that it still wishes to advance the document, detail those=20
        concerns here. Has an IPR disclosure related to this document=20
        been filed? If so, please include a reference to the=20
        disclosure and summarize the WG discussion and conclusion on=20
        this issue.=20


No concerns.



  (1.e) How solid is the WG consensus behind this document? Does it=20
        represent the strong concurrence of a few individuals, with=20
        others being silent, or does the WG as a whole understand and=20
        agree with it?  =20


There was a significant amount of discussions on the scope of this
document, and the WG consensus was to come up with a draft that=20
clarifies the intentions of RFC 1122. There is a reasonable level of
support for the document. During the working group last call, some
late objections have been raised on the mailing list, and the requested
changes have completely been incorporated in the document.



  (1.f) Has anyone threatened an appeal or otherwise indicated extreme=20
        discontent? If so, please summarise the areas of conflict in=20
        separate email messages to the Responsible Area Director. (It=20
        should be in a separate email because this questionnaire is=20
        entered into the ID Tracker.)=20


No.



  (1.g) Has the Document Shepherd personally verified that the=20
        document satisfies all ID nits? (See the Internet-Drafts
Checklist=20
        and http://tools.ietf.org/tools/idnits/). Boilerplate checks are

        not enough; this check needs to be thorough. Has the document=20
        met all formal review criteria it needs to, such as the MIB=20
        Doctor, media type and URI type reviews?=20


There are no significant formatting issues.



  (1.h) Has the document split its references into normative and=20
        informative? Are there normative references to documents that=20
        are not ready for advancement or are otherwise in an unclear=20
        state? If such normative references exist, what is the=20
        strategy for their completion? Are there normative references=20
        that are downward references, as described in [RFC3967]? If=20
        so, list these downward references to support the Area=20
        Director in the Last Call procedure for them [RFC3967].=20


The references are properly split.



  (1.i) Has the Document Shepherd verified that the document IANA=20
        consideration section exists and is consistent with the body=20
        of the document? If the document specifies protocol=20
        extensions, are reservations requested in appropriate IANA=20
        registries? Are the IANA registries clearly identified? If=20
        the document creates a new registry, does it define the=20
        proposed initial contents of the registry and an allocation=20
        procedure for future registrations? Does it suggest a=20
        reasonable name for the new registry? See [RFC5226]. If the=20
        document describes an Expert Review process has Shepherd=20
        conferred with the Responsible Area Director so that the IESG=20
        can appoint the needed Expert during the IESG Evaluation?=20


The IANA considerations are present and specify no actions for IANA.



  (1.j) Has the Document Shepherd verified that sections of the=20
        document that are written in a formal language, such as XML=20
        code, BNF rules, MIB definitions, etc., validate correctly in=20
        an automated checker?=20


Not applicable.



  (1.k) The IESG approval announcement includes a Document=20
        Announcement Write-Up. Please provide such a Document=20
        Announcement Write-Up? Recent examples can be found in the
        "Action" announcements for approved documents. The approval=20
        announcement contains the following sections:=20

     Technical Summary=20
        Relevant content can frequently be found in the abstract=20
        and/or introduction of the document. If not, this may be=20
        an indication that there are deficiencies in the abstract=20
        or introduction.=20


>From abstract:

   This document clarifies the Zero Window Probes (ZWP) described in
   Requirements for Internet Hosts [RFC1122].  In particular, it
   clarifies the actions that can be taken on connections which are
   experiencing the ZWP condition.



     Working Group Summary=20
        Was there anything in WG process that is worth noting? For=20
        example, was there controversy about particular points or=20
        were there decisions where the consensus was particularly=20
        rough?=20


The biggest challenge with this document was getting the authors to
limit its scope and simplify what needed to be in the document to make
it acceptable as a WG document. The entire impetus for this document
was that there were TCP implementations out there that treated
connections in persist state as special, and had explicit code to keep
them from being terminated in situations where the implementation
would normally abort connections in other TCP states, which lead to a
vulnerability for resource starvation. Earlier versions of the
document contained a full API, i. e., a mechanism for applications to
set timers to abort TCP connections in persist state, and the WG
wanted the document to only identify and clarify the issue of how a
TCP should treat connections in the persist state: the same as any
other connection. All that has been removed from the document.

Early on one of the largest issues was getting the authors to
understand the distinction between the TCP protocol itself terminating
a connection in persist state, and some other part of the operating
system implementation terminating a TCP connection. If the OS is
resource starved it can abort any TCP connections it wants to,
including connections in persist state, and that does not make it the
TCP protocol that is deciding to abort the connection.

In its current format, the document reflects those changes to limit
the scope of the document to just identify what has been
mis-interpretted and clarify that a TCP connection in persist state is
treated the same as any other TCP connection.

And while this document doesn't explicitly state it, the reason that
Section 4.2.2.17 of Requirements for Internet Hosts [RFC1122] exists
is that at the time there were TCP implementations that were treating
connections in persist state the same as connections that weren't
receiving any return traffic and timing them out.



     Document Quality=20
        Are there existing implementations of the protocol? Have a=20
        significant number of vendors indicated their plan to=20
        implement the specification? Are there any reviewers that=20
        merit special mention as having done a thorough review,=20
        e.g., one that resulted in important changes or a=20
        conclusion that the document had no substantive issues? If=20
        there was a MIB Doctor, Media Type or other expert review,=20
        what was its course (briefly)? In the case of a Media Type=20
        review, on what date was the request posted?


This is a short informational document that does not specify a
protocol. It is published to clarify the intentions of RFC 1122.


From Internet-Drafts@ietf.org  Fri May  6 11:15:06 2011
Return-Path: <Internet-Drafts@ietf.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 56DE6E076C; Fri,  6 May 2011 11:15:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.571
X-Spam-Level: 
X-Spam-Status: No, score=-102.571 tagged_above=-999 required=5 tests=[AWL=0.028, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TmF0Ao9mO0uS; Fri,  6 May 2011 11:15:05 -0700 (PDT)
Received: from ietfa.amsl.com (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E702AE0782; Fri,  6 May 2011 11:15:03 -0700 (PDT)
MIME-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
From: Internet-Drafts@ietf.org
To: i-d-announce@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 3.53
Message-ID: <20110506181503.17839.10589.idtracker@ietfa.amsl.com>
Date: Fri, 06 May 2011 11:15:03 -0700
Cc: tcpm@ietf.org
Subject: [tcpm] I-D ACTION:draft-ietf-tcpm-rfc3782-bis-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 06 May 2011 18:15:06 -0000

--NextPart

A new Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the TCP Maintenance and Minor Extensions Working Group of the IETF.

    Title         : The NewReno Modification to TCP's Fast Recovery Algorithm
    Author(s)     : T. Henderson, et al
    Filename      : draft-ietf-tcpm-rfc3782-bis-02.txt
    Pages         : 16
    Date          : 2011-04-20
    
   RFC 5681 documents the following four intertwined TCP
   congestion control algorithms: slow start, congestion avoidance, fast
   retransmit, and fast recovery.  RFC 5681 explicitly allows
   certain modifications of these algorithms, including modifications
   that use the TCP Selective Acknowledgement (SACK) option (RFC 2883),
   and modifications that respond to "partial acknowledgments" (ACKs
   which cover new data, but not all the data outstanding when loss was
   detected) in the absence of SACK.  This document describes a specific
   algorithm for responding to partial acknowledgments, referred to as
   NewReno.  This response to partial acknowledgments was first proposed
   by Janey Hoe.  This document obsoletes RFC 3782.


A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-tcpm-rfc3782-bis-02.txt

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Message/External-body; name="draft-ietf-tcpm-rfc3782-bis-02.txt";
	site="ftp.ietf.org"; access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID: <2011-05-06111033.I-D@ietf.org>


--NextPart--

From rs@netapp.com  Mon May  9 01:19:25 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F331DE07B1 for <tcpm@ietfa.amsl.com>; Mon,  9 May 2011 01:19:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.548
X-Spam-Level: 
X-Spam-Status: No, score=-10.548 tagged_above=-999 required=5 tests=[AWL=0.051, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sHbr46T74-sR for <tcpm@ietfa.amsl.com>; Mon,  9 May 2011 01:19:24 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id 282EEE079B for <tcpm@ietf.org>; Mon,  9 May 2011 01:19:23 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.64,339,1301900400"; d="scan'208";a="249426240"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 09 May 2011 01:16:07 -0700
Received: from amsrsexc1-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.64.251.107]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p498DUw5007064; Mon, 9 May 2011 01:16:07 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.108]) by amsrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Mon, 9 May 2011 10:15:46 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 9 May 2011 09:15:16 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E387844@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <20110506181503.17839.10589.idtracker@ietfa.amsl.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [tcpm] I-D ACTION:draft-ietf-tcpm-rfc3782-bis-02.txt
Thread-Index: AcwMGzDl62cad8imSw69oEsATavTJACA/WdQ
References: <20110506181503.17839.10589.idtracker@ietfa.amsl.com>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: <thomas.r.henderson@boeing.com>
X-OriginalArrivalTime: 09 May 2011 08:15:46.0644 (UTC) FILETIME=[4C766940:01CC0E21]
Cc: tcpm@ietf.org
Subject: Re: [tcpm] I-D ACTION:draft-ietf-tcpm-rfc3782-bis-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 May 2011 08:19:25 -0000

Tom,

Maybe a stupid question, but section 4.2 reads like some cross over with
the Eifel RFC3522 / RFC4015 algorithms, which are encumbered by IPR -
some partially open (but non-GPL) and commercial implementations appear
to be forbidden for Eifel. [For completeness, this section was already
in 3782].

https://datatracker.ietf.org/ipr/171/

How does this RFC3782bis circumvent this patent? (Without the special
wording of the Ericsson IPR, a standard reciprocity IPR wouldn't pose
any problem, for a standards-track document IMHO).

Unfortunately, the IPR doesn't state the exact patents owned by Ericsson
- but Eifel is generic enough that any heuristic making use of
timestamps for loss recovery of some form is likely to be in violation
of their claimed patents...

Thanks,



Richard Scheffenegger



> -----Original Message-----
> From: Internet-Drafts@ietf.org [mailto:Internet-Drafts@ietf.org]
> Sent: Freitag, 06. Mai 2011 20:15
> To: i-d-announce@ietf.org
> Cc: tcpm@ietf.org
> Subject: [tcpm] I-D ACTION:draft-ietf-tcpm-rfc3782-bis-02.txt
>=20
> A new Internet-Draft is available from the on-line Internet-Drafts
> directories.
> This draft is a work item of the TCP Maintenance and Minor Extensions
> Working Group of the IETF.
>=20
>     Title         : The NewReno Modification to TCP's Fast Recovery
> Algorithm
>     Author(s)     : T. Henderson, et al
>     Filename      : draft-ietf-tcpm-rfc3782-bis-02.txt
>     Pages         : 16
>     Date          : 2011-04-20
>=20
>    RFC 5681 documents the following four intertwined TCP
>    congestion control algorithms: slow start, congestion avoidance,
> fast
>    retransmit, and fast recovery.  RFC 5681 explicitly allows
>    certain modifications of these algorithms, including modifications
>    that use the TCP Selective Acknowledgement (SACK) option (RFC
2883),
>    and modifications that respond to "partial acknowledgments" (ACKs
>    which cover new data, but not all the data outstanding when loss
was
>    detected) in the absence of SACK.  This document describes a
> specific
>    algorithm for responding to partial acknowledgments, referred to as
>    NewReno.  This response to partial acknowledgments was first
> proposed
>    by Janey Hoe.  This document obsoletes RFC 3782.
>=20
>=20
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-ietf-tcpm-rfc3782-bis-02.txt
>=20
> Internet-Drafts are also available by anonymous FTP at:
> ftp://ftp.ietf.org/internet-drafts/
>=20
> Below is the data which will enable a MIME compliant mail reader
> implementation to automatically retrieve the ASCII version of the
> Internet-Draft.

From william.allen.simpson@gmail.com  Fri May 13 07:27:14 2011
Return-Path: <william.allen.simpson@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C816BE06A6 for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 07:27:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level: 
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CzghHB8ptJNs for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 07:27:14 -0700 (PDT)
Received: from mail-iy0-f172.google.com (mail-iy0-f172.google.com [209.85.210.172]) by ietfa.amsl.com (Postfix) with ESMTP id 427F0E069B for <tcpm@ietf.org>; Fri, 13 May 2011 07:27:14 -0700 (PDT)
Received: by iyn15 with SMTP id 15so2748127iyn.31 for <tcpm@ietf.org>; Fri, 13 May 2011 07:27:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:content-type:content-transfer-encoding; bh=f+C77hnKIqogaB4pZI5eOkg7v7BzGv8LeoAzBZLrWX0=; b=r+F8S4h4yqSATK4itaXUPm36iJwiODJYCMutlC5bpkf7WTbiJQ1b0scxmgkri0NeZs H2ZEKua9X/EL0uOtckqwCluVr5pnf3CtoypGWNlNaGinRbJqh08khMga/r4p4wLsA4cU U/SMubi1e1h0sGjHMewNX63HVnqdfzsoavaRI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=jkOGe74UVq401NjWPoifgIAHLzZTBf/z2Bx6EF/4GHrCbaWFkc7nr/3b8ZSmDYMPhU foJWoQXIr7vBE6mxgy6HVoYhwrkWhnwoUjFLAM0MWQlVf2rR/YJB9oQre7AFCxSZGI5q Y4WzR8SYKsj0zsQvpkaR0mpJ1Samx3i89nrZc=
Received: by 10.231.207.71 with SMTP id fx7mr1084214ibb.168.1305296833787; Fri, 13 May 2011 07:27:13 -0700 (PDT)
Received: from Wastrel-3.local (c-68-40-194-239.hsd1.mi.comcast.net [68.40.194.239]) by mx.google.com with ESMTPS id f28sm979603ibh.67.2011.05.13.07.27.12 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 13 May 2011 07:27:12 -0700 (PDT)
Message-ID: <4DCD3FBF.4040908@gmail.com>
Date: Fri, 13 May 2011 10:27:11 -0400
From: William Allen Simpson <william.allen.simpson@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: TCP Modifications WG <tcpm@ietf.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 14:27:14 -0000

This arose on the NANOG list, and I've not been able to figure out
how it happens.  It does happen consistently, and seems relevant to
this list.

Linux 2.6.18 on the server side sometimes sends:

# sh capture debug-in
8 packets captured
   1: 21:49:13.461554 client.32929 > server.3306: S
4107544000:4107544000(0) win 65535 <mss 1380,nop,wscale 3,sackOK,timestamp
2065216038 0>
   2: 21:49:13.462073 server.3306 > client.32929: S
2601320299:2601320299(0) ack 4107544001 win 5792 <mss 1460,sackOK,timestamp
2581054349 2065216038,nop,wscale 7>
   3: 21:49:13.462210 server.3306 > client.32929: P
2601320300:2601320363(63) ack 4107544001 win 46 <nop,nop,timestamp
2581054349 2065216038>
   4: 21:49:13.519061 client.32929 > server.3306: . ack 2601320300
win 8208 <nop,nop,timestamp 2065216096 2581054349>

That is, the server sends data before receiving ACK(SYN) to complete the
three-way handshake.  Cisco firewalls drop segment 3, as a violation of
the "tcp-3whs-failed" rule.

   5: 21:49:14.135384 client.32929 > server.3306: P
4107544001:4107544003(2) ack 2601320300 win 8208 <nop,nop,timestamp
2065216712 2581054349>
   6: 21:49:14.135521 server.3306 > client.32929: . ack 4107544003
win 46 <nop,nop,timestamp 2581055023 2065216712>
   7: 21:49:16.461981 server.3306 > client.32929: P
2601320300:2601320363(63) ack 4107544003 win 46 <nop,nop,timestamp
2581057349 2065216712>
   8: 21:49:16.618147 client.32929 > server.3306: . ack 2601320363
win 8208 <nop,nop,timestamp 2065219195 2581057349>

The good news is that it doesn't appear to be fatal, but stalls the
connection for 2+ seconds until retransmit.  Egregiously contrary to
intent of speeding up the connection.

Anybody know how this was triggered?  I cannot find any documentation
about this in the Linux stack.  I've tried comparing sysctls to current
values, nothing obvious.

Also, the server window and wscale seem a bit flaky, too....

From touch@isi.edu  Fri May 13 08:58:16 2011
Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C691AE06DB for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 08:58:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.599
X-Spam-Level: 
X-Spam-Status: No, score=-102.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EFS8W6F12nzn for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 08:58:16 -0700 (PDT)
Received: from nitro.isi.edu (nitro.isi.edu [128.9.208.207]) by ietfa.amsl.com (Postfix) with ESMTP id E7DE5E0685 for <tcpm@ietf.org>; Fri, 13 May 2011 08:58:15 -0700 (PDT)
Received: from [192.168.1.93] (pool-71-105-81-169.lsanca.dsl-w.verizon.net [71.105.81.169]) (authenticated bits=0) by nitro.isi.edu (8.13.8/8.13.8) with ESMTP id p4DFvoOl002550 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Fri, 13 May 2011 08:58:00 -0700 (PDT)
Message-ID: <4DCD54FE.2040304@isi.edu>
Date: Fri, 13 May 2011 08:57:50 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: William Allen Simpson <william.allen.simpson@gmail.com>
References: <4DCD3FBF.4040908@gmail.com>
In-Reply-To: <4DCD3FBF.4040908@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-MailScanner-ID: p4DFvoOl002550
X-ISI-4-69-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 15:58:16 -0000

Hi, all,

On 5/13/2011 7:27 AM, William Allen Simpson wrote:
> This arose on the NANOG list, and I've not been able to figure out
> how it happens. It does happen consistently, and seems relevant to
> this list.
>
> Linux 2.6.18 on the server side sometimes sends:
>
> # sh capture debug-in
> 8 packets captured
> 1: 21:49:13.461554 client.32929 > server.3306: S
> 4107544000:4107544000(0) win 65535 <mss 1380,nop,wscale 3,sackOK,timestamp
> 2065216038 0>
> 2: 21:49:13.462073 server.3306 > client.32929: S
> 2601320299:2601320299(0) ack 4107544001 win 5792 <mss 1460,sackOK,timestamp
> 2581054349 2065216038,nop,wscale 7>
> 3: 21:49:13.462210 server.3306 > client.32929: P
> 2601320300:2601320363(63) ack 4107544001 win 46 <nop,nop,timestamp
> 2581054349 2065216038>
> 4: 21:49:13.519061 client.32929 > server.3306: . ack 2601320300
> win 8208 <nop,nop,timestamp 2065216096 2581054349>
>
> That is, the server sends data before receiving ACK(SYN) to complete the
> three-way handshake.

That seems like the server is the party violating TCP. SEND calls in the 
SYN-RECEIVED state are supposed to:

"Queue the data for transmission after entering ESTABLISHED state."

The server is jumping the gun sending the data early when it is in the 
SYN-RECEIVED state.

 > Cisco firewalls drop segment 3, as a violation of
> the "tcp-3whs-failed" rule.

Perhaps. It's consistent with enforcing what the server should be doing, 
from RFC793 processing rules, though. I.e., it seems compliant, though 
not desired.

Joe

From shep@xplot.org  Fri May 13 10:48:58 2011
Return-Path: <shep@xplot.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 98228E080A for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 10:48:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level: 
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id prBi4GCRFsrw for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 10:48:58 -0700 (PDT)
Received: from www.xplot.org (www.xplot.org [66.92.66.146]) by ietfa.amsl.com (Postfix) with ESMTP id 0F9DDE07E1 for <tcpm@ietf.org>; Fri, 13 May 2011 10:48:58 -0700 (PDT)
Received: from shep (helo=alva.home) by www.xplot.org with local-esmtp (Exim 3.36 #1 (Debian)) id 1QKwTj-0002R3-00; Fri, 13 May 2011 13:48:43 -0400
From: Tim Shepard <shep@alum.mit.edu>
To: Joe Touch <touch@isi.edu>
In-reply-to: Your message of Fri, 13 May 2011 08:57:50 -0700. <4DCD54FE.2040304@isi.edu> 
Date: Fri, 13 May 2011 13:48:43 -0400
Message-Id: <E1QKwTj-0002R3-00@www.xplot.org>
Sender: Tim Shepard <shep@xplot.org>
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 17:48:58 -0000

I've always understood that TCP (the protocol) allows data to travel
with the SYN, and that the only important rule was that the TCP should
not deliver any data up to a naive application without having first
received an ACK of its own SYN (proving return routability, not a
stale packet, etc).

I remember watching Phil Karn's KA9Q NOS exchanging packets and seeing
it do a full SMTP exchange with remarkably few packets, and I think it
included data with the SYN in both directions.  This was all good
stuff.  (It was doing the best it could do in terms of packet and
round-trip efficiency and getting the transfer-of-responsibility for
the e-mail message right, short of solving the Two Generals' Problem.)

A non-naive application that got to see and examine the data with the
SYN and produce some data that could be included with the SYN-ACK
packet could be a good thing.  We need to be careful not to rule out
this sort of optimization.  (The author of such a non-naive
application has to understand the different nature of this data
received before an ACK of its own TCP's SYN has been received, and
take on some extra responsibility for coping with exceptional
situations.)

Round trip delays are one of the most expensive thing we've got.
Avoiding them can be important to performance, and if including data
with the SYN helps, well, that's good.


			-Tim Shepard
			 shep@alum.mit.edu

From touch@isi.edu  Fri May 13 10:58:14 2011
Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CB108E082D for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 10:58:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.017
X-Spam-Level: 
X-Spam-Status: No, score=-105.017 tagged_above=-999 required=5 tests=[AWL=1.582, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qf9tzrEYvDWB for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 10:58:14 -0700 (PDT)
Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by ietfa.amsl.com (Postfix) with ESMTP id 6D197E081A for <tcpm@ietf.org>; Fri, 13 May 2011 10:58:10 -0700 (PDT)
Received: from [128.9.160.166] (abc.isi.edu [128.9.160.166]) (authenticated bits=0) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id p4DHvDM3026632 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Fri, 13 May 2011 10:57:13 -0700 (PDT)
Message-ID: <4DCD70F8.8090203@isi.edu>
Date: Fri, 13 May 2011 10:57:12 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: Tim Shepard <shep@alum.mit.edu>
References: <E1QKwTj-0002R3-00@www.xplot.org>
In-Reply-To: <E1QKwTj-0002R3-00@www.xplot.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 17:58:14 -0000

On 5/13/2011 10:48 AM, Tim Shepard wrote:
> I've always understood that TCP (the protocol) allows data to travel
> with the SYN,

The issue is whether it allows separate data packets after the SYN but 
before the SYN-ACK. There doesn't appear to be a rule against receiving 
such packets, but the description clearly rules out generating them, AFAICT.

Joe

From david.borman@windriver.com  Fri May 13 11:56:24 2011
Return-Path: <david.borman@windriver.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 63F04E086F for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 11:56:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.599
X-Spam-Level: 
X-Spam-Status: No, score=-103.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yHF1i8-aIYKY for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 11:56:23 -0700 (PDT)
Received: from mail.windriver.com (mail.windriver.com [147.11.1.11]) by ietfa.amsl.com (Postfix) with ESMTP id BCEFBE086E for <tcpm@ietf.org>; Fri, 13 May 2011 11:56:23 -0700 (PDT)
Received: from ALA-HCA.corp.ad.wrs.com (ala-hca [147.11.189.40]) by mail.windriver.com (8.14.3/8.14.3) with ESMTP id p4DIuJTC005863 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 13 May 2011 11:56:19 -0700 (PDT)
Received: from ALA-MBB.corp.ad.wrs.com ([169.254.2.184]) by ALA-HCA.corp.ad.wrs.com ([147.11.189.40]) with mapi id 14.01.0255.000; Fri, 13 May 2011 11:56:19 -0700
From: "Borman, David" <david.borman@windriver.com>
To: William Allen Simpson <william.allen.simpson@gmail.com>
Thread-Topic: [tcpm] Sending data before ACK(SYN) receipt
Thread-Index: AQHMEZ9xjuQFI02EVkCyR3K8rdPp6w==
Date: Fri, 13 May 2011 18:56:19 +0000
Message-ID: <C939387E-77CB-4EA5-B847-77ECE4F8D497@windriver.com>
References: <4DCD3FBF.4040908@gmail.com>
In-Reply-To: <4DCD3FBF.4040908@gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [172.25.44.8]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <A9BA09145D269248BB83206599ED01AD@corp.ad.wrs.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 18:56:24 -0000

On May 13, 2011, at 9:27 AM, William Allen Simpson wrote:

> This arose on the NANOG list, and I've not been able to figure out
> how it happens.  It does happen consistently, and seems relevant to
> this list.
>=20
> Linux 2.6.18 on the server side sometimes sends:
Is this the kernel.org version, or one modified by some distro?

>=20
> # sh capture debug-in
> 8 packets captured
>  1: 21:49:13.461554 client.32929 > server.3306: S
> 4107544000:4107544000(0) win 65535 <mss 1380,nop,wscale 3,sackOK,timestam=
p
> 2065216038 0>
>  2: 21:49:13.462073 server.3306 > client.32929: S
> 2601320299:2601320299(0) ack 4107544001 win 5792 <mss 1460,sackOK,timesta=
mp
> 2581054349 2065216038,nop,wscale 7>
>  3: 21:49:13.462210 server.3306 > client.32929: P
> 2601320300:2601320363(63) ack 4107544001 win 46 <nop,nop,timestamp
> 2581054349 2065216038>
>  4: 21:49:13.519061 client.32929 > server.3306: . ack 2601320300
> win 8208 <nop,nop,timestamp 2065216096 2581054349>
>=20
> That is, the server sends data before receiving ACK(SYN) to complete the
> three-way handshake.  Cisco firewalls drop segment 3, as a violation of
> the "tcp-3whs-failed" rule.
>=20
>  5: 21:49:14.135384 client.32929 > server.3306: P
> 4107544001:4107544003(2) ack 2601320300 win 8208 <nop,nop,timestamp
> 2065216712 2581054349>
>  6: 21:49:14.135521 server.3306 > client.32929: . ack 4107544003
> win 46 <nop,nop,timestamp 2581055023 2065216712>
>  7: 21:49:16.461981 server.3306 > client.32929: P
> 2601320300:2601320363(63) ack 4107544003 win 46 <nop,nop,timestamp
> 2581057349 2065216712>
>  8: 21:49:16.618147 client.32929 > server.3306: . ack 2601320363
> win 8208 <nop,nop,timestamp 2065219195 2581057349>
>=20
> The good news is that it doesn't appear to be fatal, but stalls the
> connection for 2+ seconds until retransmit.  Egregiously contrary to
> intent of speeding up the connection.
>=20
> Anybody know how this was triggered?  I cannot find any documentation
> about this in the Linux stack.  I've tried comparing sysctls to current
> values, nothing obvious.

Just on the face of it from the packet traces, assuming this is intentional=
, it looks like the server is being opportunistic in sending the data, assu=
ming that by the time the data packet gets there, the client will have alre=
ady processed the SYN/ACK and sent back an ACK completing the 3 way handsha=
ke, transitioning to ESTABLISHED state, and thus the data packet would appe=
ar to be perfectly valid and accepted.  Thus the server eliminates one RTT =
in starting up the connection.

Of course the problem is, as in this case, the server has no way of knowing=
 if the data packet was dropped, and if so, causes the connection to timeou=
t to retransmit the data.

Of course, the danger here is if the initial SYN is bogus, the server will =
have already sent out data before verifying connection establishment.

This would be no different than a scenario where a TCP opportunistically se=
nt data beyond the advertised window, on the assumption that by the time th=
e data arrived at the other side, it will have processed the previous data =
and advanced the window and be able to accept the additional data.  From th=
e client side everything looks normal, other than, that last new data showe=
d up really fast after the window was advanced.  And again, if the window *=
doesn't* advance, the opportunistic data will be dropped, potentially leadi=
ng to a timeout on the server, slowing down the connection instead of speed=
ing it up.

>=20
> Also, the server window and wscale seem a bit flaky, too....

I don't think so.  A default wscale=3D7 is reasonable, and in the data pack=
et the server scaled the 5792 window from the SYN packet to 46 (5792/(2^7))=
.  The initial window of 5792 seems to be roughly 4 ethernet packets: (1500=
-52)*4 =3D 5792

			-David Borman

> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm


From ananth@cisco.com  Fri May 13 12:05:55 2011
Return-Path: <ananth@cisco.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B7A91E086F for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 12:05:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.999
X-Spam-Level: 
X-Spam-Status: No, score=-9.999 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id avJwsbfuZ8Y7 for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 12:05:55 -0700 (PDT)
Received: from sj-iport-1.cisco.com (sj-iport-1.cisco.com [171.71.176.70]) by ietfa.amsl.com (Postfix) with ESMTP id 08C41E086E for <tcpm@ietf.org>; Fri, 13 May 2011 12:05:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=ananth@cisco.com; l=841; q=dns/txt; s=iport; t=1305313554; x=1306523154; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to:cc; bh=b2YjGq99kZIZQCfflOBXYHbI+re01Gr4OefHl1a57Jw=; b=fMvtNNokO9KD8SNGSpVhqehlTd63rHsk4sC3nw1ZMe+59OOPrrgmW5Xn yM0Jt0W7RZ8w9o90OdNYJhyN2jpn3Y19v4QK3jdvkI2vGeKxdlA8odXTe 8qWdA9tjG7XwzsCI5wPj0vxyBSQEz0gDsXVqd3ui3dTB0z54n5LAjFJer A=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEAIN/zU2rRDoI/2dsb2JhbACmCHeIcJ5fnX+GFQSGTY1xigVV
X-IronPort-AV: E=Sophos;i="4.64,366,1301875200"; d="scan'208";a="447433546"
Received: from mtv-core-3.cisco.com ([171.68.58.8]) by sj-iport-1.cisco.com with ESMTP; 13 May 2011 19:05:54 +0000
Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by mtv-core-3.cisco.com (8.14.3/8.14.3) with ESMTP id p4DJ5mXZ013046; Fri, 13 May 2011 19:05:54 GMT
Received: from xmb-sjc-21c.amer.cisco.com ([171.70.151.176]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675);  Fri, 13 May 2011 12:05:53 -0700
X-Mimeole: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Fri, 13 May 2011 12:05:51 -0700
Message-ID: <0C53DCFB700D144284A584F54711EC580CB7D0EB@xmb-sjc-21c.amer.cisco.com>
In-Reply-To: <4DCD54FE.2040304@isi.edu>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [tcpm] Sending data before ACK(SYN) receipt
Thread-Index: AcwRhpxyNjtrIrojSfyGHPJf3pIqZgAGR1mQ
References: <4DCD3FBF.4040908@gmail.com> <4DCD54FE.2040304@isi.edu>
From: "Anantha Ramaiah (ananth)" <ananth@cisco.com>
To: "Joe Touch" <touch@isi.edu>, "William Allen Simpson" <william.allen.simpson@gmail.com>
X-OriginalArrivalTime: 13 May 2011 19:05:53.0092 (UTC) FILETIME=[C7C50840:01CC11A0]
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 19:05:55 -0000

<snip>

>  > Cisco firewalls drop segment 3, as a violation of
> > the "tcp-3whs-failed" rule.
>=20
> Perhaps. It's consistent with enforcing what the server should be
> doing,
> from RFC793 processing rules, though. I.e., it seems compliant, though
> not desired.

It could be a bogus SYN, which causes server to send SYN+ACK to the
client which would solicit an RST. Typically motive of the firewall is
to prevent such things from happening, hence it would drop the data
segment since the connection state is not ESTAB until the ACK receipt.=20

lets not get into an argument about middleboxes are evil etc., but just
outlining the common practice :-) Also lets not get too much carried
away with the terms server/client, a firewall's intention is to protect
the box that is behind it, it could be anything.

-Anantha


From Donald.Smith@qwest.com  Fri May 13 12:42:01 2011
Return-Path: <Donald.Smith@qwest.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A6909E0824 for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 12:42:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.126
X-Spam-Level: 
X-Spam-Status: No, score=-2.126 tagged_above=-999 required=5 tests=[AWL=-0.127, BAYES_00=-2.599, J_CHICKENPOX_33=0.6]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rmJ+foUKCRFw for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 12:42:01 -0700 (PDT)
Received: from suomp64i.qwest.com (suomp64i.qwest.com [155.70.16.237]) by ietfa.amsl.com (Postfix) with ESMTP id 176B0E07EF for <tcpm@ietf.org>; Fri, 13 May 2011 12:42:01 -0700 (PDT)
Received: from lxomavmpc030.qintra.com (lxomavmpc030.qintra.com [151.117.207.30]) by suomp64i.qwest.com (8.14.4/8.14.4) with ESMTP id p4DJfrsh020484 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 13 May 2011 14:41:53 -0500 (CDT)
Received: from lxomavmpc030.qintra.com (unknown [127.0.0.1]) by IMSA (Postfix) with ESMTP id 84C7E1E004D; Fri, 13 May 2011 14:41:48 -0500 (CDT)
Received: from suomp60i.qintra.com (unknown [10.6.10.61]) by lxomavmpc030.qintra.com (Postfix) with ESMTP id 69C161E003F; Fri, 13 May 2011 14:41:48 -0500 (CDT)
Received: from qtdenexhtm21.AD.QINTRA.COM (localhost [127.0.0.1]) by suomp60i.qintra.com (8.14.4/8.14.4) with ESMTP id p4DJfOOA022321; Fri, 13 May 2011 14:41:44 -0500 (CDT)
Received: from qtdenexmbm24.AD.QINTRA.COM ([151.119.91.226]) by qtdenexhtm21.AD.QINTRA.COM ([151.119.91.230]) with mapi; Fri, 13 May 2011 13:41:42 -0600
From: "Smith, Donald" <Donald.Smith@qwest.com>
To: "'Anantha Ramaiah (ananth)'" <ananth@cisco.com>, "'Joe Touch'" <touch@isi.edu>, "'William Allen Simpson'" <william.allen.simpson@gmail.com>
Date: Fri, 13 May 2011 13:41:41 -0600
Thread-Topic: [tcpm] Sending data before ACK(SYN) receipt
Thread-Index: AcwRhpxyNjtrIrojSfyGHPJf3pIqZgAGR1mQAAE2HNA=
Message-ID: <B01905DA0C7CDC478F42870679DF0F101096DEC63E@qtdenexmbm24.AD.QINTRA.COM>
References: <4DCD3FBF.4040908@gmail.com> <4DCD54FE.2040304@isi.edu> <0C53DCFB700D144284A584F54711EC580CB7D0EB@xmb-sjc-21c.amer.cisco.com>
In-Reply-To: <0C53DCFB700D144284A584F54711EC580CB7D0EB@xmb-sjc-21c.amer.cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Cc: 'TCP Modifications WG' <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 19:42:01 -0000

Sharing: Author's permission required.
Donald.Smith@qwest.com


> -----Original Message-----
> From: tcpm-bounces@ietf.org [mailto:tcpm-bounces@ietf.org] On Behalf Of
> Anantha Ramaiah (ananth)
> Sent: Friday, May 13, 2011 1:06 PM
> To: Joe Touch; William Allen Simpson
> Cc: TCP Modifications WG
> Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
>
> <snip>
>
> >  > Cisco firewalls drop segment 3, as a violation of
> > > the "tcp-3whs-failed" rule.
> >
> > Perhaps. It's consistent with enforcing what the server should be
> > doing,
> > from RFC793 processing rules, though. I.e., it seems compliant,
> though
> > not desired.
Agreed
"Several examples of connection initiation follow.  Although these
  examples do not show connection synchronization using data-carrying
  segments, this is perfectly legitimate, so long as the receiving TCP
  doesn't deliver the data to the user until it is clear the data is
  valid (i.e., the data must be buffered at the receiver until the
  connection reaches the ESTABLISHED state)."

Servers that exibit this behavior could be used in a spoofed source address=
 syn flood with reflection and amplification depending on how much data is =
sent in the syn/ack response. So I would advise against it:)


>
> It could be a bogus SYN, which causes server to send SYN+ACK to the
> client which would solicit an RST. Typically motive of the firewall is
I believe your "bogus" syn is a spoofed source address syn correct?

> to prevent such things from happening, hence it would drop the data
> segment since the connection state is not ESTAB until the ACK receipt.
>
> lets not get into an argument about middleboxes are evil etc., but just
> outlining the common practice :-) Also lets not get too much carried
> away with the terms server/client, a firewall's intention is to protect
> the box that is behind it, it could be anything.
>
> -Anantha
>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful.  If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.

From william.allen.simpson@gmail.com  Fri May 13 13:50:25 2011
Return-Path: <william.allen.simpson@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 424BEE084D for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 13:50:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level: 
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1EKDwFdMRbjV for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 13:50:24 -0700 (PDT)
Received: from mail-iy0-f172.google.com (mail-iy0-f172.google.com [209.85.210.172]) by ietfa.amsl.com (Postfix) with ESMTP id B36B1E0789 for <tcpm@ietf.org>; Fri, 13 May 2011 13:50:24 -0700 (PDT)
Received: by iyn15 with SMTP id 15so3069921iyn.31 for <tcpm@ietf.org>; Fri, 13 May 2011 13:50:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=IRl+AWQY90g95XEbK3mMpBVcO27AkVoS/NaBXpj9fbw=; b=qZOMzgZArtrOShxFqJlM1M3oxksHeJmH+pr70xt7+jfLHKCG4QKLyyrmL9jCp6xskA nFMk5KfBMgNOsFK9J73J4ePiNMITTXl7bRJ/3jbPRNt+Wk2fX93l6urVU6O4AY3kA36v l2yhz1+XQwZvBvP3Pi9OYp6LeRaiaMiecbsa4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=EVK3vAtTEzyP6HyliZFYvs2Pkh0cBBaAWLTldWJzVKaFxm3bwQso/yR/gvIiFybWD1 4T8uecl1AaIliBeC5BG63v3fA+zsmQX4tGDuSgokPNHaBI7ZELJ/tsvbdHkanFdroOfj EEFOmsyyMDgE7B8jPd9fQGklzjmOIequ9PUxQ=
Received: by 10.42.147.10 with SMTP id l10mr318231icv.314.1305319824194; Fri, 13 May 2011 13:50:24 -0700 (PDT)
Received: from Wastrel-3.local (c-68-40-194-239.hsd1.mi.comcast.net [68.40.194.239]) by mx.google.com with ESMTPS id c1sm1087753ibe.0.2011.05.13.13.50.22 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 13 May 2011 13:50:23 -0700 (PDT)
Message-ID: <4DCD998D.2050708@gmail.com>
Date: Fri, 13 May 2011 16:50:21 -0400
From: William Allen Simpson <william.allen.simpson@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: TCP Modifications WG <tcpm@ietf.org>
References: <E1QKwTj-0002R3-00@www.xplot.org>
In-Reply-To: <E1QKwTj-0002R3-00@www.xplot.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 20:50:25 -0000

On 5/13/11 1:48 PM, Tim Shepard wrote:
> I've always understood that TCP (the protocol) allows data to travel
> with the SYN, and that the only important rule was that the TCP should
> not deliver any data up to a naive application without having first
> received an ACK of its own SYN (proving return routability, not a
> stale packet, etc).
>
We're in violent agreement.  See RFC-6013.

But that's not what's going on here.  This is a packet between the server
SYNACK and client ACK(SYN), more relevant to our discussions of
draft-cheng-tcpm-fastopen and draft-simpson-tcpct-rr.


> I remember watching Phil Karn's KA9Q NOS exchanging packets and seeing
> it do a full SMTP exchange with remarkably few packets, and I think it
> included data with the SYN in both directions.  This was all good
> stuff....
>
As I'm sure you remember, I was also an avid TCP-GROUP user and
developer for KA9Q (Net and later NOS).  :-)

From william.allen.simpson@gmail.com  Fri May 13 14:08:28 2011
Return-Path: <william.allen.simpson@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AB7BDE0886 for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 14:08:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level: 
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y5RuoYx5kSC7 for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 14:08:28 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by ietfa.amsl.com (Postfix) with ESMTP id E2DB9E086E for <tcpm@ietf.org>; Fri, 13 May 2011 14:08:27 -0700 (PDT)
Received: by iwn39 with SMTP id 39so3302808iwn.31 for <tcpm@ietf.org>; Fri, 13 May 2011 14:08:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=NtmULjTHyxUUs8doCa6KKpSZO5x0/X1kASy2uxzjY3s=; b=F+atS4WCAHRuJY/xeZiYFiOjTlLu0sGIpxq4QxqjHMbEjTsnzyjLyhg5wMa69kpJIM xP4h+fwlC8CKveaYLqOTai2RkHsqSoAz3iS3Oeyx/En8To6IptgR+5g5uiRjak15GTm+ FMTfVLmPQqEIcdlb4Gadpe1Org0eKQ4GSz0fU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=xsPvz82AtQ86oQidLBNFRJPJ+yGctXSMTouINSUkRZlryBJ82l/EF522HsskrVcCT0 cR95xHG32Wyq00+d+mF3+9e5VYnmcJXw65rWMSTDvCDdRVzISQ7JvhdDpnxU1r7ySq+j C9dnC/s5KDYBB89DpBcKklKM0whanczNI/9eY=
Received: by 10.42.197.71 with SMTP id ej7mr2464722icb.106.1305320907282; Fri, 13 May 2011 14:08:27 -0700 (PDT)
Received: from Wastrel-3.local (c-68-40-194-239.hsd1.mi.comcast.net [68.40.194.239]) by mx.google.com with ESMTPS id jv9sm993723icb.13.2011.05.13.14.08.25 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 13 May 2011 14:08:26 -0700 (PDT)
Message-ID: <4DCD9DC8.6020503@gmail.com>
Date: Fri, 13 May 2011 17:08:24 -0400
From: William Allen Simpson <william.allen.simpson@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: TCP Modifications WG <tcpm@ietf.org>
References: <4DCD3FBF.4040908@gmail.com> <C939387E-77CB-4EA5-B847-77ECE4F8D497@windriver.com>
In-Reply-To: <C939387E-77CB-4EA5-B847-77ECE4F8D497@windriver.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 21:08:28 -0000

On 5/13/11 2:56 PM, Borman, David wrote:
>
> On May 13, 2011, at 9:27 AM, William Allen Simpson wrote:
>
>> This arose on the NANOG list, and I've not been able to figure out
>> how it happens.  It does happen consistently, and seems relevant to
>> this list.
>>
>> Linux 2.6.18 on the server side sometimes sends:
> Is this the kernel.org version, or one modified by some distro?
>
Red Hat 2.6.18-164.9.1.el5.  Not my machine, so I'm unsure.  They've not
found any modifications in usr/src, so I'm guessing it's plain vanilla.
Anybody here from Red Hat or LKML?


>> # sh capture debug-in
>> 8 packets captured
>>   1: 21:49:13.461554 client.32929>  server.3306: S
>> 4107544000:4107544000(0) win 65535<mss 1380,nop,wscale 3,sackOK,timestamp
>> 2065216038 0>
>>   2: 21:49:13.462073 server.3306>  client.32929: S
>> 2601320299:2601320299(0) ack 4107544001 win 5792<mss 1460,sackOK,timestamp
>> 2581054349 2065216038,nop,wscale 7>
>>   3: 21:49:13.462210 server.3306>  client.32929: P
>> 2601320300:2601320363(63) ack 4107544001 win 46<nop,nop,timestamp
>> 2581054349 2065216038>
>>   4: 21:49:13.519061 client.32929>  server.3306: . ack 2601320300
>> win 8208<nop,nop,timestamp 2065216096 2581054349>
>>
>> That is, the server sends data before receiving ACK(SYN) to complete the
>> three-way handshake.  Cisco firewalls drop segment 3, as a violation of
>> the "tcp-3whs-failed" rule.
>>
>>   5: 21:49:14.135384 client.32929>  server.3306: P
>> 4107544001:4107544003(2) ack 2601320300 win 8208<nop,nop,timestamp
>> 2065216712 2581054349>
>>   6: 21:49:14.135521 server.3306>  client.32929: . ack 4107544003
>> win 46<nop,nop,timestamp 2581055023 2065216712>
>>   7: 21:49:16.461981 server.3306>  client.32929: P
>> 2601320300:2601320363(63) ack 4107544003 win 46<nop,nop,timestamp
>> 2581057349 2065216712>
>>   8: 21:49:16.618147 client.32929>  server.3306: . ack 2601320363
>> win 8208<nop,nop,timestamp 2065219195 2581057349>
>>
>> The good news is that it doesn't appear to be fatal, but stalls the
>> connection for 2+ seconds until retransmit.  Egregiously contrary to
>> intent of speeding up the connection.
>>
>> Anybody know how this was triggered?  I cannot find any documentation
>> about this in the Linux stack.  I've tried comparing sysctls to current
>> values, nothing obvious.
>
> Just on the face of it from the packet traces, assuming this is intentional, it looks like the server is being opportunistic in sending the data, assuming that by the time the data packet gets there, the client will have already processed the SYN/ACK and sent back an ACK completing the 3 way handshake, transitioning to ESTABLISHED state, and thus the data packet would appear to be perfectly valid and accepted.  Thus the server eliminates one RTT in starting up the connection.
>
Yes.  But I've never seen this in my Ubuntu to *BSD testing, so my question is
how is this triggered?  Reportedly, this machine has no problems communicating
in-house, only across the Internet.  Maybe it's always on for the distro???


> Of course the problem is, as in this case, the server has no way of knowing if the data packet was dropped, and if so, causes the connection to timeout to retransmit the data.
>
> Of course, the danger here is if the initial SYN is bogus, the server will have already sent out data before verifying connection establishment.
>
I agree there are security holes.  I'm reporting, not advocating.  But it
negates any possibility of our related drafts working correctly in the field.


> This would be no different than a scenario where a TCP opportunistically sent data beyond the advertised window, on the assumption that by the time the data arrived at the other side, it will have processed the previous data and advanced the window and be able to accept the additional data.  From the client side everything looks normal, other than, that last new data showed up really fast after the window was advanced.  And again, if the window *doesn't* advance, the opportunistic data will be dropped, potentially leading to a timeout on the server, slowing down the connection instead of speeding it up.
>
True.  As observed!


>> Also, the server window and wscale seem a bit flaky, too....
>
> I don't think so.  A default wscale=7 is reasonable, and in the data packet the server scaled the 5792 window from the SYN packet to 46 (5792/(2^7)).  The initial window of 5792 seems to be roughly 4 ethernet packets: (1500-52)*4 = 5792
>
Aha, I miscalculated.  Thanks.

From touch@isi.edu  Fri May 13 14:08:42 2011
Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 513BCE0891 for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 14:08:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.228
X-Spam-Level: 
X-Spam-Status: No, score=-103.228 tagged_above=-999 required=5 tests=[AWL=-0.629, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AMwKdzbtv3Bq for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 14:08:41 -0700 (PDT)
Received: from vapor.isi.edu (vapor.isi.edu [128.9.64.64]) by ietfa.amsl.com (Postfix) with ESMTP id D3FB9E088F for <tcpm@ietf.org>; Fri, 13 May 2011 14:08:41 -0700 (PDT)
Received: from [128.9.160.166] (abc.isi.edu [128.9.160.166]) (authenticated bits=0) by vapor.isi.edu (8.13.8/8.13.8) with ESMTP id p4DL8Ui4025750 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Fri, 13 May 2011 14:08:30 -0700 (PDT)
Message-ID: <4DCD9DCE.5020404@isi.edu>
Date: Fri, 13 May 2011 14:08:30 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: William Allen Simpson <william.allen.simpson@gmail.com>
References: <E1QKwTj-0002R3-00@www.xplot.org> <4DCD998D.2050708@gmail.com>
In-Reply-To: <4DCD998D.2050708@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 21:08:42 -0000

On 5/13/2011 1:50 PM, William Allen Simpson wrote:
> On 5/13/11 1:48 PM, Tim Shepard wrote:
>> I've always understood that TCP (the protocol) allows data to travel
>> with the SYN, and that the only important rule was that the TCP should
>> not deliver any data up to a naive application without having first
>> received an ACK of its own SYN (proving return routability, not a
>> stale packet, etc).
>>
> We're in violent agreement. See RFC-6013.
>
> But that's not what's going on here. This is a packet between the server
> SYNACK and client ACK(SYN), more relevant to our discussions of
> draft-cheng-tcpm-fastopen and draft-simpson-tcpct-rr.

As others have pointed out, if the SYNACK arrives at the client *and* is 
ACK'd, then the client would be in ESTABLISHED and the a data would be 
accepted.

I.e., the server behavior expects that this happens at the client. If 
not - e.g., if the SYNACK is lost - they you get a stall.

That's normal behavior.

However, the TCP spec says that the server shouldn't have been sending 
this data yet. Until it gets the final ACK, the only data should have 
been in the SYN.

So the firewall is compliant in dropping the data packet - as would be 
the client, if it did so. The result in that case is also a stall.

I.e., this isn't about data in the SYN; it's about when each side is 
permitted to start sending data after the SYN.

Joe

From shep@xplot.org  Fri May 13 14:10:13 2011
Return-Path: <shep@xplot.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F381FE088B for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 14:10:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.005
X-Spam-Level: 
X-Spam-Status: No, score=-0.005 tagged_above=-999 required=5 tests=[AWL=-2.594, BAYES_00=-2.599, FRT_STOCK2=3.988, J_CHICKENPOX_46=0.6, J_CHICKENPOX_61=0.6]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iwrOEgBA4Z+5 for <tcpm@ietfa.amsl.com>; Fri, 13 May 2011 14:10:11 -0700 (PDT)
Received: from www.xplot.org (www.xplot.org [66.92.66.146]) by ietfa.amsl.com (Postfix) with ESMTP id 0BFF9E086E for <tcpm@ietf.org>; Fri, 13 May 2011 14:10:11 -0700 (PDT)
Received: from shep (helo=alva.home) by www.xplot.org with local-esmtp (Exim 3.36 #1 (Debian)) id 1QKzce-0003VE-00 for <tcpm@ietf.org>; Fri, 13 May 2011 17:10:08 -0400
From: Tim Shepard <shep@alum.mit.edu>
To: TCP Modifications WG <tcpm@ietf.org>
In-reply-to: Your message of Fri, 13 May 2011 10:57:12 -0700. <4DCD70F8.8090203@isi.edu> 
Date: Fri, 13 May 2011 17:10:08 -0400
Message-Id: <E1QKzce-0003VE-00@www.xplot.org>
Sender: Tim Shepard <shep@xplot.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 May 2011 21:10:13 -0000

> > I've always understood that TCP (the protocol) allows data to travel
> > with the SYN,
> 
> The issue is whether it allows separate data packets after the SYN but 
> before the SYN-ACK. There doesn't appear to be a rule against receiving 
> such packets, but the description clearly rules out generating them, AFAICT.
> 
> Joe

Ah, I completely misunderstood what this is about.

Now I've looked more closely at Bill's original message and I see the
interesting packet...

That is weird and interesting.

I peeked at mysql-5.1-5.1.49/server-tools/instance-manager/listener.cc
which appears to be the relevant user-mode source code (judging from
the server port number) and I see it is doing this:

  socket(AT_INET, SOCK,STREAM, 0)

  int arg=1;
  setsockopt( , SOL_SOCKET, SO_REUSEADDR,  &arg, sizeof arg)

  listen( , )

  int flags = fcntl( , F_GETFL, 0)
  fcntl( , F_SETFL, flags | O_NONBLOCK);

  int flags= fcntl( , F_GETFD, 0);
  fcntl( , F_SETFD, flags | FD_CLOEXEC);

  select(n , &reads, 0, 0, &tv);

  accept( , 0 , 0)

and I would think that the accept call would not return until the
connection gets out of SYN_RCVD state.

Weird.

Did linux really do this?

I'm wondering is there another middlebox involved that sent an ACK
back to the server?

(Bill, did you collect that trace on the server itself?)

I just tried to reproduce this on a linux server that I have access
to that already has mysql server listening on 3306, and I do not see
this weird behavior (kernel is 2.6.32-5-amd64 from Debian).  I did
"telnet server 3306" from home to trigger the connection, and
collected this trace on the server:

16:32:17.931509 IP client.52527 > server.3306: Flags [S], seq 1071911238, win 5840, options [mss 1460,sackOK,TS val 851353709 ecr 0,nop,wscale 7], length 0
16:32:17.931539 IP server.3306 > client.52527: Flags [S.], seq 1870570388, ack 1071911239, win 5792, options [mss 1460,sackOK,TS val 4291664564 ecr 851353709,nop,wscale 7], length 0
16:32:17.969949 IP client.52527 > server.3306: Flags [.], ack 1, win 46, options [nop,nop,TS val 851353719 ecr 4291664564], length 0
16:32:17.988568 IP server.3306 > client.52527: Flags [P.], seq 1:63, ack 1, win 46, options [nop,nop,TS val 4291664578 ecr 851353719], length 62
16:32:18.023935 IP client.52527 > server.3306: Flags [.], ack 63, win 46, options [nop,nop,TS val 851353733 ecr 4291664578], length 0
16:32:28.008378 IP server.3306 > client.52527: Flags [F.], seq 63, ack 1, win 46, options [nop,nop,TS val 4291667083 ecr 851353733], length 0
16:32:28.044627 IP client.52527 > server.3306: Flags [F.], seq 1, ack 64, win 46, options [nop,nop,TS val 851356238 ecr 4291667083], length 0
16:32:28.044647 IP server.3306 > client.52527: Flags [.], ack 2, win 46, options [nop,nop,TS val 4291667092 ecr 851356238], length 0

Note in my trace the ack of the server's SYN arrives before the
server sends those first 63 bytes.

I tried about a dozen more times, and never saw that data from the
server go out before the ack from the client.

Has anyone else managed to reproduce the behavior shown in the trace
Bill posted?

			-Tim Shepard
			 shep@alum.mit.edu

From iesg-secretary@ietf.org  Mon May 16 11:54:21 2011
Return-Path: <iesg-secretary@ietf.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AE9BCE079D; Mon, 16 May 2011 11:54:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.544
X-Spam-Level: 
X-Spam-Status: No, score=-102.544 tagged_above=-999 required=5 tests=[AWL=0.055, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SL+mnQgHPTb8; Mon, 16 May 2011 11:54:21 -0700 (PDT)
Received: from ietfa.amsl.com (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BAC44E079A; Mon, 16 May 2011 11:54:20 -0700 (PDT)
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: The IESG <iesg-secretary@ietf.org>
To: IETF-Announce <ietf-announce@ietf.org>
X-Test-IDTracker: no
X-IETF-IDTracker: 3.54
Message-ID: <20110516185420.4247.53975.idtracker@ietfa.amsl.com>
Date: Mon, 16 May 2011 11:54:20 -0700
Cc: tcpm chair <tcpm-chairs@tools.ietf.org>, tcpm mailing list <tcpm@ietf.org>, RFC Editor <rfc-editor@rfc-editor.org>
Subject: [tcpm] Protocol Action: 'Computing TCP's Retransmission Timer' to Proposed	Standard (draft-paxson-tcpm-rfc2988bis-02.txt)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2011 18:54:21 -0000

The IESG has approved the following document:
- 'Computing TCP's Retransmission Timer'
  (draft-paxson-tcpm-rfc2988bis-02.txt) as a Proposed Standard

This document is the product of the TCP Maintenance and Minor Extensions
Working Group.

The IESG contact person is Wesley Eddy.

A URL of this Internet Draft is:
http://datatracker.ietf.org/doc/draft-paxson-tcpm-rfc2988bis/




Technical Summary

This document defines the standard algorithm that Transmission
Control Protocol (TCP) senders are required to use to compute and
manage their retransmission timer.  It expands on the discussion in
section 4.2.3.1 of RFC 1122 and upgrades the requirement of
supporting the algorithm from a SHOULD to a MUST.  It also
modifies the initial RTO defined in RFC 2988.


Working Group Summary

Nothing exceptional occurred during the working group process for this
document.


Document Quality

Implementations of the methods described in this document have been
running on the Internet for many years.  The change in the initial
RTO parameter from 3 s to 1 s has been implemented to lesser extent,
though no significant issues have been found that would block wider
deployment and Appendix A of the document contains substantial
motivations and analysis of the change. 


Personnel

Wesley Eddy (wes@mti-systems.com) is the document shepherd and
the responsible AD.


RFC Editor Note

- Please add to the document header that this obsoletes RFC 2988

- Please add to the end of the abstract:
  "This document obsoletes RFC 2988."

- Please add to the end of the introduction section:
  "This document obsoletes [RFC2988]."

- Please add a Section called "Changes from RFC 2988" which
  follows the IANA considerations and contains the text:

  This document reduces the initial RTO from the previous 3 seconds
  [RFC2988] to 1 second, unless the ACK of the SYN is lost in which
  case the default RTO is reverted to 3 seconds before data transmission
  begins.

- Please add to the document header that this updates RFC 1122

- Please add to the security considerations section:
  "The security considerations in [RFC5681] are also applicable
   to this document." 


From william.allen.simpson@gmail.com  Tue May 17 07:01:00 2011
Return-Path: <william.allen.simpson@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BB7B7E081A for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 07:01:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level: 
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7s9lUBhISyB9 for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 07:00:58 -0700 (PDT)
Received: from mail-iy0-f172.google.com (mail-iy0-f172.google.com [209.85.210.172]) by ietfa.amsl.com (Postfix) with ESMTP id C11C6E0773 for <tcpm@ietf.org>; Tue, 17 May 2011 07:00:57 -0700 (PDT)
Received: by iyn15 with SMTP id 15so579100iyn.31 for <tcpm@ietf.org>; Tue, 17 May 2011 07:00:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=GSy0n4Sn47xzP/bJTqD0ogPwF8P58RUoea3yXWfyqYM=; b=e1a4kMVJ6bsBWjKmwZ4FHDYV4UgH7NWKRd/F9rLXknuFmZhuBA4QOC1xvqytTEW5S0 V8HUaKf2ARBfN6QTRJf3xQOM/kjksh4/ROWxsBFMVb/2lhSJWIqizTWsA4Y8UbVJYNRF XLZur6v/qdW8uKWMBL38z/rw1/avFJAJNNdkE=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=qdQMpLz76a0gzqtWQVgxCkPbx+x1itFk8BvPJWSdFjEal1JWsY9i29US02alB5kiNX kPMpEwXQGq8qxOFpmkTLQIClPJXaKvhwAKHoToHR0diY/TSLGHtoaiD3EklDoOGKnnSo NMoOqMT/RZMnaxngvGV9e5Kyt7x01Pqt7FSms=
Received: by 10.42.218.196 with SMTP id hr4mr659405icb.121.1305640857218; Tue, 17 May 2011 07:00:57 -0700 (PDT)
Received: from Wastrel-3.local (c-68-40-194-239.hsd1.mi.comcast.net [68.40.194.239]) by mx.google.com with ESMTPS id uh10sm237512icb.6.2011.05.17.07.00.55 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 17 May 2011 07:00:56 -0700 (PDT)
Message-ID: <4DD27F96.3050403@gmail.com>
Date: Tue, 17 May 2011 10:00:54 -0400
From: William Allen Simpson <william.allen.simpson@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: tcpm@ietf.org
References: <E1QKzce-0003VE-00@www.xplot.org>
In-Reply-To: <E1QKzce-0003VE-00@www.xplot.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 14:01:00 -0000

On 5/13/11 5:10 PM, Tim Shepard wrote:
> Has anyone else managed to reproduce the behavior shown in the trace
> Bill posted?
>
Neither Tim nor I can reproduce this on current debian or Ubuntu.

This seems to be a new and undocumented feature of Red Hat Enterprise
Linux 5 released in an update late last year.  Or at least that's the
current best guess.

Are there any Red Hat folks here?

From L.Wood@surrey.ac.uk  Tue May 17 07:08:55 2011
Return-Path: <L.Wood@surrey.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9FEB2E0773 for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 07:08:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.598
X-Spam-Level: 
X-Spam-Status: No, score=-6.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, UNPARSEABLE_RELAY=0.001]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Qe8rdYfxRZO9 for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 07:08:54 -0700 (PDT)
Received: from mail1.bemta14.messagelabs.com (mail1.bemta14.messagelabs.com [193.109.254.98]) by ietfa.amsl.com (Postfix) with ESMTP id 91054E081A for <tcpm@ietf.org>; Tue, 17 May 2011 07:08:54 -0700 (PDT)
Received: from [193.109.255.147:43684] by server-6.bemta-14.messagelabs.com id 6E/F7-03960-57182DD4; Tue, 17 May 2011 14:08:53 +0000
X-VirusChecked: Checked
X-Env-Sender: L.Wood@surrey.ac.uk
X-Msg-Ref: server-7.tower-72.messagelabs.com!1305641329!33829159!1
X-StarScan-Version: 6.2.16; banners=-,-,-
X-Originating-IP: [131.227.200.43]
Received: (qmail 31124 invoked from network); 17 May 2011 14:08:49 -0000
Received: from unknown (HELO EXHT022P.surrey.ac.uk) (131.227.200.43) by server-7.tower-72.messagelabs.com with AES128-SHA encrypted SMTP; 17 May 2011 14:08:49 -0000
Received: from EXMB01CMS.surrey.ac.uk ([169.254.1.151]) by EXHT022P.surrey.ac.uk ([131.227.200.43]) with mapi; Tue, 17 May 2011 15:08:52 +0100
From: <L.Wood@surrey.ac.uk>
To: <william.allen.simpson@gmail.com>, <tcpm@ietf.org>
Date: Tue, 17 May 2011 15:08:52 +0100
Thread-Topic: [tcpm] Sending data before ACK(SYN) receipt
Thread-Index: AcwUmvGNDngDmlweS2WHDB0QjHx57QAAO5Lw
Message-ID: <FD7B10366AE3794AB1EC5DE97A93A3732151620B21@EXMB01CMS.surrey.ac.uk>
References: <E1QKzce-0003VE-00@www.xplot.org> <4DD27F96.3050403@gmail.com>
In-Reply-To: <4DD27F96.3050403@gmail.com>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US, en-GB
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 14:08:55 -0000

Between the oddities in redhat's gcc variant and its kernel, is it even wor=
th tracking down red hat problems?=20

> -----Original Message-----
> From: tcpm-bounces@ietf.org [mailto:tcpm-bounces@ietf.org] On=20
> Behalf Of William Allen Simpson
> Sent: 17 May 2011 15:01
> To: tcpm@ietf.org
> Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
>=20
> On 5/13/11 5:10 PM, Tim Shepard wrote:
> > Has anyone else managed to reproduce the behavior shown in=20
> the trace=20
> > Bill posted?
> >
> Neither Tim nor I can reproduce this on current debian or Ubuntu.
>=20
> This seems to be a new and undocumented feature of Red Hat=20
> Enterprise Linux 5 released in an update late last year.  Or=20
> at least that's the current best guess.
>=20
> Are there any Red Hat folks here?

From cabo@tzi.org  Tue May 17 08:17:09 2011
Return-Path: <cabo@tzi.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A0786E0665 for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 08:17:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.249
X-Spam-Level: 
X-Spam-Status: No, score=-106.249 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id t4b9XahDLbrv for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 08:17:08 -0700 (PDT)
Received: from informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) by ietfa.amsl.com (Postfix) with ESMTP id 56F08E0658 for <tcpm@ietf.org>; Tue, 17 May 2011 08:17:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from smtp-fb3.informatik.uni-bremen.de (smtp-fb3.informatik.uni-bremen.de [134.102.224.120]) by informatik.uni-bremen.de (8.14.3/8.14.3) with ESMTP id p4HFGuL2024463; Tue, 17 May 2011 17:16:57 +0200 (CEST)
Received: from client-0035.vpn.uni-bremen.de (client-0035.vpn.uni-bremen.de [134.102.24.35]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp-fb3.informatik.uni-bremen.de (Postfix) with ESMTPSA id 8FF7D18B; Tue, 17 May 2011 17:16:56 +0200 (CEST)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <4DCD9DCE.5020404@isi.edu>
Date: Tue, 17 May 2011 17:16:54 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <701CEDC4-0484-4825-A364-3D90D4750FD1@tzi.org>
References: <E1QKwTj-0002R3-00@www.xplot.org> <4DCD998D.2050708@gmail.com> <4DCD9DCE.5020404@isi.edu>
To: Joe Touch <touch@ISI.EDU>
X-Mailer: Apple Mail (2.1084)
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 15:17:09 -0000

On May 13, 2011, at 23:08, Joe Touch wrote:

> However, the TCP spec says that the server shouldn't have been sending =
this data yet. Until it gets the final ACK, the only data should have =
been in the SYN.

It is not clear to me that the text in 793 isn't a classical case of =
overspecification.

The *protocol* typically does not care -- the connecting peer will have =
sent the final ACK before it gets the data, so it cannot distinguish the =
deviant behavior from the specified one (unless it has a special delay =
built in here).

The *service* (distributed abstract machine leaky abstraction...) indeed =
does not want data handed down early to be leaking out the wire before =
the three-way handshake completes.  But if the service user doesn't care =
about that leak?  Then this is a valid optimization.  (Modulo the =
amplification aspect, but I'm not sure this is really that relevant with =
63? bytes.)  So a high-quality specification should be warning about the =
effects of sending early (as in SHOULD NOT unless...), but not right =
outlaw it (MUST NOT).

It is not really the business of the middlebox to second-guess the =
desires of the host. =20
(But of course we know many middleboxes are built to do exactly that.)
(And, of course, one way for the server to maintain the RTT optimization =
in the presence of pesky second-guessing middleboxes is to send the data =
*twice*, once immediately after the SYN-ACK and once the =
handshake-completing ACK arrives.)

What I'm trying to say here is that deviant behavior that is =
indistinguishable from compliant behavior is interoperable*).  The =
interesting question is never whether you are compliant, but whether you =
are interoperable.  The bad guy is not the server implementing an =
interoperable optimization, but the middlebox preventing the =
interoperable optimization under the pretext of non-compliance.

(Sorry for bringing up these philosophical points here, which mostly are =
of import when writing new protocol specifications, not so much when =
maintaining widely implemented ones.)

Gruesse, Carsten

*) ... and the existence of such behavior is a diagnostic for =
overspecification.


From touch@isi.edu  Tue May 17 09:22:32 2011
Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 315B7E071C for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 09:22:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.599
X-Spam-Level: 
X-Spam-Status: No, score=-106.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UO0hNpr-HCT1 for <tcpm@ietfa.amsl.com>; Tue, 17 May 2011 09:22:31 -0700 (PDT)
Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by ietfa.amsl.com (Postfix) with ESMTP id 92426E0658 for <tcpm@ietf.org>; Tue, 17 May 2011 09:22:31 -0700 (PDT)
Received: from [136.242.248.185] (host-248-185.cua.edu [136.242.248.185]) (authenticated bits=0) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id p4HGLhQD013081 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Tue, 17 May 2011 09:21:52 -0700 (PDT)
Message-ID: <4DD2A094.8020808@isi.edu>
Date: Tue, 17 May 2011 09:21:40 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: Carsten Bormann <cabo@tzi.org>
References: <E1QKwTj-0002R3-00@www.xplot.org> <4DCD998D.2050708@gmail.com> <4DCD9DCE.5020404@isi.edu> <701CEDC4-0484-4825-A364-3D90D4750FD1@tzi.org>
In-Reply-To: <701CEDC4-0484-4825-A364-3D90D4750FD1@tzi.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: TCP Modifications WG <tcpm@ietf.org>
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 16:22:32 -0000

Hi, Carsten,

On 5/17/2011 8:16 AM, Carsten Bormann wrote:
> On May 13, 2011, at 23:08, Joe Touch wrote:
>
>> However, the TCP spec says that the server shouldn't have been sending this data yet. Until it gets the final ACK, the only data should have been in the SYN.
>
> It is not clear to me that the text in 793 isn't a classical case of
> overspecification.

I do think there is a goal - of limiting the amount of data held until 
the 3WHS completes.

> The *protocol* typically does not care -- the connecting peer will
> have sent the final ACK before it gets the data, so it cannot
> distinguish the deviant behavior from the specified one (unless it
> has a special delay built in here).

First, that assumes no packets are lost. If the SYN-ACK is lost, or the 
final ACK is lost, then the behavior is different (and hiccups).

> The *service* (distributed abstract machine leaky abstraction...)
> indeed does not want data handed down early to be leaking out the
> wire before the three-way handshake completes.

See above for reasons why that affects the protocol - or at least its 
implementation.

> But if the service
> user doesn't care about that leak?  Then this is a valid
> optimization.

Right, but so is having a middlebox decide to delete that advance data 
to prevent the server from being overloaded until the connection completes.

So one person's optimization is another's overload. Then that person's 
protection is the first person's problem (creating a stall).

 > (Modulo the amplification aspect, but I'm not sure
> this is really that relevant with 63? bytes.)  So a high-quality
> specification should be warning about the effects of sending early
> (as in SHOULD NOT unless...), but not right outlaw it (MUST NOT).

The current focus is to be conservative - unsurprisingly. The danger of 
being aggressive is that when you fail, it costs more.

> It is not really the business of the middlebox to second-guess the desires of the host.

Yes, unless the middlebox is working on behalf of the host (e.g., under 
control of the admin, etc.). And agreed that some midbox managers 
*think* they're working on behalf of the host even when they're not...

> What I'm trying to say here is that deviant behavior that is indistinguishable from compliant behavior is interoperable*).

Yes, but so is the packet loss created by the midbox ;-) IP drops packets...

>   The bad guy is not the server implementing an interoperable optimization,

Who's 'bad' here - as in many cases - depends on what you're optimizing. 
The fact that the server being aggressive sometimes shoots itself in the 
foot is, IMO, just desserts.

Joe


From wwwrun@rfc-editor.org  Tue May 17 10:58:16 2011
Return-Path: <wwwrun@rfc-editor.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1DBBFE0823; Tue, 17 May 2011 10:58:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.454
X-Spam-Level: 
X-Spam-Status: No, score=-102.454 tagged_above=-999 required=5 tests=[AWL=0.146, BAYES_00=-2.599, NO_RELAYS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ouZwxn2RFIgk; Tue, 17 May 2011 10:58:15 -0700 (PDT)
Received: from rfc-editor.org (rfc-editor.org [IPv6:2001:1890:1112:1::2f]) by ietfa.amsl.com (Postfix) with ESMTP id 9F8E8E0822; Tue, 17 May 2011 10:58:15 -0700 (PDT)
Received: by rfc-editor.org (Postfix, from userid 30) id 995E7E0790; Tue, 17 May 2011 10:58:15 -0700 (PDT)
To: ietf-announce@ietf.org, rfc-dist@rfc-editor.org
From: rfc-editor@rfc-editor.org
Message-Id: <20110517175815.995E7E0790@rfc-editor.org>
Date: Tue, 17 May 2011 10:58:15 -0700 (PDT)
Cc: tcpm@ietf.org, rfc-editor@rfc-editor.org
Subject: [tcpm] RFC 6247 on Moving the Undeployed TCP Extensions RFC 1072, RFC 1106, RFC 1110, RFC 1145, RFC 1146, RFC 1379, RFC 1644, and RFC 1693 to Historic Status
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 17:58:16 -0000

A new Request for Comments is now available in online RFC libraries.

        
        RFC 6247

        Title:      Moving the Undeployed TCP Extensions 
                    RFC 1072, RFC 1106, RFC 1110, 
                    RFC 1145, RFC 1146, RFC 1379, 
                    RFC 1644, and RFC 1693 to 
                    Historic Status 
        Author:     L. Eggert
        Status:     Informational
        Stream:     IETF
        Date:       May 2011
        Mailbox:    lars.eggert@nokia.com
        Pages:      4
        Characters: 6414
        Obsoletes:  RFC1072, RFC1106, RFC1110, RFC1145, 
                    RFC1146, RFC1379, RFC1644, RFC1693
        Updates:    RFC4614

        I-D Tag:    draft-eggert-tcpm-historicize-02.txt

        URL:        http://www.rfc-editor.org/rfc/rfc6247.txt

This document reclassifies several TCP extensions that have never
seen widespread use to Historic status.  The affected RFCs are RFC
1072, RFC 1106, RFC 1110, RFC 1145, RFC 1146, RFC 1379, RFC 1644, and
RFC 1693.  This document is not an Internet Standards Track 
specification; it is published for informational purposes.


This document is a product of the TCP Maintenance and Minor Extensions Working Group of the IETF.


INFORMATIONAL: This memo provides information for the Internet community.
It does not specify an Internet standard of any kind. Distribution of
this memo is unlimited.

This announcement is sent to the IETF-Announce and rfc-dist lists.
To subscribe or unsubscribe, see
  http://www.ietf.org/mailman/listinfo/ietf-announce
  http://mailman.rfc-editor.org/mailman/listinfo/rfc-dist

For searching the RFC series, see http://www.rfc-editor.org/rfcsearch.html.
For downloading RFCs, see http://www.rfc-editor.org/rfc.html.

Requests for special distribution should be addressed to either the
author of the RFC in question, or to rfc-editor@rfc-editor.org.  Unless
specifically noted otherwise on the RFC itself, all RFCs are for
unlimited distribution.


The RFC Editor Team
Association Management Solutions, LLC



From william.allen.simpson@gmail.com  Wed May 18 03:50:21 2011
Return-Path: <william.allen.simpson@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A23BE06CA for <tcpm@ietfa.amsl.com>; Wed, 18 May 2011 03:50:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level: 
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bdoMIHM5hp0Y for <tcpm@ietfa.amsl.com>; Wed, 18 May 2011 03:50:20 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by ietfa.amsl.com (Postfix) with ESMTP id 38208E06C9 for <tcpm@ietf.org>; Wed, 18 May 2011 03:50:20 -0700 (PDT)
Received: by iwn39 with SMTP id 39so1516168iwn.31 for <tcpm@ietf.org>; Wed, 18 May 2011 03:50:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=9ffpWevVOH2bZ5rvCYQO9IDocPNVVzXI803XvmyB2jU=; b=C7W/xrujy2FPQr5rkYFEuVYj5DzPZo54S8/lOMFqOzuXDd6tWSxI4vie4E1aVPd+p9 SNaLleNDoz2SWC2nae8mYkhdyMNdLmknR4A9K1+Iad6McBm1rxP2aPoR+kFPvw4T51Rr na0BDm1lE5jLF7zeopb8jz7YQh45f9ruqywVc=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=s0NGqn2BgrbBsDrJw5Myr6JLwBcDbbyn3DoSjmG0PQt80cuqrgG8daStCPZFPeeoS0 9wPWPj/O2uqVhnT69INPzwWbBqfR8cfOaNqQpgZZlwuXChmhzv/wN+lXbQVr9t599bxf o4eIazY+0eK6MP/JTR19aoAx/TCa3hGSTz2ZQ=
Received: by 10.42.137.194 with SMTP id z2mr2290177ict.249.1305715819654; Wed, 18 May 2011 03:50:19 -0700 (PDT)
Received: from Wastrel-3.local (c-68-40-194-239.hsd1.mi.comcast.net [68.40.194.239]) by mx.google.com with ESMTPS id y10sm640964iba.63.2011.05.18.03.50.17 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 18 May 2011 03:50:18 -0700 (PDT)
Message-ID: <4DD3A468.4060403@gmail.com>
Date: Wed, 18 May 2011 06:50:16 -0400
From: William Allen Simpson <william.allen.simpson@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: TCP Modifications WG <tcpm@ietf.org>
References: <E1QKwTj-0002R3-00@www.xplot.org> <4DCD998D.2050708@gmail.com> <4DCD9DCE.5020404@isi.edu> <701CEDC4-0484-4825-A364-3D90D4750FD1@tzi.org> <4DD2A094.8020808@isi.edu>
In-Reply-To: <4DD2A094.8020808@isi.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [tcpm] Sending data before ACK(SYN) receipt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 10:50:21 -0000

On 5/17/11 12:21 PM, Joe Touch wrote:
> Hi, Carsten,
>
> On 5/17/2011 8:16 AM, Carsten Bormann wrote:
>> On May 13, 2011, at 23:08, Joe Touch wrote:
>>
>>> However, the TCP spec says that the server shouldn't have been sending this data yet. Until it gets the final ACK, the only data should have been in the SYN.
>>
>> ...
>> It is not really the business of the middlebox to second-guess the desires of the host.
>
> Yes, unless the middlebox is working on behalf of the host (e.g., under control of the admin, etc.). And agreed that some midbox managers *think* they're working on behalf of the host even when they're not...
>
In this case, the middlebox firewall is under the control of the same
operator as the host, but the middlebox doesn't have a way of turning off
its behavior.  The middlebox *programmers* think they're working on
behalf of the host even when they're not....


>> What I'm trying to say here is that deviant behavior that is indistinguishable from compliant behavior is interoperable*).
>
> Yes, but so is the packet loss created by the midbox ;-) IP drops packets...
>
True, I think Dave Borman stated that pretty well earlier.


>> The bad guy is not the server implementing an interoperable optimization,
>
> Who's 'bad' here - as in many cases - depends on what you're optimizing. The fact that the server being aggressive sometimes shoots itself in the foot is, IMO, just desserts.
>
IMnsHO, I think both are "bad" here.  We've had the Google draft advocating
this early data, and my counter-draft.  But *we* understand that a change to
the TCP protocol needs to be negotiated securely as an option.

Red Hat (apparently) "just did it" without any negotiation.  That's bad.

Cisco shipped a firewall without the ability to turn off the behavior.
That's bad, too.

Thank goodness the operator brought the problem to NANOG!  (And why we
specify these as Experimental drafts, because TCP needs widespread
testing against more than the usual consumer middleboxen.)

I'll add the following to my draft:

Applicability                                                            +

    This specification is intended for network paths under the complete   +
    control of an operator, such as secure tunnels or intra-campus        +
    private links.  Widely deployed security firewalls block the          +
    transmission of these additional data segments, and are outside the   +
    scope of this specification.                                          +



From iesg-secretary@ietf.org  Fri May 20 06:51:13 2011
Return-Path: <iesg-secretary@ietf.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C2461E078D; Fri, 20 May 2011 06:51:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.535
X-Spam-Level: 
X-Spam-Status: No, score=-102.535 tagged_above=-999 required=5 tests=[AWL=0.064, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XAonqOcdNP7S; Fri, 20 May 2011 06:51:13 -0700 (PDT)
Received: from ietfa.amsl.com (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 19C62E0657; Fri, 20 May 2011 06:51:13 -0700 (PDT)
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: The IESG <iesg-secretary@ietf.org>
To: IETF-Announce <ietf-announce@ietf.org>
X-Test-IDTracker: no
X-IETF-IDTracker: 3.54
Message-ID: <20110520135113.16865.55178.idtracker@ietfa.amsl.com>
Date: Fri, 20 May 2011 06:51:13 -0700
Cc: tcpm@ietf.org
Subject: [tcpm] Last Call: <draft-ietf-tcpm-persist-04.txt> (Clarification of sender	behavior in persist condition.) to Informational RFC
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: ietf@ietf.org
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 13:51:13 -0000

The IESG has received a request from the TCP Maintenance and Minor
Extensions WG (tcpm) to consider the following document:
- 'Clarification of sender behavior in persist condition.'
  <draft-ietf-tcpm-persist-04.txt> as an Informational RFC

The IESG plans to make a decision in the next few weeks, and solicits
final comments on this action. Please send substantive comments to the
ietf@ietf.org mailing lists by 2011-06-03. Exceptionally, comments may be
sent to iesg@ietf.org instead. In either case, please retain the
beginning of the Subject line to allow automated sorting.

Abstract


This document clarifies the Zero Window Probes (ZWP) described in
Requirements for Internet Hosts [RFC1122].  In particular, it
clarifies the actions that can be taken on connections which are
experiencing the ZWP condition.



The file can be obtained via
http://datatracker.ietf.org/doc/draft-ietf-tcpm-persist/

IESG discussion can be tracked via
http://datatracker.ietf.org/doc/draft-ietf-tcpm-persist/


No IPR declarations have been submitted directly on this I-D.



From thomas.r.henderson@boeing.com  Mon May 23 09:29:27 2011
Return-Path: <thomas.r.henderson@boeing.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A7605E07C9 for <tcpm@ietfa.amsl.com>; Mon, 23 May 2011 09:29:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.599
X-Spam-Level: 
X-Spam-Status: No, score=-106.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1nUyuScKZRPU for <tcpm@ietfa.amsl.com>; Mon, 23 May 2011 09:29:27 -0700 (PDT)
Received: from slb-smtpout-01.boeing.com (slb-smtpout-01.boeing.com [130.76.64.48]) by ietfa.amsl.com (Postfix) with ESMTP id 2C50AE077A for <tcpm@ietf.org>; Mon, 23 May 2011 09:29:27 -0700 (PDT)
Received: from stl-av-01.boeing.com (stl-av-01.boeing.com [192.76.190.6]) by slb-smtpout-01.ns.cs.boeing.com (8.14.4/8.14.4/8.14.4/SMTPOUT) with ESMTP id p4NGT6uJ012572 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Mon, 23 May 2011 09:29:07 -0700 (PDT)
Received: from stl-av-01.boeing.com (localhost [127.0.0.1]) by stl-av-01.boeing.com (8.14.4/8.14.4/DOWNSTREAM_RELAY) with ESMTP id p4NGT69W019300; Mon, 23 May 2011 11:29:06 -0500 (CDT)
Received: from XCH-NWHT-02.nw.nos.boeing.com (xch-nwht-02.nw.nos.boeing.com [130.247.70.248]) by stl-av-01.boeing.com (8.14.4/8.14.4/UPSTREAM_RELAY) with ESMTP id p4NGT4qw019236 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=OK); Mon, 23 May 2011 11:29:05 -0500 (CDT)
Received: from XCH-NW-10V.nw.nos.boeing.com ([130.247.25.85]) by XCH-NWHT-02.nw.nos.boeing.com ([130.247.70.248]) with mapi; Mon, 23 May 2011 09:29:04 -0700
From: "Henderson, Thomas R" <thomas.r.henderson@boeing.com>
To: "'Scheffenegger, Richard'" <rs@netapp.com>
Date: Mon, 23 May 2011 09:29:04 -0700
Thread-Topic: [tcpm] I-D ACTION:draft-ietf-tcpm-rfc3782-bis-02.txt
Thread-Index: AcwMGzDl62cad8imSw69oEsATavTJACA/WdQAtFzjbA=
Message-ID: <7CC566635CFE364D87DC5803D4712A6C4CEED71853@XCH-NW-10V.nw.nos.boeing.com>
References: <20110506181503.17839.10589.idtracker@ietfa.amsl.com> <5FDC413D5FA246468C200652D63E627A0E387844@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <5FDC413D5FA246468C200652D63E627A0E387844@LDCMVEXC1-PRD.hq.netapp.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "gurtov@hiit.fi Gurtov" <gurtov@hiit.fi>, "floyd@acm.org" <floyd@acm.org>, "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] I-D ACTION:draft-ietf-tcpm-rfc3782-bis-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 May 2011 16:29:27 -0000

> -----Original Message-----
> From: Scheffenegger, Richard [mailto:rs@netapp.com]
> Sent: Monday, May 09, 2011 1:15 AM
> To: Henderson, Thomas R
> Cc: tcpm@ietf.org
> Subject: RE: [tcpm] I-D ACTION:draft-ietf-tcpm-rfc3782-bis-02.txt
>=20
> Tom,
>=20
> Maybe a stupid question, but section 4.2 reads like some cross over
> with
> the Eifel RFC3522 / RFC4015 algorithms, which are encumbered by IPR -
> some partially open (but non-GPL) and commercial implementations appear
> to be forbidden for Eifel. [For completeness, this section was already
> in 3782].
>=20
> https://datatracker.ietf.org/ipr/171/
>=20
> How does this RFC3782bis circumvent this patent? (Without the special
> wording of the Ericsson IPR, a standard reciprocity IPR wouldn't pose
> any problem, for a standards-track document IMHO).
>=20
> Unfortunately, the IPR doesn't state the exact patents owned by
> Ericsson
> - but Eifel is generic enough that any heuristic making use of
> timestamps for loss recovery of some form is likely to be in violation
> of their claimed patents...

Richard,
Sorry for the delay in replying to your question.  I discussed this offline=
 with the NewReno authors and we believe that the use of timestamps in RFC =
3782 is quite different than the use in RFC 3522 (Eifel). =20

- Tom


From mattmathis@google.com  Wed May 25 04:07:04 2011
Return-Path: <mattmathis@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 94668E06FD for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 04:07:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.977
X-Spam-Level: 
X-Spam-Status: No, score=-105.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yK5Ji0YpuB33 for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 04:07:04 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [74.125.121.67]) by ietfa.amsl.com (Postfix) with ESMTP id BE0BDE06D0 for <tcpm@ietf.org>; Wed, 25 May 2011 04:07:03 -0700 (PDT)
Received: from wpaz21.hot.corp.google.com (wpaz21.hot.corp.google.com [172.24.198.85]) by smtp-out.google.com with ESMTP id p4PB72Pw007077 for <tcpm@ietf.org>; Wed, 25 May 2011 04:07:02 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1306321622; bh=ShBMvsRajV7C+RvqBKcVwhCxu+o=; h=MIME-Version:Date:Message-ID:Subject:From:To:Content-Type: Content-Transfer-Encoding; b=uxmEN99h68kBVrJ70eVop0QShxlAbyHnZQa23+tbK0qrCGWHOWRHxhmp/QRcyJAFq OBewm/4Ynlk/H9PZqNv1g==
Received: from eyf5 (eyf5.prod.google.com [10.208.6.5]) by wpaz21.hot.corp.google.com with ESMTP id p4PB705x008220 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for <tcpm@ietf.org>; Wed, 25 May 2011 04:07:01 -0700
Received: by eyf5 with SMTP id 5so3203975eyf.15 for <tcpm@ietf.org>; Wed, 25 May 2011 04:07:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=beta; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=ifYZdu55a2VcbXc4/J2g/mGRjM7SnBO9YwHKG2vO2iE=; b=lTirF8xNN8l6tHnHYWjKRLx41gZDva2BSLV/Va1mlAxGlabiAOQowqSg/tpe8hIwbb +NEqC4If/SbWulA0aDAA==
DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=WfvnuuXRPIFjH/isut7grFvy9tr+diDlJZm22iG/rrz+54848gnTh3K5WteMmHg/BK 6Gmyanlc6YEoELADdFIA==
MIME-Version: 1.0
Received: by 10.213.33.67 with SMTP id g3mr1598626ebd.13.1306321619855; Wed, 25 May 2011 04:06:59 -0700 (PDT)
Received: by 10.213.9.75 with HTTP; Wed, 25 May 2011 04:06:59 -0700 (PDT)
Date: Wed, 25 May 2011 07:06:59 -0400
Message-ID: <BANLkTikoKcmu-kseRG0h7MFLUMOxKy=3Bw@mail.gmail.com>
From: Matt Mathis <mattmathis@google.com>
To: TCP Maintenance and Minor Extensions WG <tcpm@ietf.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-System-Of-Record: true
Subject: [tcpm] Really odd WSCALE values
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 May 2011 11:07:04 -0000

Measurement Lab ( http://www.measurementlab.net/ ) public data shows a
really odd distribution of receive WSCALE values (below) ranging all
the way up to 14. =A0 The data shows the number of client IP addresses
that invoked a test to some Measurement Lab node, with the given value
the maximum WSCALE value. =A0NATs and other shared client IP addresses
only count as the largest winscale seen. =A0-1 indicates not negotiated,
as opposed to negotiated 0. =A0The data is for the month of 2011
February.

WSCALE values of 9 and above are absurd. =A0For example at WSCALE of 10,
the maximum rwin value is 64MB and it is quantized in 1kB increments,
so an Ethernet frame is only 1 or 2 clicks. =A0Above 10, not even one
click per MSS....

Why would somebody use such large scale values? =A0 =A0The population is
too large for it to be errant manual tuning...

Furthermore, this probably tickles a bug in 1323 =A0(you can't avoid
retracting the receiver window, if the flow is rwin controlled and the
read buffer is smaller than the rwin quanta).  However, It is hard to
imagine this being some sort of exploit. =A0 Besides, why would it be
such a large number of distinct client addresses?

Anybody have any ideas about what might be going on?

=A0 WinScaleRcvd =A0 number_of_tests

| =A0 =A0 =A0 =A0 =A014 | =A0 =A0 =A0 =A0 =A0 =A01568 |
| =A0 =A0 =A0 =A0 =A013 | =A0 =A0 =A0 =A0 =A0 =A0 =A0 9 |
| =A0 =A0 =A0 =A0 =A012 | =A0 =A0 =A0 =A0 =A0 =A0 484 |
| =A0 =A0 =A0 =A0 =A011 | =A0 =A0 =A0 =A0 =A0 =A0 722 |
| =A0 =A0 =A0 =A0 =A010 | =A0 =A0 =A0 =A0 =A0 =A02615 |
| =A0 =A0 =A0 =A0 =A0 9 | =A0 =A0 =A0 =A0 =A0 =A01273 |
| =A0 =A0 =A0 =A0 =A0 8 | =A0 =A0 =A0 =A0 2042885 |
| =A0 =A0 =A0 =A0 =A0 7 | =A0 =A0 =A0 =A0 =A0 56212 |
| =A0 =A0 =A0 =A0 =A0 6 | =A0 =A0 =A0 =A0 =A0 18036 |
| =A0 =A0 =A0 =A0 =A0 5 | =A0 =A0 =A0 =A0 =A0 16153 |
| =A0 =A0 =A0 =A0 =A0 4 | =A0 =A0 =A0 =A0 =A0 46208 |
| =A0 =A0 =A0 =A0 =A0 3 | =A0 =A0 =A0 =A0 =A0470349 |
| =A0 =A0 =A0 =A0 =A0 2 | =A0 =A0 =A0 =A0 3147658 |
| =A0 =A0 =A0 =A0 =A0 1 | =A0 =A0 =A0 =A0 =A0250646 |
| =A0 =A0 =A0 =A0 =A0 0 | =A0 =A0 =A0 =A0 =A0316808 |
| =A0 =A0 =A0 =A0 =A0-1 | =A0 =A0 =A0 =A0 5271154 |

Any ideas?
Thanks,
--MM--
The best way to predict the future is to create it. =A0- Alan Kay

From shep@xplot.org  Wed May 25 06:00:28 2011
Return-Path: <shep@xplot.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1C7F5130030 for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 06:00:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.302
X-Spam-Level: 
X-Spam-Status: No, score=-1.302 tagged_above=-999 required=5 tests=[AWL=1.297,  BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c7jfmsw0nsAv for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 06:00:27 -0700 (PDT)
Received: from www.xplot.org (www.xplot.org [66.92.66.146]) by ietfa.amsl.com (Postfix) with ESMTP id E29A713002E for <tcpm@ietf.org>; Wed, 25 May 2011 06:00:26 -0700 (PDT)
Received: from shep (helo=alva.home) by www.xplot.org with local-esmtp (Exim 3.36 #1 (Debian)) id 1QPDhI-0001BT-00; Wed, 25 May 2011 09:00:24 -0400
From: Tim Shepard <shep@alum.mit.edu>
To: Matt Mathis <mattmathis@google.com>
In-reply-to: Your message of Wed, 25 May 2011 07:06:59 -0400. <BANLkTikoKcmu-kseRG0h7MFLUMOxKy=3Bw@mail.gmail.com> 
Date: Wed, 25 May 2011 09:00:23 -0400
Message-Id: <E1QPDhI-0001BT-00@www.xplot.org>
Sender: Tim Shepard <shep@xplot.org>
Cc: TCP Maintenance and Minor Extensions WG <tcpm@ietf.org>
Subject: Re: [tcpm] Really odd WSCALE values
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 May 2011 13:00:28 -0000

> Why would somebody use such large scale values?    The population is
> too large for it to be errant manual tuning...

I expect that the 0.04 percent of the population in your data that had
wscale values larger than 9 happened because of manual tuning.  It has
been well-enough known that sometimes on linux in order to saturate a
link with a significant delay-bandwidth product that you need to turn
up {r,w}mem_{max,default} in /proc/sys/net/{ipv4,core} to larger
values, and I believe that wscale is selected so that the 16-bit
window field can fully reflect the amount of buffer space available.
(But it has been many years since I've found I need to do that sort of
tuning by hand.)

> WSCALE values of 9 and above are absurd.  For example at WSCALE of 10,
> the maximum rwin value is 64MB and it is quantized in 1kB increments,
> so an Ethernet frame is only 1 or 2 clicks.  Above 10, not even one
> click per MSS....

I don't understand what the problem is with large wscale values.
When wscale is large, you don't have to retract window as you move
through the sequence space, as long as the window you've advertised is
less than or equal to the window you have (the difference being the
low-order bits that don't get carried in the 16-bit window field).
When the buffer space allocated to the connection is larger, this
small difference will not matter.


			-Tim Shepard
			 shep@alum.mit.edu

From rs@netapp.com  Wed May 25 07:29:36 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CE408130043 for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 07:29:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.299
X-Spam-Level: 
X-Spam-Status: No, score=-8.299 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, MANGLED_PILL=2.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XNxlg-HvB5tp for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 07:29:36 -0700 (PDT)
Received: from mx3.netapp.com (mx3.netapp.com [217.70.210.9]) by ietfa.amsl.com (Postfix) with ESMTP id E8A6A130041 for <tcpm@ietf.org>; Wed, 25 May 2011 07:29:35 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,267,1304319600"; d="scan'208";a="256919907"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx3-out.netapp.com with ESMTP; 25 May 2011 07:29:34 -0700
Received: from ldcrsexc1-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.109]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4PETYwZ002984 for <tcpm@ietf.org>; Wed, 25 May 2011 07:29:34 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Wed, 25 May 2011 15:29:34 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 25 May 2011 15:29:01 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E89AF66@LDCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: AcwZWjQkUmZAuHUdQEuEdzeFEpjlNQBjb/iQ
From: "Scheffenegger, Richard" <rs@netapp.com>
To: <tcpm@ietf.org>
X-OriginalArrivalTime: 25 May 2011 14:29:34.0673 (UTC) FILETIME=[2B380C10:01CC1AE8]
Subject: [tcpm] FW: Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 May 2011 14:29:36 -0000

Bob's comments were intended to go to the list also...=20





From: Bob Briscoe [mailto:bob.briscoe@bt.com]=20
Sent: Montag, 23. Mai 2011 16:58
To: Scheffenegger, Richard; Mirja KUEHLEWIND
Cc: draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org;
tcmp@ietf.org
Subject: Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01

Richard, Mirja,

1/ Because you are introducing the idea of different types of=20
capability negotiation (e.g. range negotiation in Appendix A), I=20
think it would be useful to divide the signalling protocol into two
parts:
a) Stuff that has to be common to all types.
b) Stuff for the baseline type defined normatively in this doc

a) Common to all types:
* EXO (which distinguishes all versions from RFC1323)
* the two bits after EXO, which seem to be used like a type field=20
(but see later)

b) Stuff for the baseline type:
* MASK
* SGN
* EXP16
* FRAC16

Alternatively, given (offlist) you've said that MASK must be present=20
in all future types of capability negotiation, it could be placed=20
under category (a). However, I'm not sure how you know it will always=20
be needed before future types have all been invented.

3/ In the baseline capability negotiation, the sign (SGN) bit is=20
wasted, because it must be zero. As you're not using a float format=20
identical to IEEE 754-2008, you might as well admit you are not=20
precisely using the IEEE format and not bother with the sign either.=20
You could simple say that the sign bit is hidden outside the wire=20
protocol, as an implicit hard-coded sign defined in the specification.

4/ Could the MASK use up only 4 bits? Is anyone ever likely to want=20
to mask more than 2**4=3D16 bits, given you already say (Section 6.4)=20
that MASK>8 is discouraged.

Perhaps instead it would be better to use the following format:
        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |E| R |T|       |               |R|         |                   |
       |X| E |Y| MASK  |      RES      |E|  EXP16  |      FRAC16       |
       |O| S |P|       |               |S|         |                   |
       | |   |E|       |               | |         |                   |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The 2-bit reserved field before the type would be useful if the=20
number of types needed is eventually >2. Leaving it reserved allows=20
it to be used for some other purpose orthogonal to types, if nec.
Note I've called the third field 'TYPE', whereas in draft-01 you=20
called it a reserved flag, then called it RNG in the appendix.

(An alternative might have been to call it a version field, but that=20
wouldn't be quite right. For instance, in Appendix A, although the=20
TCP server understands what I would call type 1, it replies with a=20
type 0 option. This doesn't have the same semantics as versioning.)

Are you expecting all hosts that support type 0 to also support type=20
1 (RNG negotiation)? If not, you need to say what a type-0-only host=20
would reply to a RNG request.

5/ For range negotiation (RNG, Appendix A), why not use the same=20
FRAC12 field for the hi and lo end of the range, and only communicate=20
hi and lo EXP12 fields? Surely it's not so important to specify the=20
ends of the ranges precisely.

This would leave more space reserved for future stuff, when range=20
negotiation was also needed.

However, I'm not convinced range negotiation is important. Can you
motivate it?



Bob



________________________________________________________________
Bob Briscoe,                                BT Innovate & Design=20


From rs@netapp.com  Wed May 25 07:55:20 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0119C13006B for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 07:55:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.299
X-Spam-Level: 
X-Spam-Status: No, score=-8.299 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, MANGLED_PILL=2.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9+h2gsmaH9+q for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 07:55:19 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id D1CC413006A for <tcpm@ietf.org>; Wed, 25 May 2011 07:55:18 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,267,1304319600"; d="scan'208";a="250665200"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 25 May 2011 07:55:17 -0700
Received: from amsrsexc1-prd.hq.netapp.com (amsrsexc1-prd.hq.netapp.com [10.64.251.107]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4PEtH2o008646; Wed, 25 May 2011 07:55:17 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by amsrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Wed, 25 May 2011 16:55:17 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 25 May 2011 15:54:43 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E89AF9C@LDCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: AcwZWjQkUmZAuHUdQEuEdzeFEpjlNQBjb/iQAAAMrMA=
From: "Scheffenegger, Richard" <rs@netapp.com>
To: <tcpm@ietf.org>, "Anantha Ramaiah (ananth)" <ananth@cisco.com>, "Andrew McGregor" <andrewmcgr@gmail.com>
X-OriginalArrivalTime: 25 May 2011 14:55:17.0281 (UTC) FILETIME=[C2AF5910:01CC1AEB]
Subject: Re: [tcpm] Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 May 2011 14:55:20 -0000

Hi Bob, group,


I will post an updated version (-02) of this draft shortly.

A few of these observations were addressed already, but the majority is
still pending (but should be addressed in version -02).


The comment by Bob,
=20
> However, I'm not convinced range negotiation is important. Can you
> motivate it?

I would also like to get feedback from the list on.

The only sensible reason I could come up with to have range negotiation
(appendix A, really intended as example of a future enhancement) would
be for hardware-accelerated TCP NICs:

For one-way delay measurement with two different timestamp clock rates,
the sender would have to do multiplication and divisions for every ACK
received. I believe this to be a costly operation for very high speed
(1-10 Gbps) and low delay (local/metro area) environments. With both
ends of a TCP session using an identical TCP timestamp clock rate
(frequency), one-way delay variance calculation can be done by simple
additions/subtractions. From my understanding, simple integer operations
are constant in time, and quite fast. That calculation would then not
add additional jitter into the one-way delay variance signal.

Overall, I don't see a high motivation to do range-negotiation today.
Shall I keep this Appendix at all, then?



Richard Scheffenegger



> -----Original Message-----
> From: Scheffenegger, Richard
> Sent: Mittwoch, 25. Mai 2011 16:29
> To: tcpm@ietf.org Extensions
> Subject: FW: Tech points: draft-scheffenegger-tcpm-timestamp-
> negotiation-01
>=20
> Bob's comments were intended to go to the list also...
>=20
>=20
>=20
>=20
>=20
> From: Bob Briscoe [mailto:bob.briscoe@bt.com]
> Sent: Montag, 23. Mai 2011 16:58
> To: Scheffenegger, Richard; Mirja KUEHLEWIND
> Cc: draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org;
> tcmp@ietf.org
> Subject: Tech points:
draft-scheffenegger-tcpm-timestamp-negotiation-01
>=20
> Richard, Mirja,
>=20
> 1/ Because you are introducing the idea of different types of
> capability negotiation (e.g. range negotiation in Appendix A), I
> think it would be useful to divide the signalling protocol into two
> parts:
> a) Stuff that has to be common to all types.
> b) Stuff for the baseline type defined normatively in this doc
>=20
> a) Common to all types:
> * EXO (which distinguishes all versions from RFC1323)
> * the two bits after EXO, which seem to be used like a type field
> (but see later)
>=20
> b) Stuff for the baseline type:
> * MASK
> * SGN
> * EXP16
> * FRAC16
>=20
> Alternatively, given (offlist) you've said that MASK must be present
> in all future types of capability negotiation, it could be placed
> under category (a). However, I'm not sure how you know it will always
> be needed before future types have all been invented.
>=20
> 3/ In the baseline capability negotiation, the sign (SGN) bit is
> wasted, because it must be zero. As you're not using a float format
> identical to IEEE 754-2008, you might as well admit you are not
> precisely using the IEEE format and not bother with the sign either.
> You could simple say that the sign bit is hidden outside the wire
> protocol, as an implicit hard-coded sign defined in the specification.
>=20
> 4/ Could the MASK use up only 4 bits? Is anyone ever likely to want
> to mask more than 2**4=3D16 bits, given you already say (Section 6.4)
> that MASK>8 is discouraged.
>=20
> Perhaps instead it would be better to use the following format:
>         0                   1                   2                   3
>         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
1
>
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> +
>        |E| R |T|       |               |R|         |
> |
>        |X| E |Y| MASK  |      RES      |E|  EXP16  |      FRAC16
> |
>        |O| S |P|       |               |S|         |
> |
>        | |   |E|       |               | |         |
> |
>
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> +
> The 2-bit reserved field before the type would be useful if the
> number of types needed is eventually >2. Leaving it reserved allows
> it to be used for some other purpose orthogonal to types, if nec.
> Note I've called the third field 'TYPE', whereas in draft-01 you
> called it a reserved flag, then called it RNG in the appendix.
>=20
> (An alternative might have been to call it a version field, but that
> wouldn't be quite right. For instance, in Appendix A, although the
> TCP server understands what I would call type 1, it replies with a
> type 0 option. This doesn't have the same semantics as versioning.)
>=20
> Are you expecting all hosts that support type 0 to also support type
> 1 (RNG negotiation)? If not, you need to say what a type-0-only host
> would reply to a RNG request.
>=20
> 5/ For range negotiation (RNG, Appendix A), why not use the same
> FRAC12 field for the hi and lo end of the range, and only communicate
> hi and lo EXP12 fields? Surely it's not so important to specify the
> ends of the ranges precisely.
>=20
> This would leave more space reserved for future stuff, when range
> negotiation was also needed.
>=20
> However, I'm not convinced range negotiation is important. Can you
> motivate it?
>=20
>=20
>=20
> Bob
>=20
>=20
>=20
> ________________________________________________________________
> Bob Briscoe,                                BT Innovate & Design


From perfgeek@mac.com  Wed May 25 08:14:05 2011
Return-Path: <perfgeek@mac.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2B6CCE07D5 for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 08:14:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level: 
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FGyW90dDkjBY for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 08:14:04 -0700 (PDT)
Received: from asmtpout013.mac.com (asmtpout013.mac.com [17.148.16.88]) by ietfa.amsl.com (Postfix) with ESMTP id A0141E0750 for <tcpm@ietf.org>; Wed, 25 May 2011 08:14:04 -0700 (PDT)
MIME-version: 1.0
Content-transfer-encoding: 7BIT
Content-type: text/plain; CHARSET=US-ASCII; format=flowed; delsp=yes
Received: from [192.168.1.101] (76-220-56-223.lightspeed.sntcca.sbcglobal.net [76.220.56.223]) by asmtp013.mac.com (Oracle Communications Messaging Exchange Server 7u4-20.01 64bit (built Nov 21 2010)) with ESMTPSA id <0LLR008XLBN49I10@asmtp013.mac.com> for tcpm@ietf.org; Wed, 25 May 2011 08:13:54 -0700 (PDT)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.4.6813,1.0.148,0.0.0000 definitions=2011-05-25_05:2011-05-25, 2011-05-25, 1970-01-01 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 suspectscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1012030000 definitions=main-1105250079
Message-id: <6F54F861-782E-49ED-8AB7-75EA22CEF4EC@mac.com>
From: rick jones <perfgeek@mac.com>
To: Tim Shepard <shep@alum.mit.edu>
In-reply-to: <E1QPDhI-0001BT-00@www.xplot.org>
Date: Wed, 25 May 2011 08:13:52 -0700
References: <E1QPDhI-0001BT-00@www.xplot.org>
X-Mailer: Apple Mail (2.936)
Cc: TCP Maintenance and Minor Extensions WG <tcpm@ietf.org>, Matt Mathis <mattmathis@google.com>
Subject: Re: [tcpm] Really odd WSCALE values
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 May 2011 15:14:05 -0000

On May 25, 2011, at 6:00 AM, Tim Shepard wrote:

>> Why would somebody use such large scale values?    The population is
>> too large for it to be errant manual tuning...
>
> I expect that the 0.04 percent of the population in your data that had
> wscale values larger than 9 happened because of manual tuning.  It has
> been well-enough known that sometimes on linux in order to saturate a
> link with a significant delay-bandwidth product that you need to turn
> up {r,w}mem_{max,default} in /proc/sys/net/{ipv4,core} to larger
> values, and I believe that wscale is selected so that the 16-bit
> window field can fully reflect the amount of buffer space available.
> (But it has been many years since I've found I need to do that sort of
> tuning by hand.)

Having just seen, in another mailing list, an example of someone  
setting [rw]mem_max to 50 million bytes for some of their testing, and  
not "re-tweaking" it later when their test conditions changed, ("TCP  
will deal with it and do the right thing.") I am inclined to agree  
with Tim about the likelihood of it being manual tuning.  Perhaps  
something along the lines of "Ok, well does it get any better if I go  
balls-out on the window size?"

Although, if there were a way to try to "finger print" the ostensibly  
offending IPs that might be goodness... individuals doing slightly  
screwy things can in fact have employers shipping kit.   Did these IPs  
ever actually advertise a window out to the limit enabled by those  
window scaling values?

Matt originally said
> WSCALE values of 9 and above are absurd.  For example at WSCALE of 10,
> the maximum rwin value is 64MB and it is quantized in 1kB increments,
> so an Ethernet frame is only 1 or 2 clicks.  Above 10, not even one
> click per MSS....

Window updates are "supposed" to be non-trivial fractions of the  
window, or at least 2*MSS anyway no?

One other thought - one can almost always advertise a smaller window  
regardless of WSCALE (yes, modulo the granularity imposed), but once  
the WSCALE is set that fixes an upper bound one cannot exceed.  In  
that sense, one might have a school of thought that believes that  
larger values of WSCALE are better than smaller ones - for the  
flexibility they provide to say advertise more window when one  
discovers the bandwidth delay product is large.   Perhaps an  
application layer version of the (in)famous Linux autotuning.

rick jones
Wisdom teeth are impacted, people are affected by the effects of events


From dab@weston.borman.com  Wed May 25 09:38:15 2011
Return-Path: <dab@weston.borman.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C849BE0756 for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 09:38:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level: 
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 47s+Me8NONse for <tcpm@ietfa.amsl.com>; Wed, 25 May 2011 09:38:14 -0700 (PDT)
Received: from frantic.weston.borman.com (frantic-dmz.weston.borman.com [206.196.54.22]) by ietfa.amsl.com (Postfix) with ESMTP id 9578EE074E for <tcpm@ietf.org>; Wed, 25 May 2011 09:38:14 -0700 (PDT)
Received: from [172.25.44.10] (weston-43.weston.borman.com [206.196.45.43]) by frantic.weston.borman.com (8.12.5/8.12.5) with ESMTP id p4PGc8mW013108; Wed, 25 May 2011 11:38:09 -0500 (CDT)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: David Borman <dab@weston.borman.com>
In-Reply-To: <BANLkTikoKcmu-kseRG0h7MFLUMOxKy=3Bw@mail.gmail.com>
Date: Wed, 25 May 2011 11:38:08 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <1B4EBD15-A8CC-4ACC-9EB6-88DCE2DF6B7E@weston.borman.com>
References: <BANLkTikoKcmu-kseRG0h7MFLUMOxKy=3Bw@mail.gmail.com>
To: Matt Mathis <mattmathis@google.com>
X-Mailer: Apple Mail (2.1084)
Cc: TCP Maintenance and Minor Extensions WG <tcpm@ietf.org>
Subject: Re: [tcpm] Really odd WSCALE values
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 May 2011 16:38:15 -0000

On May 25, 2011, at 6:06 AM, Matt Mathis wrote:

> Measurement Lab ( http://www.measurementlab.net/ ) public data shows a
> really odd distribution of receive WSCALE values (below) ranging all
> the way up to 14.   The data shows the number of client IP addresses
> that invoked a test to some Measurement Lab node, with the given value
> the maximum WSCALE value.  NATs and other shared client IP addresses
> only count as the largest winscale seen.  -1 indicates not negotiated,
> as opposed to negotiated 0.  The data is for the month of 2011
> February.
>=20
> WSCALE values of 9 and above are absurd.  For example at WSCALE of 10,
> the maximum rwin value is 64MB and it is quantized in 1kB increments,
> so an Ethernet frame is only 1 or 2 clicks.  Above 10, not even one
> click per MSS....

I think that just having the distribution of WSCALE values is not that =
useful.  What matters is what is the size of the window that is being =
offered for a given WSCALE value.  As long as the window exceeds the =
WSCALE by a reasonable amount, things should be fine.  If it doesn't, =
well, the TCP should already deal with rounding up, even for small =
WSCALE values.  But yes, if the actual available window is small enough =
that the scaled offered window can only be 0 or 1, then that might not =
be the best configuration, but things should still work. :-)

			-David Borman
=20
>=20
> Why would somebody use such large scale values?    The population is
> too large for it to be errant manual tuning...
>=20
> Furthermore, this probably tickles a bug in 1323  (you can't avoid
> retracting the receiver window, if the flow is rwin controlled and the
> read buffer is smaller than the rwin quanta).  However, It is hard to
> imagine this being some sort of exploit.   Besides, why would it be
> such a large number of distinct client addresses?
>=20
> Anybody have any ideas about what might be going on?
>=20
>   WinScaleRcvd   number_of_tests
>=20
> |          14 |            1568 |
> |          13 |               9 |
> |          12 |             484 |
> |          11 |             722 |
> |          10 |            2615 |
> |           9 |            1273 |
> |           8 |         2042885 |
> |           7 |           56212 |
> |           6 |           18036 |
> |           5 |           16153 |
> |           4 |           46208 |
> |           3 |          470349 |
> |           2 |         3147658 |
> |           1 |          250646 |
> |           0 |          316808 |
> |          -1 |         5271154 |
>=20
> Any ideas?
> Thanks,
> --MM--
> The best way to predict the future is to create it.  - Alan Kay
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm


From mattmathis@google.com  Thu May 26 07:36:23 2011
Return-Path: <mattmathis@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A3713E0744 for <tcpm@ietfa.amsl.com>; Thu, 26 May 2011 07:36:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.677
X-Spam-Level: 
X-Spam-Status: No, score=-105.677 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LX-SBoPAjCIS for <tcpm@ietfa.amsl.com>; Thu, 26 May 2011 07:36:23 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [74.125.121.67]) by ietfa.amsl.com (Postfix) with ESMTP id 8DACAE074F for <tcpm@ietf.org>; Thu, 26 May 2011 07:36:22 -0700 (PDT)
Received: from wpaz21.hot.corp.google.com (wpaz21.hot.corp.google.com [172.24.198.85]) by smtp-out.google.com with ESMTP id p4QEaKJk022409 for <tcpm@ietf.org>; Thu, 26 May 2011 07:36:21 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1306420581; bh=d2m7G0YLTRr9tpxX/VuZsGxHi+k=; h=MIME-Version:In-Reply-To:References:Date:Message-ID:Subject:From: To:Cc:Content-Type; b=qU6elKoOvnwmUURLLeKjWuBF6w4j7KdrYbIqwDBvap3hkciufQO5lhOHQCeyE25W7 4PZNHOjblaH0ME5VDadOQ==
Received: from eyg7 (eyg7.prod.google.com [10.208.7.7]) by wpaz21.hot.corp.google.com with ESMTP id p4QEaIJd012507 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for <tcpm@ietf.org>; Thu, 26 May 2011 07:36:19 -0700
Received: by eyg7 with SMTP id 7so415389eyg.13 for <tcpm@ietf.org>; Thu, 26 May 2011 07:36:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=beta; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=HQsocyr1xWyQzQnZ4TH596/HeYuNBrvPiygrF2f+YFc=; b=B+KsAoLP15BWknqKBM68z6U6oQq0GPqd15IL2lwHWZimCzqInJEV7dmqTRVpenHDA4 SLkx/nDWVO2blj9zElnw==
DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=o469/WHv9QIcZSKNgbNveTSh/P9niagTz1V5416I9hI8pMPEQeMyoNYlJHGm4/ADpE Bu4LrIOEmP4IuPbzaBQA==
MIME-Version: 1.0
Received: by 10.213.21.2 with SMTP id h2mr2007465ebb.51.1306420578190; Thu, 26 May 2011 07:36:18 -0700 (PDT)
Received: by 10.213.9.75 with HTTP; Thu, 26 May 2011 07:36:18 -0700 (PDT)
In-Reply-To: <E1QPDhI-0001BT-00@www.xplot.org>
References: <BANLkTikoKcmu-kseRG0h7MFLUMOxKy=3Bw@mail.gmail.com> <E1QPDhI-0001BT-00@www.xplot.org>
Date: Thu, 26 May 2011 10:36:18 -0400
Message-ID: <BANLkTi=7pgSqR9xo_CjgiO8pxisX5S0UpA@mail.gmail.com>
From: Matt Mathis <mattmathis@google.com>
To: Tim Shepard <shep@alum.mit.edu>
Content-Type: text/plain; charset=ISO-8859-1
X-System-Of-Record: true
Cc: David Borman <dab@weston.borman.com>, TCP Maintenance and Minor Extensions WG <tcpm@ietf.org>
Subject: Re: [tcpm] Really odd WSCALE values
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 May 2011 14:36:23 -0000

You may be correct about errant manual tuning.  I will have to think about it.

On Wed, May 25, 2011 at 9:00 AM, Tim Shepard <shep@alum.mit.edu> wrote:

> I don't understand what the problem is with large wscale values.
> When wscale is large, you don't have to retract window as you move
> through the sequence space, as long as the window you've advertised is
> less than or equal to the window you have (the difference being the
> low-order bits that don't get carried in the 16-bit window field).
> When the buffer space allocated to the connection is larger, this
> small difference will not matter.

My paraphrasing of the problem was not quite correct.   Try 2:

Define wextent to the sequence number of the left edge of the receiver window:
wextent = seg.ack+(2^wscale)*seg.win
Suppose three conditions:
1) wscale too big, in this case I will assume 1 just to make the point
2) the receiver is not making progress.  e.g. it wants to announce a
constant wextent
3) the sender is sending data in smaller increments than (2^wscale).
e.g. 1 byte at a time, and waiting for each ack
In this case you will see an alternating pattern:
seg.ack advances by 1 causing wextent to advance by 1
seg.ack advances by 1, seg.win retracts by 1, causing wextent to retract by 1
There is nothing else that the receiver can do.......

Note that these conditions are quite easy to trigger momentarily with
any piplined bidirectional protocol (e.g. http 1.1) on a busy server,
if there are requests that are smaller than the wscale increment.

So MUST NOT retract the receiver window is flat out incorrect and not possible.

I went back and looked, and RFC 793 & 1122 actually say reasonable
things (SHOULDs in the right places, etc) but  just a bit vague.  I
have a vague recollection of a spec someplace that strongly forbids
retracting the window at all, but I could not find it.  Anybody
suggest other docs that might say such a thing?

Note that retracting the window at all also causes an ambiguity in the
definition of zero window, and if the very last byte is in or out of
window, for example at a firewall.  (If TCP is not doing Nagle, the
last data segment can extend beyond wextent of an ACK that it
crosses).

Dave, We had a conversation about this problem in the context of
1323bis a couple of years ago, but I lost track of where it went.
Since 1323 is the normative definition for wscale, it would at least
be a reasonable place to discuss this problem...

Thanks,
--MM--

From perfgeek@mac.com  Thu May 26 08:40:31 2011
Return-Path: <perfgeek@mac.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D4E78E0677 for <tcpm@ietfa.amsl.com>; Thu, 26 May 2011 08:40:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level: 
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DiejQrwPs7Ax for <tcpm@ietfa.amsl.com>; Thu, 26 May 2011 08:40:30 -0700 (PDT)
Received: from asmtpout015.mac.com (asmtpout015.mac.com [17.148.16.90]) by ietfa.amsl.com (Postfix) with ESMTP id A2ECEE0655 for <tcpm@ietf.org>; Thu, 26 May 2011 08:40:30 -0700 (PDT)
MIME-version: 1.0
Content-transfer-encoding: 7BIT
Content-type: text/plain; CHARSET=US-ASCII; format=flowed; delsp=yes
Received: from [192.168.1.101] (76-220-56-223.lightspeed.sntcca.sbcglobal.net [76.220.56.223]) by asmtp015.mac.com (Oracle Communications Messaging Exchange Server 7u4-20.01 64bit (built Nov 21 2010)) with ESMTPSA id <0LLT00IYO7JGNF80@asmtp015.mac.com> for tcpm@ietf.org; Thu, 26 May 2011 08:40:29 -0700 (PDT)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.4.6813,1.0.148,0.0.0000 definitions=2011-05-26_03:2011-05-26, 2011-05-26, 1970-01-01 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 suspectscore=4 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1012030000 definitions=main-1105260054
Message-id: <E90E0F55-4A16-408C-B478-056BC1BF8542@mac.com>
From: rick jones <perfgeek@mac.com>
To: Matt Mathis <mattmathis@google.com>
In-reply-to: <BANLkTi=7pgSqR9xo_CjgiO8pxisX5S0UpA@mail.gmail.com>
Date: Thu, 26 May 2011 08:40:27 -0700
References: <BANLkTikoKcmu-kseRG0h7MFLUMOxKy=3Bw@mail.gmail.com> <E1QPDhI-0001BT-00@www.xplot.org> <BANLkTi=7pgSqR9xo_CjgiO8pxisX5S0UpA@mail.gmail.com>
X-Mailer: Apple Mail (2.936)
Cc: David Borman <dab@weston.borman.com>, TCP Maintenance and Minor Extensions WG <tcpm@ietf.org>, Tim Shepard <shep@alum.mit.edu>
Subject: Re: [tcpm] Really odd WSCALE values
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 May 2011 15:40:31 -0000

> I went back and looked, and RFC 793 & 1122 actually say reasonable
> things (SHOULDs in the right places, etc) but  just a bit vague.  I
> have a vague recollection of a spec someplace that strongly forbids
> retracting the window at all, but I could not find it.  Anybody
> suggest other docs that might say such a thing?

I have similarly vague recollections about retracting the window being  
considered poor form.  My dimm memory thinks the reason was it  
triggered some sort of bug in one of the common stacks at the time.   
What I don't recall was if the issue was in retracting the window at  
all, or if it was in retracting the window beyond what the sender had  
already sent.

A *very* cursory search using "retract tcp window bug" finds this:
http://www.tcpipprotocols.info/bsd-4-3-terminates-zero-window-connection#comments

where someone mentions a possible BSD 4.2 bug.

rick jones
Wisdom teeth are impacted, people are affected by the effects of events


From david.borman@windriver.com  Thu May 26 09:34:59 2011
Return-Path: <david.borman@windriver.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 09AA9E073C for <tcpm@ietfa.amsl.com>; Thu, 26 May 2011 09:34:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.599
X-Spam-Level: 
X-Spam-Status: No, score=-103.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id daFAhLjJDZqV for <tcpm@ietfa.amsl.com>; Thu, 26 May 2011 09:34:58 -0700 (PDT)
Received: from mail.windriver.com (mail.windriver.com [147.11.1.11]) by ietfa.amsl.com (Postfix) with ESMTP id 2E6ABE0717 for <tcpm@ietf.org>; Thu, 26 May 2011 09:34:57 -0700 (PDT)
Received: from ALA-HCB.corp.ad.wrs.com (ala-hcb [147.11.189.41]) by mail.windriver.com (8.14.3/8.14.3) with ESMTP id p4QGYoqX029275 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 26 May 2011 09:34:50 -0700 (PDT)
Received: from ALA-MBB.corp.ad.wrs.com ([169.254.2.184]) by ALA-HCB.corp.ad.wrs.com ([147.11.189.41]) with mapi id 14.01.0255.000; Thu, 26 May 2011 09:34:50 -0700
From: "Borman, David" <david.borman@windriver.com>
To: TCP Maintenance and Minor Extensions WG <tcpm@ietf.org>
Thread-Topic: Start of WGLC for draft-ietf-tcpm-rfc1948bis
Thread-Index: AQHMG8LVtu3LcaoxVUOf5xEU3d1gTQ==
Date: Thu, 26 May 2011 16:34:50 +0000
Message-ID: <2BD9239D-7DCE-4AC7-838F-BA915AABC756@windriver.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [172.25.34.13]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <810D7178854E3E41874D92A031F400A2@corp.ad.wrs.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "tsv-ads@tools.ietf.org" <tsv-ads@tools.ietf.org>, Fernando Gont <fernando@gont.com.ar>
Subject: [tcpm] Start of WGLC for draft-ietf-tcpm-rfc1948bis
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 May 2011 16:34:59 -0000

The author of draft-ietf-tcpm-rfc1948bis believes that the document is read=
y for Working Group Last Call.  It has been a month since there was anythin=
g new on the mailing list with regards to draft-ietf-tcpm-rfc1948bis.

We are starting the WGLC for this document, to end on Friday, June 10.  Ple=
ase send any comments on the document to the mailing list and the author be=
fore that time.

			-David Borman & Michael Scharf, TCPM WG co-chairs


From rs@netapp.com  Fri May 27 10:12:25 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 49725E0820 for <tcpm@ietfa.amsl.com>; Fri, 27 May 2011 10:12:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.999
X-Spam-Level: 
X-Spam-Status: No, score=-7.999 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, J_CHICKENPOX_23=0.6, MANGLED_PILL=2.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fkX8i4P4MNBk for <tcpm@ietfa.amsl.com>; Fri, 27 May 2011 10:12:21 -0700 (PDT)
Received: from mx3.netapp.com (mx3.netapp.com [217.70.210.9]) by ietfa.amsl.com (Postfix) with ESMTP id 6A0CAE0824 for <tcpm@ietf.org>; Fri, 27 May 2011 10:12:20 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,281,1304319600"; d="scan'208";a="257228031"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx3-out.netapp.com with ESMTP; 27 May 2011 10:12:18 -0700
Received: from amsrsexc1-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.64.251.107]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4RHCIcK015207 for <tcpm@ietf.org>; Fri, 27 May 2011 10:12:18 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by amsrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Fri, 27 May 2011 19:12:18 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Fri, 27 May 2011 18:11:45 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99BFCE@LDCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Tech points:  draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: AcwbmXPOtk3tRWb6SliLLwtqFO3dDAA96OJA
From: "Scheffenegger, Richard" <rs@netapp.com>
To: <tcpm@ietf.org>
X-OriginalArrivalTime: 27 May 2011 17:12:18.0357 (UTC) FILETIME=[3BA77E50:01CC1C91]
Subject: [tcpm] FW: Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 May 2011 17:12:25 -0000

Intended to include the list!

-----Original Message-----
From: Bob Briscoe [mailto:bob.briscoe@bt.com]=20
Sent: Donnerstag, 26. Mai 2011 13:38
To: Scheffenegger, Richard
Cc: Mirja KUEHLEWIND;
draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org;
tcmp@ietf.org
Subject: RE: Tech points:
draft-scheffenegger-tcpm-timestamp-negotiation-01

Richard,

I should have said,... I intended to split discussion into three parts:
- technical
- structural / clarity
- editorial details.

This is the tech part. It reviews the published -01 draft. I've=20
shifted other stuff to a different thread.

inline...

At 15:23 25/05/2011, Scheffenegger, Richard wrote:

> >
> > 1/ Because you are introducing the idea of different types of
> > capability negotiation (e.g. range negotiation in Appendix A), I
> > think it would be useful to divide the signalling protocol into two
> > parts:
> > a) Stuff that has to be common to all types.
> > b) Stuff for the baseline type defined normatively in this doc
> >
> > a) Common to all types:
> > * EXO (which distinguishes all versions from RFC1323)
> > * the two bits after EXO, which seem to be used like a type field
> > (but see later)
> >
> > b) Stuff for the baseline type:
> > * MASK
> > * SGN
> > * EXP16
> > * FRAC16
> >
> > Alternatively, given (offlist) you've said that MASK must be present
> > in all future types of capability negotiation, it could be placed
> > under category (a). However, I'm not sure how you know it will
always
> > be needed before future types have all been invented.
>
>The reason for mask to be present would be to fence off parts of TS.val
>that must be excluded from PAWS processing on the receiver side. That
>was also in the context of allowing a version0 receiver to reply always
>with version0, even if it cann't process the timestamp value for
>anything but PAWS (excluding the masked bits, where the monotonic
>increase property can be broken by the sender). A future version may
>specify what is in that opaque part for a compatible receiver still...

My question was whether there could be a version where the sender=20
ensures timestamps monotonically increase, so MASK would not be
necessary.

I suspect I have not fully understood what the underlying need for=20
MASK is. The doc seems to hint at two:
a) when not echoing timestamps to the rules in RFC1323, they might=20
not monotonically increase
b) some senders add entropy to the low order bits of timestamps to=20
protect against malicious tweaking of TSecr values

Am I correct?

My thinking is that b) would not be necessary in trusted=20
environments. I'm now not sure whether a) will always be a requirement.

> > 3/ In the baseline capability negotiation, the sign (SGN) bit is
> > wasted, because it must be zero. As you're not using a float format
> > identical to IEEE 754-2008, you might as well admit you are not
> > precisely using the IEEE format and not bother with the sign either.
> > You could simple say that the sign bit is hidden outside the wire
> > protocol, as an implicit hard-coded sign defined in the
specification.
>
>correct; After thinking about this a bit more (I departed from IEEE-754
>originally but modified it already as you noted), that bit would serve
>best when re-shuffled for 1 bit higher precision.

I would argue it's OK to depart from IEEE-754 in some ways but not
others:

- Making the sign bit always 0 and outside the wire protocol is OK,=20
because implementers are unlikely to get it wrong

- Adding 1 bit more precision to binary16 float representation is NOT=20
OK, because implementers will have to tweak low level arithmetic code=20
rather than using common libraries. Given the level of bugs that have=20
recently been found even in really simple things like ECN or SACK, it=20
is important to keep all changes really really simple.

>The only special value
>would be all-zero (this is included in the PDF, I hope you reviewed
that
>text) to indicate the timestamp values are not related to wall-clock
>time.

Yes, I noticed that.

In case anyone listening on the list is confused, Richard sent me an=20
offlist preview of the proposed next rev, in PDF format.


> > 4/ Could the MASK use up only 4 bits? Is anyone ever likely to want
> > to mask more than 2**4=3D16 bits, given you already say (Section =
6.4)
> > that MASK>8 is discouraged.
>
>The mask field would make up to 31 bits completely opaque to the
>receiver (thus exclude those bits also from PAWS receiver-side
>processing, lifting any constraints like the monotonic increasing
values
>- PAWS would need that).
>
>The unstated reason for the full 5 bits was to cover up to 1/4th of the
>TS value, even when TCPCT (RFC6013) with the 64 / 128 bit timestamps
are
>in use, or if the sender wants to have a true hash value (segment
>digest) with each segment, or when the future contents of TS.val would
>be at odds with PAWS processing...

Understood. Will you be explaining this reasoning in the next rev?

I assume 1/4 is a member of the set of made-up numbers, see=20
<http://search.dilbert.com/comic/Made%20Up%20Numbers> ;)


>Also, a future version could define some use of these masked out bits,
>but remain compatible with a version 0 receiver. Such a receiver would
>treat the TS.val as opaque entity again, and only the immediate echoing
>would remain as compatibility with future instances...

Again, I think this discussion shows that it would be useful to talk=20
in the draft about what will be needed for any version and what will=20
be only in some.

The WG needs to be able to decide where to draw the line between=20
features that might be nice (mission creep), and features that are
necessary.


> > Perhaps instead it would be better to use the following format:
> >    0                   1                   2                   3
> >    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> >    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >    |E| R |T|       |               |R|         |                   |
> >    |X| E |Y| MASK  |      RES      |E|  EXP16  |      FRAC16       |
> >    |O| S |P|       |               |S|         |                   |
> >    | |   |E|       |               | |         |                   |
> >    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> > The 2-bit reserved field before the type would be useful if the
> > number of types needed is eventually >2. Leaving it reserved allows
> > it to be used for some other purpose orthogonal to types, if nec.
> > Note I've called the third field 'TYPE', whereas in draft-01 you
> > called it a reserved flag, then called it RNG in the appendix.
>
>RNG is no longer in the draft (the PDF version I sent you via email),
>that was in version -01.

Yup, I had noticed that. But, because I cc'd my email to the list, I=20
tried to relate the discussion to draft-01 that everyone has seen,=20
not the offlist preview you sent me.

>Appendix A refers to version 1, which would be
>sender (SYN, no ACK) only, and responded to with certain semantics - so
>that an identical (or at least known) timestamp clock rate can be
>arrived with only one exchange in each direction. (it'll be some time
>before tcp stacks would allow dynamic adjustment of their timestamp
>clock rates, I guess :) ).
>
> > (An alternative might have been to call it a version field, but that
> > wouldn't be quite right. For instance, in Appendix A, although the
> > TCP server understands what I would call type 1, it replies with a
> > type 0 option. This doesn't have the same semantics as versioning.)
>
>Yes, the -02 PDF has a 2-bit version field with the difference in
>semantics you point out. I really want this to be an enumeration, and
>the exact semantics (like a version 0 receiver is free to either ignore
>higher versions, or respond simply with version 0 always) are in the
>specs.
>
>I'm unhappy with "type" because that sounds like request/response - but
>implicitly, the active sender (SYN, no ACK bit set) is the requester,
>and the passive sender (receiver, SYN+ACK) is the responder. Having two
>different definitions in a single version of the timestamp negotiation,
>depending on direction(i.e. if the ACK bit is set or not), with one
>being identical to version 0 sounds like overkill - version 0 would
>already have the appropriate signaling for the response...
>
>In the PDF, I settled on version as that deviates the least as I
>understand it in the effective semantics.
>
>
>
> > Are you expecting all hosts that support type 0 to also support type
> > 1 (RNG negotiation)? If not, you need to say what a type-0-only host
> > would reply to a RNG request.
>
>Yes, that is in the preliminary -02 draft (PDF); hidden as it is, with
>the Wire protocol specs. Need to fix this in the high level
description.

To repeat back to you, you are saying all version 0 implementations=20
must also support version 1? Then I don't think 'version' is the=20
right word for this. If you don't like 'type' either, then we need=20
another word. What about 'kind'?


> > 5/ For range negotiation (RNG, Appendix A), why not use the same
> > FRAC12 field for the hi and lo end of the range, and only
communicate
> > hi and lo EXP12 fields? Surely it's not so important to specify the
> > ends of the ranges precisely.
>
>I really couldn't come up with a decent use case for range negotiation,
>where it would be important that both ends run at the same clock rate;
>That's why I put this whole range negotiation stuff into an appendix,
as
>just an example of a future enhancement.
>
>You are completely correct, depending on what one wants to achieve with
>something like a range negotiation, more bits could be saved or
shuffled
>to better uses.
>
>
>
> > This would leave more space reserved for future stuff, when range
>negotiation was also needed.
>
>A roll of the version number, and reshuffling of the signaling bits
>appropriately (i.e. a certain use may only be possible with tcp
>timestamp clocks ticking between micro- and milliseconds, instead of
>nano- to tenth's of seconds) would achive the same. Of course, if this
>signaling takes off, and people figure out clever ways to further
>utilize a timing / sideband channel in TCP, having just 4 enumerated
>versions available may be too few :). I think this discussion can be
>postponed until version 3 of the negotiation signaling is to be defined
>:)

You'll notice I suggested 2 bits of reserved space before the 1-bit=20
type/kind/version field. That would allow extension to 8=20
types/kinds/versions. But it also allows some other future use to=20
burn those bits if we decide it's more important than allowing space=20
for more types in future.

HOWEVER, every new variant faces a new deployment problem.
I strongly believe we should try to settle on a core extension to=20
timestamping that the WG believes includes enough features for=20
important stuff like one-way delay, and we draw a line above "would=20
be nice to have" mission creep.


> > However, I'm not convinced range negotiation is important. Can you
>motivate it?
>
>Not really (the reason why it was pushed to an appendix as an example);
>The only sensible idea I have would be a hardware-assisted TCP stack,
>which does the timing processing on the NIC; with range negotiation,
>both TCP timestamp clocks could run at the exact same frequency,
>simplifying one-way delay variance calculations (so that wirespeed @
10G
>Ethernet becomes a possibility). As long as one-way delay variance
>calculations are carried out in software (main CPU), doing a division
>(or at least one multiplication) instead of only addition/subtraction
>should have very little impact really...

But then range negotiation doesn't help anyway, if both ends have a=20
set of discrete frequencies their hardware can use and they are=20
trying to find one they have in common. This requires an enumeration,=20
not a range. That's tough to fit into a confined space.


>Again, thanks a lot for your insightful comments!

Cheers


Bob


________________________________________________________________
Bob Briscoe,                                BT Innovate & Design=20


From rs@netapp.com  Fri May 27 13:16:51 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6FD4FE0802 for <tcpm@ietfa.amsl.com>; Fri, 27 May 2011 13:16:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.899
X-Spam-Level: 
X-Spam-Status: No, score=-7.899 tagged_above=-999 required=5 tests=[AWL=-0.200, BAYES_00=-2.599, J_CHICKENPOX_23=0.6, MANGLED_PILL=2.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RX2eBDNLNmyO for <tcpm@ietfa.amsl.com>; Fri, 27 May 2011 13:16:49 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id 76618E06C4 for <tcpm@ietf.org>; Fri, 27 May 2011 13:16:48 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,282,1304319600"; d="scan'208";a="250883878"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 27 May 2011 13:16:46 -0700
Received: from amsrsexc1-prd.hq.netapp.com (amsrsexc1-prd.hq.netapp.com [10.64.251.107]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4RKGkhU003680; Fri, 27 May 2011 13:16:46 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by amsrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Fri, 27 May 2011 22:16:46 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Fri, 27 May 2011 21:16:11 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99BFF6@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <201105261137.p4QBbc0D012437@bagheera.jungle.bt.co.uk>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Tech points:  draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: AcwbmXPOtk3tRWb6SliLLwtqFO3dDAA8x3FQ
References: <201105231457.p4NEvZfm017363@bagheera.jungle.bt.co.uk> <5FDC413D5FA246468C200652D63E627A0E89AF5A@LDCMVEXC1-PRD.hq.netapp.com> <201105261137.p4QBbc0D012437@bagheera.jungle.bt.co.uk>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "Bob Briscoe" <bob.briscoe@bt.com>
X-OriginalArrivalTime: 27 May 2011 20:16:46.0108 (UTC) FILETIME=[008C45C0:01CC1CAB]
Cc: tcpm@ietf.org, draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org
Subject: Re: [tcpm] Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 May 2011 20:16:51 -0000

> From: Bob Briscoe [mailto:bob.briscoe@bt.com]


> > > Alternatively, given (offlist) you've said that MASK=20
> > > must be present in all future types of capability=20
> > > negotiation, it could be placed under category (a).=20
> > > However, I'm not sure how you know it will always
> > > be needed before future types have all been=20
> > > invented.

> > The reason for mask to be present would be to fence off=20
> > parts of TS.val that must be excluded from PAWS=20
> > processing on the receiver side. That was also in the=20
> > context of allowing a version0 receiver to reply always=20
> > with version0, even if it cann't process the timestamp=20
> > value for anything but PAWS (excluding the masked bits,=20
> > where the monotonic increase property can be broken by=20
> > the sender). A future version may specify what is in=20
> > that opaque part for a compatible receiver still...
=20
> My question was whether there could be a version where the=20
> sender ensures timestamps monotonically increase, so MASK=20
> would not be necessary.
>=20
> I suspect I have not fully understood what the underlying=20
> need for MASK is. The doc seems to hint at two:
> a) when not echoing timestamps to the rules in RFC1323,=20
>    they might not monotonically increase
> b) some senders add entropy to the low order bits of=20
>    timestamps to protect against malicious tweaking of=20
>    TSecr values
>=20
> Am I correct?
>=20
> My thinking is that b) would not be necessary in trusted
> environments. I'm now not sure whether a) will always be=20
> a requirement.

Well, transparent timestamps will have to live in a potentially hostile
environment; For example, Linux 2.6.13 .. 2.6.22 used TSecr directly in
CUBIC congestion control, and malicious receivers could exploit that to
get a unfair large share of bandwidth; The linux community then appears
to have decided to throw out the baby with the bath water - and
currently they don't use TSecr directly, but perform a per-segment RTT
tracking based on other clues...

If b) is not necessary, the sender (receiver) is free to set MASK=3D0.
However, algorithms critical for TCP (such as congestion control) would
still need to check the plausibility of returned timestamps (see my
comment about CUBIC).

With a MASK that could cover almost the entire TSval field, the PAWS
test at the receiver is very simple to implement in a future-proof
version (simply shift right the TSval before performing PAWS). This may
become in handy if different kinds of entropy are added the the TSval by
a future version (that the receiver doesn't understand, but where the
sender still wants immediate reflection of TSval, and the receiver still
can perform PAWS with the last remaining unmasked bit...

Having one bit set aside for "TS will be strict monotinic increasing"
will not fully solve this, as you still would have to have a mask field
to demark where potentially added entropy begins; the sender timestamp
clock value should still be recognizable by the receiver...

I agree that having 5 bits (0..31) appears to be overkill, but with 64
or 128 bit timestamps, the fraction of reserved bits would be smaller.

Perhaps I need to explicitly state, that the non-masked bits have be
non-decreasing (monotonic increasing with sequence numbers, strictly
monotonic increasing between windows). With that property, having a
single unmasked bit is pointless (because +1 and -1 have the same
result). Thus a MASK=3D31 could be interpreted as all bits are opaque to =
a
version0 receiver, and the highest meaningful value would be 30 (leaving
the topmost 2 bits for PAWS).


Actually, I was just made aware of the fact, that making TSval truly
opaque without any constraints such as for PAWS could allow a data
payload CRC, as discussed in
http://tools.ietf.org/html/draft-ietf-tcpm-anumita-tcp-stronger-checksum
-00; provided that both ends agree on the scheme (tbd in a IANA
sanctioned future version :) ).

Does that sound reasonable?




> > After thinking about this a bit more (I departed from=20
> > IEEE-754 originally but modified it already as you noted),=20
> > that bit would serve best when re-shuffled for 1 bit higher=20
> > precision.
>=20
> I would argue it's OK to depart from IEEE-754 in some ways but not
> others:
>=20
> - Making the sign bit always 0 and outside the wire protocol is OK,
> because implementers are unlikely to get it wrong
>=20
> - Adding 1 bit more precision to binary16 float representation is NOT
> OK, because implementers will have to tweak low level arithmetic code
> rather than using common libraries. Given the level of bugs that have
> recently been found even in really simple things like ECN or SACK, it
> is important to keep all changes really really simple.

I understand that many OS (and SOCs) can not do floating point in the
Kernel, and if, then binary16 is often not implemented in hardware.

Thus, some bit-banging will be required to find the correction factor
between local and remote timestamp clocks for OWD calculation.

Actually, your argument works in my favor :)=20

If the representation is suspiciously close to IEEE-754, an implementer
may simply put that into a regular float16 library; however, EXP=3D31
would mean infinitiy/NaN in such an regular IEEE library, which would
probably lead to even more intricate bugs...

Having a spec that is off by 1 from something very similar may help
implementers to read the spec more closely (to discover that one can not
simply put this into a float16, multiply with 2^-13 and be done with
parsing it).

Anyway, I'm not feeling very strong about how this bit finally ends up
either way :)





> > The only special value would be all-zero (this is included=20
> > in the PDF, I hope you reviewed that text) to indicate the=20
> > timestamp values are not related to wall-clock time.
>=20
> Yes, I noticed that.
>=20
> In case anyone listening on the list is confused, Richard sent me an
> offlist preview of the proposed next rev, in PDF format.


One point here - setting the indicated clock rate to 0 would not be the
same as masking everything out (in my thinking), as even not wall-clock
related TSval would have the property to monotonic increase with
advancing segments... (allowing PAWS by the receiver, even when enhanced
timing uses as to be disabled).



> > > 4/ Could the MASK use up only 4 bits? Is anyone ever likely to
want
> > > to mask more than 2**4=3D16 bits, given you already say (Section
6.4)
> > > that MASK>8 is discouraged.
> >
> > The mask field would make up to 31 bits completely opaque to the
> > receiver (thus exclude those bits also from PAWS receiver-side
> > processing, lifting any constraints like the monotonic increasing
> > values - PAWS would need that).
> >
> > The unstated reason for the full 5 bits was to cover up to 1/4th=20
> > of the TS value, even when TCPCT (RFC6013) with the 64 / 128 bit=20
> > timestamps are in use, or if the sender wants to have a true hash=20
> > value (segment digest) with each segment, or when the future=20
> > contents of TS.val would be at odds with PAWS processing...
>=20
> Understood. Will you be explaining this reasoning in the next rev?

Will do!


>=20
> I assume 1/4 is a member of the set of made-up numbers, see
> <http://search.dilbert.com/comic/Made%20Up%20Numbers> ;)

You need to start somewhere :) 8 bits out of 32 (as in the draft already
recommended) is also 1/4 - we have consistency there.




> > Also, a future version could define some use of these masked=20
> > out bits, but remain compatible with a version 0 receiver.=20
> > Such a receiver would treat the TS.val as opaque entity again,=20
> > and only the immediate echoing would remain as compatibility=20
> > with future instances...
>=20
> Again, I think this discussion shows that it would be useful to talk
> in the draft about what will be needed for any version and what will
> be only in some.
>=20
> The WG needs to be able to decide where to draw the line between
> features that might be nice (mission creep), and features that are
> necessary.


I'll be happy to take advice in that space. Based on this discussion, I
did expand the possible use cases where a mandatory "mask" would make
sense for backwards compatibility; as discussed, the unmasked portion
would hold to the constraints imposed by rfc1323 (monotonic increase in
value between segments), while the masked (opaque) part is excluded from
this limitation.



> > > Perhaps instead it would be better to use the following format:
> > >    0                   1                   2                   3
> > >    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> > >
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> +
> > >    |E| R |T|       |               |R|         |
> |
> > >    |X| E |Y| MASK  |      RES      |E|  EXP16  |      FRAC16
> |
> > >    |O| S |P|       |               |S|         |
> |
> > >    | |   |E|       |               | |         |
> |
> > >
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> +
> > > The 2-bit reserved field before the type would be useful if the
> > > number of types needed is eventually >2. Leaving it reserved
allows
> > > it to be used for some other purpose orthogonal to types, if nec.
> > > Note I've called the third field 'TYPE', whereas in draft-01 you
> > > called it a reserved flag, then called it RNG in the appendix.
> >
> >RNG is no longer in the draft (the PDF version I sent you via email),
> >that was in version -01.
>=20
> Yup, I had noticed that. But, because I cc'd my email to the list, I
> tried to relate the discussion to draft-01 that everyone has seen,
> not the offlist preview you sent me.
>=20
> >Appendix A refers to version 1, which would be
> >sender (SYN, no ACK) only, and responded to with certain semantics -
> so
> >that an identical (or at least known) timestamp clock rate can be
> >arrived with only one exchange in each direction. (it'll be some time
> >before tcp stacks would allow dynamic adjustment of their timestamp
> >clock rates, I guess :) ).
> >
> > > (An alternative might have been to call it a version field, but
> that
> > > wouldn't be quite right. For instance, in Appendix A, although the
> > > TCP server understands what I would call type 1, it replies with a
> > > type 0 option. This doesn't have the same semantics as
versioning.)
> >
> >Yes, the -02 PDF has a 2-bit version field with the difference in
> >semantics you point out. I really want this to be an enumeration, and
> >the exact semantics (like a version 0 receiver is free to either
> ignore
> >higher versions, or respond simply with version 0 always) are in the
> >specs.
> >
> >I'm unhappy with "type" because that sounds like request/response -
> but
> >implicitly, the active sender (SYN, no ACK bit set) is the requester,
> >and the passive sender (receiver, SYN+ACK) is the responder. Having
> two
> >different definitions in a single version of the timestamp
> negotiation,
> >depending on direction(i.e. if the ACK bit is set or not), with one
> >being identical to version 0 sounds like overkill - version 0 would
> >already have the appropriate signaling for the response...
> >
> >In the PDF, I settled on version as that deviates the least as I
> >understand it in the effective semantics.


>=20
> To repeat back to you, you are saying all version 0 implementations
> must also support version 1? Then I don't think 'version' is the
> right word for this. If you don't like 'type' either, then we need
> another word. What about 'kind'?


Well, a receiver that understands the EXO bit, has to reply with at
least version 0 (regardless of the version sent to it).

For a received version 0, it can do additional checks (to verify if the
sender wasn't just misbehaving), while for any other version, the
receiver couldn't to local time-based calculations, only the traditional
PAWS check...

I think the EXO bit is what you had as TYPE, with my VERsion field being
specifics about the semantics...

If it aligns better with standing naming practices, perhaps "VER" should
really be a message type as a whole, and the semantics describe, how a
receiver has to deal with unknown types (respond with type 0, or
(discouraged) traditional RFC1323).=20

Basically:

(receiver version 0)
EXO=3D0 -> RFC1323
EXO=3D1, VER=3D0 -> VER=3D0
EXO=3D1, VER=3D1..3 -> VER=3D0 (or RFC1323)

A future spec may stipulate

EXO=3D1, VER1 -> VER1 in return, or (as in Appendix A) VER1 -> VER0 =
etc...




> > > 5/ For range negotiation (RNG, Appendix A), why not use the same
> > > FRAC12 field for the hi and lo end of the range, and only
> communicate
> > > hi and lo EXP12 fields? Surely it's not so important to specify
the
> > > ends of the ranges precisely.
> >
> >I really couldn't come up with a decent use case for range
> negotiation,
> >where it would be important that both ends run at the same clock
rate;
> >That's why I put this whole range negotiation stuff into an appendix,
> as
> >just an example of a future enhancement.
> >
> >You are completely correct, depending on what one wants to achieve
> with
> >something like a range negotiation, more bits could be saved or
> shuffled
> >to better uses.
> >
> >
> >
> > > This would leave more space reserved for future stuff, when range
> >negotiation was also needed.
> >
> >A roll of the version number, and reshuffling of the signaling bits
> >appropriately (i.e. a certain use may only be possible with tcp
> >timestamp clocks ticking between micro- and milliseconds, instead of
> >nano- to tenth's of seconds) would achive the same. Of course, if
this
> >signaling takes off, and people figure out clever ways to further
> >utilize a timing / sideband channel in TCP, having just 4 enumerated
> >versions available may be too few :). I think this discussion can be
> >postponed until version 3 of the negotiation signaling is to be
> defined
> >:)
>=20
> You'll notice I suggested 2 bits of reserved space before the 1-bit
> type/kind/version field. That would allow extension to 8
> types/kinds/versions. But it also allows some other future use to
> burn those bits if we decide it's more important than allowing space
> for more types in future.
>=20
> HOWEVER, every new variant faces a new deployment problem.
> I strongly believe we should try to settle on a core extension to
> timestamping that the WG believes includes enough features for
> important stuff like one-way delay, and we draw a line above "would
> be nice to have" mission creep.

>From my point of view, 3 enumeration points are still unused; I would
like to keep the 5-bit mask though, but one bit could indicate if MASK
is valid (and if MASK is not valid, a version 0 receiver would need to
treat the TSval as completely opaque entity (no PAWS allowed):

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|E|V|V|         #               |         |                     |
|X|L|E|  MASK   #      RES      |   EXP   |        FRAC         |
|O|D|R|         #               |         |                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

VLD (valid) ... must be set (MASK valid); if unset, treat TSval opaque
VER (or TYPE) ... 0 (or 1 - compatible with v0 receivers)
MASK # of LSB bits to exclude from PAWS (timestamp) calculations

If VLD=3D0, MASK gives 5 bits to burn otherwise; some possible features
that would require MASK=3D31 can simply set VLD=3D0 for the same =
purpose...




> > > However, I'm not convinced range negotiation is important. Can you
> >motivate it?
> >
> >Not really (the reason why it was pushed to an appendix as an
> example);
> >The only sensible idea I have would be a hardware-assisted TCP stack,
> >which does the timing processing on the NIC; with range negotiation,
> >both TCP timestamp clocks could run at the exact same frequency,
> >simplifying one-way delay variance calculations (so that wirespeed @
> 10G
> >Ethernet becomes a possibility). As long as one-way delay variance
> >calculations are carried out in software (main CPU), doing a division
> >(or at least one multiplication) instead of only addition/subtraction
> >should have very little impact really...
>=20
> But then range negotiation doesn't help anyway, if both ends have a
> set of discrete frequencies their hardware can use and they are
> trying to find one they have in common. This requires an enumeration,
> not a range. That's tough to fit into a confined space.

I believe it was Anantha Ramaiah, who also suggested to have an
enumerated "typical" rate/frequency... With 24 bits available, 24
"common" rates could be defined in a range-negotiation type (sender
sends timestamp capability with the locally supported rates (appropriate
bits set), and the receiver picks the one it can also support (or none)
in the 2nd packet...

Anyway, this is all possible, once it becomes clear that the basic
functionality is worthwhile to have (and exactly the reason, why I try
to allow a multibit version/type field.



Best regards,
   Richard


From rs@netapp.com  Fri May 27 13:18:07 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DADC0E07EE for <tcpm@ietfa.amsl.com>; Fri, 27 May 2011 13:18:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.299
X-Spam-Level: 
X-Spam-Status: No, score=-9.299 tagged_above=-999 required=5 tests=[AWL=1.300,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6JYTfxjy2YxN for <tcpm@ietfa.amsl.com>; Fri, 27 May 2011 13:18:06 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id B36AEE06C4 for <tcpm@ietf.org>; Fri, 27 May 2011 13:18:05 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,282,1304319600"; d="scan'208";a="250883964"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 27 May 2011 13:18:04 -0700
Received: from ldcrsexc2-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.110]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4RKI402003745 for <tcpm@ietf.org>; Fri, 27 May 2011 13:18:04 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Fri, 27 May 2011 21:18:04 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Fri, 27 May 2011 21:17:31 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99BFF7@LDCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Structure/clarity: draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: Acwbn7lnRU7LLswJSzytKYtF7U80TABC1hJA
From: "Scheffenegger, Richard" <rs@netapp.com>
To: <tcpm@ietf.org>
X-OriginalArrivalTime: 27 May 2011 20:18:04.0762 (UTC) FILETIME=[2F6DEBA0:01CC1CAB]
Subject: [tcpm] FW: Structure/clarity: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 May 2011 20:18:08 -0000

-----Original Message-----
From: Bob Briscoe [mailto:bob.briscoe@bt.com]=20
Sent: Donnerstag, 26. Mai 2011 14:23
To: Scheffenegger, Richard
Cc: Mirja KUEHLEWIND;
draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org;
tcmp@ietf.org
Subject: Structure/clarity:
draft-scheffenegger-tcpm-timestamp-negotiation-01

Richard,

I've split the thread into three:
- technical
- structural / clarity  <<<--- this posting
- editorial details

We need to explain to people on the list that I had criticised=20
draft-01 for not explaining the problem, so you sent me an offlist=20
preview of draft-02, and I replied (also offlist) with proposed text=20
as an introduction to the problem.


>Hi Bob,
>
>thank you very much for your extensive list of improvements the draft
>must have!
>
>May I add you as additional author, I really like the introduction
>section / layout you provided!

You're free to use it - I wanted to help this draft to be understood,=20
as I think it will be useful for many purposes (including one of my
own).

I'm OK just to be acknowledged. I'd rather not take on another=20
co-authorship at the moment, as I would then feel committed to=20
continued invovlement, which I cannot promise given my current
commitments.


>I have written a problem statement earlier, but commented it out of the
>draft; I think that is more appropriate when starting the discussion
>about the normative wire protocol - see below.

This would have been very useful if it had not been commented out! It=20
gives exactly the sort of introductory information, the ommission of=20
which made me struggle to understand the doc without reading it more=20
than once through.

Feel free to merge it with my suggested text.

I've added one or two thoughts inline...


><!--
>     <section title=3D"Problem statement">
><t>
>Timestamp values are carried in each segment if negotiated for.=20
>However, the content of this values is to be treated as an opaque=20
>entity by the receiver. This document describes an enhancement to=20
>the timestamp negotiation, and must meet the following criteria:
>     <list>
>         <t>Indicate the (rough) timestamp clock rate used by the=20
> sender in a wide range. The slowest rate should be slower than 1=20
> sec, while the highest rate should allow unique timestamps per=20
> segment, even at extremely high link speeds. At the time of=20
> writing, this highest clock rate was assumed to be 64 byte packets=20
> (i.e. ACK segments) sent at a rate of 100 Gbit/s. This would set=20
> the highest rate at about 5 ns.</t>

I know you wrote the terminology section to give yourself a license=20
to say rate but mean duration. However, I would strongly advise=20
against redefining a word to mean something it doesn't actually mean=20
in normal usage. A manual on good writing style would say this is a=20
sign that the author hasn't spent enought care and time to find the
right word.

>         <t>Allow for timestamps that are not directly related to=20
> real time (i.e. segment counting, or use of the timestamp value as=20
> a true extension of sequence numbers).</t>
>         <t>Provide means to prevent or at least detect tampering=20
> with the echoed timestamp value.</t>
>         <t>Allow for future extensions that may use some of the=20
> timestamp value bits for other signaling purposes.</t>

Do you mean some of the timestamp bits in data packets, or in the
SYN/SYN-ACK?

Mission creep?

>         <t>Signaling must be backwards compatible with existing TCP=20
> stacks implementing basic <xref target=3D"RFC1323"/> timestamps.</t>
>         <t>In-session signaling must not add sender or receiver=20
> complexity.</t>

I think I would just list required features. Things like backward=20
compatibility and simplicity are different (I would avoid listing=20
requirements where the converse would clearly never be a requirement=20
- 'motherhood' requirements).

>         <t>Support current schemes for timestamp value generation.</t>

Need to explain why this is different from backward compatible, or=20
merge into that bullet.

>         <t>Allow for timing information to be gathered during the=20
> initial handshake.</t>

I think you mean state timings explicitly to avoid a training phase=20
that would have to extend beyond the initial handshake.

>         <t>Possibly provide a means to disambiguate resent SYN
segments.</t>
>     </list>
></t>
><t>
>To support these design goals, only the TSecr field in the initial=20
>SYN can be used directly. The response from the receiver has to be=20
>encoded, since no unused field is available. The most straighforward=20
>encoding is a XOR with a value, known to the sender. Therefore, the=20
>receiver also uses TSecr to indicate it's capabilities, but=20
>calculates the XOR sum with the received TSval. This allows the=20
>receiver to remain stateless and functionalities like syncache can=20
>be maintained with no change.</t>

This is solution, not problem.

However, a para like this at the start of the signalling section=20
would have helped me understand much better.

><t>
>Furthermore, some legacy implementations exist that violate <xref=20
>target=3D"RFC1323"/> in that the TSecr field in a SYN is not cleared.=20
>This is mitigated by verifying that one specific bit (EXO), plus any=20
>reserved bits (currently 9, RES and SGN) are set accordingly. This=20
>reduces the chance of a receiver mistakenly negotiating the=20
>timestamp capabilities to less than 0.1% when passively receiving a=20
>TCP handshake from a misbehaving TCP sender.</t>
>         <t>

This starts to stray from problem to solution.

BTW, the risk of false positives is only 0.1% if the non-compliant=20
TCPs are random. We ought to know what they do, because if they set=20
TSecr in a SYN to a particular value, all we have to do is avoid it.=20
And if we don't avoid it, the risk of false positive would be 100%, not
0.1%.

>As there exist some benefit to change the receiver side treatement=20
>of which timestamp value to echo, the negotiation protocol itself=20
>must also provide some backwards compatibility. Therefore, even when=20
>a sender tries to negotiate for a higher version then is supported=20
>by the receiver, the receiver MUST respond with at least version 0.=20
>Furthermore, if selective acknowledgements are also negotiated for,=20
>the immediate echoing of the last received timestamp value has to be=20
>enabled regardless of the senders version of the timestamp=20
>capabilities. Also, the receiver must ignore an indicated number of=20
>opaque bits, before applying heuristics defined in <xref=20
>target=3D"RFC1323"/>, because the monotonic increase for each newly=20
>sent segment may not be applicable. This lead to the presented=20
>distribution of the fields, with three fields (EXO, VER and MASK)=20
>that MUST be present regardless of version in the uppermost octet,=20
>while the lower three octets MAY be redefined freely with subsequent=20
>versions of this negotiation protocol.</t>

This is straying into solution, not problem.

><t>
>The wide range of indicated timestamp clock rates (spanning at least=20
>9 orders of (decimal) magnitude, or 28 binary digits, and the=20
>limitation to no more than 24 bits requires the use of a logarithmic=20
>encoding. For simplicity, a format was choosen that conforms with=20
>the format of a IEEE-754 representation. Most stacks will at first=20
>not be able to dynamically adjust their timestamp clock rate, so=20
>that this pattern can be a static, compile time value. To use the=20
>indicated clock rate, for example to perform one way delay variance=20
>calculations, simple integer operations can be used.</t>

Again, straying into solution, not problem.

></section>
>     -->
>

Cheers


Bob


________________________________________________________________
Bob Briscoe,                                BT Innovate & Design=20


From rs@netapp.com  Sat May 28 05:08:34 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ABA6BE06B6 for <tcpm@ietfa.amsl.com>; Sat, 28 May 2011 05:08:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.559
X-Spam-Level: 
X-Spam-Status: No, score=-9.559 tagged_above=-999 required=5 tests=[AWL=1.040,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iJUpkTScI-Xs for <tcpm@ietfa.amsl.com>; Sat, 28 May 2011 05:08:33 -0700 (PDT)
Received: from mx3.netapp.com (mx3.netapp.com [217.70.210.9]) by ietfa.amsl.com (Postfix) with ESMTP id 55E7EE06F6 for <tcpm@ietf.org>; Sat, 28 May 2011 05:08:33 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,285,1304319600"; d="scan'208";a="257317741"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx3-out.netapp.com with ESMTP; 28 May 2011 05:08:30 -0700
Received: from ldcrsexc1-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.109]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4SC8Tkk028275; Sat, 28 May 2011 05:08:29 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Sat, 28 May 2011 13:08:29 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Sat, 28 May 2011 13:08:29 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99C048@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <201105261223.p4QCNM59012721@bagheera.jungle.bt.co.uk>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Structure/clarity: draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: Acwbn7lnRU7LLswJSzytKYtF7U80TABgujCg
References: <201105261223.p4QCNM59012721@bagheera.jungle.bt.co.uk>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "Bob Briscoe" <bob.briscoe@bt.com>
X-OriginalArrivalTime: 28 May 2011 12:08:29.0707 (UTC) FILETIME=[F4F1FDB0:01CC1D2F]
Cc: tcpm@ietf.org, draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org
Subject: Re: [tcpm] Structure/clarity: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 28 May 2011 12:08:34 -0000

Hello Bob, group,

please see inline.

[text donation]
> You're free to use it - I wanted to help this draft to be understood,
as I think it will be useful for many purposes (including one of my
own).

Out of curiosity, is this purpose already contained in the list of
possible use cases?=20

If not, would you mind to mention it in a rough sketch so that I can add
it?




> > I have written a problem statement earlier, but commented it out of
> > the draft;=20
:
:
>=20
> Feel free to merge it with my suggested text.

I included the problem statement as a separate section after your text
contribution, because it already goes into a lot of detail. I think for
someone trying to get the overall picture fast, your text really is the
best approach.

Also, I disentangled the problem / solution comments, as you mentioned
in this email.

>=20
> I know you wrote the terminology section to give yourself a license
> to say rate but mean duration. However, I would strongly advise
> against redefining a word to mean something it doesn't actually mean
> in normal usage. A manual on good writing style would say this is a
> sign that the author hasn't spent enought care and time to find the
> right word.

Well, rate (frequency) and duration can easily be converted.=20

The reason for having duration in the wire protocol instead of frequency
is to have high precision floating point values available for long
durations (low frequencies). I agree that the dynamic range of an
almost-binary16 would be enough for the relevant problem, but using the
duration of a clock tick allows even higher clock rates to be indicated
(by 3 decimal magnitudes), so that CPU cycle counters running at GHz
rates could be used directly  as timestamp value clock source (provided
the CPU doesn't shift frequency gears :).=20

>From the remainder of the text, I feel that text talking about a "clock
rate" (frequency) flows better than a text talking about tick durations
- technically, I don't see much difference for the discussion of the
feature.

I do agree this is overkill right now, but may be an interesting aspect
for experimental hardware implementations.

Also, I added some section, because to my understanding, everyone is
using the TCP clock as timestamp clock source*. This comes naturally,
when the timestamp is only useful for RTTM; but when indicating "real"
time with higher precision, any other clock source could be used. In a
discussion with William Simpson, he indicated that the TCPCT extended
timestamps are intended for direct NTP timestamps, even though the
document is never explicit about that aspect...

[* glad to hear about counter-examples]




> >Furthermore, some legacy implementations exist that violate <xref
> >target=3D"RFC1323"/> in that the TSecr field in a SYN is not cleared.
> >This is mitigated by verifying that one specific bit (EXO), plus any
> >reserved bits (currently 9, RES and SGN) are set accordingly. This
> >reduces the chance of a receiver mistakenly negotiating the
> >timestamp capabilities to less than 0.1% when passively receiving a
> >TCP handshake from a misbehaving TCP sender.</t>
> >         <t>
>=20
> This starts to stray from problem to solution.
>=20
> BTW, the risk of false positives is only 0.1% if the non-compliant
> TCPs are random. We ought to know what they do, because if they set
> TSecr in a SYN to a particular value, all we have to do is avoid it.
> And if we don't avoid it, the risk of false positive would be 100%,
not
> 0.1%.

Actually, there are two issues:=20
The chances that the receiver is using TSval from a misbehaving sender -
this is quite low assuming random scrambling of bits <0.1%, as 11 bits
have to match (EXO, VER, RES). This is the more problematic one as the
receiver (passive sender) may perform decisions based on invalid TSval
data...

The second issue is the fact, the semantics of TS get changed by the
receiver with a much higher likelihood (37.5%), but no enhanced
heuristics are allowed to be enabled on the receiver side. This change
in semantics may cause a sender to arrive at too small RTOs, increasing
spurious retransmissions, and self-inflicting lower bandwidth.=20

Both issues can only happen with host not compliant to RFC1323, and
since such a host is not mischievously feeding fabricated information to
the receiver, it's almost certain that such a session would perform
worse than normal. That should be motivation to upgrade the misbehaving
sender...




Talking about mission creep:=20

I would like to get feedback from the community, if indicating the
stability/precision of the senders timestamp clock source is seen as a
viable feature.

With a IEEE754 type floating point representation, and the implicit lead
bit in the fraction field, this is not possible (and decoding has a
conditional on the exponent). However, as SGN is really not useful, the
exponent field and fraction field could be expressed completely
explicit.

Thus, up to 11 different representations could be chosen for the same
value:

EXP=3D10, FRAC=3D 1.0000 0000 00
EXP=3D11, FRAC=3D 0.1000 0000 00
:
:
EXP=3D20, FRAC=3D 0.0000 0000 01

Obviously, the first representation carries a lot of significant bits,
indicating that the sender has high confidence in the TSval signal it
sends; the last representation has only a single bit significance, thus
the TSval duration may not be known by a wide margin (+- 50%) to the
sender, and a receiver may want to take this difference in precision
into consideration when enabling different heuristics...

Also, a conversion of the float representation to an integer becomes an
unconditional bit shift by a known value...

Best regards,
   Richard



From rs@netapp.com  Sat May 28 05:21:43 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CD79E0693 for <tcpm@ietfa.amsl.com>; Sat, 28 May 2011 05:21:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.732
X-Spam-Level: 
X-Spam-Status: No, score=-9.732 tagged_above=-999 required=5 tests=[AWL=0.867,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nl+SK6MPpTC0 for <tcpm@ietfa.amsl.com>; Sat, 28 May 2011 05:21:42 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id 6D34EE0688 for <tcpm@ietf.org>; Sat, 28 May 2011 05:21:42 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,285,1304319600"; d="scan'208";a="250919888"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 28 May 2011 05:21:28 -0700
Received: from ldcrsexc1-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.109]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4SCLSoU029100 for <tcpm@ietf.org>; Sat, 28 May 2011 05:21:28 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Sat, 28 May 2011 13:21:28 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: base64
Date: Sat, 28 May 2011 13:20:54 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99C04B@LDCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: New Version Notification for draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
Thread-Index: AcwdMU/cO99O/VYtTXmDg/mYNbsWYAAAAZtQ
From: "Scheffenegger, Richard" <rs@netapp.com>
To: <tcpm@ietf.org>
X-OriginalArrivalTime: 28 May 2011 12:21:28.0061 (UTC) FILETIME=[C4E162D0:01CC1D31]
Subject: [tcpm] New Version Notification for draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 28 May 2011 12:21:43 -0000

DQpodHRwOi8vd3d3LmlldGYub3JnL2lkL2RyYWZ0LXNjaGVmZmVuZWdnZXItdGNwbS10aW1lc3Rh
bXAtbmVnb3RpYXRpb24tMDIudHh0DQoNCg0KQSBuZXcgdmVyc2lvbiBvZiBJLUQsIGRyYWZ0LXNj
aGVmZmVuZWdnZXItdGNwbS10aW1lc3RhbXAtbmVnb3RpYXRpb24tMDIudHh0IGhhcyBiZWVuIHN1
Y2Nlc3NmdWxseSBzdWJtaXR0ZWQgYnkgUmljaGFyZCBTY2hlZmZlbmVnZ2VyIGFuZCBwb3N0ZWQg
dG8gdGhlIElFVEYgcmVwb3NpdG9yeS4NCg0KRmlsZW5hbWU6CSBkcmFmdC1zY2hlZmZlbmVnZ2Vy
LXRjcG0tdGltZXN0YW1wLW5lZ290aWF0aW9uDQpSZXZpc2lvbjoJIDAyDQpUaXRsZToJCSBBZGRp
dGlvbmFsIG5lZ290aWF0aW9uIGluIHRoZSBUQ1AgVGltZXN0YW1wIE9wdGlvbiBmaWVsZCBkdXJp
bmcgdGhlIFRDUCBoYW5kc2hha2UNCkNyZWF0aW9uIGRhdGU6CSAyMDExLTA1LTI4DQpXRyBJRDoJ
CSBJbmRpdmlkdWFsIFN1Ym1pc3Npb24NCk51bWJlciBvZiBwYWdlczogMzENCg0KQWJzdHJhY3Q6
DQogICBBIG51bWJlciBvZiBUQ1AgZW5oYW5jZW1lbnRzIGluIHNvIGRpdmVyc2UgZmllbGRzIGFz
IGNvbmdlc3Rpb24NCiAgIGNvbnRyb2wsIGxvc3MgcmVjb3Zlcnkgb3Igc2lkZS1iYW5kIHNpZ25h
bGluZyBjb3VsZCBiZSBpbXByb3ZlZCBieQ0KICAgbWFraW5nIHRoZSB2YWx1ZXMgY2FycmllZCBp
biB0aGUgVGltZXN0YW1wIG9wdGlvbiB0cmFuc3BhcmVudCwgYW5kDQogICBjaGFuZ2luZyB0aGUg
cmVjZWl2ZXIgc2lkZSBwcm9jZXNzaW5nIG9mIHRpbWVzdGFtcHMgaW4gdGhlIHByZXNlbmNlDQog
ICBvZiBzZWxlY3RpdmUgYWNrbm93bGVkZ2VtZW50cy4NCg0KICAgVGhpcyBkb2N1bWVudHMgc3Bl
Y2lmaWVzIGEgYmFja3dhcmRzIGNvbXBhdGlibGUgd2F5IG9mIG5lZ290aWF0aW5nDQogICBmb3Ig
VGltZXN0YW1wIGNhcGFiaWxpdGllcywgYW5kIGxpc3RzIGEgbnVtYmVyIG9mIGJlbmVmaXRzIGFu
ZA0KICAgZHJhd2JhY2tzIG9mIHRoaXMgYXBwcm9hY2guDQoNCiAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICANCg0KDQpUaGUgSUVURiBTZWNyZXRhcmlhdA0K

From rs@netapp.com  Sat May 28 05:48:21 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 79CE6E068F for <tcpm@ietfa.amsl.com>; Sat, 28 May 2011 05:48:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.856
X-Spam-Level: 
X-Spam-Status: No, score=-9.856 tagged_above=-999 required=5 tests=[AWL=0.743,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v9sD0rQce483 for <tcpm@ietfa.amsl.com>; Sat, 28 May 2011 05:48:20 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id 26147E0662 for <tcpm@ietf.org>; Sat, 28 May 2011 05:48:19 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,285,1304319600"; d="scan'208";a="250920313"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 28 May 2011 05:48:19 -0700
Received: from amsrsexc1-prd.hq.netapp.com (amsrsexc1-prd.hq.netapp.com [10.64.251.107]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4SCmJhe000606 for <tcpm@ietf.org>; Sat, 28 May 2011 05:48:19 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by amsrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Sat, 28 May 2011 14:48:19 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Sat, 28 May 2011 13:47:44 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99C04C@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <5FDC413D5FA246468C200652D63E627A0E99C04B@LDCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: feedback from group to draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
Thread-Index: AcwdMU/cO99O/VYtTXmDg/mYNbsWYAAAAZtQAAAuQZA=
References: <5FDC413D5FA246468C200652D63E627A0E99C04B@LDCMVEXC1-PRD.hq.netapp.com>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "Scheffenegger, Richard" <rs@netapp.com>, <tcpm@ietf.org>
X-OriginalArrivalTime: 28 May 2011 12:48:19.0252 (UTC) FILETIME=[85399F40:01CC1D35]
Subject: Re: [tcpm] feedback from group to draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 28 May 2011 12:48:21 -0000

Hi,

this is to summarize the open (technical content) issues, as mentioned
in earlier emails:

1)
naming of the "VER"sion field. As a receiver implementing this draft
should always respond with at least a capabilities field where this
field is set to zero (described in the draft), regardless of the version
used by the sender, the appropriate name for this field may semantically
more correct be "TYPE" or message-"ID".

It really describes the content of the lower 3 octets, and is intended
as an enumeration. The semantics how / when to use what value in this
field is explicit in the spec (and again, a value=3D0)

2)=20
5-bits mask field: It may be interesting to prevent a (version 0, see
above) receiver from processing the TSval during the data stream at all,
while bit-granularity of the mask may not be needed for too many
applications. At the same time, concern was raised as all the common
bits are already assigned to fields (even though 3 enumerations of the
"VER" field are available as future expansions.

One suggestion is to have a separate bit to indicate if MASK is valid,
and that (smaller, 4-bit) field it not, the receiver stop any TSval
processing (including PAWS check). This could yield additional (sub-)
codepoints for future extensions of the negotiation.

There are still 8 unused (reserved) bits, specific to version 0;
however, some reserved bits are important as long as misbehaving RFC1323
senders may be operating, to prevent receivers from processing TSval
improperly.


3)
Explicit indication of the confidence / precision the sender has in the
timestamp clock source. One bit (SGN) was removed over draft -01, as
negative time durations make little sense. In -02, this bit is used to
increase the indicated precision (and to deviate more from IEEE754
binary16 format, to avoid wrong implementations simply mapping the wire
protocol fields into a binary16 float).
However, such a high precision is often not necessary, while indicating
a confidence with the clock source could enable additional heuristics to
be enabled/disabled based on the time source precision.

Also, without an implicit lead bit, conversion of the float
representation to integer is slightly streamlined (no special processing
for subnormal numbers).


4)
Two possible extensions of this protocol received a little thought so
far:=20
 a) range negotiation (to simplify OWD calculations to pure
addition/subtraction operations)
 b) complete masking of the timestamp for timing purposes and using it
to carry a data payload CRC

These extensions are obviously incompatible, and would need new
version/type values (but they can signal a version 0 receiver how it
should behave on receipt of such a capabilities field).

Shall these options discussed in any more detail than they are already?
Or completely removed from the draft?

Best regards,

Richard Scheffenegger



> -----Original Message-----
> From: Scheffenegger, Richard
> Sent: Samstag, 28. Mai 2011 14:21
> To: tcpm@ietf.org
> Subject: [tcpm] New Version Notification fordraft-scheffenegger-tcpm-
> timestamp-negotiation-02.txt
>=20
>=20
> http://www.ietf.org/id/draft-scheffenegger-tcpm-timestamp-negotiation-
> 02.txt
>=20
>=20
> A new version of I-D, draft-scheffenegger-tcpm-timestamp-negotiation-
> 02.txt has been successfully submitted by Richard Scheffenegger and
> posted to the IETF repository.
>=20
> Filename:	 draft-scheffenegger-tcpm-timestamp-negotiation
> Revision:	 02
> Title:		 Additional negotiation in the TCP Timestamp
Option
> field during the TCP handshake
> Creation date:	 2011-05-28
> WG ID:		 Individual Submission
> Number of pages: 31
>=20
> Abstract:
>    A number of TCP enhancements in so diverse fields as congestion
>    control, loss recovery or side-band signaling could be improved by
>    making the values carried in the Timestamp option transparent, and
>    changing the receiver side processing of timestamps in the presence
>    of selective acknowledgements.
>=20
>    This documents specifies a backwards compatible way of negotiating
>    for Timestamp capabilities, and lists a number of benefits and
>    drawbacks of this approach.
>=20
>=20
>=20
>=20
> The IETF Secretariat
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm

From bob.briscoe@bt.com  Tue May 31 08:39:23 2011
Return-Path: <bob.briscoe@bt.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3F0BEE0755 for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 08:39:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.084
X-Spam-Level: 
X-Spam-Status: No, score=-2.084 tagged_above=-999 required=5 tests=[AWL=-1.385, BAYES_00=-2.599, J_CHICKENPOX_23=0.6, MANGLED_PILL=2.3, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Yd0HUApKJanl for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 08:39:19 -0700 (PDT)
Received: from smtp1.smtp.bt.com (smtp1.smtp.bt.com [217.32.164.137]) by ietfa.amsl.com (Postfix) with ESMTP id 8B272E0721 for <tcpm@ietf.org>; Tue, 31 May 2011 08:39:18 -0700 (PDT)
Received: from i2kc08-ukbr.domain1.systemhost.net ([193.113.197.71]) by smtp1.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.4675);  Tue, 31 May 2011 16:39:17 +0100
Received: from cbibipnt08.iuser.iroot.adidom.com ([147.149.100.81]) by i2kc08-ukbr.domain1.systemhost.net with Microsoft SMTPSVC(6.0.3790.4675); Tue, 31 May 2011 16:39:16 +0100
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by cbibipnt08.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399); id 1306856355368; Tue, 31 May 2011 16:39:15 +0100
Received: from MUT.jungle.bt.co.uk ([10.215.131.97]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id p4VFdDQM030375; Tue, 31 May 2011 16:39:13 +0100
Message-Id: <201105311539.p4VFdDQM030375@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Tue, 31 May 2011 16:39:17 +0100
To: "Scheffenegger, Richard" <rs@netapp.com>
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <5FDC413D5FA246468C200652D63E627A0E99BFF6@LDCMVEXC1-PRD.hq. netapp.com>
References: <201105231457.p4NEvZfm017363@bagheera.jungle.bt.co.uk> <5FDC413D5FA246468C200652D63E627A0E89AF5A@LDCMVEXC1-PRD.hq.netapp.com> <201105261137.p4QBbc0D012437@bagheera.jungle.bt.co.uk> <5FDC413D5FA246468C200652D63E627A0E99BFF6@LDCMVEXC1-PRD.hq.netapp.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
X-OriginalArrivalTime: 31 May 2011 15:39:16.0972 (UTC) FILETIME=[E68ABAC0:01CC1FA8]
Cc: tcpm@ietf.org, draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org
Subject: Re: [tcpm] Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 May 2011 15:39:23 -0000

Richard,

At 21:16 27/05/2011, Scheffenegger, Richard wrote:


> > From: Bob Briscoe [mailto:bob.briscoe@bt.com]
>
>
> > > > Alternatively, given (offlist) you've said that MASK
> > > > must be present in all future types of capability
> > > > negotiation, it could be placed under category (a).
> > > > However, I'm not sure how you know it will always
> > > > be needed before future types have all been
> > > > invented.
>
> > > The reason for mask to be present would be to fence off
> > > parts of TS.val that must be excluded from PAWS
> > > processing on the receiver side. That was also in the
> > > context of allowing a version0 receiver to reply always
> > > with version0, even if it cann't process the timestamp
> > > value for anything but PAWS (excluding the masked bits,
> > > where the monotonic increase property can be broken by
> > > the sender). A future version may specify what is in
> > > that opaque part for a compatible receiver still...
>
> > My question was whether there could be a version where the
> > sender ensures timestamps monotonically increase, so MASK
> > would not be necessary.
> >
> > I suspect I have not fully understood what the underlying
> > need for MASK is. The doc seems to hint at two:
> > a) when not echoing timestamps to the rules in RFC1323,
> >    they might not monotonically increase
> > b) some senders add entropy to the low order bits of
> >    timestamps to protect against malicious tweaking of
> >    TSecr values
> >
> > Am I correct?

You haven't answered this question really - it really needs answering 
in the draft as part of describing the problem.

> >
> > My thinking is that b) would not be necessary in trusted
> > environments. I'm now not sure whether a) will always be
> > a requirement.
>
>Well, transparent timestamps will have to live in a potentially hostile
>environment; For example, Linux 2.6.13 .. 2.6.22 used TSecr directly in
>CUBIC congestion control, and malicious receivers could exploit that to
>get a unfair large share of bandwidth; The linux community then appears
>to have decided to throw out the baby with the bath water - and
>currently they don't use TSecr directly, but perform a per-segment RTT
>tracking based on other clues...
>
>If b) is not necessary, the sender (receiver) is free to set MASK=0.

True.

HOWEVER, that still means implementers have to support MASK!=0. That 
adds quite a bit of complexity to their TCP code to support some 
tricky crypto-heuristics.  Given it seems some implementers cannot 
even get basic things like 2-bit ECN or SACK right, this seems to be 
setting the bar very high.

I am not saying security isn't important. I am saying that mandating 
security features is often worse than no security at all, because it 
forces people to have to implement complex security techniques when 
they aren't capable of doing so properly.

>However, algorithms critical for TCP (such as congestion control) would
>still need to check the plausibility of returned timestamps (see my
>comment about CUBIC).

Whether to make security mandatory is a question for the w-g if this 
draft becomes a wg item - no need for us to argue further on this.


>With a MASK that could cover almost the entire TSval field, the PAWS
>test at the receiver is very simple to implement in a future-proof
>version (simply shift right the TSval before performing PAWS). This may
>become in handy if different kinds of entropy are added the the TSval by
>a future version (that the receiver doesn't understand, but where the
>sender still wants immediate reflection of TSval, and the receiver still
>can perform PAWS with the last remaining unmasked bit...

I agree it's clever. I was just thinking that a future version might 
prefer to have more space for something instead of the MASK.

My comments were in the context of this text added in the -02 PDF 
draft: "MASK: ... This field MUST be present in all future version of 
timestamp capability fields."

I'm not saying don't include MASK, just don't make it mandatory for 
all future versions. I would try to say as little as possible is a 
MUST for future versions as yet uninvented.


>Having one bit set aside for "TS will be strict monotinic increasing"
>will not fully solve this, as you still would have to have a mask field
>to demark where potentially added entropy begins; the sender timestamp
>clock value should still be recognizable by the receiver...
>
>I agree that having 5 bits (0..31) appears to be overkill, but with 64
>or 128 bit timestamps, the fraction of reserved bits would be smaller.
>
>Perhaps I need to explicitly state, that the non-masked bits have be
>non-decreasing (monotonic increasing with sequence numbers, strictly
>monotonic increasing between windows). With that property, having a
>single unmasked bit is pointless (because +1 and -1 have the same
>result). Thus a MASK=31 could be interpreted as all bits are opaque to a
>version0 receiver, and the highest meaningful value would be 30 (leaving
>the topmost 2 bits for PAWS).

That reminds me, I suggest you avoid using the term opaque, but 
instead say exactly what you mean (the field need not have any 
semantics that the echoing node can check or understand).

Opaque is nearly as ambiguous as the word transparent, which some 
people use to mean you can see something and others mean you cannot 
see it (you see through it).

>Actually, I was just made aware of the fact, that making TSval truly
>opaque without any constraints such as for PAWS could allow a data
>payload CRC, as discussed in
>http://tools.ietf.org/html/draft-ietf-tcpm-anumita-tcp-stronger-checksum
>-00; provided that both ends agree on the scheme (tbd in a IANA
>sanctioned future version :) ).
>
>Does that sound reasonable?

Well, it wouldn't make sense to give implementers a dilemma that they 
could only use a bigger checksum or timestamps, but not both.

And anyway, I don't think a CRC needs the semantics of echoing.


> > > After thinking about this a bit more (I departed from
> > > IEEE-754 originally but modified it already as you noted),
> > > that bit would serve best when re-shuffled for 1 bit higher
> > > precision.
> >
> > I would argue it's OK to depart from IEEE-754 in some ways but not
> > others:
> >
> > - Making the sign bit always 0 and outside the wire protocol is OK,
> > because implementers are unlikely to get it wrong
> >
> > - Adding 1 bit more precision to binary16 float representation is NOT
> > OK, because implementers will have to tweak low level arithmetic code
> > rather than using common libraries. Given the level of bugs that have
> > recently been found even in really simple things like ECN or SACK, it
> > is important to keep all changes really really simple.
>
>I understand that many OS (and SOCs) can not do floating point in the
>Kernel, and if, then binary16 is often not implemented in hardware.
>
>Thus, some bit-banging will be required to find the correction factor
>between local and remote timestamp clocks for OWD calculation.
>
>Actually, your argument works in my favor :)
>
>If the representation is suspiciously close to IEEE-754, an implementer
>may simply put that into a regular float16 library; however, EXP=31
>would mean infinitiy/NaN in such an regular IEEE library, which would
>probably lead to even more intricate bugs...

It would be best to specifically say that EXP=31 MUST signal 
something exceptional, so implementers have to catch it before a library does.


>Having a spec that is off by 1 from something very similar may help
>implementers to read the spec more closely (to discover that one can not
>simply put this into a float16, multiply with 2^-13 and be done with
>parsing it).
>
>Anyway, I'm not feeling very strong about how this bit finally ends up
>either way :)

I see your point. I would lean towards simplicity wherever possible, 
rather than making implementers have to do complex coding.


> > > The only special value would be all-zero (this is included
> > > in the PDF, I hope you reviewed that text) to indicate the
> > > timestamp values are not related to wall-clock time.
> >
> > Yes, I noticed that.
> >
> > In case anyone listening on the list is confused, Richard sent me an
> > offlist preview of the proposed next rev, in PDF format.
>
>One point here - setting the indicated clock rate to 0 would not be the
>same as masking everything out (in my thinking), as even not wall-clock
>related TSval would have the property to monotonic increase with
>advancing segments... (allowing PAWS by the receiver, even when enhanced
>timing uses as to be disabled).

Understood.


> > > > 4/ Could the MASK use up only 4 bits? Is anyone ever likely to
>want
> > > > to mask more than 2**4=16 bits, given you already say (Section
>6.4)
> > > > that MASK>8 is discouraged.
> > >
> > > The mask field would make up to 31 bits completely opaque to the
> > > receiver (thus exclude those bits also from PAWS receiver-side
> > > processing, lifting any constraints like the monotonic increasing
> > > values - PAWS would need that).
> > >
> > > The unstated reason for the full 5 bits was to cover up to 1/4th
> > > of the TS value, even when TCPCT (RFC6013) with the 64 / 128 bit
> > > timestamps are in use, or if the sender wants to have a true hash
> > > value (segment digest) with each segment, or when the future
> > > contents of TS.val would be at odds with PAWS processing...
> >
> > Understood. Will you be explaining this reasoning in the next rev?
>
>Will do!
>
>
> >
> > I assume 1/4 is a member of the set of made-up numbers, see
> > <http://search.dilbert.com/comic/Made%20Up%20Numbers> ;)
>
>You need to start somewhere :) 8 bits out of 32 (as in the draft already
>recommended) is also 1/4 - we have consistency there.
>
>
>
>
> > > Also, a future version could define some use of these masked
> > > out bits, but remain compatible with a version 0 receiver.
> > > Such a receiver would treat the TS.val as opaque entity again,
> > > and only the immediate echoing would remain as compatibility
> > > with future instances...
> >
> > Again, I think this discussion shows that it would be useful to talk
> > in the draft about what will be needed for any version and what will
> > be only in some.
> >
> > The WG needs to be able to decide where to draw the line between
> > features that might be nice (mission creep), and features that are
> > necessary.
>
>
>I'll be happy to take advice in that space. Based on this discussion, I
>did expand the possible use cases where a mandatory "mask" would make
>sense for backwards compatibility; as discussed, the unmasked portion
>would hold to the constraints imposed by rfc1323 (monotonic increase in
>value between segments), while the masked (opaque) part is excluded from
>this limitation.
>
>
>
> > > > Perhaps instead it would be better to use the following format:
> > > >    0                   1                   2                   3
> > > >    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> > > >
>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> > +
> > > >    |E| R |T|       |               |R|         |
> > |
> > > >    |X| E |Y| MASK  |      RES      |E|  EXP16  |      FRAC16
> > |
> > > >    |O| S |P|       |               |S|         |
> > |
> > > >    | |   |E|       |               | |         |
> > |
> > > >
>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> > +
> > > > The 2-bit reserved field before the type would be useful if the
> > > > number of types needed is eventually >2. Leaving it reserved
>allows
> > > > it to be used for some other purpose orthogonal to types, if nec.
> > > > Note I've called the third field 'TYPE', whereas in draft-01 you
> > > > called it a reserved flag, then called it RNG in the appendix.
> > >
> > >RNG is no longer in the draft (the PDF version I sent you via email),
> > >that was in version -01.
> >
> > Yup, I had noticed that. But, because I cc'd my email to the list, I
> > tried to relate the discussion to draft-01 that everyone has seen,
> > not the offlist preview you sent me.
> >
> > >Appendix A refers to version 1, which would be
> > >sender (SYN, no ACK) only, and responded to with certain semantics -
> > so
> > >that an identical (or at least known) timestamp clock rate can be
> > >arrived with only one exchange in each direction. (it'll be some time
> > >before tcp stacks would allow dynamic adjustment of their timestamp
> > >clock rates, I guess :) ).
> > >
> > > > (An alternative might have been to call it a version field, but
> > that
> > > > wouldn't be quite right. For instance, in Appendix A, although the
> > > > TCP server understands what I would call type 1, it replies with a
> > > > type 0 option. This doesn't have the same semantics as
>versioning.)
> > >
> > >Yes, the -02 PDF has a 2-bit version field with the difference in
> > >semantics you point out. I really want this to be an enumeration, and
> > >the exact semantics (like a version 0 receiver is free to either
> > ignore
> > >higher versions, or respond simply with version 0 always) are in the
> > >specs.
> > >
> > >I'm unhappy with "type" because that sounds like request/response -
> > but
> > >implicitly, the active sender (SYN, no ACK bit set) is the requester,
> > >and the passive sender (receiver, SYN+ACK) is the responder. Having
> > two
> > >different definitions in a single version of the timestamp
> > negotiation,
> > >depending on direction(i.e. if the ACK bit is set or not), with one
> > >being identical to version 0 sounds like overkill - version 0 would
> > >already have the appropriate signaling for the response...
> > >
> > >In the PDF, I settled on version as that deviates the least as I
> > >understand it in the effective semantics.
>
>
> >
> > To repeat back to you, you are saying all version 0 implementations
> > must also support version 1? Then I don't think 'version' is the
> > right word for this. If you don't like 'type' either, then we need
> > another word. What about 'kind'?
>
>
>Well, a receiver that understands the EXO bit, has to reply with at
>least version 0 (regardless of the version sent to it).
>
>For a received version 0, it can do additional checks (to verify if the
>sender wasn't just misbehaving), while for any other version, the
>receiver couldn't to local time-based calculations, only the traditional
>PAWS check...
>
>I think the EXO bit is what you had as TYPE, with my VERsion field being
>specifics about the semantics...

No, I wasn't thinking that. Your EXO field is like a "Not vRFC1323" 
flag, so it's a part of a version field. Whereas your VER field 
doesn't seem to have the semantics of Version, because a v1 receiver 
replies to a v1 request with a v0 reply. This is purely a terminology 
problem, nothing serious.


>If it aligns better with standing naming practices, perhaps "VER" should
>really be a message type as a whole, and the semantics describe, how a
>receiver has to deal with unknown types (respond with type 0, or
>(discouraged) traditional RFC1323).
>
>Basically:
>
>(receiver version 0)
>EXO=0 -> RFC1323
>EXO=1, VER=0 -> VER=0
>EXO=1, VER=1..3 -> VER=0 (or RFC1323)
>
>A future spec may stipulate
>
>EXO=1, VER1 -> VER1 in return, or (as in Appendix A) VER1 -> VER0 etc...

I agree with all this about specifying what the receiver must do. My 
only concern was to choose a word that doesn't confuse people, rather 
than redefining the meaning of a word like 'version' that actually 
has different semantics in normal usage.

The term 'version' normally has the semantics than a lower version 
implementation cannot understand a higher version protocol. If you 
want to stray from that, I suggest you find a different word.




> > > > 5/ For range negotiation (RNG, Appendix A), why not use the same
> > > > FRAC12 field for the hi and lo end of the range, and only
> > communicate
> > > > hi and lo EXP12 fields? Surely it's not so important to specify
>the
> > > > ends of the ranges precisely.
> > >
> > >I really couldn't come up with a decent use case for range
> > negotiation,
> > >where it would be important that both ends run at the same clock
>rate;
> > >That's why I put this whole range negotiation stuff into an appendix,
> > as
> > >just an example of a future enhancement.
> > >
> > >You are completely correct, depending on what one wants to achieve
> > with
> > >something like a range negotiation, more bits could be saved or
> > shuffled
> > >to better uses.
> > >
> > >
> > >
> > > > This would leave more space reserved for future stuff, when range
> > >negotiation was also needed.
> > >
> > >A roll of the version number, and reshuffling of the signaling bits
> > >appropriately (i.e. a certain use may only be possible with tcp
> > >timestamp clocks ticking between micro- and milliseconds, instead of
> > >nano- to tenth's of seconds) would achive the same. Of course, if
>this
> > >signaling takes off, and people figure out clever ways to further
> > >utilize a timing / sideband channel in TCP, having just 4 enumerated
> > >versions available may be too few :). I think this discussion can be
> > >postponed until version 3 of the negotiation signaling is to be
> > defined
> > >:)
> >
> > You'll notice I suggested 2 bits of reserved space before the 1-bit
> > type/kind/version field. That would allow extension to 8
> > types/kinds/versions. But it also allows some other future use to
> > burn those bits if we decide it's more important than allowing space
> > for more types in future.
> >
> > HOWEVER, every new variant faces a new deployment problem.
> > I strongly believe we should try to settle on a core extension to
> > timestamping that the WG believes includes enough features for
> > important stuff like one-way delay, and we draw a line above "would
> > be nice to have" mission creep.
>
> >From my point of view, 3 enumeration points are still unused; I would
>like to keep the 5-bit mask though, but one bit could indicate if MASK
>is valid (and if MASK is not valid, a version 0 receiver would need to
>treat the TSval as completely opaque entity (no PAWS allowed):

I didn't mean you to have to waste a bit now saying whether MASK is valid.

My comments were in the context of objecting to the text added in the 
-02 PDF draft: "MASK: ... This field MUST be present in all future 
version of timestamp capability fields."

All you need to do now is *not* say this sentence.

Then, if someone doesn't want MASK to be valid in a future version, 
they just need to say in the spec of v3 (say) that the bits where 
MASK was in v0 have a different usage.


>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>|E|V|V|         #               |         |                     |
>|X|L|E|  MASK   #      RES      |   EXP   |        FRAC         |
>|O|D|R|         #               |         |                     |
>+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>
>VLD (valid) ... must be set (MASK valid); if unset, treat TSval opaque
>VER (or TYPE) ... 0 (or 1 - compatible with v0 receivers)
>MASK # of LSB bits to exclude from PAWS (timestamp) calculations
>
>If VLD=0, MASK gives 5 bits to burn otherwise; some possible features
>that would require MASK=31 can simply set VLD=0 for the same purpose...
>
>
>
>
> > > > However, I'm not convinced range negotiation is important. Can you
> > >motivate it?
> > >
> > >Not really (the reason why it was pushed to an appendix as an
> > example);
> > >The only sensible idea I have would be a hardware-assisted TCP stack,
> > >which does the timing processing on the NIC; with range negotiation,
> > >both TCP timestamp clocks could run at the exact same frequency,
> > >simplifying one-way delay variance calculations (so that wirespeed @
> > 10G
> > >Ethernet becomes a possibility). As long as one-way delay variance
> > >calculations are carried out in software (main CPU), doing a division
> > >(or at least one multiplication) instead of only addition/subtraction
> > >should have very little impact really...
> >
> > But then range negotiation doesn't help anyway, if both ends have a
> > set of discrete frequencies their hardware can use and they are
> > trying to find one they have in common. This requires an enumeration,
> > not a range. That's tough to fit into a confined space.
>
>I believe it was Anantha Ramaiah, who also suggested to have an
>enumerated "typical" rate/frequency... With 24 bits available, 24
>"common" rates could be defined in a range-negotiation type (sender
>sends timestamp capability with the locally supported rates (appropriate
>bits set), and the receiver picks the one it can also support (or none)
>in the 2nd packet...

Yes, but we need to work out a way to future-proof a look-up table of 
"common" rates. It is hard to imagine picosecond or femtosecond 
resolution now, but it will happen. We need something like an 
exponent (scaling factor), where we can define a list of common 
numbers, but put a scaling factor in the capability negotiation, so 
that each "common" value can be multiplied by the same scaling factor.

[Also, 24 bits seems an excessively large range of choice.]


>Anyway, this is all possible, once it becomes clear that the basic
>functionality is worthwhile to have (and exactly the reason, why I try
>to allow a multibit version/type field.

Indeed


Cheers


Bob




>Best regards,
>    Richard

________________________________________________________________
Bob Briscoe,                                BT Innovate & Design 


From touch@isi.edu  Tue May 31 10:01:36 2011
Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B6707E07E0 for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 10:01:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.057
X-Spam-Level: 
X-Spam-Status: No, score=-105.057 tagged_above=-999 required=5 tests=[AWL=1.542, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KXoMCbPAHn2x for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 10:01:35 -0700 (PDT)
Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by ietfa.amsl.com (Postfix) with ESMTP id 74384E068E for <tcpm@ietf.org>; Tue, 31 May 2011 10:01:35 -0700 (PDT)
Received: from [128.9.160.252] (pen.isi.edu [128.9.160.252]) (authenticated bits=0) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id p4VH0mYL024191 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Tue, 31 May 2011 10:00:49 -0700 (PDT)
Message-ID: <4DE51EC0.7070302@isi.edu>
Date: Tue, 31 May 2011 10:00:48 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: "Scheffenegger, Richard" <rs@netapp.com>
References: <5FDC413D5FA246468C200652D63E627A0E99C04B@LDCMVEXC1-PRD.hq.netapp.com> <5FDC413D5FA246468C200652D63E627A0E99C04C@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <5FDC413D5FA246468C200652D63E627A0E99C04C@LDCMVEXC1-PRD.hq.netapp.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: tcpm@ietf.org
Subject: Re: [tcpm] feedback from group to	draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 May 2011 17:01:36 -0000

On 5/28/2011 5:47 AM, Scheffenegger, Richard wrote:
>
> Hi,
>
> this is to summarize the open (technical content) issues, as mentioned
> in earlier emails:
>
> 1)
> naming of the "VER"sion field. As a receiver implementing this draft
> should always respond with at least a capabilities field where this
> field is set to zero (described in the draft), regardless of the version
> used by the sender, the appropriate name for this field may semantically
> more correct be "TYPE" or message-"ID".
>
> It really describes the content of the lower 3 octets, and is intended
> as an enumeration. The semantics how / when to use what value in this
> field is explicit in the spec (and again, a value=0)

That's still basically a version field. It describes not just the 
interpretation of the rest of the field, but also the overall protocol 
used (as per the draft).

However, other than "version 0", there are no useful versions of this, 
AFAICT - see below.

> 2)
> 5-bits mask field: It may be interesting to prevent a (version 0, see
> above) receiver from processing the TSval during the data stream at all,
> while bit-granularity of the mask may not be needed for too many
> applications. At the same time, concern was raised as all the common
> bits are already assigned to fields (even though 3 enumerations of the
> "VER" field are available as future expansions.
>
> One suggestion is to have a separate bit to indicate if MASK is valid,
> and that (smaller, 4-bit) field it not, the receiver stop any TSval
> processing (including PAWS check). This could yield additional (sub-)
> codepoints for future extensions of the negotiation.
>
> There are still 8 unused (reserved) bits, specific to version 0;
> however, some reserved bits are important as long as misbehaving RFC1323
> senders may be operating, to prevent receivers from processing TSval
> improperly.

For reserved fields, you need to specify:

	- whether a VERS=0 sender MUST set them
	- whether a VERS=0 receiver MUST check them
	on receipt or ignore them

"MUST be zero" is ambiguous - it could refer to the sender's actions or 
the receiver's.

I don't like the rest of the description, which may impact the overall 
utility of this option:

            If timestamp capabilities are received with version set
            to 0, but some of these bits set, the receiver MUST ignore
            the extended options field and react as if the TSecr was zero
            (compatibility mode).

There is no requirement that TCP use actual time in timestamps.
Other than "all zeroes" (as spec'd in RFC1323), there are no special 
values for timestamps.

So I'm a bit concerned about the idea of interpreting whether this 
option is in effect simply by the use of unexpected timestamp values.

Joe

From rs@netapp.com  Tue May 31 11:33:05 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 89989E06BC for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 11:33:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.499
X-Spam-Level: 
X-Spam-Status: No, score=-8.499 tagged_above=-999 required=5 tests=[AWL=-0.800, BAYES_00=-2.599, J_CHICKENPOX_23=0.6, MANGLED_PILL=2.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EFP3-GK0KGyb for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 11:33:03 -0700 (PDT)
Received: from mx3.netapp.com (mx3.netapp.com [217.70.210.9]) by ietfa.amsl.com (Postfix) with ESMTP id DF281E06B9 for <tcpm@ietf.org>; Tue, 31 May 2011 11:33:01 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,298,1304319600"; d="scan'208";a="257802313"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx3-out.netapp.com with ESMTP; 31 May 2011 11:32:59 -0700
Received: from ldcrsexc2-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.110]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4VIWwD8004340; Tue, 31 May 2011 11:32:58 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Tue, 31 May 2011 19:32:57 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Tue, 31 May 2011 19:32:14 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99CD3E@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <201105311539.p4VFdDQM030375@bagheera.jungle.bt.co.uk>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [tcpm] discussing MASK - Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: AcwfqP2skNjCQ7ZzTrKiZUkyhIWvOAAA0HFA
References: <201105231457.p4NEvZfm017363@bagheera.jungle.bt.co.uk><5FDC413D5FA246468C200652D63E627A0E89AF5A@LDCMVEXC1-PRD.hq.netapp.com><201105261137.p4QBbc0D012437@bagheera.jungle.bt.co.uk><5FDC413D5FA246468C200652D63E627A0E99BFF6@LDCMVEXC1-PRD.hq.netapp.com> <201105311539.p4VFdDQM030375@bagheera.jungle.bt.co.uk>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "Bob Briscoe" <bob.briscoe@bt.com>
X-OriginalArrivalTime: 31 May 2011 18:32:57.0941 (UTC) FILETIME=[29EC6850:01CC1FC1]
Cc: tcpm@ietf.org, draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org
Subject: Re: [tcpm] discussing MASK - Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 May 2011 18:33:05 -0000

Hi Bob,

Just to focus on the MASK discussion (I'll address the other comments
later).

> > My question was whether there could be a version where the sender=20
> > ensures timestamps monotonically increase, so MASK would not be=20
> > necessary.

After thinking about this question some more, I believe it really boils
down to the question, should the protocol provision for senders to add
some entropy to TSval or not. That entropy could be used in various
ways, as described in the Use Cases of the draft. MASK cannot be
removed, if the sender maintains monotonicity, but adds entropy
nevertheless...

I have argued that it makes sense to allow a sender to add entropy. Even
if that entropy could ensure that the full 32-bit values remain
monotonic increasing, a receiver would still have to know where the
clock-based timestamp value ends, and entropy starts. In fact, these two
(monotonicity and time value) may not necessarily coincide - but keeping
this identical saves another field.

The receiver may want to do a couple things:

PAWS processing (simple monotonic increase of TSval is sufficient)
Time processing (needs to know where the last significant bit of that
timestamp is in TSVal); time values, by definition, are monotonic
increasing at the sender side.
Entropy processing [short of introducing another word] - perhaps
interesting in a future variant (to replace "version" in the field
definition).


There are alternative ways, how the TSval may be set by the sender. Each
of these MUST be processed correctly by a variant 0 receiver, IMHO.

a) entire TSval is monotonic increasing, but has no direct relation to
wall-clock time=20

This is what a receiver currently has to assume; PAWS processing using
the full TSval is possible. Here, MASK is set to 0, and DUR also set 0
(draft-02 defined DURation as the value of EXP/FRAC together).


b) entire TSval is monotonic increasing, and has a direct relation to
wall-clock time

This is how TSval should be used, but currently a receiver has no way of
knowing exactly what clock tick interval is used by the sender. Here
MASK would be 0, while DUR the duration of a timestamp clock tick
(probably static at compile time).


c) a part of TSVal is monotonic increasing (with or without relationship
to wall-clock time, the timestamp value boundary coinciding with the
monotonic part or not)

A receiver should at least be capable of using the monotonic increasing
part in PAWS processing, even if it does not know how to process the
TSval for other purposes. This was the reason of stipulating that MASK
is always present. A future variant may use a monotonic increasing TSval
for not-time related purposes, but a receiver could still make use of
the monotonic part. MASK would be set to some non-zero value smaller
than 31.=20

d) the sender does not want the receiver to do any processing of TSval
for any heuristics (PAWS or otherwise), because no part of TSval
satisfies the monotonicity restriction imposed by PAWS.

This would be the case with MASK set to 31 meaning a complete masking of
TSval. With MASK=3D31, a variant 0 receiver would disable even PAWS
processing. Obviously, this setting wouldn't really make sense between
two variant0 hosts, because it would remove standard RFC1323 features
entirely. However, this would allow (experimental) deployment of future
variants, where two hosts understanding that variant could make sense of
the TSval again. A variant 1..3 sender may also have to have the
capability to dynamically switch to a legacy mode (with only traditional
use of TSval) if it receives a variant0 answer. Nevertheless, such a
sender may still want to add entropy in that legacy mode (case c above).



If a sender adds some entropy (hash etc), the implementation for the
receiver doesn't change. Receiver side implementation with a fixed MASK
field, which has to be maintained with any future variant, would be easy
(no hash/crypto processing is necessary). Only a fixed number has to be
kept (0-32) at the receiver by which a TSval is right-shifted, before
PAWS processing is done (right shifting by 32 is the same as disabling
PAWS, of course).=20

A future variant may have the sender indicate, what is contained in that
masked-out bits, if it becomes interesting to the receiver to process
this (such as a payload CRC). Also, if such processing is done at the
receiver end, the semantics of the data stream TSval may change (e.g.
TCecr may not be a 1:1 reflection of TSval at all...). But this is of no
concern for a variant0 implementation, really...


Using less than 5 bits would be possible though, but require the
agreement on common numbers of masked bits, and some kind of lookup in
the tcp stacks (possibly wasting a few bits during the session, or not
having enough for some purposes, when only certain mask values are
supported (i.e. a 3-bit field, encoding 0, 2, 4, 8, 12, 16, 24, 32 was
discussed early on between Mirja and me, before we decided to go for
single-bit resolution). As mentioned above, full masking would probably
not make sense, so the mask distribution could be adjusted...



The reason for having MASK in the common part is that a variant0
receiver which sees a variant !=3D 0 negotiated, would have to fall back
to RFC1323 processing, even though it may be preferable for a variant1,
2 or 3 sender, to have at least the semantics of variant0 (e.g. direct
reflection instead of RFC1323 receiver side reflection). Alternatively,
a variant 1..3 sender would have to use a unsecured TSval, that is
monotonic increasing, if a variant0 is at the other end; nearly the same
as a complete fall-back to RFC1323 on the receiver side, though.

Thus if MASK is removed from the common fields, the specs must be very
clear about the expected receiver behavior, and TSval semantics when a
var1..3 talks to a var0 receiver.

Best regards,




Richard Scheffenegger



> -----Original Message-----
> From: Bob Briscoe [mailto:bob.briscoe@bt.com]
> Sent: Dienstag, 31. Mai 2011 17:39
> To: Scheffenegger, Richard
> Cc: tcpm@ietf.org; draft-scheffenegger-tcpm-timestamp-
> negotiation@tools.ietf.org
> Subject: Re: [tcpm] Tech points: draft-scheffenegger-tcpm-timestamp-
> negotiation-01
>=20
> Richard,
>=20
> At 21:16 27/05/2011, Scheffenegger, Richard wrote:
>=20
>=20
> > > From: Bob Briscoe [mailto:bob.briscoe@bt.com]
> >
> >
> > > > > Alternatively, given (offlist) you've said that MASK
> > > > > must be present in all future types of capability
> > > > > negotiation, it could be placed under category (a).
> > > > > However, I'm not sure how you know it will always
> > > > > be needed before future types have all been
> > > > > invented.
> >
> > > > The reason for mask to be present would be to fence off
> > > > parts of TS.val that must be excluded from PAWS
> > > > processing on the receiver side. That was also in the
> > > > context of allowing a version0 receiver to reply always
> > > > with version0, even if it cann't process the timestamp
> > > > value for anything but PAWS (excluding the masked bits,
> > > > where the monotonic increase property can be broken by
> > > > the sender). A future version may specify what is in
> > > > that opaque part for a compatible receiver still...
> >
> > > My question was whether there could be a version where the
> > > sender ensures timestamps monotonically increase, so MASK
> > > would not be necessary.
> > >
> > > I suspect I have not fully understood what the underlying
> > > need for MASK is. The doc seems to hint at two:
> > > a) when not echoing timestamps to the rules in RFC1323,
> > >    they might not monotonically increase
> > > b) some senders add entropy to the low order bits of
> > >    timestamps to protect against malicious tweaking of
> > >    TSecr values
> > >
> > > Am I correct?
>=20
> You haven't answered this question really - it really needs answering
> in the draft as part of describing the problem.
>=20
> > >
> > > My thinking is that b) would not be necessary in trusted
> > > environments. I'm now not sure whether a) will always be
> > > a requirement.
> >
> >Well, transparent timestamps will have to live in a potentially
> hostile
> >environment; For example, Linux 2.6.13 .. 2.6.22 used TSecr directly
> in
> >CUBIC congestion control, and malicious receivers could exploit that
> to
> >get a unfair large share of bandwidth; The linux community then
> appears
> >to have decided to throw out the baby with the bath water - and
> >currently they don't use TSecr directly, but perform a per-segment
RTT
> >tracking based on other clues...
> >
> >If b) is not necessary, the sender (receiver) is free to set =
MASK=3D0.
>=20
> True.
>=20
> HOWEVER, that still means implementers have to support MASK!=3D0. That
> adds quite a bit of complexity to their TCP code to support some
> tricky crypto-heuristics.  Given it seems some implementers cannot
> even get basic things like 2-bit ECN or SACK right, this seems to be
> setting the bar very high.
>=20
> I am not saying security isn't important. I am saying that mandating
> security features is often worse than no security at all, because it
> forces people to have to implement complex security techniques when
> they aren't capable of doing so properly.
>=20
> >However, algorithms critical for TCP (such as congestion control)
> would
> >still need to check the plausibility of returned timestamps (see my
> >comment about CUBIC).
>=20
> Whether to make security mandatory is a question for the w-g if this
> draft becomes a wg item - no need for us to argue further on this.
>=20
>=20
> >With a MASK that could cover almost the entire TSval field, the PAWS
> >test at the receiver is very simple to implement in a future-proof
> >version (simply shift right the TSval before performing PAWS). This
> may
> >become in handy if different kinds of entropy are added the the TSval
> by
> >a future version (that the receiver doesn't understand, but where the
> >sender still wants immediate reflection of TSval, and the receiver
> still
> >can perform PAWS with the last remaining unmasked bit...
>=20
> I agree it's clever. I was just thinking that a future version might
> prefer to have more space for something instead of the MASK.
>=20
> My comments were in the context of this text added in the -02 PDF
> draft: "MASK: ... This field MUST be present in all future version of
> timestamp capability fields."
>=20
> I'm not saying don't include MASK, just don't make it mandatory for
> all future versions. I would try to say as little as possible is a
> MUST for future versions as yet uninvented.
>=20
>=20
> >Having one bit set aside for "TS will be strict monotinic increasing"
> >will not fully solve this, as you still would have to have a mask
> field
> >to demark where potentially added entropy begins; the sender
timestamp
> >clock value should still be recognizable by the receiver...
> >
> >I agree that having 5 bits (0..31) appears to be overkill, but with
64
> >or 128 bit timestamps, the fraction of reserved bits would be
smaller.
> >
> >Perhaps I need to explicitly state, that the non-masked bits have be
> >non-decreasing (monotonic increasing with sequence numbers, strictly
> >monotonic increasing between windows). With that property, having a
> >single unmasked bit is pointless (because +1 and -1 have the same
> >result). Thus a MASK=3D31 could be interpreted as all bits are opaque
to
> a
> >version0 receiver, and the highest meaningful value would be 30
> (leaving
> >the topmost 2 bits for PAWS).
>=20
> That reminds me, I suggest you avoid using the term opaque, but
> instead say exactly what you mean (the field need not have any
> semantics that the echoing node can check or understand).
>=20
> Opaque is nearly as ambiguous as the word transparent, which some
> people use to mean you can see something and others mean you cannot
> see it (you see through it).
>=20
> >Actually, I was just made aware of the fact, that making TSval truly
> >opaque without any constraints such as for PAWS could allow a data
> >payload CRC, as discussed in
> >http://tools.ietf.org/html/draft-ietf-tcpm-anumita-tcp-stronger-
> checksum
> >-00; provided that both ends agree on the scheme (tbd in a IANA
> >sanctioned future version :) ).
> >
> >Does that sound reasonable?
>=20
> Well, it wouldn't make sense to give implementers a dilemma that they
> could only use a bigger checksum or timestamps, but not both.
>=20
> And anyway, I don't think a CRC needs the semantics of echoing.
>=20
>=20
> > > > After thinking about this a bit more (I departed from
> > > > IEEE-754 originally but modified it already as you noted),
> > > > that bit would serve best when re-shuffled for 1 bit higher
> > > > precision.
> > >
> > > I would argue it's OK to depart from IEEE-754 in some ways but not
> > > others:
> > >
> > > - Making the sign bit always 0 and outside the wire protocol is
OK,
> > > because implementers are unlikely to get it wrong
> > >
> > > - Adding 1 bit more precision to binary16 float representation is
> NOT
> > > OK, because implementers will have to tweak low level arithmetic
> code
> > > rather than using common libraries. Given the level of bugs that
> have
> > > recently been found even in really simple things like ECN or SACK,
> it
> > > is important to keep all changes really really simple.
> >
> >I understand that many OS (and SOCs) can not do floating point in the
> >Kernel, and if, then binary16 is often not implemented in hardware.
> >
> >Thus, some bit-banging will be required to find the correction factor
> >between local and remote timestamp clocks for OWD calculation.
> >
> >Actually, your argument works in my favor :)
> >
> >If the representation is suspiciously close to IEEE-754, an
> implementer
> >may simply put that into a regular float16 library; however, EXP=3D31
> >would mean infinitiy/NaN in such an regular IEEE library, which would
> >probably lead to even more intricate bugs...
>=20
> It would be best to specifically say that EXP=3D31 MUST signal
> something exceptional, so implementers have to catch it before a
> library does.
>=20
>=20
> >Having a spec that is off by 1 from something very similar may help
> >implementers to read the spec more closely (to discover that one can
> not
> >simply put this into a float16, multiply with 2^-13 and be done with
> >parsing it).
> >
> >Anyway, I'm not feeling very strong about how this bit finally ends
up
> >either way :)
>=20
> I see your point. I would lean towards simplicity wherever possible,
> rather than making implementers have to do complex coding.
>=20
>=20
> > > > The only special value would be all-zero (this is included
> > > > in the PDF, I hope you reviewed that text) to indicate the
> > > > timestamp values are not related to wall-clock time.
> > >
> > > Yes, I noticed that.
> > >
> > > In case anyone listening on the list is confused, Richard sent me
> an
> > > offlist preview of the proposed next rev, in PDF format.
> >
> >One point here - setting the indicated clock rate to 0 would not be
> the
> >same as masking everything out (in my thinking), as even not wall-
> clock
> >related TSval would have the property to monotonic increase with
> >advancing segments... (allowing PAWS by the receiver, even when
> enhanced
> >timing uses as to be disabled).
>=20
> Understood.
>=20
>=20
> > > > > 4/ Could the MASK use up only 4 bits? Is anyone ever likely to
> >want
> > > > > to mask more than 2**4=3D16 bits, given you already say =
(Section
> >6.4)
> > > > > that MASK>8 is discouraged.
> > > >
> > > > The mask field would make up to 31 bits completely opaque to the
> > > > receiver (thus exclude those bits also from PAWS receiver-side
> > > > processing, lifting any constraints like the monotonic
increasing
> > > > values - PAWS would need that).
> > > >
> > > > The unstated reason for the full 5 bits was to cover up to 1/4th
> > > > of the TS value, even when TCPCT (RFC6013) with the 64 / 128 bit
> > > > timestamps are in use, or if the sender wants to have a true
hash
> > > > value (segment digest) with each segment, or when the future
> > > > contents of TS.val would be at odds with PAWS processing...
> > >
> > > Understood. Will you be explaining this reasoning in the next rev?
> >
> >Will do!
> >
> >
> > >
> > > I assume 1/4 is a member of the set of made-up numbers, see
> > > <http://search.dilbert.com/comic/Made%20Up%20Numbers> ;)
> >
> >You need to start somewhere :) 8 bits out of 32 (as in the draft
> already
> >recommended) is also 1/4 - we have consistency there.
> >
> >
> >
> >
> > > > Also, a future version could define some use of these masked
> > > > out bits, but remain compatible with a version 0 receiver.
> > > > Such a receiver would treat the TS.val as opaque entity again,
> > > > and only the immediate echoing would remain as compatibility
> > > > with future instances...
> > >
> > > Again, I think this discussion shows that it would be useful to
> talk
> > > in the draft about what will be needed for any version and what
> will
> > > be only in some.
> > >
> > > The WG needs to be able to decide where to draw the line between
> > > features that might be nice (mission creep), and features that are
> > > necessary.
> >
> >
> >I'll be happy to take advice in that space. Based on this discussion,
> I
> >did expand the possible use cases where a mandatory "mask" would make
> >sense for backwards compatibility; as discussed, the unmasked portion
> >would hold to the constraints imposed by rfc1323 (monotonic increase
> in
> >value between segments), while the masked (opaque) part is excluded
> from
> >this limitation.
> >
> >
> >
> > > > > Perhaps instead it would be better to use the following
format:
> > > > >    0                   1                   2
> 3
> > > > >    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
> 0 1
> > > > >
> >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> > > +
> > > > >    |E| R |T|       |               |R|         |
> > > |
> > > > >    |X| E |Y| MASK  |      RES      |E|  EXP16  |      FRAC16
> > > |
> > > > >    |O| S |P|       |               |S|         |
> > > |
> > > > >    | |   |E|       |               | |         |
> > > |
> > > > >
> >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> > > +
> > > > > The 2-bit reserved field before the type would be useful if
the
> > > > > number of types needed is eventually >2. Leaving it reserved
> >allows
> > > > > it to be used for some other purpose orthogonal to types, if
> nec.
> > > > > Note I've called the third field 'TYPE', whereas in draft-01
> you
> > > > > called it a reserved flag, then called it RNG in the appendix.
> > > >
> > > >RNG is no longer in the draft (the PDF version I sent you via
> email),
> > > >that was in version -01.
> > >
> > > Yup, I had noticed that. But, because I cc'd my email to the list,
> I
> > > tried to relate the discussion to draft-01 that everyone has seen,
> > > not the offlist preview you sent me.
> > >
> > > >Appendix A refers to version 1, which would be
> > > >sender (SYN, no ACK) only, and responded to with certain
semantics
> -
> > > so
> > > >that an identical (or at least known) timestamp clock rate can be
> > > >arrived with only one exchange in each direction. (it'll be some
> time
> > > >before tcp stacks would allow dynamic adjustment of their
> timestamp
> > > >clock rates, I guess :) ).
> > > >
> > > > > (An alternative might have been to call it a version field,
but
> > > that
> > > > > wouldn't be quite right. For instance, in Appendix A, although
> the
> > > > > TCP server understands what I would call type 1, it replies
> with a
> > > > > type 0 option. This doesn't have the same semantics as
> >versioning.)
> > > >
> > > >Yes, the -02 PDF has a 2-bit version field with the difference in
> > > >semantics you point out. I really want this to be an enumeration,
> and
> > > >the exact semantics (like a version 0 receiver is free to either
> > > ignore
> > > >higher versions, or respond simply with version 0 always) are in
> the
> > > >specs.
> > > >
> > > >I'm unhappy with "type" because that sounds like request/response
> -
> > > but
> > > >implicitly, the active sender (SYN, no ACK bit set) is the
> requester,
> > > >and the passive sender (receiver, SYN+ACK) is the responder.
> Having
> > > two
> > > >different definitions in a single version of the timestamp
> > > negotiation,
> > > >depending on direction(i.e. if the ACK bit is set or not), with
> one
> > > >being identical to version 0 sounds like overkill - version 0
> would
> > > >already have the appropriate signaling for the response...
> > > >
> > > >In the PDF, I settled on version as that deviates the least as I
> > > >understand it in the effective semantics.
> >
> >
> > >
> > > To repeat back to you, you are saying all version 0
implementations
> > > must also support version 1? Then I don't think 'version' is the
> > > right word for this. If you don't like 'type' either, then we need
> > > another word. What about 'kind'?
> >
> >
> >Well, a receiver that understands the EXO bit, has to reply with at
> >least version 0 (regardless of the version sent to it).
> >
> >For a received version 0, it can do additional checks (to verify if
> the
> >sender wasn't just misbehaving), while for any other version, the
> >receiver couldn't to local time-based calculations, only the
> traditional
> >PAWS check...
> >
> >I think the EXO bit is what you had as TYPE, with my VERsion field
> being
> >specifics about the semantics...
>=20
> No, I wasn't thinking that. Your EXO field is like a "Not vRFC1323"
> flag, so it's a part of a version field. Whereas your VER field
> doesn't seem to have the semantics of Version, because a v1 receiver
> replies to a v1 request with a v0 reply. This is purely a terminology
> problem, nothing serious.
>=20
>=20
> >If it aligns better with standing naming practices, perhaps "VER"
> should
> >really be a message type as a whole, and the semantics describe, how
a
> >receiver has to deal with unknown types (respond with type 0, or
> >(discouraged) traditional RFC1323).
> >
> >Basically:
> >
> >(receiver version 0)
> >EXO=3D0 -> RFC1323
> >EXO=3D1, VER=3D0 -> VER=3D0
> >EXO=3D1, VER=3D1..3 -> VER=3D0 (or RFC1323)
> >
> >A future spec may stipulate
> >
> >EXO=3D1, VER1 -> VER1 in return, or (as in Appendix A) VER1 -> VER0
> etc...
>=20
> I agree with all this about specifying what the receiver must do. My
> only concern was to choose a word that doesn't confuse people, rather
> than redefining the meaning of a word like 'version' that actually
> has different semantics in normal usage.
>=20
> The term 'version' normally has the semantics than a lower version
> implementation cannot understand a higher version protocol. If you
> want to stray from that, I suggest you find a different word.
>=20
>=20
>=20
>=20
> > > > > 5/ For range negotiation (RNG, Appendix A), why not use the
> same
> > > > > FRAC12 field for the hi and lo end of the range, and only
> > > communicate
> > > > > hi and lo EXP12 fields? Surely it's not so important to
specify
> >the
> > > > > ends of the ranges precisely.
> > > >
> > > >I really couldn't come up with a decent use case for range
> > > negotiation,
> > > >where it would be important that both ends run at the same clock
> >rate;
> > > >That's why I put this whole range negotiation stuff into an
> appendix,
> > > as
> > > >just an example of a future enhancement.
> > > >
> > > >You are completely correct, depending on what one wants to
achieve
> > > with
> > > >something like a range negotiation, more bits could be saved or
> > > shuffled
> > > >to better uses.
> > > >
> > > >
> > > >
> > > > > This would leave more space reserved for future stuff, when
> range
> > > >negotiation was also needed.
> > > >
> > > >A roll of the version number, and reshuffling of the signaling
> bits
> > > >appropriately (i.e. a certain use may only be possible with tcp
> > > >timestamp clocks ticking between micro- and milliseconds, instead
> of
> > > >nano- to tenth's of seconds) would achive the same. Of course, if
> >this
> > > >signaling takes off, and people figure out clever ways to further
> > > >utilize a timing / sideband channel in TCP, having just 4
> enumerated
> > > >versions available may be too few :). I think this discussion can
> be
> > > >postponed until version 3 of the negotiation signaling is to be
> > > defined
> > > >:)
> > >
> > > You'll notice I suggested 2 bits of reserved space before the
1-bit
> > > type/kind/version field. That would allow extension to 8
> > > types/kinds/versions. But it also allows some other future use to
> > > burn those bits if we decide it's more important than allowing
> space
> > > for more types in future.
> > >
> > > HOWEVER, every new variant faces a new deployment problem.
> > > I strongly believe we should try to settle on a core extension to
> > > timestamping that the WG believes includes enough features for
> > > important stuff like one-way delay, and we draw a line above
"would
> > > be nice to have" mission creep.
> >
> > >From my point of view, 3 enumeration points are still unused; I
> would
> >like to keep the 5-bit mask though, but one bit could indicate if
MASK
> >is valid (and if MASK is not valid, a version 0 receiver would need
to
> >treat the TSval as completely opaque entity (no PAWS allowed):
>=20
> I didn't mean you to have to waste a bit now saying whether MASK is
> valid.
>=20
> My comments were in the context of objecting to the text added in the
> -02 PDF draft: "MASK: ... This field MUST be present in all future
> version of timestamp capability fields."
>=20
> All you need to do now is *not* say this sentence.
>=20
> Then, if someone doesn't want MASK to be valid in a future version,
> they just need to say in the spec of v3 (say) that the bits where
> MASK was in v0 have a different usage.
>=20
>=20
> >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >|E|V|V|         #               |         |                     |
> >|X|L|E|  MASK   #      RES      |   EXP   |        FRAC         |
> >|O|D|R|         #               |         |                     |
> >+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >
> >VLD (valid) ... must be set (MASK valid); if unset, treat TSval
opaque
> >VER (or TYPE) ... 0 (or 1 - compatible with v0 receivers)
> >MASK # of LSB bits to exclude from PAWS (timestamp) calculations
> >
> >If VLD=3D0, MASK gives 5 bits to burn otherwise; some possible =
features
> >that would require MASK=3D31 can simply set VLD=3D0 for the same
> purpose...
> >
> >
> >
> >
> > > > > However, I'm not convinced range negotiation is important. Can
> you
> > > >motivate it?
> > > >
> > > >Not really (the reason why it was pushed to an appendix as an
> > > example);
> > > >The only sensible idea I have would be a hardware-assisted TCP
> stack,
> > > >which does the timing processing on the NIC; with range
> negotiation,
> > > >both TCP timestamp clocks could run at the exact same frequency,
> > > >simplifying one-way delay variance calculations (so that
wirespeed
> @
> > > 10G
> > > >Ethernet becomes a possibility). As long as one-way delay
variance
> > > >calculations are carried out in software (main CPU), doing a
> division
> > > >(or at least one multiplication) instead of only
> addition/subtraction
> > > >should have very little impact really...
> > >
> > > But then range negotiation doesn't help anyway, if both ends have
a
> > > set of discrete frequencies their hardware can use and they are
> > > trying to find one they have in common. This requires an
> enumeration,
> > > not a range. That's tough to fit into a confined space.
> >
> >I believe it was Anantha Ramaiah, who also suggested to have an
> >enumerated "typical" rate/frequency... With 24 bits available, 24
> >"common" rates could be defined in a range-negotiation type (sender
> >sends timestamp capability with the locally supported rates
> (appropriate
> >bits set), and the receiver picks the one it can also support (or
> none)
> >in the 2nd packet...
>=20
> Yes, but we need to work out a way to future-proof a look-up table of
> "common" rates. It is hard to imagine picosecond or femtosecond
> resolution now, but it will happen. We need something like an
> exponent (scaling factor), where we can define a list of common
> numbers, but put a scaling factor in the capability negotiation, so
> that each "common" value can be multiplied by the same scaling factor.
>=20
> [Also, 24 bits seems an excessively large range of choice.]
>=20
>=20
> >Anyway, this is all possible, once it becomes clear that the basic
> >functionality is worthwhile to have (and exactly the reason, why I
try
> >to allow a multibit version/type field.
>=20
> Indeed
>=20
>=20
> Cheers
>=20
>=20
> Bob
>=20
>=20
>=20
>=20
> >Best regards,
> >    Richard
>=20
> ________________________________________________________________
> Bob Briscoe,                                BT Innovate & Design
>=20
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm

From rs@netapp.com  Tue May 31 11:33:31 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D4E08E075B for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 11:33:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.86
X-Spam-Level: 
X-Spam-Status: No, score=-9.86 tagged_above=-999 required=5 tests=[AWL=0.739,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lhdmHXjquS+5 for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 11:33:31 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id 66DACE072F for <tcpm@ietf.org>; Tue, 31 May 2011 11:33:30 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,298,1304319600"; d="scan'208";a="251171185"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 31 May 2011 11:33:29 -0700
Received: from ldcrsexc2-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.110]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4VIXTON004357; Tue, 31 May 2011 11:33:29 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Tue, 31 May 2011 19:33:29 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Tue, 31 May 2011 19:32:46 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99CD3F@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <201105311539.p4VFdDQM030375@bagheera.jungle.bt.co.uk>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [tcpm] Other comments - Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
Thread-Index: AcwfqP2skNjCQ7ZzTrKiZUkyhIWvOAAFFVyQ
References: <201105231457.p4NEvZfm017363@bagheera.jungle.bt.co.uk><5FDC413D5FA246468C200652D63E627A0E89AF5A@LDCMVEXC1-PRD.hq.netapp.com><201105261137.p4QBbc0D012437@bagheera.jungle.bt.co.uk><5FDC413D5FA246468C200652D63E627A0E99BFF6@LDCMVEXC1-PRD.hq.netapp.com> <201105311539.p4VFdDQM030375@bagheera.jungle.bt.co.uk>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "Bob Briscoe" <bob.briscoe@bt.com>
X-OriginalArrivalTime: 31 May 2011 18:33:29.0332 (UTC) FILETIME=[3CA24B40:01CC1FC1]
Cc: tcpm@ietf.org, draft-scheffenegger-tcpm-timestamp-negotiation@tools.ietf.org
Subject: Re: [tcpm] Other comments - Tech points: draft-scheffenegger-tcpm-timestamp-negotiation-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 May 2011 18:33:31 -0000

Hi Bob,


> From: Bob Briscoe [mailto:bob.briscoe@bt.com]


>=20
> That reminds me, I suggest you avoid using the term opaque, but
> instead say exactly what you mean (the field need not have any
> semantics that the echoing node can check or understand).
>=20
> Opaque is nearly as ambiguous as the word transparent, which some
> people use to mean you can see something and others mean you cannot
> see it (you see through it).
>=20

Understood :) I'll adjust the wording to avoid these two terms.



> >Actually, I was just made aware of the fact, that making TSval truly
> >opaque without any constraints such as for PAWS could allow a data
> >payload CRC, as discussed in
> >http://tools.ietf.org/html/draft-ietf-tcpm-anumita-tcp-stronger-
> checksum
> >-00; provided that both ends agree on the scheme (tbd in a IANA
> >sanctioned future version :) ).
> >
> >Does that sound reasonable?
>=20
> Well, it wouldn't make sense to give implementers a dilemma that they
> could only use a bigger checksum or timestamps, but not both.
>=20
> And anyway, I don't think a CRC needs the semantics of echoing.

The TSval handling semantics could be adjusted with such a future
variant as well... Timestamps are well known, and not messed up by
middleboxes as much as a new TCP option possibly - at least not by those
middleboxes that allow TS negotiation that is. All I wanted to express
is, that a future variant may want to completely overrule current TSval
processing by disabling even PAWS processing. But it's probably just as
easy to make sure that such a sender is compatible with a variant0
receiver.


> >Thus, some bit-banging will be required to find the correction factor
> >between local and remote timestamp clocks for OWD calculation.
> >
> >Actually, your argument works in my favor :)
> >
> >If the representation is suspiciously close to IEEE-754, an
> implementer
> >may simply put that into a regular float16 library; however, EXP=3D31
> >would mean infinitiy/NaN in such an regular IEEE library, which would
> >probably lead to even more intricate bugs...
>=20
> It would be best to specifically say that EXP=3D31 MUST signal
> something exceptional, so implementers have to catch it before a
> library does.

Already in the draft (signaling, EXP field). However, if the fields
align too straight with a regular IEEE, an implementer may stop reading
short of these details :)

>=20
> >Having a spec that is off by 1 from something very similar may help
> >implementers to read the spec more closely (to discover that one can
> not
> >simply put this into a float16, multiply with 2^-13 and be done with
> >parsing it).
> >
> >Anyway, I'm not feeling very strong about how this bit finally ends
up
> >either way :)
>=20
> I see your point. I would lean towards simplicity wherever possible,
> rather than making implementers have to do complex coding.


Is this an support towards the "implicit precision" suggestion I posted
to the list the other day?

Making a integer from a IEEE-754 requires special processing:

If (exp > 0)=20
  Frac =3D frac | 0x800;
Integer64 =3D frac << exp;

The implicit precision scheme can be implemented as

Integer64 =3D frac << exp;

Directly. Integer would then represent the tick duration at in units of
maximum resolution (3 ps).=20
Processing would only become more complex, when the host does care about
the confidence the partner host has in its clock source.





> >
> >Well, a receiver that understands the EXO bit, has to reply with at
> >least version 0 (regardless of the version sent to it).
> >
> >For a received version 0, it can do additional checks (to verify if
> the
> >sender wasn't just misbehaving), while for any other version, the
> >receiver couldn't to local time-based calculations, only the
> traditional
> >PAWS check...
> >
> >I think the EXO bit is what you had as TYPE, with my VERsion field
> being
> >specifics about the semantics...
>=20
> No, I wasn't thinking that. Your EXO field is like a "Not vRFC1323"
> flag, so it's a part of a version field. Whereas your VER field
> doesn't seem to have the semantics of Version, because a v1 receiver
> replies to a v1 request with a v0 reply. This is purely a terminology
> problem, nothing serious.

> I agree with all this about specifying what the receiver must do. My
> only concern was to choose a word that doesn't confuse people, rather
> than redefining the meaning of a word like 'version' that actually
> has different semantics in normal usage.
>=20
> The term 'version' normally has the semantics than a lower version
> implementation cannot understand a higher version protocol. If you
> want to stray from that, I suggest you find a different word.

As a native speaker, would you recommend "TYPE" (as in message type), or
identifier, or variant? Or some other term not yet mentioned?





> >I believe it was Anantha Ramaiah, who also suggested to have an
> >enumerated "typical" rate/frequency... With 24 bits available, 24
> >"common" rates could be defined in a range-negotiation type (sender
> >sends timestamp capability with the locally supported rates
> >(appropriate bits set), and the receiver picks the one it can=20
> >also support (or none) in the 2nd packet...
>=20
> Yes, but we need to work out a way to future-proof a look-up table of
> "common" rates. It is hard to imagine picosecond or femtosecond
> resolution now, but it will happen. We need something like an
> exponent (scaling factor), where we can define a list of common
> numbers, but put a scaling factor in the capability negotiation, so
> that each "common" value can be multiplied by the same scaling factor.
>=20
> [Also, 24 bits seems an excessively large range of choice.]


I believe the DURation as defined in the draft gives the best
granularity over a reasonable large range (10s-10ps). Femtoseconds will
definitely require a roll in the variant number :), or the use of some
reserved variant0 bits.

The negotiation sketch in Appendix A would appear to be more flexible
than any bit-mapped lookup table negotiation (ie. Sender supports the
rates 10001b, while the receiver only 00100b; what rate will the sender
then choose?). The "implicit precision" stuff could be applied to a
range negotiation as well, btw.



Best regards,
   Richard


From rs@netapp.com  Tue May 31 11:57:33 2011
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 36B8DE06A8 for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 11:57:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.934
X-Spam-Level: 
X-Spam-Status: No, score=-9.934 tagged_above=-999 required=5 tests=[AWL=0.665,  BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tBd1zmj3J7MF for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 11:57:32 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by ietfa.amsl.com (Postfix) with ESMTP id 515A0E07D7 for <tcpm@ietf.org>; Tue, 31 May 2011 11:57:32 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.65,299,1304319600"; d="scan'208";a="251172323"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 31 May 2011 11:57:31 -0700
Received: from ldcrsexc2-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.110]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p4VIvV14006187; Tue, 31 May 2011 11:57:31 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959);  Tue, 31 May 2011 19:57:31 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Tue, 31 May 2011 19:56:47 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0E99CD4C@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <4DE51EC0.7070302@isi.edu>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [tcpm] feedback from group to	draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
Thread-Index: AcwftGwAP85BRXkQQxqiS6Ahd+AQgwADSCsg
References: <5FDC413D5FA246468C200652D63E627A0E99C04B@LDCMVEXC1-PRD.hq.netapp.com> <5FDC413D5FA246468C200652D63E627A0E99C04C@LDCMVEXC1-PRD.hq.netapp.com> <4DE51EC0.7070302@isi.edu>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "Joe Touch" <touch@isi.edu>
X-OriginalArrivalTime: 31 May 2011 18:57:31.0013 (UTC) FILETIME=[97F15B50:01CC1FC4]
Cc: tcpm@ietf.org
Subject: Re: [tcpm] feedback from group to	draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 May 2011 18:57:33 -0000

Hello Joe,

thank you for your comment.


> That's still basically a version field. It describes not just the
> interpretation of the rest of the field, but also the overall protocol
> used (as per the draft).

Exactly; if a sender initiates a session with VER=3D1, but receives a
VER=3D0 in the <SYN,ACK>, the sender should operate in a mode, that is
compatible with how a VER=3D0 receiver processes the TSval (and echoes =
it
in TSecr).


=20
> However, other than "version 0", there are no useful versions of this,
> AFAICT - see below.

Well, the range negotiation stuff would be one instance, where a
different VER would become necessary (because the field definition
changes). Again, from the current point of view, providing 3 additional
codepoints for future expansion seems enough to warrant the bit-granular
MASK field.



=20
> > 2)
> > 5-bits mask field: It may be interesting to prevent a (version 0,
see
> > above) receiver from processing the TSval during the data stream at
> all,
> > while bit-granularity of the mask may not be needed for too many
> > applications. At the same time, concern was raised as all the common
> > bits are already assigned to fields (even though 3 enumerations of
> the
> > "VER" field are available as future expansions.
> >
> > One suggestion is to have a separate bit to indicate if MASK is
> valid,
> > and that (smaller, 4-bit) field it not, the receiver stop any TSval
> > processing (including PAWS check). This could yield additional
(sub-)
> > codepoints for future extensions of the negotiation.
> >
> > There are still 8 unused (reserved) bits, specific to version 0;
> > however, some reserved bits are important as long as misbehaving
> RFC1323
> > senders may be operating, to prevent receivers from processing TSval
> > improperly.
>=20
> For reserved fields, you need to specify:
>=20
> 	- whether a VERS=3D0 sender MUST set them
> 	- whether a VERS=3D0 receiver MUST check them
> 	on receipt or ignore them
>=20
> "MUST be zero" is ambiguous - it could refer to the sender's actions
or
> the receiver's.

Thank you; in fact, these bits MUST be set by the sender.

>=20
> I don't like the rest of the description, which may impact the overall
> utility of this option:
>=20
>             If timestamp capabilities are received with version set
>             to 0, but some of these bits set, the receiver MUST ignore
>             the extended options field and react as if the TSecr was
> zero
>             (compatibility mode).
>=20
> There is no requirement that TCP use actual time in timestamps.
> Other than "all zeroes" (as spec'd in RFC1323), there are no special
> values for timestamps.
>=20
> So I'm a bit concerned about the idea of interpreting whether this
> option is in effect simply by the use of unexpected timestamp values.

The wording is not clear, thanks for bringing this up.=20

If a receiver observes some RES bits set to one (which can only happen
with misbehaving RFC1323 senders not clearing TSecr in <SYN>), it should
simply reflect the received TSval from the <SYN> in TSecr in <SYN,ACK>.
That is, behave like a standard RFC1323 host. During the session,
timestamp processing would follow regular RFC1323 rules.

If the receiver determines that a sender compliant to this draft is
negotiating, the <SYN,ACK> TSecr is set to (<SYN> TSval) XOR (receiver
TS capabilities).



The idea behind the XOR is, that a sender can discriminate between
different <SYN> retransmissions:

<SYN> TSval =3D 0x010000, TSecr=3Dcapability negotiation
 (lost)
<SYN> TSval =3D 0x020000, TSecr=3Dcapability negotiation
 (delayed)
<SYN> TSval =3D 0x030000, TSecr=3Dcapability negotiation
<SYN,ACK> TSval =3D <xxx>, TSecr=3D0x200000 XOR capability negotiation.

If the sender modifies the TS offset in that 3rd octet, but keeps only
state for the initial TSval, it can determine which <SYN> made it to the
receiver by checking only a few bits (instead of keeping track of all
the sent SYN TSvals). But sampling RTT on SYN is discouraged, so I
didn't mention this aspect in the draft.


Thanks,

  RIchard

From touch@isi.edu  Tue May 31 16:49:06 2011
Return-Path: <touch@isi.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6FB4EE0721 for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 16:49:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.598
X-Spam-Level: 
X-Spam-Status: No, score=-102.598 tagged_above=-999 required=5 tests=[AWL=-1.199, BAYES_00=-2.599, J_CHICKENPOX_23=0.6, J_CHICKENPOX_33=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rUioV5hjJQao for <tcpm@ietfa.amsl.com>; Tue, 31 May 2011 16:49:05 -0700 (PDT)
Received: from vapor.isi.edu (vapor.isi.edu [128.9.64.64]) by ietfa.amsl.com (Postfix) with ESMTP id 8287EE06DD for <tcpm@ietf.org>; Tue, 31 May 2011 16:49:05 -0700 (PDT)
Received: from [128.9.160.166] (abc.isi.edu [128.9.160.166]) (authenticated bits=0) by vapor.isi.edu (8.13.8/8.13.8) with ESMTP id p4VNmYu9020757 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Tue, 31 May 2011 16:48:35 -0700 (PDT)
Message-ID: <4DE57E52.5030501@isi.edu>
Date: Tue, 31 May 2011 16:48:34 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: "Scheffenegger, Richard" <rs@netapp.com>
References: <5FDC413D5FA246468C200652D63E627A0E99C04B@LDCMVEXC1-PRD.hq.netapp.com> <5FDC413D5FA246468C200652D63E627A0E99C04C@LDCMVEXC1-PRD.hq.netapp.com> <4DE51EC0.7070302@isi.edu> <5FDC413D5FA246468C200652D63E627A0E99CD4C@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <5FDC413D5FA246468C200652D63E627A0E99CD4C@LDCMVEXC1-PRD.hq.netapp.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: tcpm@ietf.org
Subject: Re: [tcpm] feedback from group to	draft-scheffenegger-tcpm-timestamp-negotiation-02.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 May 2011 23:49:06 -0000

Hi, Richard,

On 5/31/2011 11:56 AM, Scheffenegger, Richard wrote:
...
>> I don't like the rest of the description, which may impact the overall
>> utility of this option:
>>
>>              If timestamp capabilities are received with version set
>>              to 0, but some of these bits set, the receiver MUST ignore
>>              the extended options field and react as if the TSecr was
>> zero
>>              (compatibility mode).
>>
>> There is no requirement that TCP use actual time in timestamps.
>> Other than "all zeroes" (as spec'd in RFC1323), there are no special
>> values for timestamps.
>>
>> So I'm a bit concerned about the idea of interpreting whether this
>> option is in effect simply by the use of unexpected timestamp values.
>
> The wording is not clear, thanks for bringing this up.
>
> If a receiver observes some RES bits set to one (which can only happen
> with misbehaving RFC1323 senders not clearing TSecr in<SYN>),  it should
> simply reflect the received TSval from the<SYN>  in TSecr in<SYN,ACK>.
> That is, behave like a standard RFC1323 host. During the session,
> timestamp processing would follow regular RFC1323 rules.

OK - that's more clear.

> If the receiver determines that a sender compliant to this draft is
> negotiating, the<SYN,ACK>  TSecr is set to (<SYN>  TSval) XOR (receiver
> TS capabilities).

How do you know whether it's a sender compliant to the draft 
negotiating, vs. a noncompliant RFC1323 host?

> The idea behind the XOR is, that a sender can discriminate between
> different<SYN>  retransmissions:
>
> <SYN>  TSval = 0x010000, TSecr=capability negotiation
>   (lost)
> <SYN>  TSval = 0x020000, TSecr=capability negotiation
>   (delayed)
> <SYN>  TSval = 0x030000, TSecr=capability negotiation
> <SYN,ACK>  TSval =<xxx>, TSecr=0x200000 XOR capability negotiation.
>
> If the sender modifies the TS offset in that 3rd octet, but keeps only
> state for the initial TSval, it can determine which<SYN>  made it to the
> receiver by checking only a few bits (instead of keeping track of all
> the sent SYN TSvals). But sampling RTT on SYN is discouraged, so I
> didn't mention this aspect in the draft.

If the only example you have for using these bits is something you 
discourage, perhaps it's worth omitting it. ;-)

Joe

