From owner-ips@ece.cmu.edu  Sun Apr  1 17:04:24 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA07268
	for <ips-archive@odin.ietf.org>; Sun, 1 Apr 2001 17:04:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f31JKKN09967
	for ips-outgoing; Sun, 1 Apr 2001 15:20:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f31JK7r09956
	for <ips@ece.cmu.edu>; Sun, 1 Apr 2001 15:20:07 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Sun Apr  1 15:17:32 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Sun Apr  1 15:19:37 EDT 2001
Received: (from sandeepj@localhost)
	by aura.research.bell-labs.com (8.9.1/8.9.1) id PAA20734
	for ips@ece.cmu.edu; Sun, 1 Apr 2001 15:19:37 -0400 (EDT)
Date: Sun, 1 Apr 2001 15:19:37 -0400 (EDT)
Message-Id: <200104011919.PAA20734@aura.research.bell-labs.com>
From: sandeepj@research.bell-labs.com (Sandeep Joshi)
To: ips@ece.cmu.edu
Subject: Re: frame formats
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

To steal & reintroduce that idea, is it possible to borrow 
a bit or two from the DataLen field to parity-check the AHSLen ?   

Adding an entire BHS header digest, when all you need to verify 
is that AHS-length field, seemed overkill.   On the other hand,
Format-1's chief disadvantage was that it required some processing 
(QL) to locate either/both the length fields.   This has complicated
the choice between all these 3 formats resulting in the current vote.

Borrowing the bits for AHS parity from DataLen solves this problem.

DataLen will now be max 8M/4M but then we dont wish to have large 
iSCSI PDUs in any case.  Btw, I assume Next DataLen AHS does not 
exist in the new setup ?

-Sandeep

> Well, perhaps I was just not quick enough.  I thought I would let this
> settle out a bit before I added my two cents.
>  
> If you all remember, some folks on this reflector gave Julian a hard time
> because you would have to use a length field that you were not sure was OK,
> if you had a digest error and wanted to jump forward to the next, etc. etc.
> etc.  I am sure you all remember this.  OK, now that Julian proposed a
> parity way to ensure that you could trust the length field, some of the
> parties, have now, I think, voted for format #2.  Unless you want now to
> reconsider your vote, we should stop giving Julian a hard time about the
> length not being ensured correct in the presents of a Digest Error.
>  
> Either drop the session, or use the length to see if you can get somewhere,
> search for the next marker etc.  All the stuff you said you did not like
> before.  OK, now you have format 2, but lets not go over that old ground
> now that you have decided against the parity.
> 
> 
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>  
>  
> Julian Satran/Haifa/IBM@IBMIL@ece.cmu.edu on 03/30/2001 08:51:50 AM
>  
> Sent by:  owner-ips@ece.cmu.edu
>  
>  
> To:   ips@ece.cmu.edu
> cc:
> Subject:  frame formats
>  
>  
>  
>  
>  
> Dear colleagues,
>  
> It look like Format-2 is selected by popular vote.
>  
> Julo
>  
>  
>  


From owner-ips@ece.cmu.edu  Sun Apr  1 17:06:55 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA07897
	for <ips-archive@odin.ietf.org>; Sun, 1 Apr 2001 17:06:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f31IuKT08602
	for ips-outgoing; Sun, 1 Apr 2001 14:56:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f31Iu8r08593
	for <ips@ece.cmu.edu>; Sun, 1 Apr 2001 14:56:08 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Sun Apr  1 14:52:17 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Sun Apr  1 14:54:22 EDT 2001
Received: (from sandeepj@localhost)
	by aura.research.bell-labs.com (8.9.1/8.9.1) id OAA20344;
	Sun, 1 Apr 2001 14:54:21 -0400 (EDT)
Date: Sun, 1 Apr 2001 14:54:21 -0400 (EDT)
Message-Id: <200104011854.OAA20344@aura.research.bell-labs.com>
From: sandeepj@research.bell-labs.com (Sandeep Joshi)
To: hufferd@us.ibm.com, ips@ece.cmu.edu
Subject: Re: frame formats
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Well, perhaps I was just not quick enough.  I thought I would let this
settle out a bit before I added my two cents.
 
If you all remember, some folks on this reflector gave Julian a hard time
because you would have to use a length field that you were not sure was OK,
if you had a digest error and wanted to jump forward to the next, etc. etc.
etc.  I am sure you all remember this.  OK, now that Julian proposed a
parity way to ensure that you could trust the length field, some of the
parties, have now, I think, voted for format #2.  Unless you want now to
reconsider your vote, we should stop giving Julian a hard time about the
length not being ensured correct in the presents of a Digest Error.
 
Either drop the session, or use the length to see if you can get somewhere,
search for the next marker etc.  All the stuff you said you did not like
before.  OK, now you have format 2, but lets not go over that old ground
now that you have decided against the parity.
 


From owner-ips@ece.cmu.edu  Sun Apr  1 19:36:51 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA28853
	for <ips-archive@odin.ietf.org>; Sun, 1 Apr 2001 19:36:50 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f31IuMc08605
	for ips-outgoing; Sun, 1 Apr 2001 14:56:22 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f31Iu3r08588
	for <ips@ece.cmu.edu>; Sun, 1 Apr 2001 14:56:03 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by dirty; Sun Apr  1 14:54:27 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Sun Apr  1 14:54:27 EDT 2001
Received: (from sandeepj@localhost)
	by aura.research.bell-labs.com (8.9.1/8.9.1) id OAA20348;
	Sun, 1 Apr 2001 14:54:26 -0400 (EDT)
Date: Sun, 1 Apr 2001 14:54:26 -0400 (EDT)
Message-Id: <200104011854.OAA20348@aura.research.bell-labs.com>
From: sandeepj@research.bell-labs.com (Sandeep Joshi)
To: hufferd@us.ibm.com, ips@ece.cmu.edu
Subject: Re: frame formats
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Well, perhaps I was just not quick enough.  I thought I would let this
settle out a bit before I added my two cents.
 
If you all remember, some folks on this reflector gave Julian a hard time
because you would have to use a length field that you were not sure was OK,
if you had a digest error and wanted to jump forward to the next, etc. etc.
etc.  I am sure you all remember this.  OK, now that Julian proposed a
parity way to ensure that you could trust the length field, some of the
parties, have now, I think, voted for format #2.  Unless you want now to
reconsider your vote, we should stop giving Julian a hard time about the
length not being ensured correct in the presents of a Digest Error.
 
Either drop the session, or use the length to see if you can get somewhere,
search for the next marker etc.  All the stuff you said you did not like
before.  OK, now you have format 2, but lets not go over that old ground
now that you have decided against the parity.
 


From owner-ips@ece.cmu.edu  Mon Apr  2 02:59:41 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA28179
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 02:59:40 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f31JsLf11870
	for ips-outgoing; Sun, 1 Apr 2001 15:54:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f31Jrxr11856
	for <ips@ece.cmu.edu>; Sun, 1 Apr 2001 15:53:59 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id VAA178746
	for <ips@ece.cmu.edu>; Sun, 1 Apr 2001 21:53:46 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id VAA152714
	for <ips@ece.cmu.edu>; Sun, 1 Apr 2001 21:52:00 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A21.006D067A ; Sun, 1 Apr 2001 21:50:50 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A21.006D047C.00@d12mta02.de.ibm.com>
Date: Sun, 1 Apr 2001 21:54:12 +0200
Subject: Re: frame formats
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Sure - the "deep processing" required was to look at a bit.
And no we went  over the lengths. There is no need for a parity check for a
one byte field.
The Header digest plus an ocasional timeout will take care of checks.

Julo

sandeepj@research.bell-labs.com (Sandeep Joshi) on 01/04/2001 21:19:37

Please respond to sandeepj@research.bell-labs.com (Sandeep Joshi)

To:   ips@ece.cmu.edu
cc:
Subject:  Re: frame formats




To steal & reintroduce that idea, is it possible to borrow
a bit or two from the DataLen field to parity-check the AHSLen ?

Adding an entire BHS header digest, when all you need to verify
is that AHS-length field, seemed overkill.   On the other hand,
Format-1's chief disadvantage was that it required some processing
(QL) to locate either/both the length fields.   This has complicated
the choice between all these 3 formats resulting in the current vote.

Borrowing the bits for AHS parity from DataLen solves this problem.

DataLen will now be max 8M/4M but then we dont wish to have large
iSCSI PDUs in any case.  Btw, I assume Next DataLen AHS does not
exist in the new setup ?

-Sandeep

> Well, perhaps I was just not quick enough.  I thought I would let this
> settle out a bit before I added my two cents.
>
> If you all remember, some folks on this reflector gave Julian a hard time
> because you would have to use a length field that you were not sure was
OK,
> if you had a digest error and wanted to jump forward to the next, etc.
etc.
> etc.  I am sure you all remember this.  OK, now that Julian proposed a
> parity way to ensure that you could trust the length field, some of the
> parties, have now, I think, voted for format #2.  Unless you want now to
> reconsider your vote, we should stop giving Julian a hard time about the
> length not being ensured correct in the presents of a Digest Error.
>
> Either drop the session, or use the length to see if you can get
somewhere,
> search for the next marker etc.  All the stuff you said you did not like
> before.  OK, now you have format 2, but lets not go over that old ground
> now that you have decided against the parity.
>
>
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> Julian Satran/Haifa/IBM@IBMIL@ece.cmu.edu on 03/30/2001 08:51:50 AM
>
> Sent by:  owner-ips@ece.cmu.edu
>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  frame formats
>
>
>
>
>
> Dear colleagues,
>
> It look like Format-2 is selected by popular vote.
>
> Julo
>
>
>





From owner-ips@ece.cmu.edu  Mon Apr  2 06:04:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA23534
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 06:04:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f325Eg013130
	for ips-outgoing; Mon, 2 Apr 2001 01:14:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel2.hp.com (atlrel2.hp.com [156.153.255.202])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f325Dtr13106
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 01:13:55 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by atlrel2.hp.com (Postfix) with ESMTP id 4C2871085
	for <ips@ece.cmu.edu>; Mon,  2 Apr 2001 01:13:54 -0400 (EDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id WAA12440 for ips@ece.cmu.edu; Sun, 1 Apr 2001 22:14:54 -0700 (PDT)
Message-Id: <200104020514.WAA12440@core.rose.hp.com>
Subject: Re: iSCSI: synch and steering comments
To: ips@ece.cmu.edu
Date: Sun, 01 Apr 2001 22:14:54 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julian,

Thanks for the clarification.

Could you please take time to respond to the other two comments I had?
Or, do I take it that you will get back shortly?

If those comments are indeed incorrect, please help me understand why
so.

Thank you.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com
 

>I've marked it with ---
>
>Matt Wakeley <matt_wakeley@agilent.com> on 31/03/2001 10:25:25
>
>Please respond to Matt Wakeley <matt_wakeley@agilent.com>
>
>To:   IPS Reflector <ips@ece.cmu.edu>
>cc:
>Subject:  Re: iSCSI: synch and steering comments
>
>
>
>
>Julian,
>
>There were many comments in this message.  To which comment are you
>refering
>to?
>
>-Matt
>
>julian_satran@il.ibm.com wrote:
>>
>> Mallikarjun,
>>
>> It is clearly communicated in the paragraph above it - but fine I will
>add
>> it here too.
>>
>> Julo
>>
>> "Mallikarjun C." <cbm@rose.hp.com> on 30/03/2001 00:54:20
>>
>> Please respond to cbm@rose.hp.com
>>
>> To:   ips@ece.cmu.edu
>> cc:
>> Subject:  Re: iSCSI: synch and steering comments
>>
>> Julian,
>>
>> Some comments.
>>
>> >Answers in text. Thanks, Julo
>> >
>> >
>> ..
>>
>> >-Suggest adding the following statement to section 1.2.8.2.
>> >
>> > All conventional, in-order data arrival notifications generated by TCP
>> > are passed through to iSCSI by the Synch and Steering layer after
>> > appropriate data placements while none of the out-of-order data
>> placements
>> > that it performs are communicated to upper layers.
>> >
>> >+++ I have added the following to 1.2.8.2
>> >
>> >   On the incoming path the Synch and Steering layer does not change the
>> >   way TCP notifies iSCSI about in-order data arrival.  All out-of-order
>> >   data placements
>> >   performed by the Synch and Steering layer are hidden from iSCSI.
>-------------------------------------------------------------------------------
>>
>> Okay, I'd however prefer it to imply that in-order data placement is also
>> handled by Synch and Steering in the same sentence, instead of only
>> commenting on in-order notifications, and out-of-order placements.
>>
>-------------------------------------------------------------------------------
>
>> >
>> >   I have aloso changed a bit the figure to convey better the fact that
>> TCP
>> >   and Synch&Steering are related (not strictly layered +++
>>
>> That's a good idea.
>>
>> >
>> >   ++++
>> >
>> >-Section 1.2.8.2 states that a Synch and Steering layer is optional.
>> > It has to be qualifed that it is optional only for those iSCSI devices
>> > which perform connection recovery on header digest errors, since that's
>> > how they cope with loss of framing. (I guess this may change in next
>> rev?)
>> >
>> >+++ with the new format I think that we have:
>> >
>> >- one more chance if we go for format 1 or
>> >- drop the connection on header error
>> >
>> >In both cases we can leave synch and steering optional
>>
>> Well, that doesn't address the thrust of my comment.  I was implying
>> that the draft should make it clear that those implementations which
>> don't support Synch and Steering should end the connection on a header
>> digest error and/or parity error, and not go into (what Somesh called)
>> a speculative mode.
>>
>> >
>> >+++
>> >
>> >-It appears to me that at least one Synch and Steering layer must be
>> > defined/referred to as the minimal implementation in the main draft to
>> > enable interoperability, when implementations do implement Synch and
>> >Steering.
>> >
>> >+++ why ? +++
>>
>> I may be using "interoperability" in a somewhat unconventional sense
>here.
>> While the draft says that Synch and Steering layer is optional, I don't
>see
>> that it requires implementations to always support a "no synch &
>steering"
>> mode, even when they support one type of Synch and Steering layer.  Given
>> that
>> there's no mandatory Synch and Steering layer either, I don't see how two
>> iSCSI boxes that a customer buys are guaranteed to interoperate.  Please
>> comment if the draft already implies what I am asking for.
>>
>> Thanks.
>> --
>> Mallikarjun
>>
>> Mallikarjun Chadalapaka
>> Networked Storage Architecture
>> Network Storage Solutions Organization
>> MS 5668   Hewlett-Packard, Roseville.
>> cbm@rose.hp.com
>
>
>
>




From owner-ips@ece.cmu.edu  Mon Apr  2 12:05:33 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA17384
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 12:05:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32Dih120090
	for ips-outgoing; Mon, 2 Apr 2001 09:44:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32Di4r20058
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 09:44:04 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id JAA05667; Mon, 2 Apr 2001 09:43:25 -0400 (EDT)
Message-ID: <3AC881D0.72D4C17D@cisco.com>
Date: Mon, 02 Apr 2001 08:42:40 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: "Robert D. Russell" <rdr@mars.iol.unh.edu>
CC: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: Re: frame formats
References: <Pine.SGI.4.20.0103311129180.1968-100000@mars.iol.unh.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Bob-

Good point about option 2.  If we have separate BHS and AHS CRCs,
all of lengths are checked.  I don't mind having the extra CRC, since
I really don't think we will see that many PDUs use the AHS.

It is possible to optimize reading option 1 (at least in software).
Just read the 48 bytes + 4 for the digest, and if there is an AHS,
keep the extra four as part of the next read.  So you still have
a single read for most frames, and two for those with AHS.

Still, option 2 does offer the best protection.  I'm fine with
option 2, but could live with option 1.  Anything but 3.

--
Mark

"Robert D. Russell" wrote:
> 
> Mark:
> 
> There is a potentially important distinction between the 2 choices
> that is missing from your summary when you indicate that both
> choices involve a variable length.
> 
> In option 1, a single header digest after the BHS and AHS,
> you do not know when you start reading the header how long
> it will be and therefore you do not know where the digest is.
> This complicates the reading process, since it will have to either
> do the read in 2 steps (1st the BHS and then 2nd the AHS (if any)
> followed by the digest), or 1 read that interprets the data being read
> "on the fly" to extract the AHS length and extend the length of the read
> accordingly.
> 
> In option 2 the first read is always for 48 bytes of header and you
> always know where the digest is.  The second read is needed
> only if the AHS length field in the BHS is non-zero, and its length
> is determined by that AHS length field. However, when the 2nd read
> is started the length IS known and the position of the digest
> IS known -- you do NOT have the "on the fly" searching needed
> in option 1.  This may (or may not) be a simplification.
> 
> The other advantage to option 2 is that the input process never
> has to use unverified data (i.e., the AHS length field) to find
> the digest (and thus verify the data).
> 
> Bob Russell
> InterOperability Lab
> University of New Hampshire
> rdr@iol.unh.edu
> 603-862-3774
> 
> On Fri, 30 Mar 2001, Mark Bakke wrote:
> 
> >
> > Excellent.  Which header digest positioning method will we choose?
> >
> > 1. Single header digest, after BHS and AHS
> >
> > 2. Two header digests, one for BHS, one for AHS
> >
> > 3. Single header digest for BHS; AHS is added to data digest
> >
> > Option 3 will not work well with iSCSI proxies and gateways
> > that may change header information, but keep the data end-to-end.
> >
> > To me, that leaves option 1 and 2.  So, which is easier, having
> > a single header digest in a variable location, or having the
> > potential for two header digests; one in a fixed location, and
> > an optional one in a variable location?
> >
> > I don't believe that we will see AHS on most iSCSI PDUs, so is
> > it OK to have a "slow path" for these?
> >
> > --
> > Mark
> >
> > julian_satran@il.ibm.com wrote:
> > >
> > > Dear colleagues,
> > >
> > > It look like Format-2 is selected by popular vote.
> > >
> > > Julo
> >
> > --
> > Mark A. Bakke
> > Cisco Systems
> > mbakke@cisco.com
> > 763.398.1054
> >

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Mon Apr  2 12:05:50 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA17416
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 12:05:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32EJmD22497
	for ips-outgoing; Mon, 2 Apr 2001 10:19:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32EIpr22454
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 10:18:51 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 88DE594006
	for <ips@ece.cmu.edu>; Mon,  2 Apr 2001 10:18:51 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI: frame formats 
In-Reply-To: Message from sandeepj@research.bell-labs.com (Sandeep Joshi) 
   of "Sun, 01 Apr 2001 15:19:37 EDT." <200104011919.PAA20734@aura.research.bell-labs.com> 
References: <200104011919.PAA20734@aura.research.bell-labs.com> 
Date: Mon, 02 Apr 2001 10:17:33 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010402141851.88DE594006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sandeep,

> DataLen will now be max 8M/4M but then we dont wish to have large 
> iSCSI PDUs in any case.

This max size is getting the the point where I'm sure it'll be an
irritant.

I would like (but, in fact, a sure way to guarantee that it won't
happen is for me to like it :^) to view iSCSI PDUs as the expected
grain size at which you might have software involvement in an
otherwise hardware-driven iSCSI implementation.

For example, when a target is returning read data for multiple
outstanding reads, it might want to return a bit at a time from each,
and each `bit' should be an iSCSI PDU.  Clearly sending these bits
will be a software-level decision.  That certainly was the rational
for allowing multiple FCP DATA PDUs per read operation, and I naively
assumed similar logic was being applied here.

The alternative is to say that the hardware will do iSCSI PDU
chunking, but if that's the case, I expect that the header is, well, a
bit bulky.

I'm also incredibly unexcited about having data length be a multiple
of 4 bytes (if that's still in the cards).  There operations within
the SCSI command set which return arbitrary length data.  There are
perfectly nominal cases where you get less data than you requested
(e.g. inquiry & request sense).  Furthermore, the SCSI architecture
does not prohibit this, even though certain commands do, so it is not
for iSCSI to say anything about this one way or the other.

The problem with handling lengths that include padding comes when
you're trying to move the data into a buffer which is a non-multiple
length.  For example, if I ask for 22 bytes of inquiry data (with a 22
byte buffer), what can I say, at the time a PDU arrives that has 24
bytes?  It might have 21 or 22 bytes (or perhaps even, erroneously 23
or 24 bytes).  A data residual coming later will tell the software how
much was actually there, but it can't tell the hardware.  The typical
expectation of this type of transfer is that it will only overwrite
bytes of the buffer that are actually transferred, but having padded
lengths will not allow this.

The completely standard solution to carrying arbitrary data of
arbitrary length in an aligned transfer unit is to pad the transfer
unit but report the exact (shorter) length.  Another solution, used by
FC, is to carry a pad length.  In iSCSI, why bother---you've just
reintroduced added the 2 bits you were trying to remove?

The data length scenario is not comparable to IP header lengths, where
what is being carried is not arbitrary data.

Certainly, for iSCSI additional header segments (AHSs) you could
arguably use this cell length technique, since we can control what
we're carrying (AHSs that need exact byte lengths will have to be
internally self-describing) but frankly, I still think it's a bad
idea.

I can't understand why we're messing around with all these tricky
`solutions' to standard problems.  We should avoid the temptation to
get cute, and wholesomely provide the same capabilities as any other
SCSI transport.  Specifically:
  o allow long PDUs
  o carry exact data lengths

Steph


From owner-ips@ece.cmu.edu  Mon Apr  2 12:41:16 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA19581
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 12:41:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32Eqx025107
	for ips-outgoing; Mon, 2 Apr 2001 10:52:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from apollo.pirus.com ([63.91.118.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32DKnr18458
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 09:20:50 -0400 (EDT)
Message-ID: <E132D13F58DAAB45AE5D01CA75BD3D56B4F5@OZ>
From: "Binford, Charles" <CBinford@pirus.com>
To: "iSCSI (E-mail)" <ips@ece.cmu.edu>
Subject: FW: iSCSI: Out Of Sequence due to null sequence with multipleconn
	ections.
Date: Mon, 2 Apr 2001 09:20:15 -0400 
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I sent this Friday, but never saw it come back, so I'm resending.....

Charles Binford
Pirus Networks
316.315.0382 x222


Doug, If I'm hearing you correct, you are saying process the task managment
command immediately, but toss any commands waiting for a CmdSN hole to be
filled.   The problem with this approach, is there are many different
flavors of task managemet commands and not all effect all commands.  The
iSCSI layer would have to duplicate the SCSI layer effort of parsing the
task managment request and the LU/tags of all commands it comtemplates
aborting.  I don't like that solution.

David,  I don't think the current state of things is acceptable as you imply
with your final statement of 'I'm not sure anything needs to be changed.'

FC solves this problem by requiring the initiator FCP layer (note: FCP, not
SCSI)to send an ABTS for every exchange 'in an ambiguous state' after any
task management functin.  It is ugly, but is covers the hole of the
initiator not knowing for sure which commands were properly aborted and
which were not.

I have suggested in the past (although, from the response I've gotten, I
think I have been misunderstood) that the iSCSI initiator send all task
management functions down all connections of a session.  I'm NOT saying
duplicate the task management function.  The distinction being only one
instance of the task management function should be passed up to the SCSI
layer on the target side.  

To address the issue of the task management function being delayed because
of a slow connection I suggest the following:  upon receipt of the first
task management function request on any connection, the iSCSI target shall
start a relatively short timer (I'm thinking of a range from a few 100 ms to
a couple of seconds).  As soon as the task management function is received
on all connections, cancel the timer and pass a single instance of the
request up to the SCSI layer.  If the timer expires, close the connection(s)
the task management function(s) did not arrive on and pass up a single
instance of the request to the SCSI layer.

Charles Binford
Pirus Networks
316.315.0382 x222


-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
Douglas Otis
Sent: Tuesday, March 27, 2001 9:42 AM
To: julian_satran@il.ibm.com; ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with
multipleconnections.


Julian,

It could be that a "stuck" command in flight or at the sequencer is the
reason for the task management which is made problematic by the null
sequence.  Even if implementers are aware of this problem, there is not a
good solution until null sequence is removed.  Making a flag to indicate
"immediate" or perhaps "reject prior pending iSCSI commands" is a means of
ensuring the initiator remains in control with respect to all situations.
This ensures the initiator is aware of the state of the target and sequence
of commands can be maintained if required.  Be careful about being too disk
centric.  Treating these SN numbers as unsigned allows a simple means of
tracking.  Here is an example explaination:

   Comparisons and arithmetic on SNs in this document SHOULD use Serial
   Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.

Doug

> David,
>
> Your summary is correct. And (except for a minor point) it is all a matter
> of target implementation.
> SCSI is not a "completely layered" stack and others have gone so far as to
> do task cancellation by an action at link layer (parallel and fiber).
> There might be 2 funny side-effects though if an implementer chooses to
> cancel "holes" (commands in flight on other connections):
>
> 1)the cancelled command is a another task management command (there can be
> only one active but what it the one active gets stucked?)
> 2)(academic I admit) the cancelled command arrives after a wrap around in
> command sequencing; this is a bit harder (although not impossible) to fix
> in the implementation
>
> Implementers should be aware of those side effects.
>
> Regards,
> Julo
>
>
>
> Black_David@emc.com on 26/03/2001 18:57:48
>
> Please respond to Black_David@emc.com
>
> To:   dotis@sanlight.net
> cc:   ips@ece.cmu.edu
> Subject:  RE: iSCSI: Out Of Sequence due to null sequence with
> multipleconn
>       ections.
>
>
[stuff deleted]
>
> (4) is hard.  One SCSI task management command generates one
> response.  That response can either be generated immediately
> (command arrives, is passed to SCSI, SCSI does its thing) or
> at the right point in the sequence (command arrives, is
> sequenced by iSCSI, passed to SCSI at the right point in the
> sequence, and SCSI does its thing), but NOT both.  As things
> currently stand, having a task management command apply to
> in-flight commands requires sending the task management
> command for ordered delivery - so if it's desired to have
> the task management command take immediate effect and also
> catch everything in flight, it's going to have to be sent
> twice.  I'm not enthusiastic about the idea of the task
> management command taking immediate effect but delaying the
> response until everything in flight that might be affected
> arrives, as I suspect the Initiator would like to know what
> happened sooner rather than later.
>
> (3) is "interesting".  The results of applying a SCSI task
> management command to a SCSI operation are known only to
> SCSI, and hence asking that a command stuck in the iSCSI
> sequencer be affected immediately by a task management
> command is asking that the task management command have
> the side effect of changing some of the commands it affects
> to immediate delivery so that it can immediately do its
> (SCSI) thing to them.  I wouldn't want to mandate this,
> nor would I want to prohibit it, BUT ... if the above
> discussion of in-flight commands is correct, I would
> observe that the application on the Initiator side
> can't tell the difference between commands that are in-flight
> vs. waiting for something in-flight on another connection,
> and hence is going to have to issue the task management
> command for ordered delivery if it wants to affect operations
> in either place (and issue a second copy if it wants
> immediate action).
>
> The upshot is that, aside from a longer discussion of this
> issue, I'm not sure anything needs to be changed.  Comments?
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>
>
>
>



From owner-ips@ece.cmu.edu  Mon Apr  2 14:38:10 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA25377
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 14:38:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32GxjA04310
	for ips-outgoing; Mon, 2 Apr 2001 12:59:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32GxBr04277
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 12:59:11 -0400 (EDT)
Received: from hpindlm.cup.hp.com (hpindlm.cup.hp.com [15.13.95.89])
	by palrel3.hp.com (Postfix) with ESMTP
	id 810F46FD; Mon,  2 Apr 2001 09:59:08 -0700 (PDT)
Received: from mk731913.cup.hp.com (mk731912.cup.hp.com [15.8.80.111])
	by hpindlm.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA07875;
	Mon, 2 Apr 2001 10:02:59 -0700 (PDT)
Message-Id: <5.0.2.1.2.20010330091104.023845f8@esalpha2.cup.hp.com>
X-Sender: krause@esalpha2.cup.hp.com
X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
Date: Fri, 30 Mar 2001 09:15:04 -0800
To: Sandeep Joshi <sandeepj@research.bell-labs.com>
From: Michael Krause <krause@cup.hp.com>
Subject: Re: SNMP traps
Cc: ips@ece.cmu.edu
In-Reply-To: <3AC3850A.56D628D1@research.bell-labs.com>
References: <3ABEB1D0.B97B6986@research.bell-labs.com>
 <3AC1FBEB.4A8158E@cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

At 01:55 PM 3/29/2001 -0500, Sandeep Joshi wrote:

>Mark,
>
>Login & Authentication failure traps came to mind since they
>would help detect DOS attacks....and prevent them by flooding
>the network with SNMP messages ;-)

Ideally one has counters defined for these and is configured to issue a 
trap upon a counter reaching a specific threshold (1 is a possible value).


>Besides what
>  Tom suggested (Abnormal session terminations)
>and LU list changes), we could also add PortalUp, PortalDown,
>NetworkEntityReset, etc.  We can add the mechanisms for now
>and determine the policies later.
>
>You may also want to raise this question with the T10 folks at
>the Interim meeting, if part of this is becoming a SCSI MIB.

It would be very good if the SCSI MIB (and CIM representation) is 
standardized and made as robust as possible to simplify the storage 
management interactions with endnodes.  Ideally, one would have a hierarchy 
of attributes defined that encompassed everything along with some mechanism 
to communicate how to access vendor-specific attributes.

Mike



From owner-ips@ece.cmu.edu  Mon Apr  2 14:40:45 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA25541
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 14:40:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32GQqR01858
	for ips-outgoing; Mon, 2 Apr 2001 12:26:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from chmls20.mediaone.net (chmls20.mediaone.net [24.147.1.156])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32GQKr01824
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 12:26:20 -0400 (EDT)
Received: from breinhold ([140.186.66.73])
	by chmls20.mediaone.net (8.11.1/8.11.1) with SMTP id f32GQAa01432
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 12:26:10 -0400 (EDT)
From: "Barry Reinhold" <bbrtrebia@mediaone.net>
To: "ISCSI" <ips@ece.cmu.edu>
Subject: iSCSI:2.11.5 TSID
Date: Mon, 2 Apr 2001 12:24:25 -0400
Message-ID: <BJEIKPAFDFPFNCPPBCGPGEMGCEAA.bbrtrebia@mediaone.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

Section 2.11.5 states:

"The TSID is an initiator identifying tag set by the target.  A 0 in
   the returned TSID indicates that either the target supports only a
   single connection or that the ISID has already been used as a leading
   ISID. In both cases, the target rejects the login."

I have a question on this, but I'm not sure if it is an editorial or
technical issue.

If the login PDU arrives with an ISID that has "already been used as a
leading ISID" I am assuming that the TSID in this PDU must be zero. (A
leading ISID is an ISID in a LOGIN request PDU that has the TSID set to zero
and hence established a new connection)

In this case how does the target know that the ISID has already been used as
a leading ISID? There is no connection context if TSID = 0 and, for the
target, the ISID is only meaningful within a session.

If my assumption about TSID = 0 is wrong, and TSID is not zero then I would
suggest the following editorial change:


"TSID is a session identifying tag established by the target. The taget MUST
return a value of 0 in the TSID field if the TSID specified in the login
request identified a session for which no more connections are allowed."

[Note: The goal of the editorial comment is to make it clear that the error
conditon being discussed here is a login request with a non zero TSID that
can not be satisfied]

Barry Reinhold
Principal Architect
Trebia Networks
barry.reinhold@trebia.com
603-868-5144/603-659-0885/978-929-0830 x138



From owner-ips@ece.cmu.edu  Mon Apr  2 16:46:57 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00263
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 16:46:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32J1pU13680
	for ips-outgoing; Mon, 2 Apr 2001 15:01:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from chmls20.mediaone.net (chmls20.mediaone.net [24.147.1.156])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32J1Nr13644
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 15:01:36 -0400 (EDT)
Received: from breinhold ([140.186.66.73])
	by chmls20.mediaone.net (8.11.1/8.11.1) with SMTP id f32J1Fa02852;
	Mon, 2 Apr 2001 15:01:15 -0400 (EDT)
From: "Barry Reinhold" <bbrtrebia@mediaone.net>
To: "Rod Harrison" <rod.harrison@windriver.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: frame formats 
Date: Mon, 2 Apr 2001 14:59:29 -0400
Message-ID: <BJEIKPAFDFPFNCPPBCGPAEMKCEAA.bbrtrebia@mediaone.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Importance: Normal
In-Reply-To: <NEBBKMMOEMCINPLCHKGMGEDJCGAA.rod.harrison@windriver.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Rod,
	I'm pretty sure the we decided to go with bytes at the 50th IETF meeting
for data length. No one felt that a max data size of 16 megs in an iSCSI PDU
was an issue (remember this is not max SCSI data transfer size, it is max
data bytes in an iSCSI PDU).
	I do think we need to word the padding with a bit of care, as I can
envision people doing this in at least two different ways that would not
interoperate.
	The interoperability issues on the padding question boils down to: "Where
in the TCP stream do I put the Data Digest if the number of bytes I have to
send is not a multiple of 4?"
	My expectation was that the transmitter would be null padding the data
portion of the iSCSI PDU to a word boundary, then sticking on the digest.
Thus the padding is actually in the TCP stream. For example if <d> = a data
octet, <p> = pad octet, and <dd> = data digest octet, then the TCP stream
for a 6 byte data transfer would look as follows:
<octets in the header - end modulo 4> <d> <d> <d> <d> <d> <d> <p> <p> <dd>
....
	The receiver would get the value "6" in the data length portion of the
header. After pulling out the header, the receiver would pull out 8 bytes of
"data + pad" and then get the digest.
	The other way to do this to have a "virtual pad" such that padding is
created by the receiver when construction the iSCSI PDU from the TCP stream.
The padding is never actually in the TCP stream itself.
	I do not think this is as helpful, but whatever we do, the spec. should
address this minor detail so we don't trip over it.


>-----Original Message-----
>From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>Rod Harrison
>Sent: Monday, April 02, 2001 1:44 PM
>To: ips@ece.cmu.edu
>Subject: RE: iSCSI: frame formats
>
>
>Stephen,
>
>	I don't agree with your concerns about the maximum PDU
>size. I think few targets, or indeed initiators, will be
>interested in negotiating PDU sizes anywhere near this
>large, so big transfers will have to be fragmented anyway.
>
>	However, I share your concern about the padding. I don't
>really see why we are considering it. If one has a local
>alignment issue it can, and I believe should, be taken care
>of locally and not in the specification. There are several
>easy ways of handling this sort of thing; insert and remove
>the pad locally; separate header and payload buffers
>allowing each to be naturally aligned, etc.
>
>	If we have to pad then every read of non-header data will
>have to involve a rounding calculation on the length, and
>then perhaps a second read to discard the pad if the
>underlying buffer is the exact size of the data. Possibly
>the same on send, if the data buffer is the exact size the
>transmit code can't just 'go off the end,' it will have to
>send the data, and then fake up some pad and make another
>send.
>
>	Am I missing something here, why do we care about padding?
>
>	- Rod
>
>-----Original Message-----
>From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
>Behalf Of
>Stephen Bailey
>Sent: Monday, April 02, 2001 3:18 PM
>To: ips@ece.cmu.edu
>Subject: Re: iSCSI: frame formats
>
>
>Sandeep,
>
>> DataLen will now be max 8M/4M but then we dont wish to
>have large
>> iSCSI PDUs in any case.
>
>This max size is getting the the point where I'm sure it'll
>be an
>irritant.
>
>I would like (but, in fact, a sure way to guarantee that it
>won't
>happen is for me to like it :^) to view iSCSI PDUs as the
>expected
>grain size at which you might have software involvement in
>an
>otherwise hardware-driven iSCSI implementation.
>
>For example, when a target is returning read data for
>multiple
>outstanding reads, it might want to return a bit at a time
>from each,
>and each `bit' should be an iSCSI PDU.  Clearly sending
>these bits
>will be a software-level decision.  That certainly was the
>rational
>for allowing multiple FCP DATA PDUs per read operation, and
>I naively
>assumed similar logic was being applied here.
>
>The alternative is to say that the hardware will do iSCSI
>PDU
>chunking, but if that's the case, I expect that the header
>is, well, a
>bit bulky.
>
>I'm also incredibly unexcited about having data length be a
>multiple
>of 4 bytes (if that's still in the cards).  There operations
>within
>the SCSI command set which return arbitrary length data.
>There are
>perfectly nominal cases where you get less data than you
>requested
>(e.g. inquiry & request sense).  Furthermore, the SCSI
>architecture
>does not prohibit this, even though certain commands do, so
>it is not
>for iSCSI to say anything about this one way or the other.
>
>The problem with handling lengths that include padding comes
>when
>you're trying to move the data into a buffer which is a
>non-multiple
>length.  For example, if I ask for 22 bytes of inquiry data
>(with a 22
>byte buffer), what can I say, at the time a PDU arrives that
>has 24
>bytes?  It might have 21 or 22 bytes (or perhaps even,
>erroneously 23
>or 24 bytes).  A data residual coming later will tell the
>software how
>much was actually there, but it can't tell the hardware.
>The typical
>expectation of this type of transfer is that it will only
>overwrite
>bytes of the buffer that are actually transferred, but
>having padded
>lengths will not allow this.
>
>The completely standard solution to carrying arbitrary data
>of
>arbitrary length in an aligned transfer unit is to pad the
>transfer
>unit but report the exact (shorter) length.  Another
>solution, used by
>FC, is to carry a pad length.  In iSCSI, why bother---you've
>just
>reintroduced added the 2 bits you were trying to remove?
>
>The data length scenario is not comparable to IP header
>lengths, where
>what is being carried is not arbitrary data.
>
>Certainly, for iSCSI additional header segments (AHSs) you
>could
>arguably use this cell length technique, since we can
>control what
>we're carrying (AHSs that need exact byte lengths will have
>to be
>internally self-describing) but frankly, I still think it's
>a bad
>idea.
>
>I can't understand why we're messing around with all these
>tricky
>`solutions' to standard problems.  We should avoid the
>temptation to
>get cute, and wholesomely provide the same capabilities as
>any other
>SCSI transport.  Specifically:
>  o allow long PDUs
>  o carry exact data lengths
>
>Steph
>



From owner-ips@ece.cmu.edu  Mon Apr  2 16:47:45 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00296
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 16:47:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32Iupw13197
	for ips-outgoing; Mon, 2 Apr 2001 14:56:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32Iuhr13187
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 14:56:44 -0400 (EDT)
Received: from colosus2.cup.hp.com (colosus2.cup.hp.com [15.13.128.145])
	by palrel3.hp.com (Postfix) with ESMTP id 649B72A2
	for <ips@ece.cmu.edu>; Mon,  2 Apr 2001 11:56:42 -0700 (PDT)
Received: from hp.com (IDENT:plabat@pl703521.cup.hp.com [15.13.133.216])
	by colosus2.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA25852;
	Mon, 2 Apr 2001 11:56:41 -0700 (PDT)
Message-ID: <3AC8BD71.A7EF675D@hp.com>
Date: Mon, 02 Apr 2001 10:57:05 -0700
From: Pierre Labat <pierre_labat@hp.com>
Organization: Hewlett Packard ATM-SISL
X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.5-15 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <200104021733.KAA01362@core.rose.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

"Mallikarjun C." wrote:

> To be fair to data SACK, one could think of an upper bound
> on the unack'ed data - agreed on at the login time.  While not
> requiring acks on every PDU, it gives targets the deterministic
> maximum on the buffer size they have to keep around if they
> choose to "reliably" support data SACK.  The current answer of
> "replay buffer size/IO size", IMHO, is simply not attractive.
> Also to be fair to data SACK, I believe FCP-2 allows sequence-level
> error recovery in an I/O.
>
> However, I think that it's extremely useful to include a discussion
> in the draft of  the TCP checksum "escape" statistics and the
> device types for which this was considered an absolute requirement
> to make forward progress at this error rates (like huge tape
> backups?) - essentially the reasons that convinced Julian to define
> this mechanism in. That gives credibility and acceptance to this,
> or alternately may lead to the consensus that data SACK is not required.
> --
> Mallikarjun

About  TCP checksum "escape" statistics what i saw is :

A) Jonathan Stone,Craig Partridge, "When the CRC and TCP Checksum Disagree"
 http://www.acm.org/sigcomm/sigcomm2000/conf/abstract/9-1.htm

1 escape in 200 millions

B) V. Paxson "1999 End-to-End Internet Packet Dynamics."
   http://www.aciri.org/vern/papers.html

 1 escape in 300 millions

C) J. Stone et. al "Performance of Checksums and CRC's over Real Data"
  IEEE/ACM Transactions on Networking, Vol. 6, No. 5, October 1998

http://dev.acm.org/pubs/articles/journals/ton/1998-6-5/p529-stone/p529-stone.pdf

Less than 1 escape in 10e17 segments when taking into
account the link layer AAL5 CRC. (see page 540 left column on top).




Taking the worst is 1 in 200 millions.

Regards,

Pierre



From owner-ips@ece.cmu.edu  Mon Apr  2 16:47:53 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00307
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 16:47:52 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32Jqn817627
	for ips-outgoing; Mon, 2 Apr 2001 15:52:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32Jqjr17619
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 15:52:45 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <HML4ZYY0>; Mon, 2 Apr 2001 15:43:38 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801539D@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: RE: frame formats
Date: Mon, 2 Apr 2001 15:52:35 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Let me jump in here and do this officially (or
is that officiously :-)?) -- Format 2 is the
"rough consensus" of the working group.  Those
who disagree should post to the list.

The following related issues are still open:
- what pieces of the header are covered in which
	digest(s)
- if a length field covered by a digest has to
	be used to locate that digest, whether
	and how to protect that length.
We need to reach consensus on these in the next
few days.

Thanks,
--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------


> -----Original Message-----
> From:	julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> Sent:	Friday, March 30, 2001 11:52 AM
> To:	ips@ece.cmu.edu
> Subject:	frame formats
> 
> 
> 
> Dear colleagues,
> 
> It look like Format-2 is selected by popular vote.
> 
> Julo
> 


From owner-ips@ece.cmu.edu  Mon Apr  2 16:48:30 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00366
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 16:48:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32IKmv10410
	for ips-outgoing; Mon, 2 Apr 2001 14:20:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f32IK9r10361
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 14:20:09 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by crufty; Mon Apr  2 14:17:33 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Mon Apr  2 14:19:37 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id OAA04373;
	Mon, 2 Apr 2001 14:19:37 -0400 (EDT)
Message-ID: <3AC8C2B9.C250161D@research.bell-labs.com>
Date: Mon, 02 Apr 2001 14:19:37 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Michael Krause <krause@cup.hp.com>
CC: ips@ece.cmu.edu
Subject: Re: SNMP traps
References: <3ABEB1D0.B97B6986@research.bell-labs.com>
	 <3AC1FBEB.4A8158E@cisco.com> <5.0.2.1.2.20010330091104.023845f8@esalpha2.cup.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Michael Krause wrote:
> 
> At 01:55 PM 3/29/2001 -0500, Sandeep Joshi wrote:
> 
> >Mark,
> >
> >Login & Authentication failure traps came to mind since they
> >would help detect DOS attacks....and prevent them by flooding
> >the network with SNMP messages ;-)
> 
> Ideally one has counters defined for these and is configured to issue a
> trap upon a counter reaching a specific threshold (1 is a possible value).

agreed.

i see that an IPsec MIB is also in progress.   its going to require
some processing to correlate IPsec trap information with iSCSI WWUIs. 

See pages 58-62 for the various traps they have defined.
http://www.ietf.org/internet-drafts/draft-ietf-ipsec-monitor-mib-04.txt


> 
> >Besides what
> >  Tom suggested (Abnormal session terminations)
> >and LU list changes), we could also add PortalUp, PortalDown,
> >NetworkEntityReset, etc.  We can add the mechanisms for now
> >and determine the policies later.
> >
> >You may also want to raise this question with the T10 folks at
> >the Interim meeting, if part of this is becoming a SCSI MIB.
> 
> It would be very good if the SCSI MIB (and CIM representation) is
> standardized and made as robust as possible to simplify the storage
> management interactions with endnodes.  Ideally, one would have a hierarchy
> of attributes defined that encompassed everything along with some mechanism
> to communicate how to access vendor-specific attributes.
> 
> Mike


From owner-ips@ece.cmu.edu  Mon Apr  2 17:30:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA01944
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 17:30:43 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32EEke22132
	for ips-outgoing; Mon, 2 Apr 2001 10:14:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lub1028.lss.emc.com ([168.159.39.28])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32EDer22072
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 10:13:40 -0400 (EDT)
Received: from emc.com (IDENT:jhall@localhost.localdomain [127.0.0.1])
	by lub1028.lss.emc.com (8.9.3/8.9.3) with ESMTP id KAA12389
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 10:13:35 -0400
Message-Id: <200104021413.KAA12389@lub1028.lss.emc.com>
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Mon, 02 Apr 2001 10:13:35 -0400
From: "Jon Hall" <jhall@emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


I agree with Somesh.  And would go farther -- the complexity
that results from retaining enough target-side state to respond
to a SACK/SNACK request is non-trivial and needs clear justification.
Intuitively, a CRC that discovers an error in an iSCSI pdu header
(that the TCP cksum missed) seems like it should be a rare event.

What is the frequency of this event?  IMO the answer to this
question should be written into the protocol spec -- assuming
that it substantiates the benefit of SACK/SNACK.  Otherwise, the
SACK/SNACK pdu should be removed.

-Jon

julian_satran@il.ibm.com writes:
>
>Somesh,
>
>As I stated earlier - the DataSN was created to detect missing data PDUs.
>SNACK is needed to recover missing StatusSN and missing dataSN is only a
>bonus if the target wants to support it.  It is a trivial mechanism and I
>think it should stay.
>
>Julo
>
>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
>
>Please respond to someshg@yahoo.com
>
>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
>cc:
>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>Sorry to have been missing for a while. Hope you will
>appreciate my being back in action :-). It was a fairly
>clear consensus in Orlando that applications broke up
>their transfers into reasonably small chunks i.e. they
>did not have very long running transfers.
>
>Therefore the consensus was that a command level recovery
>mechanism was sufficient instead of an ack/sack for each
>data PDU.
>
>The SACK mechanism was a post Orlando invention. Without
>an ack mechanism (for every data PDU), the SACK mechanism
>just imposes additional burden on either end of the session,
>without really much benefit.
>
>The benefit of having SACK is of saving bandwidth in case
>the data part of the data PDU failed an integrity check
>(but passed TCP checksum). This is a rare enough case that
>as a percentage, the bandwidth loss from retransmitting
>all the data associated with a read or write command is
>very very small.
>
>In addition, it avoids the complexity of restarting
>something from the middle, as compared to from the begining.
>
>To me it seems that there is significant simplicity (from
>implementation, reliability and recovery process) from
>having smaller data transfer per command.
>
>I would really like to get rid of the SACK command.
>
>Somesh
>
>> -----Original Message-----
>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>> julian_satran@il.ibm.com
>> Sent: Wednesday, March 28, 2001 6:57 AM
>> To: ips@ece.cmu.edu
>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> Mallikarjun,
>>
>> Last summer I thought that recovery within a connection should be left to
>> TCP. It is simple and could be made available through IPsec (if no new
>> option of any form can be added).
>>
>> Two things killed this:
>>
>>    The requirement to have a data encapsulation that can pass through
>>    application proxies (like a storage router)
>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>    header
>>
>>
>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>> implementations are in fact allowed to drop even unacked data.
>>
>> I am bound by the Orlando meeting decision to drop it. Except the regular
>> "oppose everything" crowd the two vocal opponents where Somesh Gupta and
>> Matt Wakeley.
>>
>> David may want or not to re-open the issue - I am not going to ask for
>it.
>>
>> Regards,
>> Julo
>>
>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>
>> Please respond to cbm@rose.hp.com
>>
>> To:   Black_David@emc.com
>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com, someshg@yahoo.com,
>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> David and Julian,
>>
>> I appreciate both your views, and should I say that they're
>> along predicted lines :-)
>>
>> - David's right in saying that the situation is akin to FC's.
>>   However, I would like to point out that FC is an unreliable
>>   transport, and hence is forced to pick up a lot of the transport
>>   baggage (at least in FCP-2, as I understand), in addition
>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>   TCP being the "reliable" transport, iSCSI is going along the
>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>   point is - if this is indeed a necessary evil, why don't we
>>   complete iSCSI's transport functionality by data-ACKs?
>>
>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>   we're making its usage (and implementation) drastically less appealing
>>   since the only way error recovery algorithms can *rely* on data SACK
>>   is when replay is supported (or, "ReplaySupport=yes"  in my proposal),
>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>   draft and not providing any incentives to implement and use it!
>>
>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>>   protocol in its definition so far (and I believe, rightly so -
>mandating
>>   command ordering, bi-di support, SCSI CRN support to name a few
>> examples),
>>   the perfectly SCSI-legal R/W interactions that break in other
>transports
>>   *do not* have to break in iSCSI.
>>
>> - A last idea (may seem radical at this point) in regards to iSCSI
>>   being a "full transport". This provides us an opportunity to "cast
>>   off" the transport baggage in future when we truly move to a "reliable"
>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>   keeping the encapsulation stuff separate from the transport stuff.
>>   (Julian, I heard from Randy that ideas similar to this were explored
>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>   TCP was supposed to be reliable and granularity of recovery was deemed
>>   one I/O.)
>>
>> With that said, may I request David (with his co-chair hat on, :-))
>> to add some binding comments/observations on this discussion?
>>
>> If we decide to leave data SACKs as unattractive to implement, the draft
>> should in the least add a statement like - "Note that satisfying all
>> possible data SACK requests for a task with an unacknowledged status
>> implies implementing the I/O replay buffer on the part of targets."
>> --
>> Mallikarjun
>>
>>
>> Mallikarjun Chadalapaka
>> Networked Storage Architecture
>> Network Storage Solutions Organization
>> MS 5668   Hewlett-Packard, Roseville.
>> cbm@rose.hp.com
>>
>>
>>
>>
>> >I think Julian's basically right -- I would point
>> >out that any case of write after read that breaks
>> >over iSCSI will also break over Fibre Channel.
>> >On FC, the scenario starts with a frame CRC failure
>> >on read data at the Initiator, so applications
>> >have to cope and typically do so by enforcing
>> >ordering at the app rather than using SCSI task
>> >ordering.
>> >
>> >While SCSI has clever tools like ACA and task
>> >ordering that appear to allow dependent operations
>> >to be sent to the target concurrently, in practice
>> >they don't work and/or aren't used (funny thing,
>> >those two reinforce each other ;-) ).  Hence
>> >a minimal approach to them is in order:
>> >- Make sure the result will interoperate.
>> >- Make sure T10 doesn't ding us for leaving something
>> >    completely out.
>> >- Don't specify anything not needed for the above.
>> >
>> >My 0.02,
>> >--David
>> >
>> >> -----Original Message-----
>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>> >> To:    cbm@rose.hp.com
>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>> >> Black_David@emc.com
>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >> Mallikarjun,
>> >>
>> >> I commiserate with you at the lack of ack for data but the Orlando
>> meeting
>> >> stated - no.  Recall that I kept the number only as a mechanism to
>> detect
>> >> missing packets.
>> >>
>> >> You can achieve the effect you want by keeping around data for a while
>> >> (you
>> >> determine how long and then discard).
>> >>
>> >> If a SACK comes and you can recover - fine. If not you either reaccess
>> the
>> >> media (if you know how) or reject
>> >> and let the initiator retry.
>> >>
>> >> You should not worry about R/W conflicts as programs bound to have
>such
>> >> conflicts either:
>> >>
>> >> 1)can live with them or
>> >> 2)protect themselves through some locks and rely on
>> "operation-end-status"
>> >> to keep results deterministic.
>> >>
>> >> Regards,
>> >> Julo
>> >>
>> >>
>> >>
>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>> >>
>> >> Please respond to cbm@rose.hp.com
>> >>
>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
>Julian
>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>> >> cc:   Black_David@emc.com
>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >>
>> >> Hi Error Recovery Team,
>> >>
>> >> iSCSI can discard PDUs because of digest errors and request
>> >> retransmissions using the iSCSI data SACK.  To deal with such
>> >> an eventuality, targets that want to support data SACK have
>> >> the following options:
>> >>
>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>> >>   a SACK could come anytime before the status is ack'ed by the
>> >>   initiator. [ simple, but extremely expensive in memory resources]
>> >>
>> >> (B) (re-introduce data-ACKs into the draft, and) implement data-ACKs.
>> >>   Thus enables keeping only those I/O buffers that haven't been ack'ed
>> >>   by the initiator. IOW, become a real full transport! [ everyone
>> disliked
>> >>   it earlier...]
>> >>
>> >> (C) re-access the medium for data retransmission requests.  Now there
>> >>   are 3 sub-cases in this to handle the changed data on the medium in
>a
>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>> >> legal.)
>> >>      (1) On seeing any write, stall till status is ack'ed for all the
>> >>             previous reads (basically drain the pipe). [simple, but
>> incurs
>> >>             an additional roundtrip delay for all writes].
>> >>      (2) A variation of the above, keep an eye only on the prior
>> >>             overlapping reads. [more BW efficient, but complicated to
>> >>             resolve the block dependencies in a stream of
>> reads followed
>> >>             by writes]
>> >>         (3) Document the caveat and leave it upto the applications
>> >>             to avoid this case since this leads to data integrity
>> issues.
>> >>             [pushing to apps since the transport can't get it right!]
>> >>
>> >> My first preference is (B), followed by (A), and I suggest we not go
>> >> to (C) at all with its inherent dangers.
>> >>
>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>> >> right thing to do architecturally instead of being a "semi-transport"!
>> >>
>> >> Comments?
>> >> --
>> >> Mallikarjun
>> >>
>> >>
>> >> Mallikarjun Chadalapaka
>> >> Networked Storage Architecture
>> >> Network Storage Solutions Organization
>> >> MS 5668   Hewlett-Packard, Roseville.
>> >> cbm@rose.hp.com
>> >>
>> >>
>>
>__________________________________________________________________________
>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>> legal
>> >>         if SCSI sets the ORDERED task attribute on both the
>> commands AND
>> >>         sets the NACA bit to one to indicate that Write shall be
>> executed
>> >>         only if the Read did not fail (result in a Check Condition).
>> >>
>> >>         In the current case, since Read completed just fine from
>SCSI's
>> >>         point of view, SCSI is moving on to execute Write.  Those read
>> >> buffers
>> >>         had been freed up since iSCSI received an ACK at the TCP
>level,
>> >> and
>> >>         since iSCSI has no other way to have the data ack'ed!


From owner-ips@ece.cmu.edu  Mon Apr  2 17:43:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA02206
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 17:43:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32Hfpg07542
	for ips-outgoing; Mon, 2 Apr 2001 13:41:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.wrs.com (unknown-1-11.wrs.com [147.11.1.11])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32HfMr07502
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 13:41:22 -0400 (EDT)
Received: from london ([192.168.1.77])
	by mail.wrs.com (8.9.3/8.9.1) with SMTP id KAA28339
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 10:40:42 -0700 (PDT)
From: "Rod Harrison" <rod.harrison@windriver.com>
To: <ips@ece.cmu.edu>
Subject: RE: iSCSI: frame formats 
Date: Mon, 2 Apr 2001 18:43:56 +0100
Message-ID: <NEBBKMMOEMCINPLCHKGMGEDJCGAA.rod.harrison@windriver.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
In-Reply-To: <20010402141851.88DE594006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Stephen,

	I don't agree with your concerns about the maximum PDU
size. I think few targets, or indeed initiators, will be
interested in negotiating PDU sizes anywhere near this
large, so big transfers will have to be fragmented anyway.

	However, I share your concern about the padding. I don't
really see why we are considering it. If one has a local
alignment issue it can, and I believe should, be taken care
of locally and not in the specification. There are several
easy ways of handling this sort of thing; insert and remove
the pad locally; separate header and payload buffers
allowing each to be naturally aligned, etc.

	If we have to pad then every read of non-header data will
have to involve a rounding calculation on the length, and
then perhaps a second read to discard the pad if the
underlying buffer is the exact size of the data. Possibly
the same on send, if the data buffer is the exact size the
transmit code can't just 'go off the end,' it will have to
send the data, and then fake up some pad and make another
send.

	Am I missing something here, why do we care about padding?

	- Rod

-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
Stephen Bailey
Sent: Monday, April 02, 2001 3:18 PM
To: ips@ece.cmu.edu
Subject: Re: iSCSI: frame formats


Sandeep,

> DataLen will now be max 8M/4M but then we dont wish to
have large
> iSCSI PDUs in any case.

This max size is getting the the point where I'm sure it'll
be an
irritant.

I would like (but, in fact, a sure way to guarantee that it
won't
happen is for me to like it :^) to view iSCSI PDUs as the
expected
grain size at which you might have software involvement in
an
otherwise hardware-driven iSCSI implementation.

For example, when a target is returning read data for
multiple
outstanding reads, it might want to return a bit at a time
from each,
and each `bit' should be an iSCSI PDU.  Clearly sending
these bits
will be a software-level decision.  That certainly was the
rational
for allowing multiple FCP DATA PDUs per read operation, and
I naively
assumed similar logic was being applied here.

The alternative is to say that the hardware will do iSCSI
PDU
chunking, but if that's the case, I expect that the header
is, well, a
bit bulky.

I'm also incredibly unexcited about having data length be a
multiple
of 4 bytes (if that's still in the cards).  There operations
within
the SCSI command set which return arbitrary length data.
There are
perfectly nominal cases where you get less data than you
requested
(e.g. inquiry & request sense).  Furthermore, the SCSI
architecture
does not prohibit this, even though certain commands do, so
it is not
for iSCSI to say anything about this one way or the other.

The problem with handling lengths that include padding comes
when
you're trying to move the data into a buffer which is a
non-multiple
length.  For example, if I ask for 22 bytes of inquiry data
(with a 22
byte buffer), what can I say, at the time a PDU arrives that
has 24
bytes?  It might have 21 or 22 bytes (or perhaps even,
erroneously 23
or 24 bytes).  A data residual coming later will tell the
software how
much was actually there, but it can't tell the hardware.
The typical
expectation of this type of transfer is that it will only
overwrite
bytes of the buffer that are actually transferred, but
having padded
lengths will not allow this.

The completely standard solution to carrying arbitrary data
of
arbitrary length in an aligned transfer unit is to pad the
transfer
unit but report the exact (shorter) length.  Another
solution, used by
FC, is to carry a pad length.  In iSCSI, why bother---you've
just
reintroduced added the 2 bits you were trying to remove?

The data length scenario is not comparable to IP header
lengths, where
what is being carried is not arbitrary data.

Certainly, for iSCSI additional header segments (AHSs) you
could
arguably use this cell length technique, since we can
control what
we're carrying (AHSs that need exact byte lengths will have
to be
internally self-describing) but frankly, I still think it's
a bad
idea.

I can't understand why we're messing around with all these
tricky
`solutions' to standard problems.  We should avoid the
temptation to
get cute, and wholesomely provide the same capabilities as
any other
SCSI transport.  Specifically:
  o allow long PDUs
  o carry exact data lengths

Steph



From owner-ips@ece.cmu.edu  Mon Apr  2 17:49:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA02370
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 17:49:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32HMpn06027
	for ips-outgoing; Mon, 2 Apr 2001 13:22:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f32HM8r05968
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 13:22:08 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Mon Apr  2 13:19:35 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Mon Apr  2 13:21:39 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id NAA00585;
	Mon, 2 Apr 2001 13:21:38 -0400 (EDT)
Message-ID: <3AC8B522.35A3332D@research.bell-labs.com>
Date: Mon, 02 Apr 2001 13:21:38 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Stephen Bailey <steph@cs.uchicago.edu>
CC: ips@ece.cmu.edu
Subject: Re: iSCSI: frame formats
References: <200104011919.PAA20734@aura.research.bell-labs.com> <20010402141851.88DE594006@sandmail.sandburst.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


comments in text..

-Sandeep

Stephen Bailey wrote:
> 
> Sandeep,
> 
> > DataLen will now be max 8M/4M but then we dont wish to have large
> > iSCSI PDUs in any case.
> 
> This max size is getting the the point where I'm sure it'll be an
> irritant.

Your point about maxSize limitations set me thinking..

Hardware folks dont like parsing down the AHS to find the length 
field so perhaps putting jumbo-lengths in AHS is unattractive.

So what is the level of HW complexity in processing the SCSI PDUs which 
already have 6/10/12/16 byte commands, and consequently put the 
length field in *variable* locations ?   Can iSCSI adopt a similar
scheme to future-proof it from maxSize limitations ?

Roughly, a few extra states get added into the FSM to process variable 
length BHS based on the opcode, and you may need an extra register or
two.
But in hardware this can still get done in parallel.  Any thoughts ?

Needless to say, a SCSI-like variable BHS kills this 2-reads thingy that 
software implementors have been talking about.


> 
> I would like (but, in fact, a sure way to guarantee that it won't
> happen is for me to like it :^) to view iSCSI PDUs as the expected
> grain size at which you might have software involvement in an
> otherwise hardware-driven iSCSI implementation.
> 
> For example, when a target is returning read data for multiple
> outstanding reads, it might want to return a bit at a time from each,
> and each `bit' should be an iSCSI PDU.  Clearly sending these bits
> will be a software-level decision.  That certainly was the rational
> for allowing multiple FCP DATA PDUs per read operation, and I naively
> assumed similar logic was being applied here.
> 
> The alternative is to say that the hardware will do iSCSI PDU
> chunking, but if that's the case, I expect that the header is, well, a
> bit bulky.
> 
> I'm also incredibly unexcited about having data length be a multiple
> of 4 bytes (if that's still in the cards).  There operations within
> the SCSI command set which return arbitrary length data.  There are
> perfectly nominal cases where you get less data than you requested
> (e.g. inquiry & request sense).  Furthermore, the SCSI architecture
> does not prohibit this, even though certain commands do, so it is not
> for iSCSI to say anything about this one way or the other.


If I am not mistaken, dataLen value will be exact (not the padded)
and it will be in units of bytes (not words - this mistake was 
pointed out by someone at the IETF).


> 
> The problem with handling lengths that include padding comes when
> you're trying to move the data into a buffer which is a non-multiple
> length.  For example, if I ask for 22 bytes of inquiry data (with a 22
> byte buffer), what can I say, at the time a PDU arrives that has 24
> bytes?  It might have 21 or 22 bytes (or perhaps even, erroneously 23
> or 24 bytes).  A data residual coming later will tell the software how
> much was actually there, but it can't tell the hardware.  The typical
> expectation of this type of transfer is that it will only overwrite
> bytes of the buffer that are actually transferred, but having padded
> lengths will not allow this.
> 
> The completely standard solution to carrying arbitrary data of
> arbitrary length in an aligned transfer unit is to pad the transfer
> unit but report the exact (shorter) length.  Another solution, used by
> FC, is to carry a pad length.  In iSCSI, why bother---you've just
> reintroduced added the 2 bits you were trying to remove?
> 
> The data length scenario is not comparable to IP header lengths, where
> what is being carried is not arbitrary data.
> 
> Certainly, for iSCSI additional header segments (AHSs) you could
> arguably use this cell length technique, since we can control what
> we're carrying (AHSs that need exact byte lengths will have to be
> internally self-describing) but frankly, I still think it's a bad
> idea.
> 
> I can't understand why we're messing around with all these tricky
> `solutions' to standard problems.  We should avoid the temptation to
> get cute, and wholesomely provide the same capabilities as any other
> SCSI transport.  Specifically:
>   o allow long PDUs
>   o carry exact data lengths
> 
> Steph


From owner-ips@ece.cmu.edu  Mon Apr  2 17:50:53 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA02394
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 17:50:52 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32JDpL14672
	for ips-outgoing; Mon, 2 Apr 2001 15:13:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32JClr14598
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 15:12:47 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <HML4ZWBJ>; Mon, 2 Apr 2001 15:03:42 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801539A@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: CBinford@pirus.com, ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multipleconn
	 ections.
Date: Mon, 2 Apr 2001 15:12:35 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

It still appears to me that the behavior that
Charles wants is available by sending the task management
function twice - once for immediate delivery and once for
ordered delivery.  Duplication on all connections would
not be necessary in the normal case because the ordered
instance would naturally come behind the unordered ones.
The timer seems to be something that ought to be taken
up under error recovery in general (and in particular,
we should consider letting TCP time out the connection
rather than adding yet another recovery timer).  IMHO,
the bottom line is that the logic to duplicate the task
management commands at the Initiator and coordinate
them at the Target costs implementation complexity,
and I have not seen a convincing statement of what
we're buying with that added complexity.

It may be the case that applications don't want to be
bothered with sending task management commands twice,
in which case we have a coordination problem of some
form that has to be addressed in iSCSI.

For further discussion,
--David

> -----Original Message-----
> From:	Binford, Charles [SMTP:CBinford@pirus.com]
> Sent:	Monday, April 02, 2001 9:20 AM
> To:	iSCSI (E-mail)
> Subject:	FW: iSCSI: Out Of Sequence due to null sequence with
> multipleconn ections.
> 
> I sent this Friday, but never saw it come back, so I'm resending.....
> 
> Charles Binford
> Pirus Networks
> 316.315.0382 x222
> 
> 
> Doug, If I'm hearing you correct, you are saying process the task
> managment
> command immediately, but toss any commands waiting for a CmdSN hole to be
> filled.   The problem with this approach, is there are many different
> flavors of task managemet commands and not all effect all commands.  The
> iSCSI layer would have to duplicate the SCSI layer effort of parsing the
> task managment request and the LU/tags of all commands it comtemplates
> aborting.  I don't like that solution.
> 
> David,  I don't think the current state of things is acceptable as you
> imply
> with your final statement of 'I'm not sure anything needs to be changed.'
> 
> FC solves this problem by requiring the initiator FCP layer (note: FCP,
> not
> SCSI)to send an ABTS for every exchange 'in an ambiguous state' after any
> task management functin.  It is ugly, but is covers the hole of the
> initiator not knowing for sure which commands were properly aborted and
> which were not.
> 
> I have suggested in the past (although, from the response I've gotten, I
> think I have been misunderstood) that the iSCSI initiator send all task
> management functions down all connections of a session.  I'm NOT saying
> duplicate the task management function.  The distinction being only one
> instance of the task management function should be passed up to the SCSI
> layer on the target side.  
> 
> To address the issue of the task management function being delayed because
> of a slow connection I suggest the following:  upon receipt of the first
> task management function request on any connection, the iSCSI target shall
> start a relatively short timer (I'm thinking of a range from a few 100 ms
> to
> a couple of seconds).  As soon as the task management function is received
> on all connections, cancel the timer and pass a single instance of the
> request up to the SCSI layer.  If the timer expires, close the
> connection(s)
> the task management function(s) did not arrive on and pass up a single
> instance of the request to the SCSI layer.
> 
> Charles Binford
> Pirus Networks
> 316.315.0382 x222
> 
> 
> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> Douglas Otis
> Sent: Tuesday, March 27, 2001 9:42 AM
> To: julian_satran@il.ibm.com; ips@ece.cmu.edu
> Subject: RE: iSCSI: Out Of Sequence due to null sequence with
> multipleconnections.
> 
> 
> Julian,
> 
> It could be that a "stuck" command in flight or at the sequencer is the
> reason for the task management which is made problematic by the null
> sequence.  Even if implementers are aware of this problem, there is not a
> good solution until null sequence is removed.  Making a flag to indicate
> "immediate" or perhaps "reject prior pending iSCSI commands" is a means of
> ensuring the initiator remains in control with respect to all situations.
> This ensures the initiator is aware of the state of the target and
> sequence
> of commands can be maintained if required.  Be careful about being too
> disk
> centric.  Treating these SN numbers as unsigned allows a simple means of
> tracking.  Here is an example explaination:
> 
>    Comparisons and arithmetic on SNs in this document SHOULD use Serial
>    Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.
> 
> Doug
> 
> > David,
> >
> > Your summary is correct. And (except for a minor point) it is all a
> matter
> > of target implementation.
> > SCSI is not a "completely layered" stack and others have gone so far as
> to
> > do task cancellation by an action at link layer (parallel and fiber).
> > There might be 2 funny side-effects though if an implementer chooses to
> > cancel "holes" (commands in flight on other connections):
> >
> > 1)the cancelled command is a another task management command (there can
> be
> > only one active but what it the one active gets stucked?)
> > 2)(academic I admit) the cancelled command arrives after a wrap around
> in
> > command sequencing; this is a bit harder (although not impossible) to
> fix
> > in the implementation
> >
> > Implementers should be aware of those side effects.
> >
> > Regards,
> > Julo
> >
> >
> >
> > Black_David@emc.com on 26/03/2001 18:57:48
> >
> > Please respond to Black_David@emc.com
> >
> > To:   dotis@sanlight.net
> > cc:   ips@ece.cmu.edu
> > Subject:  RE: iSCSI: Out Of Sequence due to null sequence with
> > multipleconn
> >       ections.
> >
> >
> [stuff deleted]
> >
> > (4) is hard.  One SCSI task management command generates one
> > response.  That response can either be generated immediately
> > (command arrives, is passed to SCSI, SCSI does its thing) or
> > at the right point in the sequence (command arrives, is
> > sequenced by iSCSI, passed to SCSI at the right point in the
> > sequence, and SCSI does its thing), but NOT both.  As things
> > currently stand, having a task management command apply to
> > in-flight commands requires sending the task management
> > command for ordered delivery - so if it's desired to have
> > the task management command take immediate effect and also
> > catch everything in flight, it's going to have to be sent
> > twice.  I'm not enthusiastic about the idea of the task
> > management command taking immediate effect but delaying the
> > response until everything in flight that might be affected
> > arrives, as I suspect the Initiator would like to know what
> > happened sooner rather than later.
> >
> > (3) is "interesting".  The results of applying a SCSI task
> > management command to a SCSI operation are known only to
> > SCSI, and hence asking that a command stuck in the iSCSI
> > sequencer be affected immediately by a task management
> > command is asking that the task management command have
> > the side effect of changing some of the commands it affects
> > to immediate delivery so that it can immediately do its
> > (SCSI) thing to them.  I wouldn't want to mandate this,
> > nor would I want to prohibit it, BUT ... if the above
> > discussion of in-flight commands is correct, I would
> > observe that the application on the Initiator side
> > can't tell the difference between commands that are in-flight
> > vs. waiting for something in-flight on another connection,
> > and hence is going to have to issue the task management
> > command for ordered delivery if it wants to affect operations
> > in either place (and issue a second copy if it wants
> > immediate action).
> >
> > The upshot is that, aside from a longer discussion of this
> > issue, I'm not sure anything needs to be changed.  Comments?
> >
> > Thanks,
> > --David
> >
> > ---------------------------------------------------
> > David L. Black, Senior Technologist
> > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> > black_david@emc.com       Mobile: +1 (978) 394-7754
> > ---------------------------------------------------
> >
> >
> >
> >
> >


From owner-ips@ece.cmu.edu  Mon Apr  2 18:09:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA02689
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 18:09:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32HZTS06995
	for ips-outgoing; Mon, 2 Apr 2001 13:35:29 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32HWlr06820
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 13:32:47 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by palrel3.hp.com (Postfix) with ESMTP id D93021E2
	for <ips@ece.cmu.edu>; Mon,  2 Apr 2001 10:32:46 -0700 (PDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id KAA01362 for ips@ece.cmu.edu; Mon, 2 Apr 2001 10:33:47 -0700 (PDT)
Message-Id: <200104021733.KAA01362@core.rose.hp.com>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
To: ips@ece.cmu.edu
Date: Mon, 02 Apr 2001 10:33:46 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

>Sorry to have been missing for a while. Hope you will
>appreciate my being back in action :-). It was a fairly
>clear consensus in Orlando that applications broke up
>their transfers into reasonably small chunks i.e. they
>did not have very long running transfers.
>
>Therefore the consensus was that a command level recovery
>mechanism was sufficient instead of an ack/sack for each
>data PDU.
>
>The SACK mechanism was a post Orlando invention. Without
>an ack mechanism (for every data PDU), the SACK mechanism
>just imposes additional burden on either end of the session,
>without really much benefit.

To be fair to data SACK, one could think of an upper bound 
on the unack'ed data - agreed on at the login time.  While not
requiring acks on every PDU, it gives targets the deterministic
maximum on the buffer size they have to keep around if they 
choose to "reliably" support data SACK.  The current answer of 
"replay buffer size/IO size", IMHO, is simply not attractive. 
Also to be fair to data SACK, I believe FCP-2 allows sequence-level 
error recovery in an I/O.  

However, I think that it's extremely useful to include a discussion
in the draft of  the TCP checksum "escape" statistics and the 
device types for which this was considered an absolute requirement 
to make forward progress at this error rates (like huge tape 
backups?) - essentially the reasons that convinced Julian to define 
this mechanism in. That gives credibility and acceptance to this, 
or alternately may lead to the consensus that data SACK is not required.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com

>
>The benefit of having SACK is of saving bandwidth in case
>the data part of the data PDU failed an integrity check
>(but passed TCP checksum). This is a rare enough case that
>as a percentage, the bandwidth loss from retransmitting
>all the data associated with a read or write command is
>very very small.
>
>In addition, it avoids the complexity of restarting
>something from the middle, as compared to from the begining.
>
>To me it seems that there is significant simplicity (from
>implementation, reliability and recovery process) from
>having smaller data transfer per command.
>
>I would really like to get rid of the SACK command.
>
>Somesh
>
>> -----Original Message-----
>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>> julian_satran@il.ibm.com
>> Sent: Wednesday, March 28, 2001 6:57 AM
>> To: ips@ece.cmu.edu
>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> Mallikarjun,
>>
>> Last summer I thought that recovery within a connection should be left to
>> TCP. It is simple and could be made available through IPsec (if no new
>> option of any form can be added).
>>
>> Two things killed this:
>>
>>    The requirement to have a data encapsulation that can pass through
>>    application proxies (like a storage router)
>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>    header
>>
>>
>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>> implementations are in fact allowed to drop even unacked data.
>>
>> I am bound by the Orlando meeting decision to drop it. Except the regular
>> "oppose everything" crowd the two vocal opponents where Somesh Gupta and
>> Matt Wakeley.
>>
>> David may want or not to re-open the issue - I am not going to ask for it.
>>
>> Regards,
>> Julo
>>
>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>
>> Please respond to cbm@rose.hp.com
>>
>> To:   Black_David@emc.com
>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com, someshg@yahoo.com,
>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> David and Julian,
>>
>> I appreciate both your views, and should I say that they're
>> along predicted lines :-)
>>
>> - David's right in saying that the situation is akin to FC's.
>>   However, I would like to point out that FC is an unreliable
>>   transport, and hence is forced to pick up a lot of the transport
>>   baggage (at least in FCP-2, as I understand), in addition
>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>   TCP being the "reliable" transport, iSCSI is going along the
>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>   point is - if this is indeed a necessary evil, why don't we
>>   complete iSCSI's transport functionality by data-ACKs?
>>
>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>   we're making its usage (and implementation) drastically less appealing
>>   since the only way error recovery algorithms can *rely* on data SACK
>>   is when replay is supported (or, "ReplaySupport=yes"  in my proposal),
>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>   draft and not providing any incentives to implement and use it!
>>
>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>>   protocol in its definition so far (and I believe, rightly so - mandating
>>   command ordering, bi-di support, SCSI CRN support to name a few
>> examples),
>>   the perfectly SCSI-legal R/W interactions that break in other transports
>>   *do not* have to break in iSCSI.
>>
>> - A last idea (may seem radical at this point) in regards to iSCSI
>>   being a "full transport". This provides us an opportunity to "cast
>>   off" the transport baggage in future when we truly move to a "reliable"
>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>   keeping the encapsulation stuff separate from the transport stuff.
>>   (Julian, I heard from Randy that ideas similar to this were explored
>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>   TCP was supposed to be reliable and granularity of recovery was deemed
>>   one I/O.)
>>
>> With that said, may I request David (with his co-chair hat on, :-))
>> to add some binding comments/observations on this discussion?
>>
>> If we decide to leave data SACKs as unattractive to implement, the draft
>> should in the least add a statement like - "Note that satisfying all
>> possible data SACK requests for a task with an unacknowledged status
>> implies implementing the I/O replay buffer on the part of targets."
>> --
>> Mallikarjun
>>
>>
>> Mallikarjun Chadalapaka
>> Networked Storage Architecture
>> Network Storage Solutions Organization
>> MS 5668   Hewlett-Packard, Roseville.
>> cbm@rose.hp.com
>>
>>
>>
>>
>> >I think Julian's basically right -- I would point
>> >out that any case of write after read that breaks
>> >over iSCSI will also break over Fibre Channel.
>> >On FC, the scenario starts with a frame CRC failure
>> >on read data at the Initiator, so applications
>> >have to cope and typically do so by enforcing
>> >ordering at the app rather than using SCSI task
>> >ordering.
>> >
>> >While SCSI has clever tools like ACA and task
>> >ordering that appear to allow dependent operations
>> >to be sent to the target concurrently, in practice
>> >they don't work and/or aren't used (funny thing,
>> >those two reinforce each other ;-) ).  Hence
>> >a minimal approach to them is in order:
>> >- Make sure the result will interoperate.
>> >- Make sure T10 doesn't ding us for leaving something
>> >    completely out.
>> >- Don't specify anything not needed for the above.
>> >
>> >My 0.02,
>> >--David
>> >
>> >> -----Original Message-----
>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>> >> To:    cbm@rose.hp.com
>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>> >> Black_David@emc.com
>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >> Mallikarjun,
>> >>
>> >> I commiserate with you at the lack of ack for data but the Orlando
>> meeting
>> >> stated - no.  Recall that I kept the number only as a mechanism to
>> detect
>> >> missing packets.
>> >>
>> >> You can achieve the effect you want by keeping around data for a while
>> >> (you
>> >> determine how long and then discard).
>> >>
>> >> If a SACK comes and you can recover - fine. If not you either reaccess
>> the
>> >> media (if you know how) or reject
>> >> and let the initiator retry.
>> >>
>> >> You should not worry about R/W conflicts as programs bound to have such
>> >> conflicts either:
>> >>
>> >> 1)can live with them or
>> >> 2)protect themselves through some locks and rely on
>> "operation-end-status"
>> >> to keep results deterministic.
>> >>
>> >> Regards,
>> >> Julo
>> >>
>> >>
>> >>
>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>> >>
>> >> Please respond to cbm@rose.hp.com
>> >>
>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu, Julian
>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>> >> cc:   Black_David@emc.com
>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >>
>> >> Hi Error Recovery Team,
>> >>
>> >> iSCSI can discard PDUs because of digest errors and request
>> >> retransmissions using the iSCSI data SACK.  To deal with such
>> >> an eventuality, targets that want to support data SACK have
>> >> the following options:
>> >>
>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>> >>   a SACK could come anytime before the status is ack'ed by the
>> >>   initiator. [ simple, but extremely expensive in memory resources]
>> >>
>> >> (B) (re-introduce data-ACKs into the draft, and) implement data-ACKs.
>> >>   Thus enables keeping only those I/O buffers that haven't been ack'ed
>> >>   by the initiator. IOW, become a real full transport! [ everyone
>> disliked
>> >>   it earlier...]
>> >>
>> >> (C) re-access the medium for data retransmission requests.  Now there
>> >>   are 3 sub-cases in this to handle the changed data on the medium in a
>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>> >> legal.)
>> >>      (1) On seeing any write, stall till status is ack'ed for all the
>> >>             previous reads (basically drain the pipe). [simple, but
>> incurs
>> >>             an additional roundtrip delay for all writes].
>> >>      (2) A variation of the above, keep an eye only on the prior
>> >>             overlapping reads. [more BW efficient, but complicated to
>> >>             resolve the block dependencies in a stream of
>> reads followed
>> >>             by writes]
>> >>         (3) Document the caveat and leave it upto the applications
>> >>             to avoid this case since this leads to data integrity
>> issues.
>> >>             [pushing to apps since the transport can't get it right!]
>> >>
>> >> My first preference is (B), followed by (A), and I suggest we not go
>> >> to (C) at all with its inherent dangers.
>> >>
>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>> >> right thing to do architecturally instead of being a "semi-transport"!
>> >>
>> >> Comments?
>> >> --
>> >> Mallikarjun
>> >>
>> >>
>> >> Mallikarjun Chadalapaka
>> >> Networked Storage Architecture
>> >> Network Storage Solutions Organization
>> >> MS 5668   Hewlett-Packard, Roseville.
>> >> cbm@rose.hp.com
>> >>
>> >>
>> __________________________________________________________________________
>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>> legal
>> >>         if SCSI sets the ORDERED task attribute on both the
>> commands AND
>> >>         sets the NACA bit to one to indicate that Write shall be
>> executed
>> >>         only if the Read did not fail (result in a Check Condition).
>> >>
>> >>         In the current case, since Read completed just fine from SCSI's
>> >>         point of view, SCSI is moving on to execute Write.  Those read
>> >> buffers
>> >>         had been freed up since iSCSI received an ACK at the TCP level,
>> >> and
>> >>         since iSCSI has no other way to have the data ack'ed!
>> >>
>> >>
>> >>
>> >>
>> >
>>
>>
>>
>>
>
>
>_________________________________________________________
>Do You Yahoo!?
>Get your free @yahoo.com address at http://mail.yahoo.com
>
>




From owner-ips@ece.cmu.edu  Mon Apr  2 18:15:55 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA02773
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 18:15:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32FJuk26984
	for ips-outgoing; Mon, 2 Apr 2001 11:19:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from apollo.pirus.com ([63.91.118.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32DKnr18458
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 09:20:50 -0400 (EDT)
Message-ID: <E132D13F58DAAB45AE5D01CA75BD3D56B4F5@OZ>
From: "Binford, Charles" <CBinford@Pirus.com>
To: "iSCSI (E-mail)" <ips@ece.cmu.edu>
Subject: FW: iSCSI: Out Of Sequence due to null sequence with multipleconn
	ections.
Date: Mon, 2 Apr 2001 09:20:15 -0400 
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I sent this Friday, but never saw it come back, so I'm resending.....

Charles Binford
Pirus Networks
316.315.0382 x222


Doug, If I'm hearing you correct, you are saying process the task managment
command immediately, but toss any commands waiting for a CmdSN hole to be
filled.   The problem with this approach, is there are many different
flavors of task managemet commands and not all effect all commands.  The
iSCSI layer would have to duplicate the SCSI layer effort of parsing the
task managment request and the LU/tags of all commands it comtemplates
aborting.  I don't like that solution.

David,  I don't think the current state of things is acceptable as you imply
with your final statement of 'I'm not sure anything needs to be changed.'

FC solves this problem by requiring the initiator FCP layer (note: FCP, not
SCSI)to send an ABTS for every exchange 'in an ambiguous state' after any
task management functin.  It is ugly, but is covers the hole of the
initiator not knowing for sure which commands were properly aborted and
which were not.

I have suggested in the past (although, from the response I've gotten, I
think I have been misunderstood) that the iSCSI initiator send all task
management functions down all connections of a session.  I'm NOT saying
duplicate the task management function.  The distinction being only one
instance of the task management function should be passed up to the SCSI
layer on the target side.  

To address the issue of the task management function being delayed because
of a slow connection I suggest the following:  upon receipt of the first
task management function request on any connection, the iSCSI target shall
start a relatively short timer (I'm thinking of a range from a few 100 ms to
a couple of seconds).  As soon as the task management function is received
on all connections, cancel the timer and pass a single instance of the
request up to the SCSI layer.  If the timer expires, close the connection(s)
the task management function(s) did not arrive on and pass up a single
instance of the request to the SCSI layer.

Charles Binford
Pirus Networks
316.315.0382 x222


-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
Douglas Otis
Sent: Tuesday, March 27, 2001 9:42 AM
To: julian_satran@il.ibm.com; ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with
multipleconnections.


Julian,

It could be that a "stuck" command in flight or at the sequencer is the
reason for the task management which is made problematic by the null
sequence.  Even if implementers are aware of this problem, there is not a
good solution until null sequence is removed.  Making a flag to indicate
"immediate" or perhaps "reject prior pending iSCSI commands" is a means of
ensuring the initiator remains in control with respect to all situations.
This ensures the initiator is aware of the state of the target and sequence
of commands can be maintained if required.  Be careful about being too disk
centric.  Treating these SN numbers as unsigned allows a simple means of
tracking.  Here is an example explaination:

   Comparisons and arithmetic on SNs in this document SHOULD use Serial
   Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.

Doug

> David,
>
> Your summary is correct. And (except for a minor point) it is all a matter
> of target implementation.
> SCSI is not a "completely layered" stack and others have gone so far as to
> do task cancellation by an action at link layer (parallel and fiber).
> There might be 2 funny side-effects though if an implementer chooses to
> cancel "holes" (commands in flight on other connections):
>
> 1)the cancelled command is a another task management command (there can be
> only one active but what it the one active gets stucked?)
> 2)(academic I admit) the cancelled command arrives after a wrap around in
> command sequencing; this is a bit harder (although not impossible) to fix
> in the implementation
>
> Implementers should be aware of those side effects.
>
> Regards,
> Julo
>
>
>
> Black_David@emc.com on 26/03/2001 18:57:48
>
> Please respond to Black_David@emc.com
>
> To:   dotis@sanlight.net
> cc:   ips@ece.cmu.edu
> Subject:  RE: iSCSI: Out Of Sequence due to null sequence with
> multipleconn
>       ections.
>
>
[stuff deleted]
>
> (4) is hard.  One SCSI task management command generates one
> response.  That response can either be generated immediately
> (command arrives, is passed to SCSI, SCSI does its thing) or
> at the right point in the sequence (command arrives, is
> sequenced by iSCSI, passed to SCSI at the right point in the
> sequence, and SCSI does its thing), but NOT both.  As things
> currently stand, having a task management command apply to
> in-flight commands requires sending the task management
> command for ordered delivery - so if it's desired to have
> the task management command take immediate effect and also
> catch everything in flight, it's going to have to be sent
> twice.  I'm not enthusiastic about the idea of the task
> management command taking immediate effect but delaying the
> response until everything in flight that might be affected
> arrives, as I suspect the Initiator would like to know what
> happened sooner rather than later.
>
> (3) is "interesting".  The results of applying a SCSI task
> management command to a SCSI operation are known only to
> SCSI, and hence asking that a command stuck in the iSCSI
> sequencer be affected immediately by a task management
> command is asking that the task management command have
> the side effect of changing some of the commands it affects
> to immediate delivery so that it can immediately do its
> (SCSI) thing to them.  I wouldn't want to mandate this,
> nor would I want to prohibit it, BUT ... if the above
> discussion of in-flight commands is correct, I would
> observe that the application on the Initiator side
> can't tell the difference between commands that are in-flight
> vs. waiting for something in-flight on another connection,
> and hence is going to have to issue the task management
> command for ordered delivery if it wants to affect operations
> in either place (and issue a second copy if it wants
> immediate action).
>
> The upshot is that, aside from a longer discussion of this
> issue, I'm not sure anything needs to be changed.  Comments?
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>
>
>
>



From owner-ips@ece.cmu.edu  Mon Apr  2 19:10:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA03486
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 19:10:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32LnsQ26419
	for ips-outgoing; Mon, 2 Apr 2001 17:49:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lub1028.lss.emc.com ([168.159.39.28])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32Lmrr26357
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 17:48:53 -0400 (EDT)
Received: from emc.com (IDENT:jhall@localhost.localdomain [127.0.0.1])
	by lub1028.lss.emc.com (8.9.3/8.9.3) with ESMTP id RAA17288
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 17:48:42 -0400
Message-Id: <200104022148.RAA17288@lub1028.lss.emc.com>
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Mon, 02 Apr 2001 17:48:42 -0400
From: "Jon Hall" <jhall@emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


I thought (?) Stone and Partridge concluded somewhere in the
range between one packet in 16 million to one packet in 10
billion.

But, what percent of an iSCSI flow is comprised of iSCSI pdu
headers?  Is there a credible flow model for deciding the
probability of the one packet containing a StatSN?

-Jon

Pierre Labat writes:
>"Mallikarjun C." wrote:
>
>> To be fair to data SACK, one could think of an upper bound
>> on the unack'ed data - agreed on at the login time.  While not
>> requiring acks on every PDU, it gives targets the deterministic
>> maximum on the buffer size they have to keep around if they
>> choose to "reliably" support data SACK.  The current answer of
>> "replay buffer size/IO size", IMHO, is simply not attractive.
>> Also to be fair to data SACK, I believe FCP-2 allows sequence-level
>> error recovery in an I/O.
>>
>> However, I think that it's extremely useful to include a discussion
>> in the draft of  the TCP checksum "escape" statistics and the
>> device types for which this was considered an absolute requirement
>> to make forward progress at this error rates (like huge tape
>> backups?) - essentially the reasons that convinced Julian to define
>> this mechanism in. That gives credibility and acceptance to this,
>> or alternately may lead to the consensus that data SACK is not required.
>> --
>> Mallikarjun
>
>About  TCP checksum "escape" statistics what i saw is :
>
>A) Jonathan Stone,Craig Partridge, "When the CRC and TCP Checksum Disagree"
> http://www.acm.org/sigcomm/sigcomm2000/conf/abstract/9-1.htm
>
>1 escape in 200 millions
>
>B) V. Paxson "1999 End-to-End Internet Packet Dynamics."
>   http://www.aciri.org/vern/papers.html
>
> 1 escape in 300 millions
>
>C) J. Stone et. al "Performance of Checksums and CRC's over Real Data"
>  IEEE/ACM Transactions on Networking, Vol. 6, No. 5, October 1998
>
>http://dev.acm.org/pubs/articles/journals/ton/1998-6-5/p529-stone/p529-stone.pdf
>
>Less than 1 escape in 10e17 segments when taking into
>account the link layer AAL5 CRC. (see page 540 left column on top).
>
>
>
>
>Taking the worst is 1 in 200 millions.
>
>Regards,
>
>Pierre


From owner-ips@ece.cmu.edu  Mon Apr  2 19:14:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA03652
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 19:14:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32LMxS24478
	for ips-outgoing; Mon, 2 Apr 2001 17:22:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32LMBr24431
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 17:22:11 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTQ3YZ>; Mon, 2 Apr 2001 14:22:05 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B1733B1@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI: frame formats 
Date: Mon, 2 Apr 2001 14:21:57 -0700 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

While we're on the subject of padding: Another (non-ip) transport has the
rule that padding is only allowed on the last data PDU of a command.  That
is, the initial and intermediate data transfer PDUs must not be padded.

I believe this simplifies implementations doing cut-through placement into
word-aligned buffers. 

Charles

> -----Original Message-----
> From: Barry Reinhold [mailto:bbrtrebia@mediaone.net]
> Sent: Monday, April 02, 2001 11:59 AM
> To: Rod Harrison; ips@ece.cmu.edu
> Subject: RE: iSCSI: frame formats 
> 
> 
> Rod,
> 	I'm pretty sure the we decided to go with bytes at the 
> 50th IETF meeting
> for data length. No one felt that a max data size of 16 megs 
> in an iSCSI PDU
> was an issue (remember this is not max SCSI data transfer 
> size, it is max
> data bytes in an iSCSI PDU).
> 	I do think we need to word the padding with a bit of 
> care, as I can
> envision people doing this in at least two different ways 
> that would not
> interoperate.
> 	The interoperability issues on the padding question 
> boils down to: "Where
> in the TCP stream do I put the Data Digest if the number of 
> bytes I have to
> send is not a multiple of 4?"
> 	My expectation was that the transmitter would be null 
> padding the data
> portion of the iSCSI PDU to a word boundary, then sticking on 
> the digest.
> Thus the padding is actually in the TCP stream. For example 
> if <d> = a data
> octet, <p> = pad octet, and <dd> = data digest octet, then 
> the TCP stream
> for a 6 byte data transfer would look as follows:
> <octets in the header - end modulo 4> <d> <d> <d> <d> <d> <d> 
> <p> <p> <dd>
> ....
> 	The receiver would get the value "6" in the data length 
> portion of the
> header. After pulling out the header, the receiver would pull 
> out 8 bytes of
> "data + pad" and then get the digest.
> 	The other way to do this to have a "virtual pad" such 
> that padding is
> created by the receiver when construction the iSCSI PDU from 
> the TCP stream.
> The padding is never actually in the TCP stream itself.
> 	I do not think this is as helpful, but whatever we do, 
> the spec. should
> address this minor detail so we don't trip over it.
> 
> 
> >-----Original Message-----
> >From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On 
> Behalf Of
> >Rod Harrison
> >Sent: Monday, April 02, 2001 1:44 PM
> >To: ips@ece.cmu.edu
> >Subject: RE: iSCSI: frame formats
> >
> >
> >Stephen,
> >
> >	I don't agree with your concerns about the maximum PDU
> >size. I think few targets, or indeed initiators, will be
> >interested in negotiating PDU sizes anywhere near this
> >large, so big transfers will have to be fragmented anyway.
> >
> >	However, I share your concern about the padding. I don't
> >really see why we are considering it. If one has a local
> >alignment issue it can, and I believe should, be taken care
> >of locally and not in the specification. There are several
> >easy ways of handling this sort of thing; insert and remove
> >the pad locally; separate header and payload buffers
> >allowing each to be naturally aligned, etc.
> >
> >	If we have to pad then every read of non-header data will
> >have to involve a rounding calculation on the length, and
> >then perhaps a second read to discard the pad if the
> >underlying buffer is the exact size of the data. Possibly
> >the same on send, if the data buffer is the exact size the
> >transmit code can't just 'go off the end,' it will have to
> >send the data, and then fake up some pad and make another
> >send.
> >
> >	Am I missing something here, why do we care about padding?
> >
> >	- Rod
> >
> >-----Original Message-----
> >From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
> >Behalf Of
> >Stephen Bailey
> >Sent: Monday, April 02, 2001 3:18 PM
> >To: ips@ece.cmu.edu
> >Subject: Re: iSCSI: frame formats
> >
> >
> >Sandeep,
> >
> >> DataLen will now be max 8M/4M but then we dont wish to
> >have large
> >> iSCSI PDUs in any case.
> >
> >This max size is getting the the point where I'm sure it'll
> >be an
> >irritant.
> >
> >I would like (but, in fact, a sure way to guarantee that it
> >won't
> >happen is for me to like it :^) to view iSCSI PDUs as the
> >expected
> >grain size at which you might have software involvement in
> >an
> >otherwise hardware-driven iSCSI implementation.
> >
> >For example, when a target is returning read data for
> >multiple
> >outstanding reads, it might want to return a bit at a time
> >from each,
> >and each `bit' should be an iSCSI PDU.  Clearly sending
> >these bits
> >will be a software-level decision.  That certainly was the
> >rational
> >for allowing multiple FCP DATA PDUs per read operation, and
> >I naively
> >assumed similar logic was being applied here.
> >
> >The alternative is to say that the hardware will do iSCSI
> >PDU
> >chunking, but if that's the case, I expect that the header
> >is, well, a
> >bit bulky.
> >
> >I'm also incredibly unexcited about having data length be a
> >multiple
> >of 4 bytes (if that's still in the cards).  There operations
> >within
> >the SCSI command set which return arbitrary length data.
> >There are
> >perfectly nominal cases where you get less data than you
> >requested
> >(e.g. inquiry & request sense).  Furthermore, the SCSI
> >architecture
> >does not prohibit this, even though certain commands do, so
> >it is not
> >for iSCSI to say anything about this one way or the other.
> >
> >The problem with handling lengths that include padding comes
> >when
> >you're trying to move the data into a buffer which is a
> >non-multiple
> >length.  For example, if I ask for 22 bytes of inquiry data
> >(with a 22
> >byte buffer), what can I say, at the time a PDU arrives that
> >has 24
> >bytes?  It might have 21 or 22 bytes (or perhaps even,
> >erroneously 23
> >or 24 bytes).  A data residual coming later will tell the
> >software how
> >much was actually there, but it can't tell the hardware.
> >The typical
> >expectation of this type of transfer is that it will only
> >overwrite
> >bytes of the buffer that are actually transferred, but
> >having padded
> >lengths will not allow this.
> >
> >The completely standard solution to carrying arbitrary data
> >of
> >arbitrary length in an aligned transfer unit is to pad the
> >transfer
> >unit but report the exact (shorter) length.  Another
> >solution, used by
> >FC, is to carry a pad length.  In iSCSI, why bother---you've
> >just
> >reintroduced added the 2 bits you were trying to remove?
> >
> >The data length scenario is not comparable to IP header
> >lengths, where
> >what is being carried is not arbitrary data.
> >
> >Certainly, for iSCSI additional header segments (AHSs) you
> >could
> >arguably use this cell length technique, since we can
> >control what
> >we're carrying (AHSs that need exact byte lengths will have
> >to be
> >internally self-describing) but frankly, I still think it's
> >a bad
> >idea.
> >
> >I can't understand why we're messing around with all these
> >tricky
> >`solutions' to standard problems.  We should avoid the
> >temptation to
> >get cute, and wholesomely provide the same capabilities as
> >any other
> >SCSI transport.  Specifically:
> >  o allow long PDUs
> >  o carry exact data lengths
> >
> >Steph
> >
> 


From owner-ips@ece.cmu.edu  Mon Apr  2 19:14:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA03663
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 19:14:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32KU7k20504
	for ips-outgoing; Mon, 2 Apr 2001 16:30:07 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from smtp016.mail.yahoo.com (smtp016.mail.yahoo.com [216.136.174.113])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f32KSqr20400
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 16:28:53 -0400 (EDT)
Received: from sdsl-216-36-75-164.dsl.sjc.megapath.net (HELO somesh) (216.36.75.164)
  by smtp.mail.vip.sc5.yahoo.com with SMTP; 2 Apr 2001 20:28:46 -0000
X-Apparently-From: <someshg@yahoo.com>
Reply-To: <someshg@yahoo.com>
From: "Somesh Gupta" <someshg@yahoo.com>
To: <cbm@rose.hp.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Mon, 2 Apr 2001 13:23:25 -0700
Message-ID: <NMEALCLOIBCHBDHLCMIJOEKECCAA.someshg@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
In-Reply-To: <200104021733.KAA01362@core.rose.hp.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

To beat a dead horse ..

One has to really decide fundamentally whether

1. Commands are used to transfer very large amounts of
   data (multiple data PDUs are needed)
2. Commands are used to transfer relatively small amounts
   of data (few/about one data PDU) and multiple commands
   are then used to do long transfers

(Orlando consensus was #2)

If we assume the first model, then we really should have
a sequence # and acknowledgement of every PDU - not just
data PDUs. In this case, it is important to fill holes
in the iSCSI stream. We can have a "super-transport" as
Mallikarjun suggested between the iSCSI protocol layer
and the TCP layer that provides the various "transport"
like features we seem to want.

If we assume the second model, we assume that recovery at
the command level is sufficient. In this case it is important
to have whatever mechanisms are (including data seq #s) needed
to detect that a command will not succeed without recovery
at the command level. However, recovery is needed only
at the command level.

I would let the current application model decide the features
in "version 1" of the iSCSI protocol.

Somesh

> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> Mallikarjun C.
> Sent: Monday, April 02, 2001 10:34 AM
> To: ips@ece.cmu.edu
> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
> >Sorry to have been missing for a while. Hope you will
> >appreciate my being back in action :-). It was a fairly
> >clear consensus in Orlando that applications broke up
> >their transfers into reasonably small chunks i.e. they
> >did not have very long running transfers.
> >
> >Therefore the consensus was that a command level recovery
> >mechanism was sufficient instead of an ack/sack for each
> >data PDU.
> >
> >The SACK mechanism was a post Orlando invention. Without
> >an ack mechanism (for every data PDU), the SACK mechanism
> >just imposes additional burden on either end of the session,
> >without really much benefit.
>
> To be fair to data SACK, one could think of an upper bound
> on the unack'ed data - agreed on at the login time.  While not
> requiring acks on every PDU, it gives targets the deterministic
> maximum on the buffer size they have to keep around if they
> choose to "reliably" support data SACK.  The current answer of
> "replay buffer size/IO size", IMHO, is simply not attractive.
> Also to be fair to data SACK, I believe FCP-2 allows sequence-level
> error recovery in an I/O.
>
> However, I think that it's extremely useful to include a discussion
> in the draft of  the TCP checksum "escape" statistics and the
> device types for which this was considered an absolute requirement
> to make forward progress at this error rates (like huge tape
> backups?) - essentially the reasons that convinced Julian to define
> this mechanism in. That gives credibility and acceptance to this,
> or alternately may lead to the consensus that data SACK is not required.
> --
> Mallikarjun
>
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668	Hewlett-Packard, Roseville.
> cbm@rose.hp.com
>
> >
> >The benefit of having SACK is of saving bandwidth in case
> >the data part of the data PDU failed an integrity check
> >(but passed TCP checksum). This is a rare enough case that
> >as a percentage, the bandwidth loss from retransmitting
> >all the data associated with a read or write command is
> >very very small.
> >
> >In addition, it avoids the complexity of restarting
> >something from the middle, as compared to from the begining.
> >
> >To me it seems that there is significant simplicity (from
> >implementation, reliability and recovery process) from
> >having smaller data transfer per command.
> >
> >I would really like to get rid of the SACK command.
> >
> >Somesh
> >
> >> -----Original Message-----
> >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> >> julian_satran@il.ibm.com
> >> Sent: Wednesday, March 28, 2001 6:57 AM
> >> To: ips@ece.cmu.edu
> >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> Mallikarjun,
> >>
> >> Last summer I thought that recovery within a connection should
> be left to
> >> TCP. It is simple and could be made available through IPsec (if no new
> >> option of any form can be added).
> >>
> >> Two things killed this:
> >>
> >>    The requirement to have a data encapsulation that can pass through
> >>    application proxies (like a storage router)
> >>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> >>    header
> >>
> >>
> >> As for the ACK - I am very much in favor of it (it is a no brainer) and
> >> implementations are in fact allowed to drop even unacked data.
> >>
> >> I am bound by the Orlando meeting decision to drop it. Except
> the regular
> >> "oppose everything" crowd the two vocal opponents where Somesh
> Gupta and
> >> Matt Wakeley.
> >>
> >> David may want or not to re-open the issue - I am not going to
> ask for it.
> >>
> >> Regards,
> >> Julo
> >>
> >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> >>
> >> Please respond to cbm@rose.hp.com
> >>
> >> To:   Black_David@emc.com
> >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> someshg@yahoo.com,
> >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> David and Julian,
> >>
> >> I appreciate both your views, and should I say that they're
> >> along predicted lines :-)
> >>
> >> - David's right in saying that the situation is akin to FC's.
> >>   However, I would like to point out that FC is an unreliable
> >>   transport, and hence is forced to pick up a lot of the transport
> >>   baggage (at least in FCP-2, as I understand), in addition
> >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> >>   TCP being the "reliable" transport, iSCSI is going along the
> >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> >>   point is - if this is indeed a necessary evil, why don't we
> >>   complete iSCSI's transport functionality by data-ACKs?
> >>
> >> - If data SACK is introduced mostly to make up for TCP's shortcomings,
> >>   we're making its usage (and implementation) drastically less
> appealing
> >>   since the only way error recovery algorithms can *rely* on data SACK
> >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> proposal),
> >>   which is extremely expensive.  IOW, we're defining data SACK in the
> >>   draft and not providing any incentives to implement and use it!
> >>
> >> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
> >>   protocol in its definition so far (and I believe, rightly so
> - mandating
> >>   command ordering, bi-di support, SCSI CRN support to name a few
> >> examples),
> >>   the perfectly SCSI-legal R/W interactions that break in
> other transports
> >>   *do not* have to break in iSCSI.
> >>
> >> - A last idea (may seem radical at this point) in regards to iSCSI
> >>   being a "full transport". This provides us an opportunity to "cast
> >>   off" the transport baggage in future when we truly move to a
> "reliable"
> >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> >>   keeping the encapsulation stuff separate from the transport stuff.
> >>   (Julian, I heard from Randy that ideas similar to this were explored
> >>   in your Haifa meeting.  And yes, he recalls they were given up since
> >>   TCP was supposed to be reliable and granularity of recovery
> was deemed
> >>   one I/O.)
> >>
> >> With that said, may I request David (with his co-chair hat on, :-))
> >> to add some binding comments/observations on this discussion?
> >>
> >> If we decide to leave data SACKs as unattractive to implement,
> the draft
> >> should in the least add a statement like - "Note that satisfying all
> >> possible data SACK requests for a task with an unacknowledged status
> >> implies implementing the I/O replay buffer on the part of targets."
> >> --
> >> Mallikarjun
> >>
> >>
> >> Mallikarjun Chadalapaka
> >> Networked Storage Architecture
> >> Network Storage Solutions Organization
> >> MS 5668   Hewlett-Packard, Roseville.
> >> cbm@rose.hp.com
> >>
> >>
> >>
> >>
> >> >I think Julian's basically right -- I would point
> >> >out that any case of write after read that breaks
> >> >over iSCSI will also break over Fibre Channel.
> >> >On FC, the scenario starts with a frame CRC failure
> >> >on read data at the Initiator, so applications
> >> >have to cope and typically do so by enforcing
> >> >ordering at the app rather than using SCSI task
> >> >ordering.
> >> >
> >> >While SCSI has clever tools like ACA and task
> >> >ordering that appear to allow dependent operations
> >> >to be sent to the target concurrently, in practice
> >> >they don't work and/or aren't used (funny thing,
> >> >those two reinforce each other ;-) ).  Hence
> >> >a minimal approach to them is in order:
> >> >- Make sure the result will interoperate.
> >> >- Make sure T10 doesn't ding us for leaving something
> >> >    completely out.
> >> >- Don't specify anything not needed for the above.
> >> >
> >> >My 0.02,
> >> >--David
> >> >
> >> >> -----Original Message-----
> >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> >> >> To:    cbm@rose.hp.com
> >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
> >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> >> >> Black_David@emc.com
> >> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >> Mallikarjun,
> >> >>
> >> >> I commiserate with you at the lack of ack for data but the Orlando
> >> meeting
> >> >> stated - no.  Recall that I kept the number only as a mechanism to
> >> detect
> >> >> missing packets.
> >> >>
> >> >> You can achieve the effect you want by keeping around data
> for a while
> >> >> (you
> >> >> determine how long and then discard).
> >> >>
> >> >> If a SACK comes and you can recover - fine. If not you
> either reaccess
> >> the
> >> >> media (if you know how) or reject
> >> >> and let the initiator retry.
> >> >>
> >> >> You should not worry about R/W conflicts as programs bound
> to have such
> >> >> conflicts either:
> >> >>
> >> >> 1)can live with them or
> >> >> 2)protect themselves through some locks and rely on
> >> "operation-end-status"
> >> >> to keep results deterministic.
> >> >>
> >> >> Regards,
> >> >> Julo
> >> >>
> >> >>
> >> >>
> >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> >> >>
> >> >> Please respond to cbm@rose.hp.com
> >> >>
> >> >> To:   cbm@rose.hp.com, someshg@yahoo.com,
> steph@cs.uchicago.edu, Julian
> >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> >> >> cc:   Black_David@emc.com
> >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Hi Error Recovery Team,
> >> >>
> >> >> iSCSI can discard PDUs because of digest errors and request
> >> >> retransmissions using the iSCSI data SACK.  To deal with such
> >> >> an eventuality, targets that want to support data SACK have
> >> >> the following options:
> >> >>
> >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> >> >>   a SACK could come anytime before the status is ack'ed by the
> >> >>   initiator. [ simple, but extremely expensive in memory resources]
> >> >>
> >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> data-ACKs.
> >> >>   Thus enables keeping only those I/O buffers that haven't
> been ack'ed
> >> >>   by the initiator. IOW, become a real full transport! [ everyone
> >> disliked
> >> >>   it earlier...]
> >> >>
> >> >> (C) re-access the medium for data retransmission requests.
> Now there
> >> >>   are 3 sub-cases in this to handle the changed data on the
> medium in a
> >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
> >> >> legal.)
> >> >>      (1) On seeing any write, stall till status is ack'ed
> for all the
> >> >>             previous reads (basically drain the pipe). [simple, but
> >> incurs
> >> >>             an additional roundtrip delay for all writes].
> >> >>      (2) A variation of the above, keep an eye only on the prior
> >> >>             overlapping reads. [more BW efficient, but
> complicated to
> >> >>             resolve the block dependencies in a stream of
> >> reads followed
> >> >>             by writes]
> >> >>         (3) Document the caveat and leave it upto the applications
> >> >>             to avoid this case since this leads to data integrity
> >> issues.
> >> >>             [pushing to apps since the transport can't get
> it right!]
> >> >>
> >> >> My first preference is (B), followed by (A), and I suggest we not go
> >> >> to (C) at all with its inherent dangers.
> >> >>
> >> >> Doing (B) naturally completes the transport job that iSCSI has taken
> >> >> on itself in view of TCP's claimed unreliable checksum.  That is the
> >> >> right thing to do architecturally instead of being a
> "semi-transport"!
> >> >>
> >> >> Comments?
> >> >> --
> >> >> Mallikarjun
> >> >>
> >> >>
> >> >> Mallikarjun Chadalapaka
> >> >> Networked Storage Architecture
> >> >> Network Storage Solutions Organization
> >> >> MS 5668   Hewlett-Packard, Roseville.
> >> >> cbm@rose.hp.com
> >> >>
> >> >>
> >>
> __________________________________________________________________________
> >> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
> >> legal
> >> >>         if SCSI sets the ORDERED task attribute on both the
> >> commands AND
> >> >>         sets the NACA bit to one to indicate that Write shall be
> >> executed
> >> >>         only if the Read did not fail (result in a Check Condition).
> >> >>
> >> >>         In the current case, since Read completed just fine
> from SCSI's
> >> >>         point of view, SCSI is moving on to execute Write.
> Those read
> >> >> buffers
> >> >>         had been freed up since iSCSI received an ACK at
> the TCP level,
> >> >> and
> >> >>         since iSCSI has no other way to have the data ack'ed!
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >>
> >>
> >>
> >>
> >
> >
> >_________________________________________________________
> >Do You Yahoo!?
> >Get your free @yahoo.com address at http://mail.yahoo.com
> >
> >
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



From owner-ips@ece.cmu.edu  Mon Apr  2 19:15:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA03699
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 19:15:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32LKt524303
	for ips-outgoing; Mon, 2 Apr 2001 17:20:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32LJqr24224
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 17:19:52 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f32MSL090086;
	Mon, 2 Apr 2001 15:28:21 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Binford, Charles" <CBinford@pirus.com>,
        "iSCSI \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multipleconnections.
Date: Mon, 2 Apr 2001 14:18:15 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMEOLCFAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <E132D13F58DAAB45AE5D01CA75BD3D56B4F5@OZ>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Charles,

My concern was to avoid creating errors within the transport then seen in
the SCSI layer.  Unlike FCP, there is a connection sequencer and there are
no strict and short fabric time limits.  As this scheme can get stuck, an
ability for a driver to extricate the situation while ensuring the state of
the target remains deterministic was my concern.  Should commands get
rejected, that same driver at the initiator can resend them unless the
intent of the "Reject Pending PDUs" flagged command was to abort these stuck
commands.  It would be the initiator that would make that determination and
not the middle box.  I would hope this type of behavior would be used only
in the situation of an apparent target failure.  The task attributes within
the command would be used at the target.

iSCSI should only make an effort to ensure commands get there in the proper
sequence.  My recommendation is far simpler than your suggestion.  Every
command receives a sequence number.  If this command is to bypass any
commands pending being sent to the target within the middle box, these
pending commands are rejected.  This would also apply to commands still in
flight.  Unsigned arithmetic to handle the test of what is prior is not
difficult nor does it require a timer.  Filtering out duplicates and waiting
would be a greater effort.  In this manner, there are no sequence holes and
you have a simple way of purging a queue heading into the target.  The
commands work only at the target and not within the sequencing queue.

Doug

> I sent this Friday, but never saw it come back, so I'm resending.....
>
> Charles Binford
> Pirus Networks
> 316.315.0382 x222
>
>
> Doug, If I'm hearing you correct, you are saying process the task
> managment
> command immediately, but toss any commands waiting for a CmdSN hole to be
> filled.   The problem with this approach, is there are many different
> flavors of task managemet commands and not all effect all commands.  The
> iSCSI layer would have to duplicate the SCSI layer effort of parsing the
> task managment request and the LU/tags of all commands it comtemplates
> aborting.  I don't like that solution.
>
> David,  I don't think the current state of things is acceptable
> as you imply
> with your final statement of 'I'm not sure anything needs to be changed.'
>
> FC solves this problem by requiring the initiator FCP layer
> (note: FCP, not
> SCSI)to send an ABTS for every exchange 'in an ambiguous state' after any
> task management functin.  It is ugly, but is covers the hole of the
> initiator not knowing for sure which commands were properly aborted and
> which were not.
>
> I have suggested in the past (although, from the response I've gotten, I
> think I have been misunderstood) that the iSCSI initiator send all task
> management functions down all connections of a session.  I'm NOT saying
> duplicate the task management function.  The distinction being only one
> instance of the task management function should be passed up to the SCSI
> layer on the target side.
>
> To address the issue of the task management function being delayed because
> of a slow connection I suggest the following:  upon receipt of the first
> task management function request on any connection, the iSCSI target shall
> start a relatively short timer (I'm thinking of a range from a
> few 100 ms to
> a couple of seconds).  As soon as the task management function is received
> on all connections, cancel the timer and pass a single instance of the
> request up to the SCSI layer.  If the timer expires, close the
> connection(s)
> the task management function(s) did not arrive on and pass up a single
> instance of the request to the SCSI layer.
>
> Charles Binford
> Pirus Networks
> 316.315.0382 x222
>
>
> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> Douglas Otis
> Sent: Tuesday, March 27, 2001 9:42 AM
> To: julian_satran@il.ibm.com; ips@ece.cmu.edu
> Subject: RE: iSCSI: Out Of Sequence due to null sequence with
> multipleconnections.
>
>
> Julian,
>
> It could be that a "stuck" command in flight or at the sequencer is the
> reason for the task management which is made problematic by the null
> sequence.  Even if implementers are aware of this problem, there is not a
> good solution until null sequence is removed.  Making a flag to indicate
> "immediate" or perhaps "reject prior pending iSCSI commands" is a means of
> ensuring the initiator remains in control with respect to all situations.
> This ensures the initiator is aware of the state of the target
> and sequence
> of commands can be maintained if required.  Be careful about
> being too disk
> centric.  Treating these SN numbers as unsigned allows a simple means of
> tracking.  Here is an example explaination:
>
>    Comparisons and arithmetic on SNs in this document SHOULD use Serial
>    Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.
>
> Doug
>
> > David,
> >
> > Your summary is correct. And (except for a minor point) it is
> all a matter
> > of target implementation.
> > SCSI is not a "completely layered" stack and others have gone
> so far as to
> > do task cancellation by an action at link layer (parallel and fiber).
> > There might be 2 funny side-effects though if an implementer chooses to
> > cancel "holes" (commands in flight on other connections):
> >
> > 1)the cancelled command is a another task management command
> (there can be
> > only one active but what it the one active gets stucked?)
> > 2)(academic I admit) the cancelled command arrives after a wrap
> around in
> > command sequencing; this is a bit harder (although not
> impossible) to fix
> > in the implementation
> >
> > Implementers should be aware of those side effects.
> >
> > Regards,
> > Julo
> >
> >
> >
> > Black_David@emc.com on 26/03/2001 18:57:48
> >
> > Please respond to Black_David@emc.com
> >
> > To:   dotis@sanlight.net
> > cc:   ips@ece.cmu.edu
> > Subject:  RE: iSCSI: Out Of Sequence due to null sequence with
> > multipleconn
> >       ections.
> >
> >
> [stuff deleted]
> >
> > (4) is hard.  One SCSI task management command generates one
> > response.  That response can either be generated immediately
> > (command arrives, is passed to SCSI, SCSI does its thing) or
> > at the right point in the sequence (command arrives, is
> > sequenced by iSCSI, passed to SCSI at the right point in the
> > sequence, and SCSI does its thing), but NOT both.  As things
> > currently stand, having a task management command apply to
> > in-flight commands requires sending the task management
> > command for ordered delivery - so if it's desired to have
> > the task management command take immediate effect and also
> > catch everything in flight, it's going to have to be sent
> > twice.  I'm not enthusiastic about the idea of the task
> > management command taking immediate effect but delaying the
> > response until everything in flight that might be affected
> > arrives, as I suspect the Initiator would like to know what
> > happened sooner rather than later.
> >
> > (3) is "interesting".  The results of applying a SCSI task
> > management command to a SCSI operation are known only to
> > SCSI, and hence asking that a command stuck in the iSCSI
> > sequencer be affected immediately by a task management
> > command is asking that the task management command have
> > the side effect of changing some of the commands it affects
> > to immediate delivery so that it can immediately do its
> > (SCSI) thing to them.  I wouldn't want to mandate this,
> > nor would I want to prohibit it, BUT ... if the above
> > discussion of in-flight commands is correct, I would
> > observe that the application on the Initiator side
> > can't tell the difference between commands that are in-flight
> > vs. waiting for something in-flight on another connection,
> > and hence is going to have to issue the task management
> > command for ordered delivery if it wants to affect operations
> > in either place (and issue a second copy if it wants
> > immediate action).
> >
> > The upshot is that, aside from a longer discussion of this
> > issue, I'm not sure anything needs to be changed.  Comments?
> >
> > Thanks,
> > --David
> >
> > ---------------------------------------------------
> > David L. Black, Senior Technologist
> > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> > black_david@emc.com       Mobile: +1 (978) 394-7754
> > ---------------------------------------------------
> >
> >
> >
> >
> >
>
>



From owner-ips@ece.cmu.edu  Mon Apr  2 22:15:17 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA07268
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 22:15:16 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f331Awq09566
	for ips-outgoing; Mon, 2 Apr 2001 21:10:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3319pr09438
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 21:09:51 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id DAA300678
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 03:09:44 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id DAA194746
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 03:07:55 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.00061C57 ; Tue, 3 Apr 2001 03:06:44 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.00061BD2.00@d12mta02.de.ibm.com>
Date: Tue, 3 Apr 2001 03:10:14 +0200
Subject: Re: frame formats
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Mark,

You might be getting a better protection but the implementation is more
complex .
BTW you would have had the same level of protection with less complexity
with Format 1.

Julo

Mark Bakke <mbakke@cisco.com> on 02/04/2001 15:42:40

Please respond to Mark Bakke <mbakke@cisco.com>

To:   "Robert D. Russell" <rdr@mars.iol.unh.edu>
cc:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
Subject:  Re: frame formats




Bob-

Good point about option 2.  If we have separate BHS and AHS CRCs,
all of lengths are checked.  I don't mind having the extra CRC, since
I really don't think we will see that many PDUs use the AHS.

It is possible to optimize reading option 1 (at least in software).
Just read the 48 bytes + 4 for the digest, and if there is an AHS,
keep the extra four as part of the next read.  So you still have
a single read for most frames, and two for those with AHS.

Still, option 2 does offer the best protection.  I'm fine with
option 2, but could live with option 1.  Anything but 3.

--
Mark

"Robert D. Russell" wrote:
>
> Mark:
>
> There is a potentially important distinction between the 2 choices
> that is missing from your summary when you indicate that both
> choices involve a variable length.
>
> In option 1, a single header digest after the BHS and AHS,
> you do not know when you start reading the header how long
> it will be and therefore you do not know where the digest is.
> This complicates the reading process, since it will have to either
> do the read in 2 steps (1st the BHS and then 2nd the AHS (if any)
> followed by the digest), or 1 read that interprets the data being read
> "on the fly" to extract the AHS length and extend the length of the read
> accordingly.
>
> In option 2 the first read is always for 48 bytes of header and you
> always know where the digest is.  The second read is needed
> only if the AHS length field in the BHS is non-zero, and its length
> is determined by that AHS length field. However, when the 2nd read
> is started the length IS known and the position of the digest
> IS known -- you do NOT have the "on the fly" searching needed
> in option 1.  This may (or may not) be a simplification.
>
> The other advantage to option 2 is that the input process never
> has to use unverified data (i.e., the AHS length field) to find
> the digest (and thus verify the data).
>
> Bob Russell
> InterOperability Lab
> University of New Hampshire
> rdr@iol.unh.edu
> 603-862-3774
>
> On Fri, 30 Mar 2001, Mark Bakke wrote:
>
> >
> > Excellent.  Which header digest positioning method will we choose?
> >
> > 1. Single header digest, after BHS and AHS
> >
> > 2. Two header digests, one for BHS, one for AHS
> >
> > 3. Single header digest for BHS; AHS is added to data digest
> >
> > Option 3 will not work well with iSCSI proxies and gateways
> > that may change header information, but keep the data end-to-end.
> >
> > To me, that leaves option 1 and 2.  So, which is easier, having
> > a single header digest in a variable location, or having the
> > potential for two header digests; one in a fixed location, and
> > an optional one in a variable location?
> >
> > I don't believe that we will see AHS on most iSCSI PDUs, so is
> > it OK to have a "slow path" for these?
> >
> > --
> > Mark
> >
> > julian_satran@il.ibm.com wrote:
> > >
> > > Dear colleagues,
> > >
> > > It look like Format-2 is selected by popular vote.
> > >
> > > Julo
> >
> > --
> > Mark A. Bakke
> > Cisco Systems
> > mbakke@cisco.com
> > 763.398.1054
> >

--
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054





From owner-ips@ece.cmu.edu  Mon Apr  2 22:16:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA07282
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 22:16:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32NxuK05128
	for ips-outgoing; Mon, 2 Apr 2001 19:59:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.brocade.com (asbestos.brocade.com [63.121.140.244])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32NuQr04936
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 19:56:26 -0400 (EDT)
Received: from thor.brocade.com (thor [192.168.126.45])
	by mail.brocade.com (8.8.8+Sun/8.8.8) with ESMTP id QAA17229;
	Mon, 2 Apr 2001 16:56:11 -0700 (PDT)
Received: by thor.brocade.com with Internet Mail Service (5.5.2653.19)
	id <FXLAX367>; Mon, 2 Apr 2001 16:56:12 -0700
Message-ID: <FFD40DB4943CD411876500508BAD02797D467D@sj5-ex2.brocade.com>
From: Robert Snively <rsnively@Brocade.COM>
To: "'Jon Hall'" <jhall@emc.com>, ips@ece.cmu.edu
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Mon, 2 Apr 2001 16:56:10 -0700 
X-Mailer: Internet Mail Service (5.5.2653.19)
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

The Stone and Partridge paper is mostly not applicable to an iSCSI
environment.  The principal failure mechanisms were major software
bugs in the driver stack of PC-oriented machines.  Such software
bugs would quickly be rung out in any iSCSI host driver and in
any iSCSI target and would certainly not exist in an enterprise
environment.  The 1 in 10 billion range is the more likely
environment.

Bob

>  -----Original Message-----
>  From: Jon Hall [mailto:jhall@emc.com]
>  Sent: Monday, April 02, 2001 2:49 PM
>  To: ips@ece.cmu.edu
>  Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>  
>  
>  
>  I thought (?) Stone and Partridge concluded somewhere in the
>  range between one packet in 16 million to one packet in 10
>  billion.
>  
>  But, what percent of an iSCSI flow is comprised of iSCSI pdu
>  headers?  Is there a credible flow model for deciding the
>  probability of the one packet containing a StatSN?
>  
>  -Jon
>  
>  Pierre Labat writes:
>  >"Mallikarjun C." wrote:
>  >
>  >> To be fair to data SACK, one could think of an upper bound
>  >> on the unack'ed data - agreed on at the login time.  While not
>  >> requiring acks on every PDU, it gives targets the deterministic
>  >> maximum on the buffer size they have to keep around if they
>  >> choose to "reliably" support data SACK.  The current answer of
>  >> "replay buffer size/IO size", IMHO, is simply not attractive.
>  >> Also to be fair to data SACK, I believe FCP-2 allows 
>  sequence-level
>  >> error recovery in an I/O.
>  >>
>  >> However, I think that it's extremely useful to include a 
>  discussion
>  >> in the draft of  the TCP checksum "escape" statistics and the
>  >> device types for which this was considered an absolute requirement
>  >> to make forward progress at this error rates (like huge tape
>  >> backups?) - essentially the reasons that convinced Julian 
>  to define
>  >> this mechanism in. That gives credibility and acceptance to this,
>  >> or alternately may lead to the consensus that data SACK 
>  is not required.
>  >> --
>  >> Mallikarjun
>  >
>  >About  TCP checksum "escape" statistics what i saw is :
>  >
>  >A) Jonathan Stone,Craig Partridge, "When the CRC and TCP 
>  Checksum Disagree"
>  > http://www.acm.org/sigcomm/sigcomm2000/conf/abstract/9-1.htm
>  >
>  >1 escape in 200 millions
>  >
>  >B) V. Paxson "1999 End-to-End Internet Packet Dynamics."
>  >   http://www.aciri.org/vern/papers.html
>  >
>  > 1 escape in 300 millions
>  >
>  >C) J. Stone et. al "Performance of Checksums and CRC's over 
>  Real Data"
>  >  IEEE/ACM Transactions on Networking, Vol. 6, No. 5, October 1998
>  >
>  >http://dev.acm.org/pubs/articles/journals/ton/1998-6-5/p529-
stone/p529-stone.pdf
>
>Less than 1 escape in 10e17 segments when taking into
>account the link layer AAL5 CRC. (see page 540 left column on top).
>
>
>
>
>Taking the worst is 1 in 200 millions.
>
>Regards,
>
>Pierre


From owner-ips@ece.cmu.edu  Mon Apr  2 22:21:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA07342
	for <ips-archive@odin.ietf.org>; Mon, 2 Apr 2001 22:21:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f331Gw309949
	for ips-outgoing; Mon, 2 Apr 2001 21:16:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f331GGr09889
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 21:16:16 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id DAA321464
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 03:16:09 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id DAA69040
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 03:14:20 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.0006B337 ; Tue, 3 Apr 2001 03:13:10 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.0006B244.00@d12mta02.de.ibm.com>
Date: Tue, 3 Apr 2001 03:16:39 +0200
Subject: Re: iSCSI: frame formats
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Steph,

Very thoughtful. The why did you not speak up for Format 1 - it has no
tricks.

And BTW the length (data) as it stands now includes only the effective
bytes - padding is implicit.
Not so for AHS - but there the private length could end-up being bytes with
padding implied.

Julo

Stephen Bailey <steph@cs.uchicago.edu> on 02/04/2001 16:17:33

Please respond to Stephen Bailey <steph@cs.uchicago.edu>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI: frame formats




Sandeep,

> DataLen will now be max 8M/4M but then we dont wish to have large
> iSCSI PDUs in any case.

This max size is getting the the point where I'm sure it'll be an
irritant.

I would like (but, in fact, a sure way to guarantee that it won't
happen is for me to like it :^) to view iSCSI PDUs as the expected
grain size at which you might have software involvement in an
otherwise hardware-driven iSCSI implementation.

For example, when a target is returning read data for multiple
outstanding reads, it might want to return a bit at a time from each,
and each `bit' should be an iSCSI PDU.  Clearly sending these bits
will be a software-level decision.  That certainly was the rational
for allowing multiple FCP DATA PDUs per read operation, and I naively
assumed similar logic was being applied here.

The alternative is to say that the hardware will do iSCSI PDU
chunking, but if that's the case, I expect that the header is, well, a
bit bulky.

I'm also incredibly unexcited about having data length be a multiple
of 4 bytes (if that's still in the cards).  There operations within
the SCSI command set which return arbitrary length data.  There are
perfectly nominal cases where you get less data than you requested
(e.g. inquiry & request sense).  Furthermore, the SCSI architecture
does not prohibit this, even though certain commands do, so it is not
for iSCSI to say anything about this one way or the other.

The problem with handling lengths that include padding comes when
you're trying to move the data into a buffer which is a non-multiple
length.  For example, if I ask for 22 bytes of inquiry data (with a 22
byte buffer), what can I say, at the time a PDU arrives that has 24
bytes?  It might have 21 or 22 bytes (or perhaps even, erroneously 23
or 24 bytes).  A data residual coming later will tell the software how
much was actually there, but it can't tell the hardware.  The typical
expectation of this type of transfer is that it will only overwrite
bytes of the buffer that are actually transferred, but having padded
lengths will not allow this.

The completely standard solution to carrying arbitrary data of
arbitrary length in an aligned transfer unit is to pad the transfer
unit but report the exact (shorter) length.  Another solution, used by
FC, is to carry a pad length.  In iSCSI, why bother---you've just
reintroduced added the 2 bits you were trying to remove?

The data length scenario is not comparable to IP header lengths, where
what is being carried is not arbitrary data.

Certainly, for iSCSI additional header segments (AHSs) you could
arguably use this cell length technique, since we can control what
we're carrying (AHSs that need exact byte lengths will have to be
internally self-describing) but frankly, I still think it's a bad
idea.

I can't understand why we're messing around with all these tricky
`solutions' to standard problems.  We should avoid the temptation to
get cute, and wholesomely provide the same capabilities as any other
SCSI transport.  Specifically:
  o allow long PDUs
  o carry exact data lengths

Steph





From owner-ips@ece.cmu.edu  Tue Apr  3 00:15:02 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA10137
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 00:15:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f331P2H10423
	for ips-outgoing; Mon, 2 Apr 2001 21:25:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f331OVr10400
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 21:24:31 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id DAA257488
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 03:24:24 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id DAA68134
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 03:21:21 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.00077463 ; Tue, 3 Apr 2001 03:21:25 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.00077341.00@d12mta02.de.ibm.com>
Date: Tue, 3 Apr 2001 03:24:54 +0200
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Somesh,

That will certainly result in poor performance for important applications
even with hardware implementations of iSCSI - mainly due to the large SCSI
command traffic and associated interrupts.

Julo

"Somesh Gupta" <someshg@yahoo.com> on 02/04/2001 22:23:25

Please respond to someshg@yahoo.com

To:   cbm@rose.hp.com, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"




To beat a dead horse ..

One has to really decide fundamentally whether

1. Commands are used to transfer very large amounts of
   data (multiple data PDUs are needed)
2. Commands are used to transfer relatively small amounts
   of data (few/about one data PDU) and multiple commands
   are then used to do long transfers

(Orlando consensus was #2)

If we assume the first model, then we really should have
a sequence # and acknowledgement of every PDU - not just
data PDUs. In this case, it is important to fill holes
in the iSCSI stream. We can have a "super-transport" as
Mallikarjun suggested between the iSCSI protocol layer
and the TCP layer that provides the various "transport"
like features we seem to want.

If we assume the second model, we assume that recovery at
the command level is sufficient. In this case it is important
to have whatever mechanisms are (including data seq #s) needed
to detect that a command will not succeed without recovery
at the command level. However, recovery is needed only
at the command level.

I would let the current application model decide the features
in "version 1" of the iSCSI protocol.

Somesh

> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> Mallikarjun C.
> Sent: Monday, April 02, 2001 10:34 AM
> To: ips@ece.cmu.edu
> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
> >Sorry to have been missing for a while. Hope you will
> >appreciate my being back in action :-). It was a fairly
> >clear consensus in Orlando that applications broke up
> >their transfers into reasonably small chunks i.e. they
> >did not have very long running transfers.
> >
> >Therefore the consensus was that a command level recovery
> >mechanism was sufficient instead of an ack/sack for each
> >data PDU.
> >
> >The SACK mechanism was a post Orlando invention. Without
> >an ack mechanism (for every data PDU), the SACK mechanism
> >just imposes additional burden on either end of the session,
> >without really much benefit.
>
> To be fair to data SACK, one could think of an upper bound
> on the unack'ed data - agreed on at the login time.  While not
> requiring acks on every PDU, it gives targets the deterministic
> maximum on the buffer size they have to keep around if they
> choose to "reliably" support data SACK.  The current answer of
> "replay buffer size/IO size", IMHO, is simply not attractive.
> Also to be fair to data SACK, I believe FCP-2 allows sequence-level
> error recovery in an I/O.
>
> However, I think that it's extremely useful to include a discussion
> in the draft of  the TCP checksum "escape" statistics and the
> device types for which this was considered an absolute requirement
> to make forward progress at this error rates (like huge tape
> backups?) - essentially the reasons that convinced Julian to define
> this mechanism in. That gives credibility and acceptance to this,
> or alternately may lead to the consensus that data SACK is not required.
> --
> Mallikarjun
>
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668 Hewlett-Packard, Roseville.
> cbm@rose.hp.com
>
> >
> >The benefit of having SACK is of saving bandwidth in case
> >the data part of the data PDU failed an integrity check
> >(but passed TCP checksum). This is a rare enough case that
> >as a percentage, the bandwidth loss from retransmitting
> >all the data associated with a read or write command is
> >very very small.
> >
> >In addition, it avoids the complexity of restarting
> >something from the middle, as compared to from the begining.
> >
> >To me it seems that there is significant simplicity (from
> >implementation, reliability and recovery process) from
> >having smaller data transfer per command.
> >
> >I would really like to get rid of the SACK command.
> >
> >Somesh
> >
> >> -----Original Message-----
> >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> >> julian_satran@il.ibm.com
> >> Sent: Wednesday, March 28, 2001 6:57 AM
> >> To: ips@ece.cmu.edu
> >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> Mallikarjun,
> >>
> >> Last summer I thought that recovery within a connection should
> be left to
> >> TCP. It is simple and could be made available through IPsec (if no new
> >> option of any form can be added).
> >>
> >> Two things killed this:
> >>
> >>    The requirement to have a data encapsulation that can pass through
> >>    application proxies (like a storage router)
> >>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> >>    header
> >>
> >>
> >> As for the ACK - I am very much in favor of it (it is a no brainer)
and
> >> implementations are in fact allowed to drop even unacked data.
> >>
> >> I am bound by the Orlando meeting decision to drop it. Except
> the regular
> >> "oppose everything" crowd the two vocal opponents where Somesh
> Gupta and
> >> Matt Wakeley.
> >>
> >> David may want or not to re-open the issue - I am not going to
> ask for it.
> >>
> >> Regards,
> >> Julo
> >>
> >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> >>
> >> Please respond to cbm@rose.hp.com
> >>
> >> To:   Black_David@emc.com
> >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> someshg@yahoo.com,
> >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> David and Julian,
> >>
> >> I appreciate both your views, and should I say that they're
> >> along predicted lines :-)
> >>
> >> - David's right in saying that the situation is akin to FC's.
> >>   However, I would like to point out that FC is an unreliable
> >>   transport, and hence is forced to pick up a lot of the transport
> >>   baggage (at least in FCP-2, as I understand), in addition
> >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> >>   TCP being the "reliable" transport, iSCSI is going along the
> >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> >>   point is - if this is indeed a necessary evil, why don't we
> >>   complete iSCSI's transport functionality by data-ACKs?
> >>
> >> - If data SACK is introduced mostly to make up for TCP's shortcomings,
> >>   we're making its usage (and implementation) drastically less
> appealing
> >>   since the only way error recovery algorithms can *rely* on data SACK
> >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> proposal),
> >>   which is extremely expensive.  IOW, we're defining data SACK in the
> >>   draft and not providing any incentives to implement and use it!
> >>
> >> - I submit that since iSCSI is being hailed as the ideal SCSI
Transport
> >>   protocol in its definition so far (and I believe, rightly so
> - mandating
> >>   command ordering, bi-di support, SCSI CRN support to name a few
> >> examples),
> >>   the perfectly SCSI-legal R/W interactions that break in
> other transports
> >>   *do not* have to break in iSCSI.
> >>
> >> - A last idea (may seem radical at this point) in regards to iSCSI
> >>   being a "full transport". This provides us an opportunity to "cast
> >>   off" the transport baggage in future when we truly move to a
> "reliable"
> >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> >>   keeping the encapsulation stuff separate from the transport stuff.
> >>   (Julian, I heard from Randy that ideas similar to this were explored
> >>   in your Haifa meeting.  And yes, he recalls they were given up since
> >>   TCP was supposed to be reliable and granularity of recovery
> was deemed
> >>   one I/O.)
> >>
> >> With that said, may I request David (with his co-chair hat on, :-))
> >> to add some binding comments/observations on this discussion?
> >>
> >> If we decide to leave data SACKs as unattractive to implement,
> the draft
> >> should in the least add a statement like - "Note that satisfying all
> >> possible data SACK requests for a task with an unacknowledged status
> >> implies implementing the I/O replay buffer on the part of targets."
> >> --
> >> Mallikarjun
> >>
> >>
> >> Mallikarjun Chadalapaka
> >> Networked Storage Architecture
> >> Network Storage Solutions Organization
> >> MS 5668   Hewlett-Packard, Roseville.
> >> cbm@rose.hp.com
> >>
> >>
> >>
> >>
> >> >I think Julian's basically right -- I would point
> >> >out that any case of write after read that breaks
> >> >over iSCSI will also break over Fibre Channel.
> >> >On FC, the scenario starts with a frame CRC failure
> >> >on read data at the Initiator, so applications
> >> >have to cope and typically do so by enforcing
> >> >ordering at the app rather than using SCSI task
> >> >ordering.
> >> >
> >> >While SCSI has clever tools like ACA and task
> >> >ordering that appear to allow dependent operations
> >> >to be sent to the target concurrently, in practice
> >> >they don't work and/or aren't used (funny thing,
> >> >those two reinforce each other ;-) ).  Hence
> >> >a minimal approach to them is in order:
> >> >- Make sure the result will interoperate.
> >> >- Make sure T10 doesn't ding us for leaving something
> >> >    completely out.
> >> >- Don't specify anything not needed for the above.
> >> >
> >> >My 0.02,
> >> >--David
> >> >
> >> >> -----Original Message-----
> >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> >> >> To:    cbm@rose.hp.com
> >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
hufferd@us.ibm.com;
> >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> >> >> Black_David@emc.com
> >> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >> Mallikarjun,
> >> >>
> >> >> I commiserate with you at the lack of ack for data but the Orlando
> >> meeting
> >> >> stated - no.  Recall that I kept the number only as a mechanism to
> >> detect
> >> >> missing packets.
> >> >>
> >> >> You can achieve the effect you want by keeping around data
> for a while
> >> >> (you
> >> >> determine how long and then discard).
> >> >>
> >> >> If a SACK comes and you can recover - fine. If not you
> either reaccess
> >> the
> >> >> media (if you know how) or reject
> >> >> and let the initiator retry.
> >> >>
> >> >> You should not worry about R/W conflicts as programs bound
> to have such
> >> >> conflicts either:
> >> >>
> >> >> 1)can live with them or
> >> >> 2)protect themselves through some locks and rely on
> >> "operation-end-status"
> >> >> to keep results deterministic.
> >> >>
> >> >> Regards,
> >> >> Julo
> >> >>
> >> >>
> >> >>
> >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> >> >>
> >> >> Please respond to cbm@rose.hp.com
> >> >>
> >> >> To:   cbm@rose.hp.com, someshg@yahoo.com,
> steph@cs.uchicago.edu, Julian
> >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> >> >> cc:   Black_David@emc.com
> >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Hi Error Recovery Team,
> >> >>
> >> >> iSCSI can discard PDUs because of digest errors and request
> >> >> retransmissions using the iSCSI data SACK.  To deal with such
> >> >> an eventuality, targets that want to support data SACK have
> >> >> the following options:
> >> >>
> >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> >> >>   a SACK could come anytime before the status is ack'ed by the
> >> >>   initiator. [ simple, but extremely expensive in memory resources]
> >> >>
> >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> data-ACKs.
> >> >>   Thus enables keeping only those I/O buffers that haven't
> been ack'ed
> >> >>   by the initiator. IOW, become a real full transport! [ everyone
> >> disliked
> >> >>   it earlier...]
> >> >>
> >> >> (C) re-access the medium for data retransmission requests.
> Now there
> >> >>   are 3 sub-cases in this to handle the changed data on the
> medium in a
> >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
is
> >> >> legal.)
> >> >>      (1) On seeing any write, stall till status is ack'ed
> for all the
> >> >>             previous reads (basically drain the pipe). [simple, but
> >> incurs
> >> >>             an additional roundtrip delay for all writes].
> >> >>      (2) A variation of the above, keep an eye only on the prior
> >> >>             overlapping reads. [more BW efficient, but
> complicated to
> >> >>             resolve the block dependencies in a stream of
> >> reads followed
> >> >>             by writes]
> >> >>         (3) Document the caveat and leave it upto the applications
> >> >>             to avoid this case since this leads to data integrity
> >> issues.
> >> >>             [pushing to apps since the transport can't get
> it right!]
> >> >>
> >> >> My first preference is (B), followed by (A), and I suggest we not
go
> >> >> to (C) at all with its inherent dangers.
> >> >>
> >> >> Doing (B) naturally completes the transport job that iSCSI has
taken
> >> >> on itself in view of TCP's claimed unreliable checksum.  That is
the
> >> >> right thing to do architecturally instead of being a
> "semi-transport"!
> >> >>
> >> >> Comments?
> >> >> --
> >> >> Mallikarjun
> >> >>
> >> >>
> >> >> Mallikarjun Chadalapaka
> >> >> Networked Storage Architecture
> >> >> Network Storage Solutions Organization
> >> >> MS 5668   Hewlett-Packard, Roseville.
> >> >> cbm@rose.hp.com
> >> >>
> >> >>
> >>
>
__________________________________________________________________________
> >> >> Note.1: A Read followed by a Write (to the same blocks) is
perfectly
> >> legal
> >> >>         if SCSI sets the ORDERED task attribute on both the
> >> commands AND
> >> >>         sets the NACA bit to one to indicate that Write shall be
> >> executed
> >> >>         only if the Read did not fail (result in a Check
Condition).
> >> >>
> >> >>         In the current case, since Read completed just fine
> from SCSI's
> >> >>         point of view, SCSI is moving on to execute Write.
> Those read
> >> >> buffers
> >> >>         had been freed up since iSCSI received an ACK at
> the TCP level,
> >> >> and
> >> >>         since iSCSI has no other way to have the data ack'ed!
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >>
> >>
> >>
> >>
> >
> >
> >_________________________________________________________
> >Do You Yahoo!?
> >Get your free @yahoo.com address at http://mail.yahoo.com
> >
> >
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com






From owner-ips@ece.cmu.edu  Tue Apr  3 01:21:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA12317
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 01:21:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f330kvi08022
	for ips-outgoing; Mon, 2 Apr 2001 20:46:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f330kjr08005
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 20:46:45 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id CAA99080
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 02:46:38 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id CAA123254
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 02:44:49 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.0003FF1C ; Tue, 3 Apr 2001 02:43:39 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.0003FCEB.00@d12mta02.de.ibm.com>
Date: Mon, 2 Apr 2001 14:33:37 +0200
Subject: iSCSI requirements drafts
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Marjorie & team,

I hope this gets to you before the deadline.

A very good document - and up to now it behaves like good wine - it gets
better as it ages -:)

Here re some comments.

Regards,
Julo
_____________

On page 1 "will entail a minimum of new invention" I think you would like
to say "will entail a minimum of research".

On page 3 - the summary of the MUST  (and at section 4.1) you would like to
say "MUST specify how to recover" instead of "MUST provide the ability to
recover" (as the general agreed position of the group is that recovery is
mostly not mandatory to implement).
Page 5 - the summary - reference to 7.1 - iSCSI SHOULD deal with the
complications of the new SCSI security architecture - is a hard to
understand.

Page 5 - the summary - reference to 8.2 - "any login or connect command"
should read "read any login or connect phase" as the names may appear at
later points in the phase due to security

Page 7 - I think you would like to take out the "completion of requirements
and specification of prerequisites for the "full realization of iSCSI"
(whatever that means!) and say something  like a "a stable and widely
accepted standard".

Page 8 - 3.2 first paragraph is a nice piece of education in
cost/perforformance vs. performance/cost. I guess you would like to remove
it as it adds little to the rest of the document and is controversial
(cost/performance is bounded by 0 while performance/cost is potentially
unbounded)

Page 11 - 5th paragraph - the current consensus is that connection binding
is part of the protocol but optional to implement (mandatory to specify in
the requirements lingo). The text should reflect this.

Page 11 - 8th paragraph -- I think you would like to strike the first
statement (about symmetric vs. asymmetric).  The question has been solved
by hiatus (nobody is pursuing the asymmetric approach). To my chagrin I
can't find the spare time to pursue it and it doesn't look that there are
many noble souls willing to take it up out there.

Page 13 - 5.2 3rd paragraph -- I think that you would like to change "iSCSI
shall have no impact on T10 architecture" into 'iSCSI shall require no
changes to T10 architecture" (it will certainly impact T10 architecture
;:)).

Page 15 - 6.3 Data Integrity - I think that the statement "Two header CRCs
one for the ... " is not representing a fact (as "strongest integrity
check" suggests a numeric scale that is not there) nor a consensus of the
group. As such I think you may want to remove it.
Page 17 - 8.2 - See my previous comment on names in the login phase





From owner-ips@ece.cmu.edu  Tue Apr  3 01:23:39 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA12583
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 01:23:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f330l0408027
	for ips-outgoing; Mon, 2 Apr 2001 20:47:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f330kkr08006
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 20:46:46 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id CAA161844
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 02:46:38 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id CAA100130
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 02:43:34 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.0003FF58 ; Tue, 3 Apr 2001 02:43:39 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.0003FCEC.00@d12mta02.de.ibm.com>
Date: Mon, 2 Apr 2001 14:49:11 +0200
Subject: Re: iSCSI: synch and steering comments
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Mallikarjun,

I am not sure about which comment. If it is about synch and steering I
think that recovery from header digest errors
should not mandate a synch mechanism.  Some very sophisticated
implementations may want to take advantage of such a mechanism if it is
there. As this interaction may fairly complex and implementation dependent
we will assume that we will drop the connection in the recovery
descriptions we will provide.

This is also partly a result of choosing Format 2.

Regards,
Julo

"Mallikarjun C." <cbm@rose.hp.com> on 02/04/2001 07:14:54

Please respond to cbm@rose.hp.com

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI: synch and steering comments




Julian,

Thanks for the clarification.

Could you please take time to respond to the other two comments I had?
Or, do I take it that you will get back shortly?

If those comments are indeed incorrect, please help me understand why
so.

Thank you.
--
Mallikarjun


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668   Hewlett-Packard, Roseville.
cbm@rose.hp.com


>I've marked it with ---
>
>Matt Wakeley <matt_wakeley@agilent.com> on 31/03/2001 10:25:25
>
>Please respond to Matt Wakeley <matt_wakeley@agilent.com>
>
>To:   IPS Reflector <ips@ece.cmu.edu>
>cc:
>Subject:  Re: iSCSI: synch and steering comments
>
>
>
>
>Julian,
>
>There were many comments in this message.  To which comment are you
>refering
>to?
>
>-Matt
>
>julian_satran@il.ibm.com wrote:
>>
>> Mallikarjun,
>>
>> It is clearly communicated in the paragraph above it - but fine I will
>add
>> it here too.
>>
>> Julo
>>
>> "Mallikarjun C." <cbm@rose.hp.com> on 30/03/2001 00:54:20
>>
>> Please respond to cbm@rose.hp.com
>>
>> To:   ips@ece.cmu.edu
>> cc:
>> Subject:  Re: iSCSI: synch and steering comments
>>
>> Julian,
>>
>> Some comments.
>>
>> >Answers in text. Thanks, Julo
>> >
>> >
>> ..
>>
>> >-Suggest adding the following statement to section 1.2.8.2.
>> >
>> > All conventional, in-order data arrival notifications generated by TCP
>> > are passed through to iSCSI by the Synch and Steering layer after
>> > appropriate data placements while none of the out-of-order data
>> placements
>> > that it performs are communicated to upper layers.
>> >
>> >+++ I have added the following to 1.2.8.2
>> >
>> >   On the incoming path the Synch and Steering layer does not change
the
>> >   way TCP notifies iSCSI about in-order data arrival.  All
out-of-order
>> >   data placements
>> >   performed by the Synch and Steering layer are hidden from iSCSI.
>-------------------------------------------------------------------------------

>>
>> Okay, I'd however prefer it to imply that in-order data placement is
also
>> handled by Synch and Steering in the same sentence, instead of only
>> commenting on in-order notifications, and out-of-order placements.
>>
>-------------------------------------------------------------------------------

>
>> >
>> >   I have aloso changed a bit the figure to convey better the fact that
>> TCP
>> >   and Synch&Steering are related (not strictly layered +++
>>
>> That's a good idea.
>>
>> >
>> >   ++++
>> >
>> >-Section 1.2.8.2 states that a Synch and Steering layer is optional.
>> > It has to be qualifed that it is optional only for those iSCSI devices
>> > which perform connection recovery on header digest errors, since
that's
>> > how they cope with loss of framing. (I guess this may change in next
>> rev?)
>> >
>> >+++ with the new format I think that we have:
>> >
>> >- one more chance if we go for format 1 or
>> >- drop the connection on header error
>> >
>> >In both cases we can leave synch and steering optional
>>
>> Well, that doesn't address the thrust of my comment.  I was implying
>> that the draft should make it clear that those implementations which
>> don't support Synch and Steering should end the connection on a header
>> digest error and/or parity error, and not go into (what Somesh called)
>> a speculative mode.
>>
>> >
>> >+++
>> >
>> >-It appears to me that at least one Synch and Steering layer must be
>> > defined/referred to as the minimal implementation in the main draft to
>> > enable interoperability, when implementations do implement Synch and
>> >Steering.
>> >
>> >+++ why ? +++
>>
>> I may be using "interoperability" in a somewhat unconventional sense
>here.
>> While the draft says that Synch and Steering layer is optional, I don't
>see
>> that it requires implementations to always support a "no synch &
>steering"
>> mode, even when they support one type of Synch and Steering layer.
Given
>> that
>> there's no mandatory Synch and Steering layer either, I don't see how
two
>> iSCSI boxes that a customer buys are guaranteed to interoperate.  Please
>> comment if the draft already implies what I am asking for.
>>
>> Thanks.
>> --
>> Mallikarjun
>>
>> Mallikarjun Chadalapaka
>> Networked Storage Architecture
>> Network Storage Solutions Organization
>> MS 5668   Hewlett-Packard, Roseville.
>> cbm@rose.hp.com
>
>
>
>







From owner-ips@ece.cmu.edu  Tue Apr  3 01:31:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA13636
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 01:31:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f32LmtH26361
	for ips-outgoing; Mon, 2 Apr 2001 17:48:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from med.corp.rhapsodynetworks.com (64-160-62-201.rhapsodynetworks.com [64.160.62.201] (may be forged))
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f32LmQr26340
	for <ips@ece.cmu.edu>; Mon, 2 Apr 2001 17:48:26 -0400 (EDT)
Received: by med.corp.rhapsodynetworks.com with Internet Mail Service (5.5.2653.19)
	id <2FCGHZZ6>; Mon, 2 Apr 2001 14:48:19 -0700
Message-ID: <15851BD69CFCD41186B100B0D0AABE650C1B34@med.corp.rhapsodynetworks.com>
From: Venkat Rangan <venkat@rhapsodynetworks.com>
To: "'cbm@rose.hp.com'" <cbm@rose.hp.com>, ips@ece.cmu.edu
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Mon, 2 Apr 2001 14:48:19 -0700 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Mallikarjun,

> Also to be fair to data SACK, I believe FCP-2 allows sequence-level 
> error recovery in an I/O.  

While this is true, in practice, few implementations appear to actually
perform sequence-level error recovery. Passing up a CRC error as a 'Delivery
Subsystem Failure' is the way they seem to recover. Another point is that
with iSCSI (unlike FC), the TCP layer already has done the stream-level
recovery, based on link-level CRC mismatches dropping the frame (before TCP
sees it). So the coverage of data SACK is those times when link level CRC
errors are overcome by TCP recovery but TCP checksum is unable to detect the
error, but iSCSI data digest detects it.

Venkat Rangan
Rhapsody Networks Inc.
http://www.rhapsodynetworks.com


-----Original Message-----
From: Mallikarjun C. [mailto:cbm@rose.hp.com]
Sent: Monday, April 02, 2001 10:34 AM
To: ips@ece.cmu.edu
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"


>Sorry to have been missing for a while. Hope you will
>appreciate my being back in action :-). It was a fairly
>clear consensus in Orlando that applications broke up
>their transfers into reasonably small chunks i.e. they
>did not have very long running transfers.
>
>Therefore the consensus was that a command level recovery
>mechanism was sufficient instead of an ack/sack for each
>data PDU.
>
>The SACK mechanism was a post Orlando invention. Without
>an ack mechanism (for every data PDU), the SACK mechanism
>just imposes additional burden on either end of the session,
>without really much benefit.

To be fair to data SACK, one could think of an upper bound 
on the unack'ed data - agreed on at the login time.  While not
requiring acks on every PDU, it gives targets the deterministic
maximum on the buffer size they have to keep around if they 
choose to "reliably" support data SACK.  The current answer of 
"replay buffer size/IO size", IMHO, is simply not attractive. 
Also to be fair to data SACK, I believe FCP-2 allows sequence-level 
error recovery in an I/O.  

However, I think that it's extremely useful to include a discussion
in the draft of  the TCP checksum "escape" statistics and the 
device types for which this was considered an absolute requirement 
to make forward progress at this error rates (like huge tape 
backups?) - essentially the reasons that convinced Julian to define 
this mechanism in. That gives credibility and acceptance to this, 
or alternately may lead to the consensus that data SACK is not required.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com

>
>The benefit of having SACK is of saving bandwidth in case
>the data part of the data PDU failed an integrity check
>(but passed TCP checksum). This is a rare enough case that
>as a percentage, the bandwidth loss from retransmitting
>all the data associated with a read or write command is
>very very small.
>
>In addition, it avoids the complexity of restarting
>something from the middle, as compared to from the begining.
>
>To me it seems that there is significant simplicity (from
>implementation, reliability and recovery process) from
>having smaller data transfer per command.
>
>I would really like to get rid of the SACK command.
>
>Somesh
>
>> -----Original Message-----
>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>> julian_satran@il.ibm.com
>> Sent: Wednesday, March 28, 2001 6:57 AM
>> To: ips@ece.cmu.edu
>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> Mallikarjun,
>>
>> Last summer I thought that recovery within a connection should be left to
>> TCP. It is simple and could be made available through IPsec (if no new
>> option of any form can be added).
>>
>> Two things killed this:
>>
>>    The requirement to have a data encapsulation that can pass through
>>    application proxies (like a storage router)
>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>    header
>>
>>
>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>> implementations are in fact allowed to drop even unacked data.
>>
>> I am bound by the Orlando meeting decision to drop it. Except the regular
>> "oppose everything" crowd the two vocal opponents where Somesh Gupta and
>> Matt Wakeley.
>>
>> David may want or not to re-open the issue - I am not going to ask for
it.
>>
>> Regards,
>> Julo
>>
>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>
>> Please respond to cbm@rose.hp.com
>>
>> To:   Black_David@emc.com
>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com, someshg@yahoo.com,
>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> David and Julian,
>>
>> I appreciate both your views, and should I say that they're
>> along predicted lines :-)
>>
>> - David's right in saying that the situation is akin to FC's.
>>   However, I would like to point out that FC is an unreliable
>>   transport, and hence is forced to pick up a lot of the transport
>>   baggage (at least in FCP-2, as I understand), in addition
>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>   TCP being the "reliable" transport, iSCSI is going along the
>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>   point is - if this is indeed a necessary evil, why don't we
>>   complete iSCSI's transport functionality by data-ACKs?
>>
>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>   we're making its usage (and implementation) drastically less appealing
>>   since the only way error recovery algorithms can *rely* on data SACK
>>   is when replay is supported (or, "ReplaySupport=yes"  in my proposal),
>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>   draft and not providing any incentives to implement and use it!
>>
>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>>   protocol in its definition so far (and I believe, rightly so -
mandating
>>   command ordering, bi-di support, SCSI CRN support to name a few
>> examples),
>>   the perfectly SCSI-legal R/W interactions that break in other
transports
>>   *do not* have to break in iSCSI.
>>
>> - A last idea (may seem radical at this point) in regards to iSCSI
>>   being a "full transport". This provides us an opportunity to "cast
>>   off" the transport baggage in future when we truly move to a "reliable"
>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>   keeping the encapsulation stuff separate from the transport stuff.
>>   (Julian, I heard from Randy that ideas similar to this were explored
>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>   TCP was supposed to be reliable and granularity of recovery was deemed
>>   one I/O.)
>>
>> With that said, may I request David (with his co-chair hat on, :-))
>> to add some binding comments/observations on this discussion?
>>
>> If we decide to leave data SACKs as unattractive to implement, the draft
>> should in the least add a statement like - "Note that satisfying all
>> possible data SACK requests for a task with an unacknowledged status
>> implies implementing the I/O replay buffer on the part of targets."
>> --
>> Mallikarjun
>>
>>
>> Mallikarjun Chadalapaka
>> Networked Storage Architecture
>> Network Storage Solutions Organization
>> MS 5668   Hewlett-Packard, Roseville.
>> cbm@rose.hp.com
>>
>>
>>
>>
>> >I think Julian's basically right -- I would point
>> >out that any case of write after read that breaks
>> >over iSCSI will also break over Fibre Channel.
>> >On FC, the scenario starts with a frame CRC failure
>> >on read data at the Initiator, so applications
>> >have to cope and typically do so by enforcing
>> >ordering at the app rather than using SCSI task
>> >ordering.
>> >
>> >While SCSI has clever tools like ACA and task
>> >ordering that appear to allow dependent operations
>> >to be sent to the target concurrently, in practice
>> >they don't work and/or aren't used (funny thing,
>> >those two reinforce each other ;-) ).  Hence
>> >a minimal approach to them is in order:
>> >- Make sure the result will interoperate.
>> >- Make sure T10 doesn't ding us for leaving something
>> >    completely out.
>> >- Don't specify anything not needed for the above.
>> >
>> >My 0.02,
>> >--David
>> >
>> >> -----Original Message-----
>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>> >> To:    cbm@rose.hp.com
>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>> >> Black_David@emc.com
>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >> Mallikarjun,
>> >>
>> >> I commiserate with you at the lack of ack for data but the Orlando
>> meeting
>> >> stated - no.  Recall that I kept the number only as a mechanism to
>> detect
>> >> missing packets.
>> >>
>> >> You can achieve the effect you want by keeping around data for a while
>> >> (you
>> >> determine how long and then discard).
>> >>
>> >> If a SACK comes and you can recover - fine. If not you either reaccess
>> the
>> >> media (if you know how) or reject
>> >> and let the initiator retry.
>> >>
>> >> You should not worry about R/W conflicts as programs bound to have
such
>> >> conflicts either:
>> >>
>> >> 1)can live with them or
>> >> 2)protect themselves through some locks and rely on
>> "operation-end-status"
>> >> to keep results deterministic.
>> >>
>> >> Regards,
>> >> Julo
>> >>
>> >>
>> >>
>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>> >>
>> >> Please respond to cbm@rose.hp.com
>> >>
>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
Julian
>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>> >> cc:   Black_David@emc.com
>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >>
>> >> Hi Error Recovery Team,
>> >>
>> >> iSCSI can discard PDUs because of digest errors and request
>> >> retransmissions using the iSCSI data SACK.  To deal with such
>> >> an eventuality, targets that want to support data SACK have
>> >> the following options:
>> >>
>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>> >>   a SACK could come anytime before the status is ack'ed by the
>> >>   initiator. [ simple, but extremely expensive in memory resources]
>> >>
>> >> (B) (re-introduce data-ACKs into the draft, and) implement data-ACKs.
>> >>   Thus enables keeping only those I/O buffers that haven't been ack'ed
>> >>   by the initiator. IOW, become a real full transport! [ everyone
>> disliked
>> >>   it earlier...]
>> >>
>> >> (C) re-access the medium for data retransmission requests.  Now there
>> >>   are 3 sub-cases in this to handle the changed data on the medium in
a
>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>> >> legal.)
>> >>      (1) On seeing any write, stall till status is ack'ed for all the
>> >>             previous reads (basically drain the pipe). [simple, but
>> incurs
>> >>             an additional roundtrip delay for all writes].
>> >>      (2) A variation of the above, keep an eye only on the prior
>> >>             overlapping reads. [more BW efficient, but complicated to
>> >>             resolve the block dependencies in a stream of
>> reads followed
>> >>             by writes]
>> >>         (3) Document the caveat and leave it upto the applications
>> >>             to avoid this case since this leads to data integrity
>> issues.
>> >>             [pushing to apps since the transport can't get it right!]
>> >>
>> >> My first preference is (B), followed by (A), and I suggest we not go
>> >> to (C) at all with its inherent dangers.
>> >>
>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>> >> right thing to do architecturally instead of being a "semi-transport"!
>> >>
>> >> Comments?
>> >> --
>> >> Mallikarjun
>> >>
>> >>
>> >> Mallikarjun Chadalapaka
>> >> Networked Storage Architecture
>> >> Network Storage Solutions Organization
>> >> MS 5668   Hewlett-Packard, Roseville.
>> >> cbm@rose.hp.com
>> >>
>> >>
>>
__________________________________________________________________________
>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>> legal
>> >>         if SCSI sets the ORDERED task attribute on both the
>> commands AND
>> >>         sets the NACA bit to one to indicate that Write shall be
>> executed
>> >>         only if the Read did not fail (result in a Check Condition).
>> >>
>> >>         In the current case, since Read completed just fine from
SCSI's
>> >>         point of view, SCSI is moving on to execute Write.  Those read
>> >> buffers
>> >>         had been freed up since iSCSI received an ACK at the TCP
level,
>> >> and
>> >>         since iSCSI has no other way to have the data ack'ed!
>> >>
>> >>
>> >>
>> >>
>> >
>>
>>
>>
>>
>
>
>_________________________________________________________
>Do You Yahoo!?
>Get your free @yahoo.com address at http://mail.yahoo.com
>
>



From owner-ips@ece.cmu.edu  Tue Apr  3 03:40:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA24743
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 03:40:41 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f334s4M22385
	for ips-outgoing; Tue, 3 Apr 2001 00:54:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f334rcr22364
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 00:53:38 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f33624090419;
	Mon, 2 Apr 2001 23:02:05 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Mon, 2 Apr 2001 21:51:57 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEOPCFAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <C1256A23.00077341.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

Acknowledgement of every PDU does not require an acknowledgement exchange
per PDU.  I would expect some algorithm used to limit these exchanges.  I
agree every PDU should have a sequence number to allow acknowledgement and,
in some cases, purging.

Doug

> Somesh,
>
> That will certainly result in poor performance for important applications
> even with hardware implementations of iSCSI - mainly due to the large SCSI
> command traffic and associated interrupts.
>
> Julo
>
> "Somesh Gupta" <someshg@yahoo.com> on 02/04/2001 22:23:25
>
> Please respond to someshg@yahoo.com
>
> To:   cbm@rose.hp.com, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> To beat a dead horse ..
>
> One has to really decide fundamentally whether
>
> 1. Commands are used to transfer very large amounts of
>    data (multiple data PDUs are needed)
> 2. Commands are used to transfer relatively small amounts
>    of data (few/about one data PDU) and multiple commands
>    are then used to do long transfers
>
> (Orlando consensus was #2)
>
> If we assume the first model, then we really should have
> a sequence # and acknowledgement of every PDU - not just
> data PDUs. In this case, it is important to fill holes
> in the iSCSI stream. We can have a "super-transport" as
> Mallikarjun suggested between the iSCSI protocol layer
> and the TCP layer that provides the various "transport"
> like features we seem to want.
>
> If we assume the second model, we assume that recovery at
> the command level is sufficient. In this case it is important
> to have whatever mechanisms are (including data seq #s) needed
> to detect that a command will not succeed without recovery
> at the command level. However, recovery is needed only
> at the command level.
>
> I would let the current application model decide the features
> in "version 1" of the iSCSI protocol.
>
> Somesh
>
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> > Mallikarjun C.
> > Sent: Monday, April 02, 2001 10:34 AM
> > To: ips@ece.cmu.edu
> > Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> > >Sorry to have been missing for a while. Hope you will
> > >appreciate my being back in action :-). It was a fairly
> > >clear consensus in Orlando that applications broke up
> > >their transfers into reasonably small chunks i.e. they
> > >did not have very long running transfers.
> > >
> > >Therefore the consensus was that a command level recovery
> > >mechanism was sufficient instead of an ack/sack for each
> > >data PDU.
> > >
> > >The SACK mechanism was a post Orlando invention. Without
> > >an ack mechanism (for every data PDU), the SACK mechanism
> > >just imposes additional burden on either end of the session,
> > >without really much benefit.
> >
> > To be fair to data SACK, one could think of an upper bound
> > on the unack'ed data - agreed on at the login time.  While not
> > requiring acks on every PDU, it gives targets the deterministic
> > maximum on the buffer size they have to keep around if they
> > choose to "reliably" support data SACK.  The current answer of
> > "replay buffer size/IO size", IMHO, is simply not attractive.
> > Also to be fair to data SACK, I believe FCP-2 allows sequence-level
> > error recovery in an I/O.
> >
> > However, I think that it's extremely useful to include a discussion
> > in the draft of  the TCP checksum "escape" statistics and the
> > device types for which this was considered an absolute requirement
> > to make forward progress at this error rates (like huge tape
> > backups?) - essentially the reasons that convinced Julian to define
> > this mechanism in. That gives credibility and acceptance to this,
> > or alternately may lead to the consensus that data SACK is not required.
> > --
> > Mallikarjun
> >
> >
> > Mallikarjun Chadalapaka
> > Networked Storage Architecture
> > Network Storage Solutions Organization
> > MS 5668 Hewlett-Packard, Roseville.
> > cbm@rose.hp.com
> >
> > >
> > >The benefit of having SACK is of saving bandwidth in case
> > >the data part of the data PDU failed an integrity check
> > >(but passed TCP checksum). This is a rare enough case that
> > >as a percentage, the bandwidth loss from retransmitting
> > >all the data associated with a read or write command is
> > >very very small.
> > >
> > >In addition, it avoids the complexity of restarting
> > >something from the middle, as compared to from the begining.
> > >
> > >To me it seems that there is significant simplicity (from
> > >implementation, reliability and recovery process) from
> > >having smaller data transfer per command.
> > >
> > >I would really like to get rid of the SACK command.
> > >
> > >Somesh
> > >
> > >> -----Original Message-----
> > >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
> Behalf Of
> > >> julian_satran@il.ibm.com
> > >> Sent: Wednesday, March 28, 2001 6:57 AM
> > >> To: ips@ece.cmu.edu
> > >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> Mallikarjun,
> > >>
> > >> Last summer I thought that recovery within a connection should
> > be left to
> > >> TCP. It is simple and could be made available through IPsec
> (if no new
> > >> option of any form can be added).
> > >>
> > >> Two things killed this:
> > >>
> > >>    The requirement to have a data encapsulation that can pass through
> > >>    application proxies (like a storage router)
> > >>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> > >>    header
> > >>
> > >>
> > >> As for the ACK - I am very much in favor of it (it is a no brainer)
> and
> > >> implementations are in fact allowed to drop even unacked data.
> > >>
> > >> I am bound by the Orlando meeting decision to drop it. Except
> > the regular
> > >> "oppose everything" crowd the two vocal opponents where Somesh
> > Gupta and
> > >> Matt Wakeley.
> > >>
> > >> David may want or not to re-open the issue - I am not going to
> > ask for it.
> > >>
> > >> Regards,
> > >> Julo
> > >>
> > >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> > >>
> > >> Please respond to cbm@rose.hp.com
> > >>
> > >> To:   Black_David@emc.com
> > >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> > someshg@yahoo.com,
> > >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> > >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> > >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> David and Julian,
> > >>
> > >> I appreciate both your views, and should I say that they're
> > >> along predicted lines :-)
> > >>
> > >> - David's right in saying that the situation is akin to FC's.
> > >>   However, I would like to point out that FC is an unreliable
> > >>   transport, and hence is forced to pick up a lot of the transport
> > >>   baggage (at least in FCP-2, as I understand), in addition
> > >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> > >>   TCP being the "reliable" transport, iSCSI is going along the
> > >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> > >>   point is - if this is indeed a necessary evil, why don't we
> > >>   complete iSCSI's transport functionality by data-ACKs?
> > >>
> > >> - If data SACK is introduced mostly to make up for TCP's
> shortcomings,
> > >>   we're making its usage (and implementation) drastically less
> > appealing
> > >>   since the only way error recovery algorithms can *rely* on
> data SACK
> > >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> > proposal),
> > >>   which is extremely expensive.  IOW, we're defining data SACK in the
> > >>   draft and not providing any incentives to implement and use it!
> > >>
> > >> - I submit that since iSCSI is being hailed as the ideal SCSI
> Transport
> > >>   protocol in its definition so far (and I believe, rightly so
> > - mandating
> > >>   command ordering, bi-di support, SCSI CRN support to name a few
> > >> examples),
> > >>   the perfectly SCSI-legal R/W interactions that break in
> > other transports
> > >>   *do not* have to break in iSCSI.
> > >>
> > >> - A last idea (may seem radical at this point) in regards to iSCSI
> > >>   being a "full transport". This provides us an opportunity to "cast
> > >>   off" the transport baggage in future when we truly move to a
> > "reliable"
> > >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> > >>   keeping the encapsulation stuff separate from the transport stuff.
> > >>   (Julian, I heard from Randy that ideas similar to this
> were explored
> > >>   in your Haifa meeting.  And yes, he recalls they were
> given up since
> > >>   TCP was supposed to be reliable and granularity of recovery
> > was deemed
> > >>   one I/O.)
> > >>
> > >> With that said, may I request David (with his co-chair hat on, :-))
> > >> to add some binding comments/observations on this discussion?
> > >>
> > >> If we decide to leave data SACKs as unattractive to implement,
> > the draft
> > >> should in the least add a statement like - "Note that satisfying all
> > >> possible data SACK requests for a task with an unacknowledged status
> > >> implies implementing the I/O replay buffer on the part of targets."
> > >> --
> > >> Mallikarjun
> > >>
> > >>
> > >> Mallikarjun Chadalapaka
> > >> Networked Storage Architecture
> > >> Network Storage Solutions Organization
> > >> MS 5668   Hewlett-Packard, Roseville.
> > >> cbm@rose.hp.com
> > >>
> > >>
> > >>
> > >>
> > >> >I think Julian's basically right -- I would point
> > >> >out that any case of write after read that breaks
> > >> >over iSCSI will also break over Fibre Channel.
> > >> >On FC, the scenario starts with a frame CRC failure
> > >> >on read data at the Initiator, so applications
> > >> >have to cope and typically do so by enforcing
> > >> >ordering at the app rather than using SCSI task
> > >> >ordering.
> > >> >
> > >> >While SCSI has clever tools like ACA and task
> > >> >ordering that appear to allow dependent operations
> > >> >to be sent to the target concurrently, in practice
> > >> >they don't work and/or aren't used (funny thing,
> > >> >those two reinforce each other ;-) ).  Hence
> > >> >a minimal approach to them is in order:
> > >> >- Make sure the result will interoperate.
> > >> >- Make sure T10 doesn't ding us for leaving something
> > >> >    completely out.
> > >> >- Don't specify anything not needed for the above.
> > >> >
> > >> >My 0.02,
> > >> >--David
> > >> >
> > >> >> -----Original Message-----
> > >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> > >> >> To:    cbm@rose.hp.com
> > >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
> hufferd@us.ibm.com;
> > >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> > >> >> Black_David@emc.com
> > >> >> Subject:    Re: iSCSI ERT: data SACK/replay
> buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >> Mallikarjun,
> > >> >>
> > >> >> I commiserate with you at the lack of ack for data but the Orlando
> > >> meeting
> > >> >> stated - no.  Recall that I kept the number only as a mechanism to
> > >> detect
> > >> >> missing packets.
> > >> >>
> > >> >> You can achieve the effect you want by keeping around data
> > for a while
> > >> >> (you
> > >> >> determine how long and then discard).
> > >> >>
> > >> >> If a SACK comes and you can recover - fine. If not you
> > either reaccess
> > >> the
> > >> >> media (if you know how) or reject
> > >> >> and let the initiator retry.
> > >> >>
> > >> >> You should not worry about R/W conflicts as programs bound
> > to have such
> > >> >> conflicts either:
> > >> >>
> > >> >> 1)can live with them or
> > >> >> 2)protect themselves through some locks and rely on
> > >> "operation-end-status"
> > >> >> to keep results deterministic.
> > >> >>
> > >> >> Regards,
> > >> >> Julo
> > >> >>
> > >> >>
> > >> >>
> > >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> > >> >>
> > >> >> Please respond to cbm@rose.hp.com
> > >> >>
> > >> >> To:   cbm@rose.hp.com, someshg@yahoo.com,
> > steph@cs.uchicago.edu, Julian
> > >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> > >> >> cc:   Black_David@emc.com
> > >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Hi Error Recovery Team,
> > >> >>
> > >> >> iSCSI can discard PDUs because of digest errors and request
> > >> >> retransmissions using the iSCSI data SACK.  To deal with such
> > >> >> an eventuality, targets that want to support data SACK have
> > >> >> the following options:
> > >> >>
> > >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> > >> >>   a SACK could come anytime before the status is ack'ed by the
> > >> >>   initiator. [ simple, but extremely expensive in memory
> resources]
> > >> >>
> > >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> > data-ACKs.
> > >> >>   Thus enables keeping only those I/O buffers that haven't
> > been ack'ed
> > >> >>   by the initiator. IOW, become a real full transport! [ everyone
> > >> disliked
> > >> >>   it earlier...]
> > >> >>
> > >> >> (C) re-access the medium for data retransmission requests.
> > Now there
> > >> >>   are 3 sub-cases in this to handle the changed data on the
> > medium in a
> > >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
> is
> > >> >> legal.)
> > >> >>      (1) On seeing any write, stall till status is ack'ed
> > for all the
> > >> >>             previous reads (basically drain the pipe).
> [simple, but
> > >> incurs
> > >> >>             an additional roundtrip delay for all writes].
> > >> >>      (2) A variation of the above, keep an eye only on the prior
> > >> >>             overlapping reads. [more BW efficient, but
> > complicated to
> > >> >>             resolve the block dependencies in a stream of
> > >> reads followed
> > >> >>             by writes]
> > >> >>         (3) Document the caveat and leave it upto the applications
> > >> >>             to avoid this case since this leads to data integrity
> > >> issues.
> > >> >>             [pushing to apps since the transport can't get
> > it right!]
> > >> >>
> > >> >> My first preference is (B), followed by (A), and I suggest we not
> go
> > >> >> to (C) at all with its inherent dangers.
> > >> >>
> > >> >> Doing (B) naturally completes the transport job that iSCSI has
> taken
> > >> >> on itself in view of TCP's claimed unreliable checksum.  That is
> the
> > >> >> right thing to do architecturally instead of being a
> > "semi-transport"!
> > >> >>
> > >> >> Comments?
> > >> >> --
> > >> >> Mallikarjun
> > >> >>
> > >> >>
> > >> >> Mallikarjun Chadalapaka
> > >> >> Networked Storage Architecture
> > >> >> Network Storage Solutions Organization
> > >> >> MS 5668   Hewlett-Packard, Roseville.
> > >> >> cbm@rose.hp.com
> > >> >>
> > >> >>
> > >>
> >
> __________________________________________________________________________
> > >> >> Note.1: A Read followed by a Write (to the same blocks) is
> perfectly
> > >> legal
> > >> >>         if SCSI sets the ORDERED task attribute on both the
> > >> commands AND
> > >> >>         sets the NACA bit to one to indicate that Write shall be
> > >> executed
> > >> >>         only if the Read did not fail (result in a Check
> Condition).
> > >> >>
> > >> >>         In the current case, since Read completed just fine
> > from SCSI's
> > >> >>         point of view, SCSI is moving on to execute Write.
> > Those read
> > >> >> buffers
> > >> >>         had been freed up since iSCSI received an ACK at
> > the TCP level,
> > >> >> and
> > >> >>         since iSCSI has no other way to have the data ack'ed!
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >>
> > >>
> > >>
> > >>
> > >
> > >
> > >_________________________________________________________
> > >Do You Yahoo!?
> > >Get your free @yahoo.com address at http://mail.yahoo.com
> > >
> > >
> >
>
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>
>
>
>
>



From owner-ips@ece.cmu.edu  Tue Apr  3 14:16:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA11474
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 14:16:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33AKBL19284
	for ips-outgoing; Tue, 3 Apr 2001 06:20:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33AJQr19257
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 06:19:26 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id MAA128186;
	Tue, 3 Apr 2001 12:19:19 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id MAA99520;
	Tue, 3 Apr 2001 12:17:28 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.00386A4D ; Tue, 3 Apr 2001 12:16:12 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: "Barry Reinhold" <bbrtrebia@mediaone.net>
cc: ips@ece.cmu.edu
Message-ID: <C1256A23.00386968.00@d12mta02.de.ibm.com>
Date: Tue, 3 Apr 2001 12:19:41 +0200
Subject: RE: iSCSI: frame formats
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



This detail is already fixed in 06 -:) Julo

"Barry Reinhold" <bbrtrebia@mediaone.net> on 02/04/2001 20:59:29

Please respond to "Barry Reinhold" <bbrtrebia@mediaone.net>

To:   "Rod Harrison" <rod.harrison@windriver.com>, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI: frame formats




Rod,
     I'm pretty sure the we decided to go with bytes at the 50th IETF
meeting
for data length. No one felt that a max data size of 16 megs in an iSCSI
PDU
was an issue (remember this is not max SCSI data transfer size, it is max
data bytes in an iSCSI PDU).
     I do think we need to word the padding with a bit of care, as I can
envision people doing this in at least two different ways that would not
interoperate.
     The interoperability issues on the padding question boils down to:
"Where
in the TCP stream do I put the Data Digest if the number of bytes I have to
send is not a multiple of 4?"
     My expectation was that the transmitter would be null padding the data
portion of the iSCSI PDU to a word boundary, then sticking on the digest.
Thus the padding is actually in the TCP stream. For example if <d> = a data
octet, <p> = pad octet, and <dd> = data digest octet, then the TCP stream
for a 6 byte data transfer would look as follows:
<octets in the header - end modulo 4> <d> <d> <d> <d> <d> <d> <p> <p> <dd>
....
     The receiver would get the value "6" in the data length portion of the
header. After pulling out the header, the receiver would pull out 8 bytes
of
"data + pad" and then get the digest.
     The other way to do this to have a "virtual pad" such that padding is
created by the receiver when construction the iSCSI PDU from the TCP
stream.
The padding is never actually in the TCP stream itself.
     I do not think this is as helpful, but whatever we do, the spec.
should
address this minor detail so we don't trip over it.


>-----Original Message-----
>From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>Rod Harrison
>Sent: Monday, April 02, 2001 1:44 PM
>To: ips@ece.cmu.edu
>Subject: RE: iSCSI: frame formats
>
>
>Stephen,
>
>    I don't agree with your concerns about the maximum PDU
>size. I think few targets, or indeed initiators, will be
>interested in negotiating PDU sizes anywhere near this
>large, so big transfers will have to be fragmented anyway.
>
>    However, I share your concern about the padding. I don't
>really see why we are considering it. If one has a local
>alignment issue it can, and I believe should, be taken care
>of locally and not in the specification. There are several
>easy ways of handling this sort of thing; insert and remove
>the pad locally; separate header and payload buffers
>allowing each to be naturally aligned, etc.
>
>    If we have to pad then every read of non-header data will
>have to involve a rounding calculation on the length, and
>then perhaps a second read to discard the pad if the
>underlying buffer is the exact size of the data. Possibly
>the same on send, if the data buffer is the exact size the
>transmit code can't just 'go off the end,' it will have to
>send the data, and then fake up some pad and make another
>send.
>
>    Am I missing something here, why do we care about padding?
>
>    - Rod
>
>-----Original Message-----
>From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
>Behalf Of
>Stephen Bailey
>Sent: Monday, April 02, 2001 3:18 PM
>To: ips@ece.cmu.edu
>Subject: Re: iSCSI: frame formats
>
>
>Sandeep,
>
>> DataLen will now be max 8M/4M but then we dont wish to
>have large
>> iSCSI PDUs in any case.
>
>This max size is getting the the point where I'm sure it'll
>be an
>irritant.
>
>I would like (but, in fact, a sure way to guarantee that it
>won't
>happen is for me to like it :^) to view iSCSI PDUs as the
>expected
>grain size at which you might have software involvement in
>an
>otherwise hardware-driven iSCSI implementation.
>
>For example, when a target is returning read data for
>multiple
>outstanding reads, it might want to return a bit at a time
>from each,
>and each `bit' should be an iSCSI PDU.  Clearly sending
>these bits
>will be a software-level decision.  That certainly was the
>rational
>for allowing multiple FCP DATA PDUs per read operation, and
>I naively
>assumed similar logic was being applied here.
>
>The alternative is to say that the hardware will do iSCSI
>PDU
>chunking, but if that's the case, I expect that the header
>is, well, a
>bit bulky.
>
>I'm also incredibly unexcited about having data length be a
>multiple
>of 4 bytes (if that's still in the cards).  There operations
>within
>the SCSI command set which return arbitrary length data.
>There are
>perfectly nominal cases where you get less data than you
>requested
>(e.g. inquiry & request sense).  Furthermore, the SCSI
>architecture
>does not prohibit this, even though certain commands do, so
>it is not
>for iSCSI to say anything about this one way or the other.
>
>The problem with handling lengths that include padding comes
>when
>you're trying to move the data into a buffer which is a
>non-multiple
>length.  For example, if I ask for 22 bytes of inquiry data
>(with a 22
>byte buffer), what can I say, at the time a PDU arrives that
>has 24
>bytes?  It might have 21 or 22 bytes (or perhaps even,
>erroneously 23
>or 24 bytes).  A data residual coming later will tell the
>software how
>much was actually there, but it can't tell the hardware.
>The typical
>expectation of this type of transfer is that it will only
>overwrite
>bytes of the buffer that are actually transferred, but
>having padded
>lengths will not allow this.
>
>The completely standard solution to carrying arbitrary data
>of
>arbitrary length in an aligned transfer unit is to pad the
>transfer
>unit but report the exact (shorter) length.  Another
>solution, used by
>FC, is to carry a pad length.  In iSCSI, why bother---you've
>just
>reintroduced added the 2 bits you were trying to remove?
>
>The data length scenario is not comparable to IP header
>lengths, where
>what is being carried is not arbitrary data.
>
>Certainly, for iSCSI additional header segments (AHSs) you
>could
>arguably use this cell length technique, since we can
>control what
>we're carrying (AHSs that need exact byte lengths will have
>to be
>internally self-describing) but frankly, I still think it's
>a bad
>idea.
>
>I can't understand why we're messing around with all these
>tricky
>`solutions' to standard problems.  We should avoid the
>temptation to
>get cute, and wholesomely provide the same capabilities as
>any other
>SCSI transport.  Specifically:
>  o allow long PDUs
>  o carry exact data lengths
>
>Steph
>






From owner-ips@ece.cmu.edu  Tue Apr  3 14:27:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA11920
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 14:27:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f339wBA18159
	for ips-outgoing; Tue, 3 Apr 2001 05:58:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f339vwr18147
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 05:57:58 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id LAA210230
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 11:57:51 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id LAA151312
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 11:56:01 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.00366F2E ; Tue, 3 Apr 2001 11:54:34 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.00366D4D.00@d12mta02.de.ibm.com>
Date: Tue, 3 Apr 2001 11:58:00 +0200
Subject: Re: iSCSI:2.11.5 TSID
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Barry,


You are right - this is editorial and it is here from a very old version.
It was already raised and I fixed it.

It reads now:

   The TSID is an initiator identifying tag set by the target.  It is valid
   only if the connection is accepted.



Code 0206 identifies now the error.

Julo

"Barry Reinhold" <bbrtrebia@mediaone.net> on 02/04/2001 18:24:25

Please respond to "Barry Reinhold" <bbrtrebia@mediaone.net>

To:   "ISCSI" <ips@ece.cmu.edu>
cc:
Subject:  iSCSI:2.11.5 TSID




Julian,

Section 2.11.5 states:

"The TSID is an initiator identifying tag set by the target.  A 0 in
   the returned TSID indicates that either the target supports only a
   single connection or that the ISID has already been used as a leading
   ISID. In both cases, the target rejects the login."

I have a question on this, but I'm not sure if it is an editorial or
technical issue.

If the login PDU arrives with an ISID that has "already been used as a
leading ISID" I am assuming that the TSID in this PDU must be zero. (A
leading ISID is an ISID in a LOGIN request PDU that has the TSID set to
zero
and hence established a new connection)

In this case how does the target know that the ISID has already been used
as
a leading ISID? There is no connection context if TSID = 0 and, for the
target, the ISID is only meaningful within a session.

If my assumption about TSID = 0 is wrong, and TSID is not zero then I would
suggest the following editorial change:


"TSID is a session identifying tag established by the target. The taget
MUST
return a value of 0 in the TSID field if the TSID specified in the login
request identified a session for which no more connections are allowed."

[Note: The goal of the editorial comment is to make it clear that the error
conditon being discussed here is a login request with a non zero TSID that
can not be satisfied]

Barry Reinhold
Principal Architect
Trebia Networks
barry.reinhold@trebia.com
603-868-5144/603-659-0885/978-929-0830 x138






From owner-ips@ece.cmu.edu  Tue Apr  3 14:29:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA12005
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 14:29:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33ACBp18898
	for ips-outgoing; Tue, 3 Apr 2001 06:12:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33ABmr18856
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 06:11:48 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id MAA79054
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 12:11:39 +0200
From: biran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id MAA137938
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 12:08:34 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.0037B705 ; Tue, 3 Apr 2001 12:08:33 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.0037B642.00@d12mta02.de.ibm.com>
Date: Tue, 3 Apr 2001 13:09:58 +0300
Subject: iSCSI requirements drafts
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Just two comments on the Security Considerations:

The iSCSI draft states that
"iSCSI implementations MUST provide means of
protection against active attacks (pretending as another
identity, message insertion, deletion, and modification)".
This might be reflected in a MUST statement in section 6.3
(which I would rename to "Data Integrity and Authentication ")

Also - I would rename the "CRC" occurrences in the MAY/MUST
statements on the beginning of 6.3 to "digest", as digests with
real security value may be negotiated (this is one method of
providing the above MUST).  CRC is one type of digest (that
doesn't provide security value, just error detection).

  Regards,
     Ofer


Ofer Biran
Storage and Systems Technology
IBM Research Lab in Haifa
biran@il.ibm.com  972-4-8296253




From owner-ips@ece.cmu.edu  Tue Apr  3 14:32:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA12142
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 14:32:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33DEJd29009
	for ips-outgoing; Tue, 3 Apr 2001 09:14:19 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33DDtr28985
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 09:13:55 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id B6D6494006
	for <ips@ece.cmu.edu>; Tue,  3 Apr 2001 09:13:53 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
In-Reply-To: Message from Robert Snively <rsnively@Brocade.COM> 
   of "Mon, 02 Apr 2001 16:56:10 PDT." <FFD40DB4943CD411876500508BAD02797D467D@sj5-ex2.brocade.com> 
References: <FFD40DB4943CD411876500508BAD02797D467D@sj5-ex2.brocade.com> 
Date: Tue, 03 Apr 2001 09:12:35 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010403131353.B6D6494006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> The Stone and Partridge paper is mostly not applicable to an iSCSI
> environment.  The principal failure mechanisms were major software
> bugs in the driver stack of PC-oriented machines.

I'm in complete agreement with Bob.

I haven't seen a good analysis of TCP checksum escapes which resulted
from intermediary manipulation (I haven't read the papers, but
hopefully soon), but my hunch is that it's incredibly rare.

An endpoint precipiated TCP checksum `escape' also escape a CRC or any
other similar integrity check.  That is why I think all this
additional integrity checking (on iSCSI headers & data), is an
incredible amount of extra work (not just in computing the CRCs, but
also in designing the SACK mechanism and recovery for digest failures)
for no real gain.  The real loss is that it's immensely slowing
time-to-market for iSCSI (both in the front end specification and the
back end implementation).

A straw-man proposal (very unpopular given where we are, I know) would
be to specify iSCSI without additional integrity checks (other than
what you can get through security mechanisms, which is probably not
visible to iSCSI anyway), and if that `fails' (I'm sure it won't), we
can put an integrity shim between iSCSI and the transport.

One example of how to do this would be Julian's TAF.  Another would be
the WARP RDMA layer.

We don't have to specify how to do this now, and the point is that
it's hard to do so, because we really don't know what problem we're
solving with it.  We're OK as long as we have a way to address it in
the future without completely chucking what already exists.

The other point to remember is that iSCSI still has to make the
ID->Proposed->Draft->Internet traversal, and anybody that thinks it's
going to do that on the first try is kidding themselves.  It's more
important to get SOMETHING out there that exposes the implementation
holes than to design a cathedral on paper.

Steph


From owner-ips@ece.cmu.edu  Tue Apr  3 16:05:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA15353
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 16:05:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33HeR318624
	for ips-outgoing; Tue, 3 Apr 2001 13:40:27 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33HdLr18540
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 13:39:22 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id TAA254570
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 19:39:15 +0200
From: biran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id TAA25916
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 19:36:09 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A23.0060B094 ; Tue, 3 Apr 2001 19:36:06 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A23.0060AFE0.00@d12mta02.de.ibm.com>
Date: Tue, 3 Apr 2001 20:37:29 +0300
Subject: Public key AuthMethod
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



In Minneapolis I proposed to add the public key AuthMethod based on
SPKM (public key implementation of GSS-API, RFC-2025). SPKM is really
suitable (it gives the exact definition of tokens to be exchange in iSCSI
text
messages for public key authentication including optional certificates
exchange, and MAC digest based on shared key generated by the
exchange, that might be negotiated in the iSCSI login).

However, there is a question mark about the status of RFC-2025. It is on
standards truck at Proposed Standard level, but it is from 1996... I had
a correspondence with the CAT-WG chair, and here are two citations:

"I'm unaware, however, of any current plans for advancement of this
document
beyond Proposed and it hasn't been actively discussed within the WG for
some
time. I'm also unsure as to its number of existing implementations."

"Nonetheless, I believe that it remains well suited as a specification for
an
X.509-based authentication mechanism.  I'm not aware of an alternative
specification with comparable scope currently defined within an Internet
standards-track RFC"

(BTW, if you look at the version linked from the RFC pages, the
"Status of this Memo" section states:
"This memo defines an Experimental Protocol for the Internet community..."
however the same section in the version fetched from the RFC Editor-pages
states:
"This document specifies an Internet standards track protocol for the
   Internet community..."
the CAT-WG chair confirmed that the first copy is a mistake.)

In anyway, can we (/ should we) rely on RFC that its plan for becoming a
standard is not clear at all?

Another option for the public key AuthMethod might be a reduced version
of the TLS handshake (implemented in the iSCSI text messages, not using
the TLS record layer).  This can provide authentication (with optional
certificate exchange) and a shared secret that can be used for MAC
digest according to the TLS MAC specification (but used of course as
optional iSCSI digest and not inside TLS).

I believe it's preferable to adopt an existing security standard as much
as possible than inventing something new for iSCSI.

I'd like to hear some opinions on these before we decide how to define
the public key AuthMethod.

Regards,
  Ofer


Ofer Biran
Storage and Systems Technology
IBM Research Lab in Haifa
biran@il.ibm.com  972-4-8296253




From owner-ips@ece.cmu.edu  Tue Apr  3 16:06:42 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA15379
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 16:06:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33I8N220747
	for ips-outgoing; Tue, 3 Apr 2001 14:08:23 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from patan.sun.com (patan.Sun.COM [192.18.98.43])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33I7Sr20676
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 14:07:28 -0400 (EDT)
Received: from engmail1.Eng.Sun.COM ([129.146.1.13])
	by patan.sun.com (8.9.3+Sun/8.9.3) with ESMTP id LAA29394;
	Tue, 3 Apr 2001 11:07:26 -0700 (PDT)
Received: from dhcp-aus08-229.central.sun.com (dhcp-aus08-229.Central.Sun.COM [129.153.128.229])
	by engmail1.Eng.Sun.COM (8.9.3+Sun/8.9.3/ENSMAIL,v2.1p1) with ESMTP id LAA11504;
	Tue, 3 Apr 2001 11:07:24 -0700 (PDT)
Received: (from shepler@localhost)
	by dhcp-aus08-229.central.sun.com (8.10.2+Sun/8.10.2) id f33J7K1100760;
	Tue, 3 Apr 2001 14:07:20 -0500 (CDT)
Date: Tue, 3 Apr 2001 14:07:20 -0500
From: Spencer Shepler <shepler@eng.sun.com>
To: biran@il.ibm.com
Cc: ips@ece.cmu.edu, mike@eisler.com
Subject: Re: Public key AuthMethod
Message-ID: <20010403140720.F100718@dhcp-aus08-229.central.sun.com>
Reply-To: shepler@eng.sun.com
References: <C1256A23.0060AFE0.00@d12mta02.de.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <C1256A23.0060AFE0.00@d12mta02.de.ibm.com>; from biran@il.ibm.com on Tue, Apr 03, 2001 at 08:37:29PM +0300
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Note that NFSv4 specifies LIPKEY (RFC2847) and SPKM-3 as mandatory to
implement.  LIPKEY/RFC2847 builds upon the SPKM RFC and defines the
SPKM-3 mechanism which meets the needs of NFSv4.  SPKM-3/LIPKEY is
being implemented and therefore the SPKM RFC may be moving forward as
part of the NFSv4 work.

Spencer

On Tue, biran@il.ibm.com wrote:
> 
> 
> In Minneapolis I proposed to add the public key AuthMethod based on
> SPKM (public key implementation of GSS-API, RFC-2025). SPKM is really
> suitable (it gives the exact definition of tokens to be exchange in iSCSI
> text
> messages for public key authentication including optional certificates
> exchange, and MAC digest based on shared key generated by the
> exchange, that might be negotiated in the iSCSI login).
> 
> However, there is a question mark about the status of RFC-2025. It is on
> standards truck at Proposed Standard level, but it is from 1996... I had
> a correspondence with the CAT-WG chair, and here are two citations:
> 
> "I'm unaware, however, of any current plans for advancement of this
> document
> beyond Proposed and it hasn't been actively discussed within the WG for
> some
> time. I'm also unsure as to its number of existing implementations."
> 
> "Nonetheless, I believe that it remains well suited as a specification for
> an
> X.509-based authentication mechanism.  I'm not aware of an alternative
> specification with comparable scope currently defined within an Internet
> standards-track RFC"
> 
> (BTW, if you look at the version linked from the RFC pages, the
> "Status of this Memo" section states:
> "This memo defines an Experimental Protocol for the Internet community..."
> however the same section in the version fetched from the RFC Editor-pages
> states:
> "This document specifies an Internet standards track protocol for the
>    Internet community..."
> the CAT-WG chair confirmed that the first copy is a mistake.)
> 
> In anyway, can we (/ should we) rely on RFC that its plan for becoming a
> standard is not clear at all?
> 
> Another option for the public key AuthMethod might be a reduced version
> of the TLS handshake (implemented in the iSCSI text messages, not using
> the TLS record layer).  This can provide authentication (with optional
> certificate exchange) and a shared secret that can be used for MAC
> digest according to the TLS MAC specification (but used of course as
> optional iSCSI digest and not inside TLS).
> 
> I believe it's preferable to adopt an existing security standard as much
> as possible than inventing something new for iSCSI.
> 
> I'd like to hear some opinions on these before we decide how to define
> the public key AuthMethod.
> 
> Regards,
>   Ofer
> 
> 
> Ofer Biran
> Storage and Systems Technology
> IBM Research Lab in Haifa
> biran@il.ibm.com  972-4-8296253
> 
> 

-- 

- Spencer -




From owner-ips@ece.cmu.edu  Tue Apr  3 17:31:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA17724
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 17:31:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33GeSt13111
	for ips-outgoing; Tue, 3 Apr 2001 12:40:28 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from apollo.pirus.com ([63.91.118.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33GY0r12718
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 12:34:00 -0400 (EDT)
Message-ID: <E132D13F58DAAB45AE5D01CA75BD3D56B509@OZ>
From: "Binford, Charles" <CBinford@Pirus.com>
To: "'Black_David@emc.com'" <Black_David@emc.com>, ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple con
	nections.
Date: Tue, 3 Apr 2001 12:33:30 -0400 
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

See comments below, marked with ++++

Charles Binford
Pirus Networks
316.315.0382 x222


-----Original Message-----
From: Black_David@emc.com [mailto:Black_David@emc.com]
Sent: Monday, April 02, 2001 2:13 PM
To: CBinford@pirus.com; ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with
multipleconn ections.


It still appears to me that the behavior that
Charles wants is available by sending the task management
function twice - once for immediate delivery and once for
ordered delivery.  Duplication on all connections would
not be necessary in the normal case because the ordered
instance would naturally come behind the unordered ones.
The timer seems to be something that ought to be taken
up under error recovery in general (and in particular,
we should consider letting TCP time out the connection
rather than adding yet another recovery timer).  IMHO,
the bottom line is that the logic to duplicate the task
management commands at the Initiator and coordinate
them at the Target costs implementation complexity,
and I have not seen a convincing statement of what
we're buying with that added complexity.

It may be the case that applications don't want to be
bothered with sending task management commands twice,
in which case we have a coordination problem of some
form that has to be addressed in iSCSI.

+++++++++++++++
I would state this much stronger.  Applications had better not have to know
that it is iSCSI underneath vs. FCP or parallel SCSI else I believe we
missed the objective (granted, some things such as target address space are
unavoidably different, but I believe task management functions should be the
same).  The transport needs to handle the transport issues without exposing
quirks to the SCSI or application layer.

I would be more agreeable if David was advocating the iSCSI layer was
responsible for sending task management functions twice, once with, once
without ordering.  This puts the handling of the iSCSI specific ordering
issue in realm of iSCSI.  My caution to this approach is that is changes the
behavior of the target slightly.  The SCSI target layer will see the task
management function twice, not once.  I can't think of any scenario where
this would matter, but I get nervous when we change behavior from the norm.
++++++++++++++++++ cb


For further discussion,
--David


From owner-ips@ece.cmu.edu  Tue Apr  3 17:35:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA17834
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 17:34:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33HkSg19118
	for ips-outgoing; Tue, 3 Apr 2001 13:46:28 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f33Hk3r19083
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 13:46:03 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by dirty; Tue Apr  3 13:45:45 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Tue Apr  3 13:45:44 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id NAA05439;
	Tue, 3 Apr 2001 13:45:44 -0400 (EDT)
Message-ID: <3ACA0C48.D922C464@research.bell-labs.com>
Date: Tue, 03 Apr 2001 13:45:44 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Black_David@emc.com
CC: dotis@sanlight.net, ips@ece.cmu.edu
Subject: Re: iSCSI: Out Of Sequence due to null sequence with 
 multipleconnections.
References: <0F31E5C394DAD311B60C00E029101A070801532D@corpmx9.isus.emc.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


There is probably more to this than meets the eye, in which
case do please let me know where I err..

Is it not possible to use the refTaskTag in task management
command to introduce "state" at the target ?

Specifically, 
1) send task management command with immediate delivery(cmdSN=0).
2) if iSCSI target sees a non-existing refTaskTag,
   it uses that fact to create some "state" at the target.
   (NOTE: we dont know if the task had already completed ??)
3) When actual task arrives, it gets dropped since iSCSI sees 
   that for that refTaskTag, state=must abort.

But there is still the question..
1) when do we delete that target "state" ?  there is no 
   endCmdSN or refCmdSN.

-sandeep

Black_David@emc.com wrote:
> 
> Let me apologize for unintentionally stepping on Doug
> in the meeting.  Due to the time squeeze, I neglected
> to ask for other issues at the end of the iSCSI discussion
> - sorry about that.
> 
> I'm going to back off and try to take a high level view of
> this and see what sort of observations emerge.  When a SCSI
> task management command pops out of an iSCSI TCP connection
> at the target, there are four places that the SCSI operations
> it affects could be:
> 
> (1) Executing in SCSI.
> (2) Queued to SCSI for execution, but not executing.
> (3) Queued in iSCSI waiting for command sequencing.
> (4) In-flight.
> 
> (2) includes the "resource limitations between the
> sequencer and the target that may lead to a stall or
> a long term delay".
> 
> (1) and (2) are the easy cases - the SCSI implementation
> must apply the task management command to executing tasks,
> and should perform the obvious "peephole optimization" to
> the commands queued for execution (i.e., if they're to be
> aborted, abort them and send the response(s) without starting
> execution).  In essence, this models a command as crossing
> the boundary from iSCSI to SCSI the moment that iSCSI is
> prepared to give it to SCSI (i.e., any queue and related
> resource limitations are on SCSI's side of the line).
> 
> (4) is hard.  One SCSI task management command generates one
> response.  That response can either be generated immediately
> (command arrives, is passed to SCSI, SCSI does its thing) or
> at the right point in the sequence (command arrives, is
> sequenced by iSCSI, passed to SCSI at the right point in the
> sequence, and SCSI does its thing), but NOT both.  As things
> currently stand, having a task management command apply to
> in-flight commands requires sending the task management
> command for ordered delivery - so if it's desired to have
> the task management command take immediate effect and also
> catch everything in flight, it's going to have to be sent
> twice.  I'm not enthusiastic about the idea of the task
> management command taking immediate effect but delaying the
> response until everything in flight that might be affected
> arrives, as I suspect the Initiator would like to know what
> happened sooner rather than later.
> 
> (3) is "interesting".  The results of applying a SCSI task
> management command to a SCSI operation are known only to
> SCSI, and hence asking that a command stuck in the iSCSI
> sequencer be affected immediately by a task management
> command is asking that the task management command have
> the side effect of changing some of the commands it affects
> to immediate delivery so that it can immediately do its
> (SCSI) thing to them.  I wouldn't want to mandate this,
> nor would I want to prohibit it, BUT ... if the above
> discussion of in-flight commands is correct, I would
> observe that the application on the Initiator side
> can't tell the difference between commands that are in-flight
> vs. waiting for something in-flight on another connection,
> and hence is going to have to issue the task management
> command for ordered delivery if it wants to affect operations
> in either place (and issue a second copy if it wants
> immediate action).
> 
> The upshot is that, aside from a longer discussion of this
> issue, I'm not sure anything needs to be changed.  Comments?
> 
> Thanks,
> --David
> 
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------


From owner-ips@ece.cmu.edu  Tue Apr  3 18:50:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA19423
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 18:50:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33JmVo28150
	for ips-outgoing; Tue, 3 Apr 2001 15:48:31 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com (mxic1.isus.emc.com [168.159.211.82])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33JlVr28061
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 15:47:31 -0400 (EDT)
Received: by mxic1.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <HMMDPXA3>; Tue, 3 Apr 2001 15:48:52 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153A9@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: CBinford@pirus.com, ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple con
	 nections.
Date: Tue, 3 Apr 2001 15:47:23 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> I would state this much stronger.  Applications had better not have to
know
> that it is iSCSI underneath vs. FCP or parallel SCSI else I believe we
> missed the objective (granted, some things such as target address space
are
> unavoidably different, but I believe task management functions should be
the
> same).  The transport needs to handle the transport issues without
exposing
> quirks to the SCSI or application layer.

Unfortunately, I think we have an impossible situation.  It appears to me
that
we have to pick at most two of the following three goals, as I have yet to
see
any way to achieve all three for a single task management command on a
multiple connection session:

(1) The command takes effect immediately and its status/response
	is available immediately.
(2) The command affects all commands in flight, and its status/response
	is delayed until all such effects are complete.
(3) There is no significant visible departure from existing SCSI task
	management behavior.

The problem is that trying to do both (1) and (2) either requires SCSI to
"execute" the task management command twice or requires that iSCSI do
some task management (e.g., on the in-flight commands) on SCSI's behalf
(or worse like having SCSI prolong the execution of the task management
command until everything in flight in iSCSI arrives).  All of these appear
to lead to problems with (3) in one form or another - two executions
result in two SCSI status/responses that have to be merged, and iSCSI
task management will sooner or later do something different from SCSI
(e.g., I sincerely doubt that a Target in a bridge will ever get this 100%
identical to the devices that are being bridged).

The current iSCSI draft provides the choice of  [(1)] XOR [(2), (3)];
the reason for not getting (3) with (1) is the possibility of the task
management command bypassing commands that it's supposed to
affect.  Charles' original proposal is [(2), (3)] because it has to time out
a stuck connection before executing the command, and is roughly
equivalent to sending the command for ordered delivery and having
the implementation treat any queue between iSCSI and SCSI as
being on the SCSI side of the line.  Doug Otis's counter-proposal
falls into the category of iSCSI doing task management on SCSI's
behalf and provides an example of how this results in visible changes
in behavior -- for the CLEAR ACA task management command,
aborting all tasks that are queued or in flight is generally incorrect.

I would note that this issue does not arise on single connection sessions,
because sending the command for immediate delivery plus some care not
to reorder things in the iSCSI Target (i.e., consider the iSCSI to SCSI
queue
to be in "SCSI" and hence subject to the task management command)
obtains all of (1) through (3).

Going out on a limb, I suspect applications will generally want [(2), (3)]
-- send for ordered delivery and wait for the dust to settle because that
provides the best odds of having some weird device get into a known
state from which further progress is possible.  This allows the application
to not know whether parallel SCSI, FCP or iSCSI is underneath and
relies on other iSCSI recovery procedures to make sure that the task
management command is delivered and executed (e.g., unstick and/or
close "stuck" connections).  There will be cases in which (1) is
needed (e.g., observe tape robot doing something obviously wrong,
and get it to stop immediately), but those may involve fairly blunt
instruments (e.g., LUN RESET) and the need to clean up any collateral
damage.

Sandeep's proposal to create state in the target either fails to achieve
(1) [if the response is delayed until the state is removed] or violates SAM2
[returns the response to the task management command before the task
management command is complete].  Having state linger after a completed
LUN or TARGET RESET is almost certainly wrong.

So, I think I'm down to sending task management functions once, usually
for ordered delivery with the application making the ordered vs. immediate
delivery choice (and sending the task management function twice if it
so chooses).  I think apps will generally choose ordered delivery, choosing
predictable behavior over immediacy concerns.  Aside from a longer
discussion of this issue, I still don't see the need for additional
mechanism(s) to task management - what have I missed in the above
discussion?

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Tue Apr  3 20:45:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA21127
	for <ips-archive@odin.ietf.org>; Tue, 3 Apr 2001 20:45:47 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33MKfV03721
	for ips-outgoing; Tue, 3 Apr 2001 18:20:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from df-inet1.exchange.microsoft.com (df-inet1.exchange.microsoft.com [131.107.8.8])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f33MJir03663
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 18:19:45 -0400 (EDT)
Received: from df-virus2.platinum.corp.microsoft.com ([172.30.236.33]) by df-inet1.exchange.microsoft.com with Microsoft SMTPSVC(5.0.2195.2831);
	 Tue, 3 Apr 2001 15:20:25 -0700
Received: from 172.30.236.11 by df-virus2.platinum.corp.microsoft.com (InterScan E-Mail VirusWall NT); Tue, 03 Apr 2001 14:20:42 -0700 (Pacific Daylight Time)
Received: from DF-BOWWOW.platinum.corp.microsoft.com ([172.30.236.100]) by yuri.dns.microsoft.com with Microsoft SMTPSVC(5.0.2195.2883);
	 Tue, 3 Apr 2001 15:20:42 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.0.4678.0
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C0BC8C.514ED7BE"
Subject: iSCSI implementation
Date: Tue, 3 Apr 2001 15:20:41 -0700
Message-ID: <5B90AD65A9E8934987DB0C8C07626574666F58@DF-BOWWOW.platinum.corp.microsoft.com>
Thread-Topic: iSCSI implementation
Thread-Index: AcC8jCiy16U6n9vsTRCzHo+/Zeslyg==
From: "Lakshmi Ramasubramanian" <nramas@Exchange.Microsoft.com>
To: <ips@ece.cmu.edu>
X-OriginalArrivalTime: 03 Apr 2001 22:20:42.0031 (UTC) FILETIME=[51CE93F0:01C0BC8C]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.

------_=_NextPart_001_01C0BC8C.514ED7BE
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

We implemented an iSCSI initiator based on July 2000 version of iSCSI
spec.
This was demoed in WinHEC (Windows Hardware Engg. Conference) last
month.
=20
We'd like to move to the newer version of the spec, and would like to
know what
level of the spec, people implementing the iSCSI targets, are using.
=20
thanks!
 -lakshmi

------_=_NextPart_001_01C0BC8C.514ED7BE
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Message</TITLE>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
<META content=3D"MSHTML 6.00.2462.0" name=3DGENERATOR></HEAD>
<BODY>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2>We implemented an iSCSI initiator based on July 2000 version of =
iSCSI=20
spec.</FONT></SPAN></DIV>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2>This was demoed in WinHEC (Windows Hardware Engg. =
Conference)&nbsp;last=20
month.</FONT></SPAN></DIV>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2>We'd like to move to the newer version of the spec, and would =
like to=20
know what</FONT></SPAN></DIV>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2>level of the spec, people implementing the iSCSI targets, are=20
using.</FONT></SPAN></DIV>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2>thanks!</FONT></SPAN></DIV>
<DIV><SPAN class=3D494051622-03042001><FONT face=3D"Courier New" =
color=3D#000080=20
size=3D2>&nbsp;-lakshmi</FONT></SPAN></DIV></BODY></HTML>
=00
------_=_NextPart_001_01C0BC8C.514ED7BE--


From owner-ips@ece.cmu.edu  Wed Apr  4 00:09:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA25716
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 00:09:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f33NbZB07311
	for ips-outgoing; Tue, 3 Apr 2001 19:37:35 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from smtp010.mail.yahoo.com (smtp010.mail.yahoo.com [216.136.173.30])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f33Naor07279
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 19:36:50 -0400 (EDT)
Received: from sdsl-216-36-75-164.dsl.sjc.megapath.net (HELO somesh) (216.36.75.164)
  by smtp.mail.vip.sc5.yahoo.com with SMTP; 3 Apr 2001 23:36:47 -0000
X-Apparently-From: <someshg@yahoo.com>
Reply-To: <someshg@yahoo.com>
From: "Somesh Gupta" <someshg@yahoo.com>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Tue, 3 Apr 2001 16:31:23 -0700
Message-ID: <NMEALCLOIBCHBDHLCMIJCELACCAA.someshg@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
In-Reply-To: <C1256A23.00077341.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

Which part of my note were you raising a concern about?

Somesh

> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Monday, April 02, 2001 6:25 PM
> To: ips@ece.cmu.edu
> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> Somesh,
>
> That will certainly result in poor performance for important applications
> even with hardware implementations of iSCSI - mainly due to the large SCSI
> command traffic and associated interrupts.
>
> Julo
>
> "Somesh Gupta" <someshg@yahoo.com> on 02/04/2001 22:23:25
>
> Please respond to someshg@yahoo.com
>
> To:   cbm@rose.hp.com, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> To beat a dead horse ..
>
> One has to really decide fundamentally whether
>
> 1. Commands are used to transfer very large amounts of
>    data (multiple data PDUs are needed)
> 2. Commands are used to transfer relatively small amounts
>    of data (few/about one data PDU) and multiple commands
>    are then used to do long transfers
>
> (Orlando consensus was #2)
>
> If we assume the first model, then we really should have
> a sequence # and acknowledgement of every PDU - not just
> data PDUs. In this case, it is important to fill holes
> in the iSCSI stream. We can have a "super-transport" as
> Mallikarjun suggested between the iSCSI protocol layer
> and the TCP layer that provides the various "transport"
> like features we seem to want.
>
> If we assume the second model, we assume that recovery at
> the command level is sufficient. In this case it is important
> to have whatever mechanisms are (including data seq #s) needed
> to detect that a command will not succeed without recovery
> at the command level. However, recovery is needed only
> at the command level.
>
> I would let the current application model decide the features
> in "version 1" of the iSCSI protocol.
>
> Somesh
>
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> > Mallikarjun C.
> > Sent: Monday, April 02, 2001 10:34 AM
> > To: ips@ece.cmu.edu
> > Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> > >Sorry to have been missing for a while. Hope you will
> > >appreciate my being back in action :-). It was a fairly
> > >clear consensus in Orlando that applications broke up
> > >their transfers into reasonably small chunks i.e. they
> > >did not have very long running transfers.
> > >
> > >Therefore the consensus was that a command level recovery
> > >mechanism was sufficient instead of an ack/sack for each
> > >data PDU.
> > >
> > >The SACK mechanism was a post Orlando invention. Without
> > >an ack mechanism (for every data PDU), the SACK mechanism
> > >just imposes additional burden on either end of the session,
> > >without really much benefit.
> >
> > To be fair to data SACK, one could think of an upper bound
> > on the unack'ed data - agreed on at the login time.  While not
> > requiring acks on every PDU, it gives targets the deterministic
> > maximum on the buffer size they have to keep around if they
> > choose to "reliably" support data SACK.  The current answer of
> > "replay buffer size/IO size", IMHO, is simply not attractive.
> > Also to be fair to data SACK, I believe FCP-2 allows sequence-level
> > error recovery in an I/O.
> >
> > However, I think that it's extremely useful to include a discussion
> > in the draft of  the TCP checksum "escape" statistics and the
> > device types for which this was considered an absolute requirement
> > to make forward progress at this error rates (like huge tape
> > backups?) - essentially the reasons that convinced Julian to define
> > this mechanism in. That gives credibility and acceptance to this,
> > or alternately may lead to the consensus that data SACK is not required.
> > --
> > Mallikarjun
> >
> >
> > Mallikarjun Chadalapaka
> > Networked Storage Architecture
> > Network Storage Solutions Organization
> > MS 5668 Hewlett-Packard, Roseville.
> > cbm@rose.hp.com
> >
> > >
> > >The benefit of having SACK is of saving bandwidth in case
> > >the data part of the data PDU failed an integrity check
> > >(but passed TCP checksum). This is a rare enough case that
> > >as a percentage, the bandwidth loss from retransmitting
> > >all the data associated with a read or write command is
> > >very very small.
> > >
> > >In addition, it avoids the complexity of restarting
> > >something from the middle, as compared to from the begining.
> > >
> > >To me it seems that there is significant simplicity (from
> > >implementation, reliability and recovery process) from
> > >having smaller data transfer per command.
> > >
> > >I would really like to get rid of the SACK command.
> > >
> > >Somesh
> > >
> > >> -----Original Message-----
> > >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
> Behalf Of
> > >> julian_satran@il.ibm.com
> > >> Sent: Wednesday, March 28, 2001 6:57 AM
> > >> To: ips@ece.cmu.edu
> > >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> Mallikarjun,
> > >>
> > >> Last summer I thought that recovery within a connection should
> > be left to
> > >> TCP. It is simple and could be made available through IPsec
> (if no new
> > >> option of any form can be added).
> > >>
> > >> Two things killed this:
> > >>
> > >>    The requirement to have a data encapsulation that can pass through
> > >>    application proxies (like a storage router)
> > >>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> > >>    header
> > >>
> > >>
> > >> As for the ACK - I am very much in favor of it (it is a no brainer)
> and
> > >> implementations are in fact allowed to drop even unacked data.
> > >>
> > >> I am bound by the Orlando meeting decision to drop it. Except
> > the regular
> > >> "oppose everything" crowd the two vocal opponents where Somesh
> > Gupta and
> > >> Matt Wakeley.
> > >>
> > >> David may want or not to re-open the issue - I am not going to
> > ask for it.
> > >>
> > >> Regards,
> > >> Julo
> > >>
> > >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> > >>
> > >> Please respond to cbm@rose.hp.com
> > >>
> > >> To:   Black_David@emc.com
> > >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> > someshg@yahoo.com,
> > >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> > >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> > >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> David and Julian,
> > >>
> > >> I appreciate both your views, and should I say that they're
> > >> along predicted lines :-)
> > >>
> > >> - David's right in saying that the situation is akin to FC's.
> > >>   However, I would like to point out that FC is an unreliable
> > >>   transport, and hence is forced to pick up a lot of the transport
> > >>   baggage (at least in FCP-2, as I understand), in addition
> > >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> > >>   TCP being the "reliable" transport, iSCSI is going along the
> > >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> > >>   point is - if this is indeed a necessary evil, why don't we
> > >>   complete iSCSI's transport functionality by data-ACKs?
> > >>
> > >> - If data SACK is introduced mostly to make up for TCP's
> shortcomings,
> > >>   we're making its usage (and implementation) drastically less
> > appealing
> > >>   since the only way error recovery algorithms can *rely* on
> data SACK
> > >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> > proposal),
> > >>   which is extremely expensive.  IOW, we're defining data SACK in the
> > >>   draft and not providing any incentives to implement and use it!
> > >>
> > >> - I submit that since iSCSI is being hailed as the ideal SCSI
> Transport
> > >>   protocol in its definition so far (and I believe, rightly so
> > - mandating
> > >>   command ordering, bi-di support, SCSI CRN support to name a few
> > >> examples),
> > >>   the perfectly SCSI-legal R/W interactions that break in
> > other transports
> > >>   *do not* have to break in iSCSI.
> > >>
> > >> - A last idea (may seem radical at this point) in regards to iSCSI
> > >>   being a "full transport". This provides us an opportunity to "cast
> > >>   off" the transport baggage in future when we truly move to a
> > "reliable"
> > >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> > >>   keeping the encapsulation stuff separate from the transport stuff.
> > >>   (Julian, I heard from Randy that ideas similar to this
> were explored
> > >>   in your Haifa meeting.  And yes, he recalls they were
> given up since
> > >>   TCP was supposed to be reliable and granularity of recovery
> > was deemed
> > >>   one I/O.)
> > >>
> > >> With that said, may I request David (with his co-chair hat on, :-))
> > >> to add some binding comments/observations on this discussion?
> > >>
> > >> If we decide to leave data SACKs as unattractive to implement,
> > the draft
> > >> should in the least add a statement like - "Note that satisfying all
> > >> possible data SACK requests for a task with an unacknowledged status
> > >> implies implementing the I/O replay buffer on the part of targets."
> > >> --
> > >> Mallikarjun
> > >>
> > >>
> > >> Mallikarjun Chadalapaka
> > >> Networked Storage Architecture
> > >> Network Storage Solutions Organization
> > >> MS 5668   Hewlett-Packard, Roseville.
> > >> cbm@rose.hp.com
> > >>
> > >>
> > >>
> > >>
> > >> >I think Julian's basically right -- I would point
> > >> >out that any case of write after read that breaks
> > >> >over iSCSI will also break over Fibre Channel.
> > >> >On FC, the scenario starts with a frame CRC failure
> > >> >on read data at the Initiator, so applications
> > >> >have to cope and typically do so by enforcing
> > >> >ordering at the app rather than using SCSI task
> > >> >ordering.
> > >> >
> > >> >While SCSI has clever tools like ACA and task
> > >> >ordering that appear to allow dependent operations
> > >> >to be sent to the target concurrently, in practice
> > >> >they don't work and/or aren't used (funny thing,
> > >> >those two reinforce each other ;-) ).  Hence
> > >> >a minimal approach to them is in order:
> > >> >- Make sure the result will interoperate.
> > >> >- Make sure T10 doesn't ding us for leaving something
> > >> >    completely out.
> > >> >- Don't specify anything not needed for the above.
> > >> >
> > >> >My 0.02,
> > >> >--David
> > >> >
> > >> >> -----Original Message-----
> > >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> > >> >> To:    cbm@rose.hp.com
> > >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
> hufferd@us.ibm.com;
> > >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> > >> >> Black_David@emc.com
> > >> >> Subject:    Re: iSCSI ERT: data SACK/replay
> buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >> Mallikarjun,
> > >> >>
> > >> >> I commiserate with you at the lack of ack for data but the Orlando
> > >> meeting
> > >> >> stated - no.  Recall that I kept the number only as a mechanism to
> > >> detect
> > >> >> missing packets.
> > >> >>
> > >> >> You can achieve the effect you want by keeping around data
> > for a while
> > >> >> (you
> > >> >> determine how long and then discard).
> > >> >>
> > >> >> If a SACK comes and you can recover - fine. If not you
> > either reaccess
> > >> the
> > >> >> media (if you know how) or reject
> > >> >> and let the initiator retry.
> > >> >>
> > >> >> You should not worry about R/W conflicts as programs bound
> > to have such
> > >> >> conflicts either:
> > >> >>
> > >> >> 1)can live with them or
> > >> >> 2)protect themselves through some locks and rely on
> > >> "operation-end-status"
> > >> >> to keep results deterministic.
> > >> >>
> > >> >> Regards,
> > >> >> Julo
> > >> >>
> > >> >>
> > >> >>
> > >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> > >> >>
> > >> >> Please respond to cbm@rose.hp.com
> > >> >>
> > >> >> To:   cbm@rose.hp.com, someshg@yahoo.com,
> > steph@cs.uchicago.edu, Julian
> > >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> > >> >> cc:   Black_David@emc.com
> > >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Hi Error Recovery Team,
> > >> >>
> > >> >> iSCSI can discard PDUs because of digest errors and request
> > >> >> retransmissions using the iSCSI data SACK.  To deal with such
> > >> >> an eventuality, targets that want to support data SACK have
> > >> >> the following options:
> > >> >>
> > >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> > >> >>   a SACK could come anytime before the status is ack'ed by the
> > >> >>   initiator. [ simple, but extremely expensive in memory
> resources]
> > >> >>
> > >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> > data-ACKs.
> > >> >>   Thus enables keeping only those I/O buffers that haven't
> > been ack'ed
> > >> >>   by the initiator. IOW, become a real full transport! [ everyone
> > >> disliked
> > >> >>   it earlier...]
> > >> >>
> > >> >> (C) re-access the medium for data retransmission requests.
> > Now there
> > >> >>   are 3 sub-cases in this to handle the changed data on the
> > medium in a
> > >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
> is
> > >> >> legal.)
> > >> >>      (1) On seeing any write, stall till status is ack'ed
> > for all the
> > >> >>             previous reads (basically drain the pipe).
> [simple, but
> > >> incurs
> > >> >>             an additional roundtrip delay for all writes].
> > >> >>      (2) A variation of the above, keep an eye only on the prior
> > >> >>             overlapping reads. [more BW efficient, but
> > complicated to
> > >> >>             resolve the block dependencies in a stream of
> > >> reads followed
> > >> >>             by writes]
> > >> >>         (3) Document the caveat and leave it upto the applications
> > >> >>             to avoid this case since this leads to data integrity
> > >> issues.
> > >> >>             [pushing to apps since the transport can't get
> > it right!]
> > >> >>
> > >> >> My first preference is (B), followed by (A), and I suggest we not
> go
> > >> >> to (C) at all with its inherent dangers.
> > >> >>
> > >> >> Doing (B) naturally completes the transport job that iSCSI has
> taken
> > >> >> on itself in view of TCP's claimed unreliable checksum.  That is
> the
> > >> >> right thing to do architecturally instead of being a
> > "semi-transport"!
> > >> >>
> > >> >> Comments?
> > >> >> --
> > >> >> Mallikarjun
> > >> >>
> > >> >>
> > >> >> Mallikarjun Chadalapaka
> > >> >> Networked Storage Architecture
> > >> >> Network Storage Solutions Organization
> > >> >> MS 5668   Hewlett-Packard, Roseville.
> > >> >> cbm@rose.hp.com
> > >> >>
> > >> >>
> > >>
> >
> __________________________________________________________________________
> > >> >> Note.1: A Read followed by a Write (to the same blocks) is
> perfectly
> > >> legal
> > >> >>         if SCSI sets the ORDERED task attribute on both the
> > >> commands AND
> > >> >>         sets the NACA bit to one to indicate that Write shall be
> > >> executed
> > >> >>         only if the Read did not fail (result in a Check
> Condition).
> > >> >>
> > >> >>         In the current case, since Read completed just fine
> > from SCSI's
> > >> >>         point of view, SCSI is moving on to execute Write.
> > Those read
> > >> >> buffers
> > >> >>         had been freed up since iSCSI received an ACK at
> > the TCP level,
> > >> >> and
> > >> >>         since iSCSI has no other way to have the data ack'ed!
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >>
> > >>
> > >>
> > >>
> > >
> > >
> > >_________________________________________________________
> > >Do You Yahoo!?
> > >Get your free @yahoo.com address at http://mail.yahoo.com
> > >
> > >
> >
>
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>
>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



From owner-ips@ece.cmu.edu  Wed Apr  4 03:20:21 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA10636
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 03:20:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f340cnf09632
	for ips-outgoing; Tue, 3 Apr 2001 20:38:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f340cMr09612
	for <ips@ece.cmu.edu>; Tue, 3 Apr 2001 20:38:22 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f341kX091694;
	Tue, 3 Apr 2001 18:46:34 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <CBinford@pirus.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple con nections.
Date: Tue, 3 Apr 2001 17:36:27 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOEPFCFAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
In-Reply-To: <0F31E5C394DAD311B60C00E029101A07080153A9@corpmx9.isus.emc.com>
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

Sandeep missed a point found within serial math, you have a window that
rotates with respect to prior commands based on the magnitude of the
difference.  There is no need to maintain any state other than the sequence
of the flagged command where prior pending to be sent commands are rejected.
Obviously before this window rotates more than 2 billion PDUs, this prior
value will need to be retired.  This is not a difficult or high overhead
operation with respect to rejecting prior commands.  There would not be any
decisions within the sequencer regarding content of any rejected PDU.  You
still should want to purge PDUs waiting in a queue pending to be sent to the
target should an "immediate" command be flagged.  Your concept creates an
odd event with both sequential and non-sequential delivery of a task
management command.  You are then left with a time interval where a
non-sequential command reception must modify behavior waiting for a possible
counter-part.  Causing all pending PDUs to be rejected immediately there is
no waiting for status information or any further activity to occur.  You
would see reject-reject-status.  If the initiator needs these rejected
commands replayed, this becomes an option of the initiator.

Doug

> > I would state this much stronger.  Applications had better not have to
> know
> > that it is iSCSI underneath vs. FCP or parallel SCSI else I believe we
> > missed the objective (granted, some things such as target address space
> are
> > unavoidably different, but I believe task management functions should be
> the
> > same).  The transport needs to handle the transport issues without
> exposing
> > quirks to the SCSI or application layer.
>
> Unfortunately, I think we have an impossible situation.  It appears to me
> that
> we have to pick at most two of the following three goals, as I have yet to
> see
> any way to achieve all three for a single task management command on a
> multiple connection session:
>
> (1) The command takes effect immediately and its status/response
> 	is available immediately.
> (2) The command affects all commands in flight, and its status/response
> 	is delayed until all such effects are complete.
> (3) There is no significant visible departure from existing SCSI task
> 	management behavior.
>
> The problem is that trying to do both (1) and (2) either requires SCSI to
> "execute" the task management command twice or requires that iSCSI do
> some task management (e.g., on the in-flight commands) on SCSI's behalf
> (or worse like having SCSI prolong the execution of the task management
> command until everything in flight in iSCSI arrives).  All of these appear
> to lead to problems with (3) in one form or another - two executions
> result in two SCSI status/responses that have to be merged, and iSCSI
> task management will sooner or later do something different from SCSI
> (e.g., I sincerely doubt that a Target in a bridge will ever get this 100%
> identical to the devices that are being bridged).
>
> The current iSCSI draft provides the choice of  [(1)] XOR [(2), (3)];
> the reason for not getting (3) with (1) is the possibility of the task
> management command bypassing commands that it's supposed to
> affect.  Charles' original proposal is [(2), (3)] because it has
> to time out
> a stuck connection before executing the command, and is roughly
> equivalent to sending the command for ordered delivery and having
> the implementation treat any queue between iSCSI and SCSI as
> being on the SCSI side of the line.  Doug Otis's counter-proposal
> falls into the category of iSCSI doing task management on SCSI's
> behalf and provides an example of how this results in visible changes
> in behavior -- for the CLEAR ACA task management command,
> aborting all tasks that are queued or in flight is generally incorrect.
>
> I would note that this issue does not arise on single connection sessions,
> because sending the command for immediate delivery plus some care not
> to reorder things in the iSCSI Target (i.e., consider the iSCSI to SCSI
> queue
> to be in "SCSI" and hence subject to the task management command)
> obtains all of (1) through (3).
>
> Going out on a limb, I suspect applications will generally want [(2), (3)]
> -- send for ordered delivery and wait for the dust to settle because that
> provides the best odds of having some weird device get into a known
> state from which further progress is possible.  This allows the
> application
> to not know whether parallel SCSI, FCP or iSCSI is underneath and
> relies on other iSCSI recovery procedures to make sure that the task
> management command is delivered and executed (e.g., unstick and/or
> close "stuck" connections).  There will be cases in which (1) is
> needed (e.g., observe tape robot doing something obviously wrong,
> and get it to stop immediately), but those may involve fairly blunt
> instruments (e.g., LUN RESET) and the need to clean up any collateral
> damage.
>
> Sandeep's proposal to create state in the target either fails to achieve
> (1) [if the response is delayed until the state is removed] or
> violates SAM2
> [returns the response to the task management command before the task
> management command is complete].  Having state linger after a completed
> LUN or TARGET RESET is almost certainly wrong.
>
> So, I think I'm down to sending task management functions once, usually
> for ordered delivery with the application making the ordered vs. immediate
> delivery choice (and sending the task management function twice if it
> so chooses).  I think apps will generally choose ordered
> delivery, choosing
> predictable behavior over immediacy concerns.  Aside from a longer
> discussion of this issue, I still don't see the need for additional
> mechanism(s) to task management - what have I missed in the above
> discussion?
>
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>



From owner-ips@ece.cmu.edu  Wed Apr  4 09:55:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA19369
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 09:55:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34BjDL12615
	for ips-outgoing; Wed, 4 Apr 2001 07:45:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34BiDr12591
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 07:44:14 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id NAA185862;
	Wed, 4 Apr 2001 13:44:09 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id NAA152780;
	Wed, 4 Apr 2001 13:42:17 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.00402215 ; Wed, 4 Apr 2001 13:40:30 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: "Martin, Nick" <Nick.Martin@compaq.com>
cc: ips@ece.cmu.edu
Message-ID: <C1256A24.0038CCD7.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 12:23:58 +0200
Subject: Re: Bit position request.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Martin,

I'll look at it this week and if it does not have any side effect I'll put
it in.

Regards,
Julo

"Martin, Nick" <Nick.Martin@compaq.com> on 03/04/2001 22:26:48

Please respond to "Martin, Nick" <Nick.Martin@compaq.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  Bit position request.




Julian,

I would like to request a minor change in the PDU format of 0x45 SCSI Data
for Read and 0x41 SCSI Response.
My request is that the bits within byte 1 with the same name and function
would have the same bit position.

My suggestion would be:

0x45 F _ _ _ _ O U S
0x41 1 _ _ o u O U S

The 1 in 0x41 is still an "F" bit for Final bit, but must be set to 1 in
this PDU.
The "S" bit for SCSI status present is the one that is defined in two
different locations in draft 05.
The reason for moving the "S" bit to the end is purely esthetic.

I hope not to start a huge discussion, but if you feel this requires review
on the mailing list, that is fine also.

Thanks,
Nick









From owner-ips@ece.cmu.edu  Wed Apr  4 09:56:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA19421
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 09:56:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34C0ER13122
	for ips-outgoing; Wed, 4 Apr 2001 08:00:14 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from saturn.cs.uml.edu (saturn.cs.uml.edu [129.63.8.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34BxHr13096
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 07:59:18 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1])
	by saturn.cs.uml.edu (8.11.0/8.11.2) with SMTP id f34BxGS503524
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 07:59:16 -0400 (EDT)
Date: Wed, 4 Apr 2001 07:59:16 -0400 (EDT)
From: Mike Brown <mbrown@cs.uml.edu>
To: ips@ece.cmu.edu
Subject: iSCSI linux implementation 
Message-ID: <Pine.OSF.3.96.1010404074934.506661D-100000@saturn.cs.uml.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by ece.cmu.edu id f34C0Ds13119
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
X-MIME-Autoconverted: from 8bit to quoted-printable by ece.cmu.edu id f34C0ES13122
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by ietf.org id JAA19421

Hello,

Due to recent posts on linux-net and linux-scsi mailing lists, I felt
obligated to announce the implementation myself and a few other developers
have been working on.  What we are writing is a target and an initiator
driver.  I hastily threw up a web page describing our project:

http://www.cs.uml.edu/~mbrown/iSCSI/

This also contains current sources.  We are still in the middle of
developing and what we have written is based largely on version 3 of the
iSCSI draft with only a few minor changes moving to version 5 (op codes).
Currently our priority is to get something working and then go back and
make it conform to version 5.  Again, this is a WIP and is likely
incorrect in some parts.

Thanks.

-Michael F. Brown, UMass Lowell Computer Science

phone:  (978) 934-5354
email:  mbrown@cs.uml.edu

"I wonder if pawns just realize that they're just pawns 
 in someone's (chess) game."   -L. Fitzgerald Sjöberg



From owner-ips@ece.cmu.edu  Wed Apr  4 09:59:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA19581
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 09:59:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34BwEl13066
	for ips-outgoing; Wed, 4 Apr 2001 07:58:14 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34BvSr13043
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 07:57:29 -0400 (EDT)
Received: from southrelay02.raleigh.ibm.com (southrelay02.raleigh.ibm.com [9.37.3.209])
	by e21.nc.us.ibm.com (8.9.3/8.9.3) with ESMTP id HAA11728
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 07:52:07 -0500
Received: from d04nms25.raleigh.ibm.com (d04nms25.raleigh.ibm.com [9.67.228.6])
	by southrelay02.raleigh.ibm.com (8.11.1/NCO v4.95) with ESMTP id f34BvLE38434
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 07:57:21 -0400
Importance: Normal
Subject: Initiator-detected format or digest errors
To: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF5DB616D2.4B6004EF-ON85256A24.0040EB25@raleigh.ibm.com>
From: "Thomas McSweeney" <rf42tpme@us.ibm.com>
Date: Wed, 4 Apr 2001 07:57:20 -0400
X-MIMETrack: Serialize by Router on D04NMS25/04/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/04/2001 07:57:21 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Section "2.20 Reject" talks about the target receiving a bad frame and
sending Reject.  What should an Initiator do if it receives a PDU with a
format or digest error?  Should it send Reject?  If so, we'll need to
ensure that the Initiator fields of the MIB include an object to count
Reject commands transmitted.

Tom McSweeney
iSCSI Development, Storage Systems Group, IBM
Email: rf42tpme@us.ibm.com
Phone: (USA) 919-254-5634  (tie line: 444-5634)
Fax:   (USA) 919-254-0391  (tie line: 444-0391)



From owner-ips@ece.cmu.edu  Wed Apr  4 13:39:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA27009
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 13:39:57 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34FWLm26565
	for ips-outgoing; Wed, 4 Apr 2001 11:32:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from chmls05.mediaone.net (chmls05.mediaone.net [24.147.1.143])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34FVrr26540
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 11:31:53 -0400 (EDT)
Received: from breinhold ([65.192.191.232])
	by chmls05.mediaone.net (8.11.1/8.11.1) with SMTP id f34FVlx28909;
	Wed, 4 Apr 2001 11:31:47 -0400 (EDT)
From: "Barry Reinhold" <bbrtrebia@mediaone.net>
To: "Julian_Satran@Il. Ibm. Com" <julian_satran@il.ibm.com>
Cc: "ISCSI" <ips@ece.cmu.edu>
Subject: ISCSI: Detail on counting offset for fixed markers
Date: Wed, 4 Apr 2001 11:29:59 -0400
Message-ID: <BJEIKPAFDFPFNCPPBCGPCEOICEAA.bbrtrebia@mediaone.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,
	A mundane clearification that need to go into the specification relative to
appendix C (Fixed Markers). It currently reads:

"The Marker indicates the offset to the next iSCSI message header.
   The Marker is eight bytes in length, and contains two 32-bit offset
   fields that indicate how many bytes to skip in the TCP stream in
   order to find the next iSCSI message header.  The marker uses two
   copies of the pointer so that a marker spanning a TCP packet boundary
   will leave at least one valid copy in one of the packets."

Since we are counting bytes, does the offset start after the 4 bytes that
make up this copy of the pointer (exculdes pointer)or does it start with the
first byte of the pointer (includes pointer)?

A third option is that it starts after both copies of the pointers so that
the two values are the same....




Barry Reinhold
Principal Architect
Trebia Networks
barry.reinhold@trebia.com
603-868-5144/603-659-0885/978-929-0830 x138



From owner-ips@ece.cmu.edu  Wed Apr  4 13:41:02 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA27044
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 13:41:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34FUYm26438
	for ips-outgoing; Wed, 4 Apr 2001 11:30:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34FTYr26336
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 11:29:35 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id RAA116462
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:29:28 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id RAA23060
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:27:36 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0054D00C ; Wed, 4 Apr 2001 17:26:22 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A24.0054CDDD.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 17:02:11 +0200
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Sorry,

Sorry - I meant to say many commands with small data amounts.

Julo

"Somesh Gupta" <someshg@yahoo.com> on 04/04/2001 01:31:23

Please respond to someshg@yahoo.com

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"




Julian,

Which part of my note were you raising a concern about?

Somesh

> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Monday, April 02, 2001 6:25 PM
> To: ips@ece.cmu.edu
> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> Somesh,
>
> That will certainly result in poor performance for important applications
> even with hardware implementations of iSCSI - mainly due to the large
SCSI
> command traffic and associated interrupts.
>
> Julo
>
> "Somesh Gupta" <someshg@yahoo.com> on 02/04/2001 22:23:25
>
> Please respond to someshg@yahoo.com
>
> To:   cbm@rose.hp.com, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> To beat a dead horse ..
>
> One has to really decide fundamentally whether
>
> 1. Commands are used to transfer very large amounts of
>    data (multiple data PDUs are needed)
> 2. Commands are used to transfer relatively small amounts
>    of data (few/about one data PDU) and multiple commands
>    are then used to do long transfers
>
> (Orlando consensus was #2)
>
> If we assume the first model, then we really should have
> a sequence # and acknowledgement of every PDU - not just
> data PDUs. In this case, it is important to fill holes
> in the iSCSI stream. We can have a "super-transport" as
> Mallikarjun suggested between the iSCSI protocol layer
> and the TCP layer that provides the various "transport"
> like features we seem to want.
>
> If we assume the second model, we assume that recovery at
> the command level is sufficient. In this case it is important
> to have whatever mechanisms are (including data seq #s) needed
> to detect that a command will not succeed without recovery
> at the command level. However, recovery is needed only
> at the command level.
>
> I would let the current application model decide the features
> in "version 1" of the iSCSI protocol.
>
> Somesh
>
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> > Mallikarjun C.
> > Sent: Monday, April 02, 2001 10:34 AM
> > To: ips@ece.cmu.edu
> > Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> > >Sorry to have been missing for a while. Hope you will
> > >appreciate my being back in action :-). It was a fairly
> > >clear consensus in Orlando that applications broke up
> > >their transfers into reasonably small chunks i.e. they
> > >did not have very long running transfers.
> > >
> > >Therefore the consensus was that a command level recovery
> > >mechanism was sufficient instead of an ack/sack for each
> > >data PDU.
> > >
> > >The SACK mechanism was a post Orlando invention. Without
> > >an ack mechanism (for every data PDU), the SACK mechanism
> > >just imposes additional burden on either end of the session,
> > >without really much benefit.
> >
> > To be fair to data SACK, one could think of an upper bound
> > on the unack'ed data - agreed on at the login time.  While not
> > requiring acks on every PDU, it gives targets the deterministic
> > maximum on the buffer size they have to keep around if they
> > choose to "reliably" support data SACK.  The current answer of
> > "replay buffer size/IO size", IMHO, is simply not attractive.
> > Also to be fair to data SACK, I believe FCP-2 allows sequence-level
> > error recovery in an I/O.
> >
> > However, I think that it's extremely useful to include a discussion
> > in the draft of  the TCP checksum "escape" statistics and the
> > device types for which this was considered an absolute requirement
> > to make forward progress at this error rates (like huge tape
> > backups?) - essentially the reasons that convinced Julian to define
> > this mechanism in. That gives credibility and acceptance to this,
> > or alternately may lead to the consensus that data SACK is not
required.
> > --
> > Mallikarjun
> >
> >
> > Mallikarjun Chadalapaka
> > Networked Storage Architecture
> > Network Storage Solutions Organization
> > MS 5668 Hewlett-Packard, Roseville.
> > cbm@rose.hp.com
> >
> > >
> > >The benefit of having SACK is of saving bandwidth in case
> > >the data part of the data PDU failed an integrity check
> > >(but passed TCP checksum). This is a rare enough case that
> > >as a percentage, the bandwidth loss from retransmitting
> > >all the data associated with a read or write command is
> > >very very small.
> > >
> > >In addition, it avoids the complexity of restarting
> > >something from the middle, as compared to from the begining.
> > >
> > >To me it seems that there is significant simplicity (from
> > >implementation, reliability and recovery process) from
> > >having smaller data transfer per command.
> > >
> > >I would really like to get rid of the SACK command.
> > >
> > >Somesh
> > >
> > >> -----Original Message-----
> > >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
> Behalf Of
> > >> julian_satran@il.ibm.com
> > >> Sent: Wednesday, March 28, 2001 6:57 AM
> > >> To: ips@ece.cmu.edu
> > >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> Mallikarjun,
> > >>
> > >> Last summer I thought that recovery within a connection should
> > be left to
> > >> TCP. It is simple and could be made available through IPsec
> (if no new
> > >> option of any form can be added).
> > >>
> > >> Two things killed this:
> > >>
> > >>    The requirement to have a data encapsulation that can pass
through
> > >>    application proxies (like a storage router)
> > >>    The "NO WAY" message we got from IESG-Security on a CRC only
IPSec
> > >>    header
> > >>
> > >>
> > >> As for the ACK - I am very much in favor of it (it is a no brainer)
> and
> > >> implementations are in fact allowed to drop even unacked data.
> > >>
> > >> I am bound by the Orlando meeting decision to drop it. Except
> > the regular
> > >> "oppose everything" crowd the two vocal opponents where Somesh
> > Gupta and
> > >> Matt Wakeley.
> > >>
> > >> David may want or not to re-open the issue - I am not going to
> > ask for it.
> > >>
> > >> Regards,
> > >> Julo
> > >>
> > >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> > >>
> > >> Please respond to cbm@rose.hp.com
> > >>
> > >> To:   Black_David@emc.com
> > >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> > someshg@yahoo.com,
> > >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> > >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> > >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> David and Julian,
> > >>
> > >> I appreciate both your views, and should I say that they're
> > >> along predicted lines :-)
> > >>
> > >> - David's right in saying that the situation is akin to FC's.
> > >>   However, I would like to point out that FC is an unreliable
> > >>   transport, and hence is forced to pick up a lot of the transport
> > >>   baggage (at least in FCP-2, as I understand), in addition
> > >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> > >>   TCP being the "reliable" transport, iSCSI is going along the
> > >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> > >>   point is - if this is indeed a necessary evil, why don't we
> > >>   complete iSCSI's transport functionality by data-ACKs?
> > >>
> > >> - If data SACK is introduced mostly to make up for TCP's
> shortcomings,
> > >>   we're making its usage (and implementation) drastically less
> > appealing
> > >>   since the only way error recovery algorithms can *rely* on
> data SACK
> > >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> > proposal),
> > >>   which is extremely expensive.  IOW, we're defining data SACK in
the
> > >>   draft and not providing any incentives to implement and use it!
> > >>
> > >> - I submit that since iSCSI is being hailed as the ideal SCSI
> Transport
> > >>   protocol in its definition so far (and I believe, rightly so
> > - mandating
> > >>   command ordering, bi-di support, SCSI CRN support to name a few
> > >> examples),
> > >>   the perfectly SCSI-legal R/W interactions that break in
> > other transports
> > >>   *do not* have to break in iSCSI.
> > >>
> > >> - A last idea (may seem radical at this point) in regards to iSCSI
> > >>   being a "full transport". This provides us an opportunity to "cast
> > >>   off" the transport baggage in future when we truly move to a
> > "reliable"
> > >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> > >>   keeping the encapsulation stuff separate from the transport stuff.
> > >>   (Julian, I heard from Randy that ideas similar to this
> were explored
> > >>   in your Haifa meeting.  And yes, he recalls they were
> given up since
> > >>   TCP was supposed to be reliable and granularity of recovery
> > was deemed
> > >>   one I/O.)
> > >>
> > >> With that said, may I request David (with his co-chair hat on, :-))
> > >> to add some binding comments/observations on this discussion?
> > >>
> > >> If we decide to leave data SACKs as unattractive to implement,
> > the draft
> > >> should in the least add a statement like - "Note that satisfying all
> > >> possible data SACK requests for a task with an unacknowledged status
> > >> implies implementing the I/O replay buffer on the part of targets."
> > >> --
> > >> Mallikarjun
> > >>
> > >>
> > >> Mallikarjun Chadalapaka
> > >> Networked Storage Architecture
> > >> Network Storage Solutions Organization
> > >> MS 5668   Hewlett-Packard, Roseville.
> > >> cbm@rose.hp.com
> > >>
> > >>
> > >>
> > >>
> > >> >I think Julian's basically right -- I would point
> > >> >out that any case of write after read that breaks
> > >> >over iSCSI will also break over Fibre Channel.
> > >> >On FC, the scenario starts with a frame CRC failure
> > >> >on read data at the Initiator, so applications
> > >> >have to cope and typically do so by enforcing
> > >> >ordering at the app rather than using SCSI task
> > >> >ordering.
> > >> >
> > >> >While SCSI has clever tools like ACA and task
> > >> >ordering that appear to allow dependent operations
> > >> >to be sent to the target concurrently, in practice
> > >> >they don't work and/or aren't used (funny thing,
> > >> >those two reinforce each other ;-) ).  Hence
> > >> >a minimal approach to them is in order:
> > >> >- Make sure the result will interoperate.
> > >> >- Make sure T10 doesn't ding us for leaving something
> > >> >    completely out.
> > >> >- Don't specify anything not needed for the above.
> > >> >
> > >> >My 0.02,
> > >> >--David
> > >> >
> > >> >> -----Original Message-----
> > >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> > >> >> To:    cbm@rose.hp.com
> > >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
> hufferd@us.ibm.com;
> > >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> > >> >> Black_David@emc.com
> > >> >> Subject:    Re: iSCSI ERT: data SACK/replay
> buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >> Mallikarjun,
> > >> >>
> > >> >> I commiserate with you at the lack of ack for data but the
Orlando
> > >> meeting
> > >> >> stated - no.  Recall that I kept the number only as a mechanism
to
> > >> detect
> > >> >> missing packets.
> > >> >>
> > >> >> You can achieve the effect you want by keeping around data
> > for a while
> > >> >> (you
> > >> >> determine how long and then discard).
> > >> >>
> > >> >> If a SACK comes and you can recover - fine. If not you
> > either reaccess
> > >> the
> > >> >> media (if you know how) or reject
> > >> >> and let the initiator retry.
> > >> >>
> > >> >> You should not worry about R/W conflicts as programs bound
> > to have such
> > >> >> conflicts either:
> > >> >>
> > >> >> 1)can live with them or
> > >> >> 2)protect themselves through some locks and rely on
> > >> "operation-end-status"
> > >> >> to keep results deterministic.
> > >> >>
> > >> >> Regards,
> > >> >> Julo
> > >> >>
> > >> >>
> > >> >>
> > >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> > >> >>
> > >> >> Please respond to cbm@rose.hp.com
> > >> >>
> > >> >> To:   cbm@rose.hp.com, someshg@yahoo.com,
> > steph@cs.uchicago.edu, Julian
> > >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> > >> >> cc:   Black_David@emc.com
> > >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Hi Error Recovery Team,
> > >> >>
> > >> >> iSCSI can discard PDUs because of digest errors and request
> > >> >> retransmissions using the iSCSI data SACK.  To deal with such
> > >> >> an eventuality, targets that want to support data SACK have
> > >> >> the following options:
> > >> >>
> > >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> > >> >>   a SACK could come anytime before the status is ack'ed by the
> > >> >>   initiator. [ simple, but extremely expensive in memory
> resources]
> > >> >>
> > >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> > data-ACKs.
> > >> >>   Thus enables keeping only those I/O buffers that haven't
> > been ack'ed
> > >> >>   by the initiator. IOW, become a real full transport! [ everyone
> > >> disliked
> > >> >>   it earlier...]
> > >> >>
> > >> >> (C) re-access the medium for data retransmission requests.
> > Now there
> > >> >>   are 3 sub-cases in this to handle the changed data on the
> > medium in a
> > >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
> is
> > >> >> legal.)
> > >> >>      (1) On seeing any write, stall till status is ack'ed
> > for all the
> > >> >>             previous reads (basically drain the pipe).
> [simple, but
> > >> incurs
> > >> >>             an additional roundtrip delay for all writes].
> > >> >>      (2) A variation of the above, keep an eye only on the prior
> > >> >>             overlapping reads. [more BW efficient, but
> > complicated to
> > >> >>             resolve the block dependencies in a stream of
> > >> reads followed
> > >> >>             by writes]
> > >> >>         (3) Document the caveat and leave it upto the
applications
> > >> >>             to avoid this case since this leads to data integrity
> > >> issues.
> > >> >>             [pushing to apps since the transport can't get
> > it right!]
> > >> >>
> > >> >> My first preference is (B), followed by (A), and I suggest we not
> go
> > >> >> to (C) at all with its inherent dangers.
> > >> >>
> > >> >> Doing (B) naturally completes the transport job that iSCSI has
> taken
> > >> >> on itself in view of TCP's claimed unreliable checksum.  That is
> the
> > >> >> right thing to do architecturally instead of being a
> > "semi-transport"!
> > >> >>
> > >> >> Comments?
> > >> >> --
> > >> >> Mallikarjun
> > >> >>
> > >> >>
> > >> >> Mallikarjun Chadalapaka
> > >> >> Networked Storage Architecture
> > >> >> Network Storage Solutions Organization
> > >> >> MS 5668   Hewlett-Packard, Roseville.
> > >> >> cbm@rose.hp.com
> > >> >>
> > >> >>
> > >>
> >
>
__________________________________________________________________________
> > >> >> Note.1: A Read followed by a Write (to the same blocks) is
> perfectly
> > >> legal
> > >> >>         if SCSI sets the ORDERED task attribute on both the
> > >> commands AND
> > >> >>         sets the NACA bit to one to indicate that Write shall be
> > >> executed
> > >> >>         only if the Read did not fail (result in a Check
> Condition).
> > >> >>
> > >> >>         In the current case, since Read completed just fine
> > from SCSI's
> > >> >>         point of view, SCSI is moving on to execute Write.
> > Those read
> > >> >> buffers
> > >> >>         had been freed up since iSCSI received an ACK at
> > the TCP level,
> > >> >> and
> > >> >>         since iSCSI has no other way to have the data ack'ed!
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >>
> > >>
> > >>
> > >>
> > >
> > >
> > >_________________________________________________________
> > >Do You Yahoo!?
> > >Get your free @yahoo.com address at http://mail.yahoo.com
> > >
> > >
> >
>
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>
>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com






From owner-ips@ece.cmu.edu  Wed Apr  4 15:25:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA29458
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 15:25:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34HUOq04529
	for ips-outgoing; Wed, 4 Apr 2001 13:30:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f34HU5r04495
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 13:30:05 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by dirty; Wed Apr  4 13:28:40 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Wed Apr  4 13:28:40 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id NAA05705;
	Wed, 4 Apr 2001 13:28:36 -0400 (EDT)
Message-ID: <3ACB59C4.BCD5D379@research.bell-labs.com>
Date: Wed, 04 Apr 2001 13:28:36 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Douglas Otis <dotis@sanlight.net>
CC: Black_David@emc.com, ips@ece.cmu.edu
Subject: Re: iSCSI: Out Of Sequence due to null sequence with multiple con 
 nections.
References: <NEBBJGDMMLHHCIKHGBEJOEPFCFAA.dotis@sanlight.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


Doug,

thanks.  If you (or anyone) could correct the psuedo-code below to 
illustrate your solution, it might help achieve quicker consensus 
and avoid some discussion.

I see what I missed, in addition to Julian's point about the 
refTaskTag usage preventing ITT reuse.  But dont you still need 
the cmdSN of the original task to find out if task_mgmt command 
is early or late?
 (a..assuming you are still sending the task_mgmt 
command with immediate delivery)

**Event=task_mgmt at initiator:
    purge PDUs in queue at initiator
    send task_mgmt to target (cmdSN=0)

**Event=task_mgmt at target:
    compare refCmdSN with executing <min,max>CmdSN queue
    if (refCmdSN < minCmdSN)
        /*task_mgmt cmd is early */
        must wait & drop the orig_task PDU when it arrives 
    else if (refCmdSN > maxCmdSN)
        /*task_mgmt cmd is late, original task has completed at target*/
        return task_response (response code=Task was not in task set)
    else
        /*task is executing*/
        give task_mgmt command to SCSI layer


-Sandeep 


Douglas Otis wrote:
> 
> David,
> 
> Sandeep missed a point found within serial math, you have a window that
> rotates with respect to prior commands based on the magnitude of the
> difference.  There is no need to maintain any state other than the sequence
> of the flagged command where prior pending to be sent commands are rejected.
> Obviously before this window rotates more than 2 billion PDUs, this prior
> value will need to be retired.  This is not a difficult or high overhead
> operation with respect to rejecting prior commands.  There would not be any
> decisions within the sequencer regarding content of any rejected PDU.  You
> still should want to purge PDUs waiting in a queue pending to be sent to the
> target should an "immediate" command be flagged.  Your concept creates an
> odd event with both sequential and non-sequential delivery of a task
> management command.  You are then left with a time interval where a
> non-sequential command reception must modify behavior waiting for a possible
> counter-part.  Causing all pending PDUs to be rejected immediately there is
> no waiting for status information or any further activity to occur.  You
> would see reject-reject-status.  If the initiator needs these rejected
> commands replayed, this becomes an option of the initiator.
> 
> Doug
> 
> > > I would state this much stronger.  Applications had better not have to
> > know
> > > that it is iSCSI underneath vs. FCP or parallel SCSI else I believe we
> > > missed the objective (granted, some things such as target address space
> > are
> > > unavoidably different, but I believe task management functions should be
> > the
> > > same).  The transport needs to handle the transport issues without
> > exposing
> > > quirks to the SCSI or application layer.
> >
> > Unfortunately, I think we have an impossible situation.  It appears to me
> > that
> > we have to pick at most two of the following three goals, as I have yet to
> > see
> > any way to achieve all three for a single task management command on a
> > multiple connection session:
> >
> > (1) The command takes effect immediately and its status/response
> >       is available immediately.
> > (2) The command affects all commands in flight, and its status/response
> >       is delayed until all such effects are complete.
> > (3) There is no significant visible departure from existing SCSI task
> >       management behavior.
> >
> > The problem is that trying to do both (1) and (2) either requires SCSI to
> > "execute" the task management command twice or requires that iSCSI do
> > some task management (e.g., on the in-flight commands) on SCSI's behalf
> > (or worse like having SCSI prolong the execution of the task management
> > command until everything in flight in iSCSI arrives).  All of these appear
> > to lead to problems with (3) in one form or another - two executions
> > result in two SCSI status/responses that have to be merged, and iSCSI
> > task management will sooner or later do something different from SCSI
> > (e.g., I sincerely doubt that a Target in a bridge will ever get this 100%
> > identical to the devices that are being bridged).
> >
> > The current iSCSI draft provides the choice of  [(1)] XOR [(2), (3)];
> > the reason for not getting (3) with (1) is the possibility of the task
> > management command bypassing commands that it's supposed to
> > affect.  Charles' original proposal is [(2), (3)] because it has
> > to time out
> > a stuck connection before executing the command, and is roughly
> > equivalent to sending the command for ordered delivery and having
> > the implementation treat any queue between iSCSI and SCSI as
> > being on the SCSI side of the line.  Doug Otis's counter-proposal
> > falls into the category of iSCSI doing task management on SCSI's
> > behalf and provides an example of how this results in visible changes
> > in behavior -- for the CLEAR ACA task management command,
> > aborting all tasks that are queued or in flight is generally incorrect.
> >
> > I would note that this issue does not arise on single connection sessions,
> > because sending the command for immediate delivery plus some care not
> > to reorder things in the iSCSI Target (i.e., consider the iSCSI to SCSI
> > queue
> > to be in "SCSI" and hence subject to the task management command)
> > obtains all of (1) through (3).
> >
> > Going out on a limb, I suspect applications will generally want [(2), (3)]
> > -- send for ordered delivery and wait for the dust to settle because that
> > provides the best odds of having some weird device get into a known
> > state from which further progress is possible.  This allows the
> > application
> > to not know whether parallel SCSI, FCP or iSCSI is underneath and
> > relies on other iSCSI recovery procedures to make sure that the task
> > management command is delivered and executed (e.g., unstick and/or
> > close "stuck" connections).  There will be cases in which (1) is
> > needed (e.g., observe tape robot doing something obviously wrong,
> > and get it to stop immediately), but those may involve fairly blunt
> > instruments (e.g., LUN RESET) and the need to clean up any collateral
> > damage.
> >
> > Sandeep's proposal to create state in the target either fails to achieve
> > (1) [if the response is delayed until the state is removed] or
> > violates SAM2
> > [returns the response to the task management command before the task
> > management command is complete].  Having state linger after a completed
> > LUN or TARGET RESET is almost certainly wrong.
> >
> > So, I think I'm down to sending task management functions once, usually
> > for ordered delivery with the application making the ordered vs. immediate
> > delivery choice (and sending the task management function twice if it
> > so chooses).  I think apps will generally choose ordered
> > delivery, choosing
> > predictable behavior over immediacy concerns.  Aside from a longer
> > discussion of this issue, I still don't see the need for additional
> > mechanism(s) to task management - what have I missed in the above
> > discussion?
> >
> > --David
> >
> > ---------------------------------------------------
> > David L. Black, Senior Technologist
> > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> > black_david@emc.com       Mobile: +1 (978) 394-7754
> > ---------------------------------------------------
> >
> >


From owner-ips@ece.cmu.edu  Wed Apr  4 15:25:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA29469
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 15:25:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34I3Oo06807
	for ips-outgoing; Wed, 4 Apr 2001 14:03:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34I2Yr06774
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 14:02:34 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by palrel3.hp.com (Postfix) with ESMTP id 6EF4F13A
	for <ips@ece.cmu.edu>; Wed,  4 Apr 2001 11:02:25 -0700 (PDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id LAA05914 for ips@ece.cmu.edu; Wed, 4 Apr 2001 11:02:13 -0700 (PDT)
Message-Id: <200104041802.LAA05914@core.rose.hp.com>
Subject: Re: iSCSI: synch and steering comments
To: ips@ece.cmu.edu
Date: Wed, 04 Apr 2001 11:02:12 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julian,

>Mallikarjun,
>
>I am not sure about which comment. If it is about synch and steering I
>think that recovery from header digest errors
>should not mandate a synch mechanism.  Some very sophisticated
>implementations may want to take advantage of such a mechanism if it is
>there. As this interaction may fairly complex and implementation dependent
>we will assume that we will drop the connection in the recovery
>descriptions we will provide.

Sorry, I am not clear on what you meant (keep in mind that that I am
not asking to mandate a synch mechanism) -

Are you saying that when synch and steering is not implemented in an iSCSI device:
  a) it can recover from header digest errors only by dropping the connection
     (which the ER algorithms would assume)
OR 
  b) it can recover from header digest errors by "fairly complex and
     implementaion dependent" mechanisms which rely on the Length field
     anyway, and try to analyze perhaps several PDUs for achieving the 
     framing synch again?

I am assuming that it would be (a).  I am also requesting that this be made
clear in the draft.

Here's the third comment at the bottom that you missed.  Your comments
would be very helpful.  Thanks!

-It appears to me that at least one Synch and Steering layer must be
 defined/referred to as the minimal implementation in the main draft to
 enable interoperability, when implementations do implement Synch and
Steering.

+++ why ? +++

I may be using "interoperability" in a somewhat unconventional sense here.
While the draft says that Synch and Steering layer is optional, I don't see
that it requires implementations to always support a "no synch & steering"
mode, even when they support one type of Synch and Steering layer.  Given
that there's no mandatory Synch and Steering layer either, I don't see how two
iSCSI boxes that a customer buys are guaranteed to interoperate.  Please
comment if the draft already implies what I am asking for.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com

>
>This is also partly a result of choosing Format 2.
>
>Regards,
>Julo
>
>"Mallikarjun C." <cbm@rose.hp.com> on 02/04/2001 07:14:54
>
>Please respond to cbm@rose.hp.com
>
>To:   ips@ece.cmu.edu
>cc:
>Subject:  Re: iSCSI: synch and steering comments
>
>
>
>
>Julian,
>
>Thanks for the clarification.
>
>Could you please take time to respond to the other two comments I had?
>Or, do I take it that you will get back shortly?
>
>If those comments are indeed incorrect, please help me understand why
>so.
>
>Thank you.
>--
>Mallikarjun
>
>
>Mallikarjun Chadalapaka
>Networked Storage Architecture
>Network Storage Solutions Organization
>MS 5668   Hewlett-Packard, Roseville.
>cbm@rose.hp.com
>
>
>>I've marked it with ---
>>
>>Matt Wakeley <matt_wakeley@agilent.com> on 31/03/2001 10:25:25
>>
>>Please respond to Matt Wakeley <matt_wakeley@agilent.com>
>>
>>To:   IPS Reflector <ips@ece.cmu.edu>
>>cc:
>>Subject:  Re: iSCSI: synch and steering comments
>>
>>
>>
>>
>>Julian,
>>
>>There were many comments in this message.  To which comment are you
>>refering
>>to?
>>
>>-Matt
>>
>>julian_satran@il.ibm.com wrote:
>>>
>>> Mallikarjun,
>>>
>>> It is clearly communicated in the paragraph above it - but fine I will
>>add
>>> it here too.
>>>
>>> Julo
>>>
>>> "Mallikarjun C." <cbm@rose.hp.com> on 30/03/2001 00:54:20
>>>
>>> Please respond to cbm@rose.hp.com
>>>
>>> To:   ips@ece.cmu.edu
>>> cc:
>>> Subject:  Re: iSCSI: synch and steering comments
>>>
>>> Julian,
>>>
>>> Some comments.
>>>
>>> >Answers in text. Thanks, Julo
>>> >
>>> >
>>> ..
>>>
>>> >-Suggest adding the following statement to section 1.2.8.2.
>>> >
>>> > All conventional, in-order data arrival notifications generated by TCP
>>> > are passed through to iSCSI by the Synch and Steering layer after
>>> > appropriate data placements while none of the out-of-order data
>>> placements
>>> > that it performs are communicated to upper layers.
>>> >
>>> >+++ I have added the following to 1.2.8.2
>>> >
>>> >   On the incoming path the Synch and Steering layer does not change
>the
>>> >   way TCP notifies iSCSI about in-order data arrival.  All
>out-of-order
>>> >   data placements
>>> >   performed by the Synch and Steering layer are hidden from iSCSI.
>>-------------------------------------------------------------------------------
>
>>>
>>> Okay, I'd however prefer it to imply that in-order data placement is
>also
>>> handled by Synch and Steering in the same sentence, instead of only
>>> commenting on in-order notifications, and out-of-order placements.
>>>
>>-------------------------------------------------------------------------------
>
>>
>>> >
>>> >   I have aloso changed a bit the figure to convey better the fact that
>>> TCP
>>> >   and Synch&Steering are related (not strictly layered +++
>>>
>>> That's a good idea.
>>>
>>> >
>>> >   ++++
>>> >
>>> >-Section 1.2.8.2 states that a Synch and Steering layer is optional.
>>> > It has to be qualifed that it is optional only for those iSCSI devices
>>> > which perform connection recovery on header digest errors, since
>that's
>>> > how they cope with loss of framing. (I guess this may change in next
>>> rev?)
>>> >
>>> >+++ with the new format I think that we have:
>>> >
>>> >- one more chance if we go for format 1 or
>>> >- drop the connection on header error
>>> >
>>> >In both cases we can leave synch and steering optional
>>>
>>> Well, that doesn't address the thrust of my comment.  I was implying
>>> that the draft should make it clear that those implementations which
>>> don't support Synch and Steering should end the connection on a header
>>> digest error and/or parity error, and not go into (what Somesh called)
>>> a speculative mode.
>>>
>>> >
>>> >+++
>>> >
>>> >-It appears to me that at least one Synch and Steering layer must be
>>> > defined/referred to as the minimal implementation in the main draft to
>>> > enable interoperability, when implementations do implement Synch and
>>> >Steering.
>>> >
>>> >+++ why ? +++
>>>
>>> I may be using "interoperability" in a somewhat unconventional sense
>>here.
>>> While the draft says that Synch and Steering layer is optional, I don't
>>see
>>> that it requires implementations to always support a "no synch &
>>steering"
>>> mode, even when they support one type of Synch and Steering layer.
>Given
>>> that
>>> there's no mandatory Synch and Steering layer either, I don't see how
>two
>>> iSCSI boxes that a customer buys are guaranteed to interoperate.  Please
>>> comment if the draft already implies what I am asking for.
>>>
>>> Thanks.
>>> --
>>> Mallikarjun
>>>
>>> Mallikarjun Chadalapaka
>>> Networked Storage Architecture
>>> Network Storage Solutions Organization
>>> MS 5668   Hewlett-Packard, Roseville.
>>> cbm@rose.hp.com
>>
>>
>>
>>
>
>
>
>
>
>




From owner-ips@ece.cmu.edu  Wed Apr  4 15:26:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA29491
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 15:26:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34GwPI02381
	for ips-outgoing; Wed, 4 Apr 2001 12:58:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from iol.unh.edu (mars.iol.unh.edu [132.177.121.222])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34Ganr00969
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 12:36:49 -0400 (EDT)
Received: from iol.unh.edu (lambodar.iol.unh.edu [132.177.117.8])
	by iol.unh.edu (8.9.3/8.9.3) with ESMTP id MAA28015;
	Wed, 4 Apr 2001 12:42:05 -0400 (EDT)
Message-ID: <3ACB4EBD.E8F8E2DD@iol.unh.edu>
Date: Wed, 04 Apr 2001 12:41:34 -0400
From: "Ashish A. Palekar" <ashishp@iol.unh.edu>
Organization: FCC/IOL, UNH
X-Mailer: Mozilla 4.72 [en] (WinNT; I)
X-Accept-Language: en
MIME-Version: 1.0
To: mbrown@cs.uml.edu, achadda@iol.unh.edu, ng3@iol.unh.edu, ips@ece.cmu.edu
Subject: iSCSI implementation
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Mike:

A group at the InterOperability Lab at the University of New Hampshire
under Prof. Robert D. Russell has been looking at the performance and
implementation aspects of iSCSI as compared to Fibre Channel and other
SAN technologies. We have developed among other things an iSCSI
Initiator (again to draft-3 specs) - this interfaces to the SCSI
Mid-Level in the Linux kernel. For the Target side, we have developed a
generic SCSI Target Mid-Level, along with a bunch of front-ends (for
Fibre Channel, Adaptec's SEP and iSCSI). Actually we have been looking
for a good open source initiator implementation for iSCSI.

The SEP implementation (both initiator and target) is decently stable,
the iSCSI/Fibre Channel implementations are being worked on. I am in the
process of putting them up on the web.

All code is being released under the GPL license v2. Code is
experimental, buggy and not everything has been implemented. Code has
been tested on Intel machines only at this point. We are planning to
upgrade the code directly to v06 of the draft when it comes out. We did
not upgrade to version 05 because of frame format issues (WN).

The link for the code is:
http://www.iol.unh.edu/consortiums/fc/fc_linux.html

I have put up a copy of my thesis. It contains an overly detailed
documentation of the Target side code. However, it was the shortest path
to getting the information out there.

I do not subscribe to the iSCSI mailing list, so please respond directly
to me if you have any questions/thoughts/ideas on how to go ahead with
development work

Thanks
--
Ashish A. Palekar
-----------------
Fibre Channel Consortium
InterOperability Lab
Rm 201, Jere A. Chase Ocean Engg. Bldg
24 Colovis Road
Durham, NH 03824-3525

Tel. No.: (603) 862 0701




From owner-ips@ece.cmu.edu  Wed Apr  4 16:02:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00248
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 16:02:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34FUKt26378
	for ips-outgoing; Wed, 4 Apr 2001 11:30:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34FTYr26337
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 11:29:35 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id RAA116464
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:29:28 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id RAA23062
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:27:36 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0054CFB0 ; Wed, 4 Apr 2001 17:26:21 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A24.0054CD40.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 16:52:47 +0200
Subject: Re: iSCSI: Out Of Sequence due to null sequence with
	 multipleconnections.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



That might interfere with ITT reuse.  Julo

Sandeep Joshi <sandeepj@research.bell-labs.com> on 03/04/2001 19:45:44

Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>

To:   Black_David@emc.com
cc:   dotis@sanlight.net, ips@ece.cmu.edu
Subject:  Re: iSCSI: Out Of Sequence due to null sequence with
      multipleconnections.





There is probably more to this than meets the eye, in which
case do please let me know where I err..

Is it not possible to use the refTaskTag in task management
command to introduce "state" at the target ?

Specifically,
1) send task management command with immediate delivery(cmdSN=0).
2) if iSCSI target sees a non-existing refTaskTag,
   it uses that fact to create some "state" at the target.
   (NOTE: we dont know if the task had already completed ??)
3) When actual task arrives, it gets dropped since iSCSI sees
   that for that refTaskTag, state=must abort.

But there is still the question..
1) when do we delete that target "state" ?  there is no
   endCmdSN or refCmdSN.

-sandeep

Black_David@emc.com wrote:
>
> Let me apologize for unintentionally stepping on Doug
> in the meeting.  Due to the time squeeze, I neglected
> to ask for other issues at the end of the iSCSI discussion
> - sorry about that.
>
> I'm going to back off and try to take a high level view of
> this and see what sort of observations emerge.  When a SCSI
> task management command pops out of an iSCSI TCP connection
> at the target, there are four places that the SCSI operations
> it affects could be:
>
> (1) Executing in SCSI.
> (2) Queued to SCSI for execution, but not executing.
> (3) Queued in iSCSI waiting for command sequencing.
> (4) In-flight.
>
> (2) includes the "resource limitations between the
> sequencer and the target that may lead to a stall or
> a long term delay".
>
> (1) and (2) are the easy cases - the SCSI implementation
> must apply the task management command to executing tasks,
> and should perform the obvious "peephole optimization" to
> the commands queued for execution (i.e., if they're to be
> aborted, abort them and send the response(s) without starting
> execution).  In essence, this models a command as crossing
> the boundary from iSCSI to SCSI the moment that iSCSI is
> prepared to give it to SCSI (i.e., any queue and related
> resource limitations are on SCSI's side of the line).
>
> (4) is hard.  One SCSI task management command generates one
> response.  That response can either be generated immediately
> (command arrives, is passed to SCSI, SCSI does its thing) or
> at the right point in the sequence (command arrives, is
> sequenced by iSCSI, passed to SCSI at the right point in the
> sequence, and SCSI does its thing), but NOT both.  As things
> currently stand, having a task management command apply to
> in-flight commands requires sending the task management
> command for ordered delivery - so if it's desired to have
> the task management command take immediate effect and also
> catch everything in flight, it's going to have to be sent
> twice.  I'm not enthusiastic about the idea of the task
> management command taking immediate effect but delaying the
> response until everything in flight that might be affected
> arrives, as I suspect the Initiator would like to know what
> happened sooner rather than later.
>
> (3) is "interesting".  The results of applying a SCSI task
> management command to a SCSI operation are known only to
> SCSI, and hence asking that a command stuck in the iSCSI
> sequencer be affected immediately by a task management
> command is asking that the task management command have
> the side effect of changing some of the commands it affects
> to immediate delivery so that it can immediately do its
> (SCSI) thing to them.  I wouldn't want to mandate this,
> nor would I want to prohibit it, BUT ... if the above
> discussion of in-flight commands is correct, I would
> observe that the application on the Initiator side
> can't tell the difference between commands that are in-flight
> vs. waiting for something in-flight on another connection,
> and hence is going to have to issue the task management
> command for ordered delivery if it wants to affect operations
> in either place (and issue a second copy if it wants
> immediate action).
>
> The upshot is that, aside from a longer discussion of this
> issue, I'm not sure anything needs to be changed.  Comments?
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------





From owner-ips@ece.cmu.edu  Wed Apr  4 16:12:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00503
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 16:12:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34HsRV06196
	for ips-outgoing; Wed, 4 Apr 2001 13:54:27 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from saturn.cs.uml.edu (saturn.cs.uml.edu [129.63.8.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34Hrdr06153
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 13:53:39 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1])
	by saturn.cs.uml.edu (8.11.0/8.11.2) with SMTP id f34HrSS520376;
	Wed, 4 Apr 2001 13:53:28 -0400 (EDT)
Date: Wed, 4 Apr 2001 13:53:28 -0400 (EDT)
From: Mike Brown <mbrown@cs.uml.edu>
To: "Ashish A. Palekar" <ashishp@iol.unh.edu>
cc: achadda@iol.unh.edu, ng3@iol.unh.edu, ips@ece.cmu.edu
Subject: Re: iSCSI implementation
In-Reply-To: <3ACB4EBD.E8F8E2DD@iol.unh.edu>
Message-ID: <Pine.OSF.3.96.1010404134254.503389F-100000@saturn.cs.uml.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by ece.cmu.edu id f34HsMs06185
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
X-MIME-Autoconverted: from 8bit to quoted-printable by ece.cmu.edu id f34HsRW06196
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by ietf.org id QAA00503

On Wed, 4 Apr 2001, Ashish A. Palekar wrote:

>Mike:
>
>A group at the InterOperability Lab at the University of New Hampshire
>under Prof. Robert D. Russell has been looking at the performance and
>implementation aspects of iSCSI as compared to Fibre Channel and other
>SAN technologies. We have developed among other things an iSCSI
>Initiator (again to draft-3 specs) - this interfaces to the SCSI
>Mid-Level in the Linux kernel. For the Target side, we have developed a

Our initiator is implemented as a lower layer adapter driver and therefore
also interfaces with the mid-level.  

>generic SCSI Target Mid-Level, along with a bunch of front-ends (for
>Fibre Channel, Adaptec's SEP and iSCSI). Actually we have been looking
>for a good open source initiator implementation for iSCSI.

I'd be very interested in seeing your target work, ours is just getting
under way.  Our implementation hasn't been tested yet since we don't have
a target.

>The SEP implementation (both initiator and target) is decently stable,
>the iSCSI/Fibre Channel implementations are being worked on. I am in the
>process of putting them up on the web.
>
>All code is being released under the GPL license v2. Code is
>experimental, buggy and not everything has been implemented. Code has
>been tested on Intel machines only at this point. We are planning to
>upgrade the code directly to v06 of the draft when it comes out. We did
>not upgrade to version 05 because of frame format issues (WN).

Sounds just about where we are at.  The only v5 stuff I've added has been
changing the opcode values.

>The link for the code is:
>http://www.iol.unh.edu/consortiums/fc/fc_linux.html

Regards,

-Michael F. Brown, UMass Lowell Computer Science

phone:  (978) 934-5354
email:  mbrown@cs.uml.edu

"I wonder if pawns just realize that they're just pawns 
 in someone's (chess) game."   -L. Fitzgerald Sjöberg



From owner-ips@ece.cmu.edu  Wed Apr  4 16:13:13 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00529
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 16:13:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34I9O707191
	for ips-outgoing; Wed, 4 Apr 2001 14:09:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34I8xr07173
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 14:08:59 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id UAA257394
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 20:08:52 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id UAA42924
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 20:07:00 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0063642D ; Wed, 4 Apr 2001 20:05:37 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A24.00633D8D.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 17:39:18 +0200
Subject: Re: Initiator-detected format or digest errors
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Tom,

An initiator will pass the appropriate response to the SCSI layer and will
abort the task if it can identify one.  Further behavior of initiators and
targets is implementation dependent.

6.2 specifies this.

Julo

"Thomas McSweeney" <rf42tpme@us.ibm.com> on 04/04/2001 13:57:20

Please respond to "Thomas McSweeney" <rf42tpme@us.ibm.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Initiator-detected format or digest errors




Section "2.20 Reject" talks about the target receiving a bad frame and
sending Reject.  What should an Initiator do if it receives a PDU with a
format or digest error?  Should it send Reject?  If so, we'll need to
ensure that the Initiator fields of the MIB include an object to count
Reject commands transmitted.

Tom McSweeney
iSCSI Development, Storage Systems Group, IBM
Email: rf42tpme@us.ibm.com
Phone: (USA) 919-254-5634  (tie line: 444-5634)
Fax:   (USA) 919-254-0391  (tie line: 444-0391)






From owner-ips@ece.cmu.edu  Wed Apr  4 16:13:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00577
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 16:13:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34IMPK08101
	for ips-outgoing; Wed, 4 Apr 2001 14:22:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from smtp010.mail.yahoo.com (smtp010.mail.yahoo.com [216.136.173.30])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f34ILQr08029
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 14:21:26 -0400 (EDT)
Received: from sdsl-216-36-75-164.dsl.sjc.megapath.net (HELO somesh) (216.36.75.164)
  by smtp.mail.vip.sc5.yahoo.com with SMTP; 4 Apr 2001 18:21:19 -0000
X-Apparently-From: <someshg@yahoo.com>
Reply-To: <someshg@yahoo.com>
From: "Somesh Gupta" <someshg@yahoo.com>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Wed, 4 Apr 2001 11:15:53 -0700
Message-ID: <NMEALCLOIBCHBDHLCMIJAELGCCAA.someshg@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
In-Reply-To: <C1256A24.0054CD3E.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit



> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Wednesday, April 04, 2001 7:32 AM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> SNACK is here for two reasons - Status retry (which is cheap) and Data
> retry as a side benefit.

  Unless there is clear benefit (i.e. the event is frequent enough
  to justify recovery at this level), the entire mechanism should be
  dropped - it is neither cheap nor free. If it is relatively
  infrequent, the recovery at the command level should be a sufficient
  mechanism

> CRC errors are not that rare (although we don't have real data the
> simulation with file systems seem to indicate that numbers could
> be as high
> a 0.0002%). A restart of link - is expensive (slow start) and even if they
> are far lower for many applications a slow start is a painfull event.

  Intuitively, it seems that the combination of link level CRC, TCP
  checksum, and good hardware (ECC, parity etc) should lead to a
  much lower level of errors caught by the iSCSI CRC algorithm. We have
  to seperate error detection (i.e. what if I have bad hardware or
  some vendor makes bad/buggy intermediate system) from recovery
  mechanisms (not based on hardware being bad or buggy - market forces
  will wean out the vendor) which should not be based on assumptions
  of bugs in hardware/software of specific implementations.

>
> Removing them from the spec is not a path we should take lightly.

  I would phrase it the other way. We should not keep adding things
  unless there is very clear proof that the additional feature is
  beneficial and does not have negative side effects (and there is
  some consensus on adding it)
>
> Julo
>
> "Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
>
> Please respond to "Jon Hall" <jhall@emc.com>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>
> I agree with Somesh.  And would go farther -- the complexity
> that results from retaining enough target-side state to respond
> to a SACK/SNACK request is non-trivial and needs clear justification.
> Intuitively, a CRC that discovers an error in an iSCSI pdu header
> (that the TCP cksum missed) seems like it should be a rare event.
>
> What is the frequency of this event?  IMO the answer to this
> question should be written into the protocol spec -- assuming
> that it substantiates the benefit of SACK/SNACK.  Otherwise, the
> SACK/SNACK pdu should be removed.
>
> -Jon
>
> julian_satran@il.ibm.com writes:
> >
> >Somesh,
> >
> >As I stated earlier - the DataSN was created to detect missing data PDUs.
> >SNACK is needed to recover missing StatusSN and missing dataSN is only a
> >bonus if the target wants to support it.  It is a trivial mechanism and I
> >think it should stay.
> >
> >Julo
> >
> >"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
> >
> >Please respond to someshg@yahoo.com
> >
> >To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> >cc:
> >Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> >Sorry to have been missing for a while. Hope you will
> >appreciate my being back in action :-). It was a fairly
> >clear consensus in Orlando that applications broke up
> >their transfers into reasonably small chunks i.e. they
> >did not have very long running transfers.
> >
> >Therefore the consensus was that a command level recovery
> >mechanism was sufficient instead of an ack/sack for each
> >data PDU.
> >
> >The SACK mechanism was a post Orlando invention. Without
> >an ack mechanism (for every data PDU), the SACK mechanism
> >just imposes additional burden on either end of the session,
> >without really much benefit.
> >
> >The benefit of having SACK is of saving bandwidth in case
> >the data part of the data PDU failed an integrity check
> >(but passed TCP checksum). This is a rare enough case that
> >as a percentage, the bandwidth loss from retransmitting
> >all the data associated with a read or write command is
> >very very small.
> >
> >In addition, it avoids the complexity of restarting
> >something from the middle, as compared to from the begining.
> >
> >To me it seems that there is significant simplicity (from
> >implementation, reliability and recovery process) from
> >having smaller data transfer per command.
> >
> >I would really like to get rid of the SACK command.
> >
> >Somesh
> >
> >> -----Original Message-----
> >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> >> julian_satran@il.ibm.com
> >> Sent: Wednesday, March 28, 2001 6:57 AM
> >> To: ips@ece.cmu.edu
> >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> Mallikarjun,
> >>
> >> Last summer I thought that recovery within a connection should be left
> to
> >> TCP. It is simple and could be made available through IPsec (if no new
> >> option of any form can be added).
> >>
> >> Two things killed this:
> >>
> >>    The requirement to have a data encapsulation that can pass through
> >>    application proxies (like a storage router)
> >>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> >>    header
> >>
> >>
> >> As for the ACK - I am very much in favor of it (it is a no brainer) and
> >> implementations are in fact allowed to drop even unacked data.
> >>
> >> I am bound by the Orlando meeting decision to drop it. Except the
> regular
> >> "oppose everything" crowd the two vocal opponents where Somesh
> Gupta and
> >> Matt Wakeley.
> >>
> >> David may want or not to re-open the issue - I am not going to ask for
> >it.
> >>
> >> Regards,
> >> Julo
> >>
> >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> >>
> >> Please respond to cbm@rose.hp.com
> >>
> >> To:   Black_David@emc.com
> >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> someshg@yahoo.com,
> >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> David and Julian,
> >>
> >> I appreciate both your views, and should I say that they're
> >> along predicted lines :-)
> >>
> >> - David's right in saying that the situation is akin to FC's.
> >>   However, I would like to point out that FC is an unreliable
> >>   transport, and hence is forced to pick up a lot of the transport
> >>   baggage (at least in FCP-2, as I understand), in addition
> >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> >>   TCP being the "reliable" transport, iSCSI is going along the
> >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> >>   point is - if this is indeed a necessary evil, why don't we
> >>   complete iSCSI's transport functionality by data-ACKs?
> >>
> >> - If data SACK is introduced mostly to make up for TCP's shortcomings,
> >>   we're making its usage (and implementation) drastically less
> appealing
> >>   since the only way error recovery algorithms can *rely* on data SACK
> >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> proposal),
> >>   which is extremely expensive.  IOW, we're defining data SACK in the
> >>   draft and not providing any incentives to implement and use it!
> >>
> >> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
> >>   protocol in its definition so far (and I believe, rightly so -
> >mandating
> >>   command ordering, bi-di support, SCSI CRN support to name a few
> >> examples),
> >>   the perfectly SCSI-legal R/W interactions that break in other
> >transports
> >>   *do not* have to break in iSCSI.
> >>
> >> - A last idea (may seem radical at this point) in regards to iSCSI
> >>   being a "full transport". This provides us an opportunity to "cast
> >>   off" the transport baggage in future when we truly move to a
> "reliable"
> >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> >>   keeping the encapsulation stuff separate from the transport stuff.
> >>   (Julian, I heard from Randy that ideas similar to this were explored
> >>   in your Haifa meeting.  And yes, he recalls they were given up since
> >>   TCP was supposed to be reliable and granularity of recovery
> was deemed
> >>   one I/O.)
> >>
> >> With that said, may I request David (with his co-chair hat on, :-))
> >> to add some binding comments/observations on this discussion?
> >>
> >> If we decide to leave data SACKs as unattractive to implement,
> the draft
> >> should in the least add a statement like - "Note that satisfying all
> >> possible data SACK requests for a task with an unacknowledged status
> >> implies implementing the I/O replay buffer on the part of targets."
> >> --
> >> Mallikarjun
> >>
> >>
> >> Mallikarjun Chadalapaka
> >> Networked Storage Architecture
> >> Network Storage Solutions Organization
> >> MS 5668   Hewlett-Packard, Roseville.
> >> cbm@rose.hp.com
> >>
> >>
> >>
> >>
> >> >I think Julian's basically right -- I would point
> >> >out that any case of write after read that breaks
> >> >over iSCSI will also break over Fibre Channel.
> >> >On FC, the scenario starts with a frame CRC failure
> >> >on read data at the Initiator, so applications
> >> >have to cope and typically do so by enforcing
> >> >ordering at the app rather than using SCSI task
> >> >ordering.
> >> >
> >> >While SCSI has clever tools like ACA and task
> >> >ordering that appear to allow dependent operations
> >> >to be sent to the target concurrently, in practice
> >> >they don't work and/or aren't used (funny thing,
> >> >those two reinforce each other ;-) ).  Hence
> >> >a minimal approach to them is in order:
> >> >- Make sure the result will interoperate.
> >> >- Make sure T10 doesn't ding us for leaving something
> >> >    completely out.
> >> >- Don't specify anything not needed for the above.
> >> >
> >> >My 0.02,
> >> >--David
> >> >
> >> >> -----Original Message-----
> >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> >> >> To:    cbm@rose.hp.com
> >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
> >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> >> >> Black_David@emc.com
> >> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >> Mallikarjun,
> >> >>
> >> >> I commiserate with you at the lack of ack for data but the Orlando
> >> meeting
> >> >> stated - no.  Recall that I kept the number only as a mechanism to
> >> detect
> >> >> missing packets.
> >> >>
> >> >> You can achieve the effect you want by keeping around data for a
> while
> >> >> (you
> >> >> determine how long and then discard).
> >> >>
> >> >> If a SACK comes and you can recover - fine. If not you either
> reaccess
> >> the
> >> >> media (if you know how) or reject
> >> >> and let the initiator retry.
> >> >>
> >> >> You should not worry about R/W conflicts as programs bound to have
> >such
> >> >> conflicts either:
> >> >>
> >> >> 1)can live with them or
> >> >> 2)protect themselves through some locks and rely on
> >> "operation-end-status"
> >> >> to keep results deterministic.
> >> >>
> >> >> Regards,
> >> >> Julo
> >> >>
> >> >>
> >> >>
> >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> >> >>
> >> >> Please respond to cbm@rose.hp.com
> >> >>
> >> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
> >Julian
> >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> >> >> cc:   Black_David@emc.com
> >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Hi Error Recovery Team,
> >> >>
> >> >> iSCSI can discard PDUs because of digest errors and request
> >> >> retransmissions using the iSCSI data SACK.  To deal with such
> >> >> an eventuality, targets that want to support data SACK have
> >> >> the following options:
> >> >>
> >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> >> >>   a SACK could come anytime before the status is ack'ed by the
> >> >>   initiator. [ simple, but extremely expensive in memory resources]
> >> >>
> >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> data-ACKs.
> >> >>   Thus enables keeping only those I/O buffers that haven't been
> ack'ed
> >> >>   by the initiator. IOW, become a real full transport! [ everyone
> >> disliked
> >> >>   it earlier...]
> >> >>
> >> >> (C) re-access the medium for data retransmission requests.
> Now there
> >> >>   are 3 sub-cases in this to handle the changed data on the
> medium in
> >a
> >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
> >> >> legal.)
> >> >>      (1) On seeing any write, stall till status is ack'ed
> for all the
> >> >>             previous reads (basically drain the pipe). [simple, but
> >> incurs
> >> >>             an additional roundtrip delay for all writes].
> >> >>      (2) A variation of the above, keep an eye only on the prior
> >> >>             overlapping reads. [more BW efficient, but
> complicated to
> >> >>             resolve the block dependencies in a stream of
> >> reads followed
> >> >>             by writes]
> >> >>         (3) Document the caveat and leave it upto the applications
> >> >>             to avoid this case since this leads to data integrity
> >> issues.
> >> >>             [pushing to apps since the transport can't get
> it right!]
> >> >>
> >> >> My first preference is (B), followed by (A), and I suggest we not go
> >> >> to (C) at all with its inherent dangers.
> >> >>
> >> >> Doing (B) naturally completes the transport job that iSCSI has taken
> >> >> on itself in view of TCP's claimed unreliable checksum.  That is the
> >> >> right thing to do architecturally instead of being a
> "semi-transport"!
> >> >>
> >> >> Comments?
> >> >> --
> >> >> Mallikarjun
> >> >>
> >> >>
> >> >> Mallikarjun Chadalapaka
> >> >> Networked Storage Architecture
> >> >> Network Storage Solutions Organization
> >> >> MS 5668   Hewlett-Packard, Roseville.
> >> >> cbm@rose.hp.com
> >> >>
> >> >>
> >>
> >_________________________________________________________________
> _________
> >> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
> >> legal
> >> >>         if SCSI sets the ORDERED task attribute on both the
> >> commands AND
> >> >>         sets the NACA bit to one to indicate that Write shall be
> >> executed
> >> >>         only if the Read did not fail (result in a Check Condition).
> >> >>
> >> >>         In the current case, since Read completed just fine from
> >SCSI's
> >> >>         point of view, SCSI is moving on to execute Write.  Those
> read
> >> >> buffers
> >> >>         had been freed up since iSCSI received an ACK at the TCP
> >level,
> >> >> and
> >> >>         since iSCSI has no other way to have the data ack'ed!
>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



From owner-ips@ece.cmu.edu  Wed Apr  4 16:15:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00620
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 16:15:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34FUNO26382
	for ips-outgoing; Wed, 4 Apr 2001 11:30:23 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34FTYr26335
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 11:29:34 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id RAA116458;
	Wed, 4 Apr 2001 17:29:27 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id RAA50460;
	Wed, 4 Apr 2001 17:27:31 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0054CE9E ; Wed, 4 Apr 2001 17:26:19 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: "BURBRIDGE,MATTHEW (HP-UnitedKingdom,ex2)" <matthew_burbridge@hp.com>
cc: ips@ece.cmu.edu
Message-ID: <C1256A24.0054CC53.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 16:20:58 +0200
Subject: RE: Unsolicited Data Questions
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Mathew,

The appendix is correct.  It does not make too much sense to have the first
burst both as immediate and separate PDU. It complicates implementation.
Immediate are meant for short transactions but recall that immediate data
can be large. If you decide to go for separate PDUs then you can as well
drop the immediate - the gain is marginal and checking becomes cumbersome.

For the editorial answers in text.

Thanks,
Julo



Please respond to "BURBRIDGE,MATTHEW (HP-UnitedKingdom,ex2)"
      <matthew_burbridge@hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  RE: Unsolicited Data Questions




Hi Julian

I echo Mallikarjun's input on this.

If you still have time, I  would appriciate comments on my original email

Cheers

Matthew


> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: 31 March 2001 07:03
> To: ips@ece.cmu.edu
> Subject: Re: Unsolicited Data Questions
>
>
>
>
> Mallikarjun,
>
> I'll see what I can do. BTW - the second line has also to
> specify - not in
> the same command.
>
> Regards,
> Julo
>
> "Mallikarjun C." <cbm@rose.hp.com> on 30/03/2001 19:38:18
>
> Please respond to cbm@rose.hp.com
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: Unsolicited Data Questions
>
>
>
>
> Julian,
>
> On the subject of unsolicited and immediate data: Perhaps adding
> a table like this might help understand the intent of the draft
> better. (InitialR2T = UseR2T, per earlier discussion)
>
> +-------------+---------------+----------------------------------------+
> | InitialR2T  | ImmediateData |   Result (upto FirstBurstSize)         |
> +-------------+---------------+------------------------------- ---------+
> |  no         |    no         | Unsolicited data in separate PDUs only |
> +-------------+---------------+----------------------------------------+
> |  no         |    yes        | Immediate & separate unsolicited data  |
> +-------------+---------------+----------------------------------------+
> |  yes        |    no         | Unsolicited data disallowed>          |
> +-------------+---------------+----------------------------------------+
> |  yes        |    yes        | Immediate unsolicited data only        |
> +-------------+---------------+----------------------------------------+
>
> --
> Mallikarjun
>
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668   Hewlett-Packard, Roseville.
> cbm@rose.hp.com
>
> >Hi Julian
> >
> >Sorry if I'm covering old ground...
> >
> >Is it possible to use unsolicited data for the first burst and then
> request
> >any remaining data using R2T?  For example, if the target
> has a previously
> >allocated buffer available (length defined by FirstBurstSize) for
> >unsolicited data, then once the initiator has sent
> unsolicited data up to
> >and including this amount then the remaining data (if any) can be
> requested
> >using R2T once the target has the buffer space available.
> >
> >Section 1.2.5 pertains to this..  "An initiator may send
> unsolicited data
> >either as immediate (up to the negotiated  maximum PDU size -
> DataPDULength
> >- disconnect-reconnect mode page) or in a separate PDU
> sequence (up to the
> >negotiated limit - FirstBurstSize - disconnect-reconnect
> mode page). All
> >subsequent data  has to be solicited"
> >
> >However, Appendix D - 22 ImmediateData" appears to
> contradict this:   "If
> >ImmediateData is set to no and UseR2T is set to yes then the
> initiator
> MUST
> >NOT send unsolicited data and the target MUST reject them with the
> >corresponding response code."
> >
> >Also I have some comments from my colleagues:
> >
> >Page 15: Last paragraph is unclear from 2nd sentence.  The
> hyphens look
> like
> >minus signs. All a bit confusing.
> >Page 16: There are several references to initial burst.
> Should these not
> be
> >changed to first burst to be consistent with the terminology used
> >subsequently in the spec especially since this it's the
> "FirstBurstSize"
+++ fixed text +++
> >which defines how much data can be accepted.
> >page 111: The spec says that you need to set immediateData
> to yes so that
> >you can receive a first burst of immediate data, but then "only
> >immediate data are accepted in the first burst".  If you
> have immediate
> data
> >set to no and R2T set to yes you can't accept any
> unsolicited data: Are
> you
> >stating that immediate data and "normal" unsolicited data can not be
> mixed.
> >
> >1) Section 2.13.1: Is the response to a NOP-In issues by the target,
> always
> >a NOP-out?
> >
+++ yes+++
> >2) Appendix D 29: KeyValueText: default value is outside of parameter
> range.
+++ fixed ++
> >
> >Cheers
> >
> >Matthew Burbridge
> >Hewlett Packard, Bristol
> >Telnet: 312 7010
> >E-mail: matthewb@bri.hp.com
> >
> >
>
>
>
>
>





From owner-ips@ece.cmu.edu  Wed Apr  4 16:15:57 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA00647
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 16:15:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34FUTZ26426
	for ips-outgoing; Wed, 4 Apr 2001 11:30:29 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34FTVr26332
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 11:29:31 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id RAA299424
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:29:22 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id RAA50458
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:27:30 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0054CEFA ; Wed, 4 Apr 2001 17:26:20 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A24.0054CD3E.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 16:31:30 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



SNACK is here for two reasons - Status retry (which is cheap) and Data
retry as a side benefit.
CRC errors are not that rare (although we don't have real data the
simulation with file systems seem to indicate that numbers could be as high
a 0.0002%). A restart of link - is expensive (slow start) and even if they
are far lower for many applications a slow start is a painfull event.

Removing them from the spec is not a path we should take lightly.

Julo

"Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35

Please respond to "Jon Hall" <jhall@emc.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"





I agree with Somesh.  And would go farther -- the complexity
that results from retaining enough target-side state to respond
to a SACK/SNACK request is non-trivial and needs clear justification.
Intuitively, a CRC that discovers an error in an iSCSI pdu header
(that the TCP cksum missed) seems like it should be a rare event.

What is the frequency of this event?  IMO the answer to this
question should be written into the protocol spec -- assuming
that it substantiates the benefit of SACK/SNACK.  Otherwise, the
SACK/SNACK pdu should be removed.

-Jon

julian_satran@il.ibm.com writes:
>
>Somesh,
>
>As I stated earlier - the DataSN was created to detect missing data PDUs.
>SNACK is needed to recover missing StatusSN and missing dataSN is only a
>bonus if the target wants to support it.  It is a trivial mechanism and I
>think it should stay.
>
>Julo
>
>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
>
>Please respond to someshg@yahoo.com
>
>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
>cc:
>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>Sorry to have been missing for a while. Hope you will
>appreciate my being back in action :-). It was a fairly
>clear consensus in Orlando that applications broke up
>their transfers into reasonably small chunks i.e. they
>did not have very long running transfers.
>
>Therefore the consensus was that a command level recovery
>mechanism was sufficient instead of an ack/sack for each
>data PDU.
>
>The SACK mechanism was a post Orlando invention. Without
>an ack mechanism (for every data PDU), the SACK mechanism
>just imposes additional burden on either end of the session,
>without really much benefit.
>
>The benefit of having SACK is of saving bandwidth in case
>the data part of the data PDU failed an integrity check
>(but passed TCP checksum). This is a rare enough case that
>as a percentage, the bandwidth loss from retransmitting
>all the data associated with a read or write command is
>very very small.
>
>In addition, it avoids the complexity of restarting
>something from the middle, as compared to from the begining.
>
>To me it seems that there is significant simplicity (from
>implementation, reliability and recovery process) from
>having smaller data transfer per command.
>
>I would really like to get rid of the SACK command.
>
>Somesh
>
>> -----Original Message-----
>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>> julian_satran@il.ibm.com
>> Sent: Wednesday, March 28, 2001 6:57 AM
>> To: ips@ece.cmu.edu
>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> Mallikarjun,
>>
>> Last summer I thought that recovery within a connection should be left
to
>> TCP. It is simple and could be made available through IPsec (if no new
>> option of any form can be added).
>>
>> Two things killed this:
>>
>>    The requirement to have a data encapsulation that can pass through
>>    application proxies (like a storage router)
>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>    header
>>
>>
>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>> implementations are in fact allowed to drop even unacked data.
>>
>> I am bound by the Orlando meeting decision to drop it. Except the
regular
>> "oppose everything" crowd the two vocal opponents where Somesh Gupta and
>> Matt Wakeley.
>>
>> David may want or not to re-open the issue - I am not going to ask for
>it.
>>
>> Regards,
>> Julo
>>
>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>
>> Please respond to cbm@rose.hp.com
>>
>> To:   Black_David@emc.com
>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com, someshg@yahoo.com,
>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>> David and Julian,
>>
>> I appreciate both your views, and should I say that they're
>> along predicted lines :-)
>>
>> - David's right in saying that the situation is akin to FC's.
>>   However, I would like to point out that FC is an unreliable
>>   transport, and hence is forced to pick up a lot of the transport
>>   baggage (at least in FCP-2, as I understand), in addition
>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>   TCP being the "reliable" transport, iSCSI is going along the
>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>   point is - if this is indeed a necessary evil, why don't we
>>   complete iSCSI's transport functionality by data-ACKs?
>>
>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>   we're making its usage (and implementation) drastically less appealing
>>   since the only way error recovery algorithms can *rely* on data SACK
>>   is when replay is supported (or, "ReplaySupport=yes"  in my proposal),
>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>   draft and not providing any incentives to implement and use it!
>>
>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>>   protocol in its definition so far (and I believe, rightly so -
>mandating
>>   command ordering, bi-di support, SCSI CRN support to name a few
>> examples),
>>   the perfectly SCSI-legal R/W interactions that break in other
>transports
>>   *do not* have to break in iSCSI.
>>
>> - A last idea (may seem radical at this point) in regards to iSCSI
>>   being a "full transport". This provides us an opportunity to "cast
>>   off" the transport baggage in future when we truly move to a
"reliable"
>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>   keeping the encapsulation stuff separate from the transport stuff.
>>   (Julian, I heard from Randy that ideas similar to this were explored
>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>   TCP was supposed to be reliable and granularity of recovery was deemed
>>   one I/O.)
>>
>> With that said, may I request David (with his co-chair hat on, :-))
>> to add some binding comments/observations on this discussion?
>>
>> If we decide to leave data SACKs as unattractive to implement, the draft
>> should in the least add a statement like - "Note that satisfying all
>> possible data SACK requests for a task with an unacknowledged status
>> implies implementing the I/O replay buffer on the part of targets."
>> --
>> Mallikarjun
>>
>>
>> Mallikarjun Chadalapaka
>> Networked Storage Architecture
>> Network Storage Solutions Organization
>> MS 5668   Hewlett-Packard, Roseville.
>> cbm@rose.hp.com
>>
>>
>>
>>
>> >I think Julian's basically right -- I would point
>> >out that any case of write after read that breaks
>> >over iSCSI will also break over Fibre Channel.
>> >On FC, the scenario starts with a frame CRC failure
>> >on read data at the Initiator, so applications
>> >have to cope and typically do so by enforcing
>> >ordering at the app rather than using SCSI task
>> >ordering.
>> >
>> >While SCSI has clever tools like ACA and task
>> >ordering that appear to allow dependent operations
>> >to be sent to the target concurrently, in practice
>> >they don't work and/or aren't used (funny thing,
>> >those two reinforce each other ;-) ).  Hence
>> >a minimal approach to them is in order:
>> >- Make sure the result will interoperate.
>> >- Make sure T10 doesn't ding us for leaving something
>> >    completely out.
>> >- Don't specify anything not needed for the above.
>> >
>> >My 0.02,
>> >--David
>> >
>> >> -----Original Message-----
>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>> >> To:    cbm@rose.hp.com
>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>> >> Black_David@emc.com
>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >> Mallikarjun,
>> >>
>> >> I commiserate with you at the lack of ack for data but the Orlando
>> meeting
>> >> stated - no.  Recall that I kept the number only as a mechanism to
>> detect
>> >> missing packets.
>> >>
>> >> You can achieve the effect you want by keeping around data for a
while
>> >> (you
>> >> determine how long and then discard).
>> >>
>> >> If a SACK comes and you can recover - fine. If not you either
reaccess
>> the
>> >> media (if you know how) or reject
>> >> and let the initiator retry.
>> >>
>> >> You should not worry about R/W conflicts as programs bound to have
>such
>> >> conflicts either:
>> >>
>> >> 1)can live with them or
>> >> 2)protect themselves through some locks and rely on
>> "operation-end-status"
>> >> to keep results deterministic.
>> >>
>> >> Regards,
>> >> Julo
>> >>
>> >>
>> >>
>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>> >>
>> >> Please respond to cbm@rose.hp.com
>> >>
>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
>Julian
>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>> >> cc:   Black_David@emc.com
>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >>
>> >> Hi Error Recovery Team,
>> >>
>> >> iSCSI can discard PDUs because of digest errors and request
>> >> retransmissions using the iSCSI data SACK.  To deal with such
>> >> an eventuality, targets that want to support data SACK have
>> >> the following options:
>> >>
>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>> >>   a SACK could come anytime before the status is ack'ed by the
>> >>   initiator. [ simple, but extremely expensive in memory resources]
>> >>
>> >> (B) (re-introduce data-ACKs into the draft, and) implement data-ACKs.
>> >>   Thus enables keeping only those I/O buffers that haven't been
ack'ed
>> >>   by the initiator. IOW, become a real full transport! [ everyone
>> disliked
>> >>   it earlier...]
>> >>
>> >> (C) re-access the medium for data retransmission requests.  Now there
>> >>   are 3 sub-cases in this to handle the changed data on the medium in
>a
>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>> >> legal.)
>> >>      (1) On seeing any write, stall till status is ack'ed for all the
>> >>             previous reads (basically drain the pipe). [simple, but
>> incurs
>> >>             an additional roundtrip delay for all writes].
>> >>      (2) A variation of the above, keep an eye only on the prior
>> >>             overlapping reads. [more BW efficient, but complicated to
>> >>             resolve the block dependencies in a stream of
>> reads followed
>> >>             by writes]
>> >>         (3) Document the caveat and leave it upto the applications
>> >>             to avoid this case since this leads to data integrity
>> issues.
>> >>             [pushing to apps since the transport can't get it right!]
>> >>
>> >> My first preference is (B), followed by (A), and I suggest we not go
>> >> to (C) at all with its inherent dangers.
>> >>
>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>> >> right thing to do architecturally instead of being a
"semi-transport"!
>> >>
>> >> Comments?
>> >> --
>> >> Mallikarjun
>> >>
>> >>
>> >> Mallikarjun Chadalapaka
>> >> Networked Storage Architecture
>> >> Network Storage Solutions Organization
>> >> MS 5668   Hewlett-Packard, Roseville.
>> >> cbm@rose.hp.com
>> >>
>> >>
>>
>__________________________________________________________________________
>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>> legal
>> >>         if SCSI sets the ORDERED task attribute on both the
>> commands AND
>> >>         sets the NACA bit to one to indicate that Write shall be
>> executed
>> >>         only if the Read did not fail (result in a Check Condition).
>> >>
>> >>         In the current case, since Read completed just fine from
>SCSI's
>> >>         point of view, SCSI is moving on to execute Write.  Those
read
>> >> buffers
>> >>         had been freed up since iSCSI received an ACK at the TCP
>level,
>> >> and
>> >>         since iSCSI has no other way to have the data ack'ed!





From owner-ips@ece.cmu.edu  Wed Apr  4 17:18:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA02130
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 17:17:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34Iuop10534
	for ips-outgoing; Wed, 4 Apr 2001 14:56:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34ItOr10397
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 14:55:24 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id UAA293052
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 20:55:21 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta06_cs0 [9.165.222.255])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id UAA49282
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 20:53:28 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0067EF78 ; Wed, 4 Apr 2001 20:55:15 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: someshg@yahoo.com
cc: ips@ece.cmu.edu
Message-ID: <C1256A24.0067EEDB.00@d12mta05.de.ibm.com>
Date: Wed, 4 Apr 2001 20:55:48 +0200
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



What are the numbers you are looking at:

1 per 10 sec, 1/10h or 1 /10y?

Julo

"Somesh Gupta" <someshg@yahoo.com> on 04/04/2001 20:15:53

Please respond to someshg@yahoo.com

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"






> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Wednesday, April 04, 2001 7:32 AM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> SNACK is here for two reasons - Status retry (which is cheap) and Data
> retry as a side benefit.

  Unless there is clear benefit (i.e. the event is frequent enough
  to justify recovery at this level), the entire mechanism should be
  dropped - it is neither cheap nor free. If it is relatively
  infrequent, the recovery at the command level should be a sufficient
  mechanism

> CRC errors are not that rare (although we don't have real data the
> simulation with file systems seem to indicate that numbers could
> be as high
> a 0.0002%). A restart of link - is expensive (slow start) and even if
they
> are far lower for many applications a slow start is a painfull event.

  Intuitively, it seems that the combination of link level CRC, TCP
  checksum, and good hardware (ECC, parity etc) should lead to a
  much lower level of errors caught by the iSCSI CRC algorithm. We have
  to seperate error detection (i.e. what if I have bad hardware or
  some vendor makes bad/buggy intermediate system) from recovery
  mechanisms (not based on hardware being bad or buggy - market forces
  will wean out the vendor) which should not be based on assumptions
  of bugs in hardware/software of specific implementations.

>
> Removing them from the spec is not a path we should take lightly.

  I would phrase it the other way. We should not keep adding things
  unless there is very clear proof that the additional feature is
  beneficial and does not have negative side effects (and there is
  some consensus on adding it)
>
> Julo
>
> "Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
>
> Please respond to "Jon Hall" <jhall@emc.com>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>
> I agree with Somesh.  And would go farther -- the complexity
> that results from retaining enough target-side state to respond
> to a SACK/SNACK request is non-trivial and needs clear justification.
> Intuitively, a CRC that discovers an error in an iSCSI pdu header
> (that the TCP cksum missed) seems like it should be a rare event.
>
> What is the frequency of this event?  IMO the answer to this
> question should be written into the protocol spec -- assuming
> that it substantiates the benefit of SACK/SNACK.  Otherwise, the
> SACK/SNACK pdu should be removed.
>
> -Jon
>
> julian_satran@il.ibm.com writes:
> >
> >Somesh,
> >
> >As I stated earlier - the DataSN was created to detect missing data
PDUs.
> >SNACK is needed to recover missing StatusSN and missing dataSN is only a
> >bonus if the target wants to support it.  It is a trivial mechanism and
I
> >think it should stay.
> >
> >Julo
> >
> >"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
> >
> >Please respond to someshg@yahoo.com
> >
> >To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> >cc:
> >Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> >Sorry to have been missing for a while. Hope you will
> >appreciate my being back in action :-). It was a fairly
> >clear consensus in Orlando that applications broke up
> >their transfers into reasonably small chunks i.e. they
> >did not have very long running transfers.
> >
> >Therefore the consensus was that a command level recovery
> >mechanism was sufficient instead of an ack/sack for each
> >data PDU.
> >
> >The SACK mechanism was a post Orlando invention. Without
> >an ack mechanism (for every data PDU), the SACK mechanism
> >just imposes additional burden on either end of the session,
> >without really much benefit.
> >
> >The benefit of having SACK is of saving bandwidth in case
> >the data part of the data PDU failed an integrity check
> >(but passed TCP checksum). This is a rare enough case that
> >as a percentage, the bandwidth loss from retransmitting
> >all the data associated with a read or write command is
> >very very small.
> >
> >In addition, it avoids the complexity of restarting
> >something from the middle, as compared to from the begining.
> >
> >To me it seems that there is significant simplicity (from
> >implementation, reliability and recovery process) from
> >having smaller data transfer per command.
> >
> >I would really like to get rid of the SACK command.
> >
> >Somesh
> >
> >> -----Original Message-----
> >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> >> julian_satran@il.ibm.com
> >> Sent: Wednesday, March 28, 2001 6:57 AM
> >> To: ips@ece.cmu.edu
> >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> Mallikarjun,
> >>
> >> Last summer I thought that recovery within a connection should be left
> to
> >> TCP. It is simple and could be made available through IPsec (if no new
> >> option of any form can be added).
> >>
> >> Two things killed this:
> >>
> >>    The requirement to have a data encapsulation that can pass through
> >>    application proxies (like a storage router)
> >>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> >>    header
> >>
> >>
> >> As for the ACK - I am very much in favor of it (it is a no brainer)
and
> >> implementations are in fact allowed to drop even unacked data.
> >>
> >> I am bound by the Orlando meeting decision to drop it. Except the
> regular
> >> "oppose everything" crowd the two vocal opponents where Somesh
> Gupta and
> >> Matt Wakeley.
> >>
> >> David may want or not to re-open the issue - I am not going to ask for
> >it.
> >>
> >> Regards,
> >> Julo
> >>
> >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> >>
> >> Please respond to cbm@rose.hp.com
> >>
> >> To:   Black_David@emc.com
> >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> someshg@yahoo.com,
> >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >> David and Julian,
> >>
> >> I appreciate both your views, and should I say that they're
> >> along predicted lines :-)
> >>
> >> - David's right in saying that the situation is akin to FC's.
> >>   However, I would like to point out that FC is an unreliable
> >>   transport, and hence is forced to pick up a lot of the transport
> >>   baggage (at least in FCP-2, as I understand), in addition
> >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> >>   TCP being the "reliable" transport, iSCSI is going along the
> >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> >>   point is - if this is indeed a necessary evil, why don't we
> >>   complete iSCSI's transport functionality by data-ACKs?
> >>
> >> - If data SACK is introduced mostly to make up for TCP's shortcomings,
> >>   we're making its usage (and implementation) drastically less
> appealing
> >>   since the only way error recovery algorithms can *rely* on data SACK
> >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> proposal),
> >>   which is extremely expensive.  IOW, we're defining data SACK in the
> >>   draft and not providing any incentives to implement and use it!
> >>
> >> - I submit that since iSCSI is being hailed as the ideal SCSI
Transport
> >>   protocol in its definition so far (and I believe, rightly so -
> >mandating
> >>   command ordering, bi-di support, SCSI CRN support to name a few
> >> examples),
> >>   the perfectly SCSI-legal R/W interactions that break in other
> >transports
> >>   *do not* have to break in iSCSI.
> >>
> >> - A last idea (may seem radical at this point) in regards to iSCSI
> >>   being a "full transport". This provides us an opportunity to "cast
> >>   off" the transport baggage in future when we truly move to a
> "reliable"
> >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> >>   keeping the encapsulation stuff separate from the transport stuff.
> >>   (Julian, I heard from Randy that ideas similar to this were explored
> >>   in your Haifa meeting.  And yes, he recalls they were given up since
> >>   TCP was supposed to be reliable and granularity of recovery
> was deemed
> >>   one I/O.)
> >>
> >> With that said, may I request David (with his co-chair hat on, :-))
> >> to add some binding comments/observations on this discussion?
> >>
> >> If we decide to leave data SACKs as unattractive to implement,
> the draft
> >> should in the least add a statement like - "Note that satisfying all
> >> possible data SACK requests for a task with an unacknowledged status
> >> implies implementing the I/O replay buffer on the part of targets."
> >> --
> >> Mallikarjun
> >>
> >>
> >> Mallikarjun Chadalapaka
> >> Networked Storage Architecture
> >> Network Storage Solutions Organization
> >> MS 5668   Hewlett-Packard, Roseville.
> >> cbm@rose.hp.com
> >>
> >>
> >>
> >>
> >> >I think Julian's basically right -- I would point
> >> >out that any case of write after read that breaks
> >> >over iSCSI will also break over Fibre Channel.
> >> >On FC, the scenario starts with a frame CRC failure
> >> >on read data at the Initiator, so applications
> >> >have to cope and typically do so by enforcing
> >> >ordering at the app rather than using SCSI task
> >> >ordering.
> >> >
> >> >While SCSI has clever tools like ACA and task
> >> >ordering that appear to allow dependent operations
> >> >to be sent to the target concurrently, in practice
> >> >they don't work and/or aren't used (funny thing,
> >> >those two reinforce each other ;-) ).  Hence
> >> >a minimal approach to them is in order:
> >> >- Make sure the result will interoperate.
> >> >- Make sure T10 doesn't ding us for leaving something
> >> >    completely out.
> >> >- Don't specify anything not needed for the above.
> >> >
> >> >My 0.02,
> >> >--David
> >> >
> >> >> -----Original Message-----
> >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> >> >> To:    cbm@rose.hp.com
> >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
hufferd@us.ibm.com;
> >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> >> >> Black_David@emc.com
> >> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >> Mallikarjun,
> >> >>
> >> >> I commiserate with you at the lack of ack for data but the Orlando
> >> meeting
> >> >> stated - no.  Recall that I kept the number only as a mechanism to
> >> detect
> >> >> missing packets.
> >> >>
> >> >> You can achieve the effect you want by keeping around data for a
> while
> >> >> (you
> >> >> determine how long and then discard).
> >> >>
> >> >> If a SACK comes and you can recover - fine. If not you either
> reaccess
> >> the
> >> >> media (if you know how) or reject
> >> >> and let the initiator retry.
> >> >>
> >> >> You should not worry about R/W conflicts as programs bound to have
> >such
> >> >> conflicts either:
> >> >>
> >> >> 1)can live with them or
> >> >> 2)protect themselves through some locks and rely on
> >> "operation-end-status"
> >> >> to keep results deterministic.
> >> >>
> >> >> Regards,
> >> >> Julo
> >> >>
> >> >>
> >> >>
> >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> >> >>
> >> >> Please respond to cbm@rose.hp.com
> >> >>
> >> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
> >Julian
> >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> >> >> cc:   Black_David@emc.com
> >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Hi Error Recovery Team,
> >> >>
> >> >> iSCSI can discard PDUs because of digest errors and request
> >> >> retransmissions using the iSCSI data SACK.  To deal with such
> >> >> an eventuality, targets that want to support data SACK have
> >> >> the following options:
> >> >>
> >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> >> >>   a SACK could come anytime before the status is ack'ed by the
> >> >>   initiator. [ simple, but extremely expensive in memory resources]
> >> >>
> >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> data-ACKs.
> >> >>   Thus enables keeping only those I/O buffers that haven't been
> ack'ed
> >> >>   by the initiator. IOW, become a real full transport! [ everyone
> >> disliked
> >> >>   it earlier...]
> >> >>
> >> >> (C) re-access the medium for data retransmission requests.
> Now there
> >> >>   are 3 sub-cases in this to handle the changed data on the
> medium in
> >a
> >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
is
> >> >> legal.)
> >> >>      (1) On seeing any write, stall till status is ack'ed
> for all the
> >> >>             previous reads (basically drain the pipe). [simple, but
> >> incurs
> >> >>             an additional roundtrip delay for all writes].
> >> >>      (2) A variation of the above, keep an eye only on the prior
> >> >>             overlapping reads. [more BW efficient, but
> complicated to
> >> >>             resolve the block dependencies in a stream of
> >> reads followed
> >> >>             by writes]
> >> >>         (3) Document the caveat and leave it upto the applications
> >> >>             to avoid this case since this leads to data integrity
> >> issues.
> >> >>             [pushing to apps since the transport can't get
> it right!]
> >> >>
> >> >> My first preference is (B), followed by (A), and I suggest we not
go
> >> >> to (C) at all with its inherent dangers.
> >> >>
> >> >> Doing (B) naturally completes the transport job that iSCSI has
taken
> >> >> on itself in view of TCP's claimed unreliable checksum.  That is
the
> >> >> right thing to do architecturally instead of being a
> "semi-transport"!
> >> >>
> >> >> Comments?
> >> >> --
> >> >> Mallikarjun
> >> >>
> >> >>
> >> >> Mallikarjun Chadalapaka
> >> >> Networked Storage Architecture
> >> >> Network Storage Solutions Organization
> >> >> MS 5668   Hewlett-Packard, Roseville.
> >> >> cbm@rose.hp.com
> >> >>
> >> >>
> >>
> >_________________________________________________________________
> _________
> >> >> Note.1: A Read followed by a Write (to the same blocks) is
perfectly
> >> legal
> >> >>         if SCSI sets the ORDERED task attribute on both the
> >> commands AND
> >> >>         sets the NACA bit to one to indicate that Write shall be
> >> executed
> >> >>         only if the Read did not fail (result in a Check
Condition).
> >> >>
> >> >>         In the current case, since Read completed just fine from
> >SCSI's
> >> >>         point of view, SCSI is moving on to execute Write.  Those
> read
> >> >> buffers
> >> >>         had been freed up since iSCSI received an ACK at the TCP
> >level,
> >> >> and
> >> >>         since iSCSI has no other way to have the data ack'ed!
>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com






From owner-ips@ece.cmu.edu  Wed Apr  4 18:48:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03781
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:48:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LWV721562
	for ips-outgoing; Wed, 4 Apr 2001 17:32:31 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34LVZr21492
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:31:35 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id E4A04102E; Wed,  4 Apr 2001 14:30:28 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA13644;
	Wed, 4 Apr 2001 14:30:08 -0700 (PDT)
Message-ID: <3ACB9399.1C059D78@cup.hp.com>
Date: Wed, 04 Apr 2001 14:35:21 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Jon Hall <jhall@emc.com>
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <200104042055.QAA28295@lub1028.lss.emc.com>
Content-Type: multipart/mixed;
 boundary="------------BA716B8A4431081A5BF600DB"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------BA716B8A4431081A5BF600DB
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Jon Hall wrote:
> 
> But CRC errors are not really the issue.  It is the
> singular case of a TCP cksum failing to detect what a
> CRC succeeds in detecting, and this occurring to a TCP
> segment containing an iSCSI hdr with a StatSN.
> 
> Is there a reason to believe that iSCSI StatSNs will be
> lost at a higher rate than is currently documented for TCP
> cksum failure?  Or, is the problem a loss of one TCP segment
> in tens (possibly hundreds) of millions of segments.  Where
> the bad segment may contain a StatSN but probably doesn't
> because it is a data pdu.  If the latter, why does a SCSI-level
> timeout and retry (on the initiator) not suffice?  [Note,
> an initiator timeout/retry does not require a connection
> to be closed.]
> 
> I realize that I am being annoyingly repetitious, but it is
> not an idle question.  For some targets, retained rsp status
> is not cheap (and retained rsp data is not tractable at all).
> 
> IMO there appears to be no real need for SNACK.  And, more
> radically, there appears to be no need for StatSNs.

Jon,

The SACK mechanism exists because of StatSN [and holes that can arise in
StatSN]. 

Removal of StatSN allows the removal of SACK mechanism as well. The
reason SACK was introduced can be traced in the thread :
http://ips.pdl.cs.cmu.edu/mail/msg03257.html

The request originally made was to have a SACK (not a SNACK) for StatSN.
i.e. allow individual StatSN that were received to be acknowledged when
holes occurs in StatSN sequence. 

The solution provided in 05 was the Status SACK [or SNACK] which was a
variant of the request made. i.e. the SNACK is a request to re-send the
Status PDU that was dropped, instead of selectively ack'ing the received
Status PDUs and allowing timeout recovery of the dropped Status PDU.

If the rate of TCP checksum escapes is low enough to consider such
errors relatively infrequent [with the probability of affecting a Status
PDU even lower] ,then, both StatSN and S[N]ACK mechanisms are overkill
and should be removed, in an attempt to keep the protocol simple and
free of un-used features.

> 
> Maybe, as Somesh said, this is a dead horse but why include
> something in the spec which suggests a need for target-side
> complexity, while not solving a clear and compelling
> requirement?

Agreed. Both StatSN and SACK are overkill and should be considered for
removal. There is no need to add complexity to targets to retain I/O
state information in anticpation of a SACK. 

- Santosh Rao

> 
> -Jon
> 
> julian_satran@il.ibm.com writes:
> >
> >SNACK is here for two reasons - Status retry (which is cheap) and Data
> >retry as a side benefit.
> >CRC errors are not that rare (although we don't have real data the
> >simulation with file systems seem to indicate that numbers could be as high
> >a 0.0002%). A restart of link - is expensive (slow start) and even if they
> >are far lower for many applications a slow start is a painfull event.
> >
> >Removing them from the spec is not a path we should take lightly.
> >
> >Julo
--------------BA716B8A4431081A5BF600DB
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------BA716B8A4431081A5BF600DB--



From owner-ips@ece.cmu.edu  Wed Apr  4 18:49:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03801
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:49:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LChM20103
	for ips-outgoing; Wed, 4 Apr 2001 17:12:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe11.law11.hotmail.com [64.4.16.115])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34LCBr20076
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:12:12 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Wed, 4 Apr 2001 14:12:05 -0700
X-Originating-IP: [66.31.72.237]
From: "Eddy Quicksall" <ESQuicksall@hotmail.com>
To: <ips@ece.cmu.edu>
References: <Pine.OSF.3.96.1010404074934.506661D-100000@saturn.cs.uml.edu>
Subject: Re: iSCSI linux implementation
Date: Wed, 4 Apr 2001 17:12:04 -0400
MIME-Version: 1.0
Content-Type: text/plain;	charset="iso-8859-1"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2919.6700
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Message-ID: <OE11ee4fDGCl9FopYC9000007f5@hotmail.com>
X-OriginalArrivalTime: 04 Apr 2001 21:12:05.0025 (UTC) FILETIME=[E64B1110:01C0BD4B]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
X-MIME-Autoconverted: from 8bit to quoted-printable by ece.cmu.edu id f34LChN20103
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by ietf.org id SAA03801

I've just coded to version 3 and just have to make the changes for version 5
too.

I'm testing on NT but plan to move to Linux soon.

Eddy
----- Original Message -----
From: "Mike Brown" <mbrown@cs.uml.edu>
To: <ips@ece.cmu.edu>
Sent: Wednesday, April 04, 2001 7:59 AM
Subject: iSCSI linux implementation


Hello,

Due to recent posts on linux-net and linux-scsi mailing lists, I felt
obligated to announce the implementation myself and a few other developers
have been working on.  What we are writing is a target and an initiator
driver.  I hastily threw up a web page describing our project:

http://www.cs.uml.edu/~mbrown/iSCSI/

This also contains current sources.  We are still in the middle of
developing and what we have written is based largely on version 3 of the
iSCSI draft with only a few minor changes moving to version 5 (op codes).
Currently our priority is to get something working and then go back and
make it conform to version 5.  Again, this is a WIP and is likely
incorrect in some parts.

Thanks.

-Michael F. Brown, UMass Lowell Computer Science

phone:  (978) 934-5354
email:  mbrown@cs.uml.edu

"I wonder if pawns just realize that they're just pawns
 in someone's (chess) game."   -L. Fitzgerald Sjöberg




From owner-ips@ece.cmu.edu  Wed Apr  4 18:49:34 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03814
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:49:33 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LEeP20199
	for ips-outgoing; Wed, 4 Apr 2001 17:14:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34LDfr20146
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:13:42 -0400 (EDT)
Received: from colosus2.cup.hp.com (colosus2.cup.hp.com [15.13.128.145])
	by palrel1.hp.com (Postfix) with ESMTP id C78175A
	for <ips@ece.cmu.edu>; Wed,  4 Apr 2001 14:13:39 -0700 (PDT)
Received: from hp.com (IDENT:plabat@pl703521.cup.hp.com [15.13.133.216])
	by colosus2.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA18718;
	Wed, 4 Apr 2001 14:13:38 -0700 (PDT)
Message-ID: <3ACB8E82.1563B241@hp.com>
Date: Wed, 04 Apr 2001 14:13:38 -0700
From: Pierre Labat <pierre_labat@hp.com>
Organization: Hewlett Packard ATM-SISL
X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.5-15 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A24.0054CD3E.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:

> SNACK is here for two reasons - Status retry (which is cheap) and Data
> retry as a side benefit.
> CRC errors are not that rare (although we don't have real data the
> simulation with file systems seem to indicate that numbers could be as high
> a 0.0002%).

Julo,

Could you explain how you get this number.
Does it come from
                     J. Stone et. al "Performance of Checksums and CRC's over
Real Data"
                      IEEE/ACM Transactions on Networking, Vol. 6, No. 5,
October 1998

http://dev.acm.org/pubs/articles/journals/ton/1998-6-5/p529-stone/p529-stone.pdf

???
I don't see how you got this number.
What i saw was:
                    Less than 1 escape in 10e17 segments when taking into
                    account the link layer AAL5 CRC. (see page 540 left column
on top).

Regards,

Pierre

> A restart of link - is expensive (slow start) and even if they
> are far lower for many applications a slow start is a painfull event.
>
> Removing them from the spec is not a path we should take lightly.
>
> Julo



From owner-ips@ece.cmu.edu  Wed Apr  4 18:49:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03825
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:49:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LJc120566
	for ips-outgoing; Wed, 4 Apr 2001 17:19:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34LIvr20527
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:18:57 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id XAA234430;
	Wed, 4 Apr 2001 23:18:50 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id XAA41476;
	Wed, 4 Apr 2001 23:16:58 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0074CCA1 ; Wed, 4 Apr 2001 23:15:45 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: "Jon Hall" <jhall@emc.com>
cc: ips@ece.cmu.edu
Message-ID: <C1256A24.0074CC11.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 23:19:21 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Jon,

Inexpensive implementation are always free to do away with recovery. That
si true for targets too.
But not specifying the mechanism for the more expensive one we make them
non-interoperable.

Julo

"Jon Hall" <jhall@emc.com> on 04/04/2001 22:55:40

Please respond to "Jon Hall" <jhall@emc.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"





But CRC errors are not really the issue.  It is the
singular case of a TCP cksum failing to detect what a
CRC succeeds in detecting, and this occurring to a TCP
segment containing an iSCSI hdr with a StatSN.

Is there a reason to believe that iSCSI StatSNs will be
lost at a higher rate than is currently documented for TCP
cksum failure?  Or, is the problem a loss of one TCP segment
in tens (possibly hundreds) of millions of segments.  Where
the bad segment may contain a StatSN but probably doesn't
because it is a data pdu.  If the latter, why does a SCSI-level
timeout and retry (on the initiator) not suffice?  [Note,
an initiator timeout/retry does not require a connection
to be closed.]

I realize that I am being annoyingly repetitious, but it is
not an idle question.  For some targets, retained rsp status
is not cheap (and retained rsp data is not tractable at all).

IMO there appears to be no real need for SNACK.  And, more
radically, there appears to be no need for StatSNs.

Maybe, as Somesh said, this is a dead horse but why include
something in the spec which suggests a need for target-side
complexity, while not solving a clear and compelling
requirement?

-Jon

julian_satran@il.ibm.com writes:
>
>SNACK is here for two reasons - Status retry (which is cheap) and Data
>retry as a side benefit.
>CRC errors are not that rare (although we don't have real data the
>simulation with file systems seem to indicate that numbers could be as
high
>a 0.0002%). A restart of link - is expensive (slow start) and even if they
>are far lower for many applications a slow start is a painfull event.
>
>Removing them from the spec is not a path we should take lightly.
>
>Julo
>
>"Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
>
>Please respond to "Jon Hall" <jhall@emc.com>
>
>To:   ips@ece.cmu.edu
>cc:
>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>
>I agree with Somesh.  And would go farther -- the complexity
>that results from retaining enough target-side state to respond
>to a SACK/SNACK request is non-trivial and needs clear justification.
>Intuitively, a CRC that discovers an error in an iSCSI pdu header
>(that the TCP cksum missed) seems like it should be a rare event.
>
>What is the frequency of this event?  IMO the answer to this
>question should be written into the protocol spec -- assuming
>that it substantiates the benefit of SACK/SNACK.  Otherwise, the
>SACK/SNACK pdu should be removed.
>
>-Jon
>
>julian_satran@il.ibm.com writes:
>>
>>Somesh,
>>
>>As I stated earlier - the DataSN was created to detect missing data PDUs.
>>SNACK is needed to recover missing StatusSN and missing dataSN is only a
>>bonus if the target wants to support it.  It is a trivial mechanism and I
>>think it should stay.
>>
>>Julo
>>
>>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
>>
>>Please respond to someshg@yahoo.com
>>
>>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
>>cc:
>>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>>Sorry to have been missing for a while. Hope you will
>>appreciate my being back in action :-). It was a fairly
>>clear consensus in Orlando that applications broke up
>>their transfers into reasonably small chunks i.e. they
>>did not have very long running transfers.
>>
>>Therefore the consensus was that a command level recovery
>>mechanism was sufficient instead of an ack/sack for each
>>data PDU.
>>
>>The SACK mechanism was a post Orlando invention. Without
>>an ack mechanism (for every data PDU), the SACK mechanism
>>just imposes additional burden on either end of the session,
>>without really much benefit.
>>
>>The benefit of having SACK is of saving bandwidth in case
>>the data part of the data PDU failed an integrity check
>>(but passed TCP checksum). This is a rare enough case that
>>as a percentage, the bandwidth loss from retransmitting
>>all the data associated with a read or write command is
>>very very small.
>>
>>In addition, it avoids the complexity of restarting
>>something from the middle, as compared to from the begining.
>>
>>To me it seems that there is significant simplicity (from
>>implementation, reliability and recovery process) from
>>having smaller data transfer per command.
>>
>>I would really like to get rid of the SACK command.
>>
>>Somesh
>>
>>> -----Original Message-----
>>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>>> julian_satran@il.ibm.com
>>> Sent: Wednesday, March 28, 2001 6:57 AM
>>> To: ips@ece.cmu.edu
>>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>
>>>
>>>
>>>
>>> Mallikarjun,
>>>
>>> Last summer I thought that recovery within a connection should be left
>to
>>> TCP. It is simple and could be made available through IPsec (if no new
>>> option of any form can be added).
>>>
>>> Two things killed this:
>>>
>>>    The requirement to have a data encapsulation that can pass through
>>>    application proxies (like a storage router)
>>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>>    header
>>>
>>>
>>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>>> implementations are in fact allowed to drop even unacked data.
>>>
>>> I am bound by the Orlando meeting decision to drop it. Except the
>regular
>>> "oppose everything" crowd the two vocal opponents where Somesh Gupta
and
>>> Matt Wakeley.
>>>
>>> David may want or not to re-open the issue - I am not going to ask for
>>it.
>>>
>>> Regards,
>>> Julo
>>>
>>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>>
>>> Please respond to cbm@rose.hp.com
>>>
>>> To:   Black_David@emc.com
>>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
someshg@yahoo.com,
>>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>
>>>
>>>
>>>
>>> David and Julian,
>>>
>>> I appreciate both your views, and should I say that they're
>>> along predicted lines :-)
>>>
>>> - David's right in saying that the situation is akin to FC's.
>>>   However, I would like to point out that FC is an unreliable
>>>   transport, and hence is forced to pick up a lot of the transport
>>>   baggage (at least in FCP-2, as I understand), in addition
>>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>>   TCP being the "reliable" transport, iSCSI is going along the
>>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>>   point is - if this is indeed a necessary evil, why don't we
>>>   complete iSCSI's transport functionality by data-ACKs?
>>>
>>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>>   we're making its usage (and implementation) drastically less
appealing
>>>   since the only way error recovery algorithms can *rely* on data SACK
>>>   is when replay is supported (or, "ReplaySupport=yes"  in my
proposal),
>>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>>   draft and not providing any incentives to implement and use it!
>>>
>>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>>>   protocol in its definition so far (and I believe, rightly so -
>>mandating
>>>   command ordering, bi-di support, SCSI CRN support to name a few
>>> examples),
>>>   the perfectly SCSI-legal R/W interactions that break in other
>>transports
>>>   *do not* have to break in iSCSI.
>>>
>>> - A last idea (may seem radical at this point) in regards to iSCSI
>>>   being a "full transport". This provides us an opportunity to "cast
>>>   off" the transport baggage in future when we truly move to a
>"reliable"
>>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>>   keeping the encapsulation stuff separate from the transport stuff.
>>>   (Julian, I heard from Randy that ideas similar to this were explored
>>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>>   TCP was supposed to be reliable and granularity of recovery was
deemed
>>>   one I/O.)
>>>
>>> With that said, may I request David (with his co-chair hat on, :-))
>>> to add some binding comments/observations on this discussion?
>>>
>>> If we decide to leave data SACKs as unattractive to implement, the
draft
>>> should in the least add a statement like - "Note that satisfying all
>>> possible data SACK requests for a task with an unacknowledged status
>>> implies implementing the I/O replay buffer on the part of targets."
>>> --
>>> Mallikarjun
>>>
>>>
>>> Mallikarjun Chadalapaka
>>> Networked Storage Architecture
>>> Network Storage Solutions Organization
>>> MS 5668   Hewlett-Packard, Roseville.
>>> cbm@rose.hp.com
>>>
>>>
>>>
>>>
>>> >I think Julian's basically right -- I would point
>>> >out that any case of write after read that breaks
>>> >over iSCSI will also break over Fibre Channel.
>>> >On FC, the scenario starts with a frame CRC failure
>>> >on read data at the Initiator, so applications
>>> >have to cope and typically do so by enforcing
>>> >ordering at the app rather than using SCSI task
>>> >ordering.
>>> >
>>> >While SCSI has clever tools like ACA and task
>>> >ordering that appear to allow dependent operations
>>> >to be sent to the target concurrently, in practice
>>> >they don't work and/or aren't used (funny thing,
>>> >those two reinforce each other ;-) ).  Hence
>>> >a minimal approach to them is in order:
>>> >- Make sure the result will interoperate.
>>> >- Make sure T10 doesn't ding us for leaving something
>>> >    completely out.
>>> >- Don't specify anything not needed for the above.
>>> >
>>> >My 0.02,
>>> >--David
>>> >
>>> >> -----Original Message-----
>>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>>> >> To:    cbm@rose.hp.com
>>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>>> >> Black_David@emc.com
>>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>> >>
>>> >>
>>> >>
>>> >> Mallikarjun,
>>> >>
>>> >> I commiserate with you at the lack of ack for data but the Orlando
>>> meeting
>>> >> stated - no.  Recall that I kept the number only as a mechanism to
>>> detect
>>> >> missing packets.
>>> >>
>>> >> You can achieve the effect you want by keeping around data for a
>while
>>> >> (you
>>> >> determine how long and then discard).
>>> >>
>>> >> If a SACK comes and you can recover - fine. If not you either
>reaccess
>>> the
>>> >> media (if you know how) or reject
>>> >> and let the initiator retry.
>>> >>
>>> >> You should not worry about R/W conflicts as programs bound to have
>>such
>>> >> conflicts either:
>>> >>
>>> >> 1)can live with them or
>>> >> 2)protect themselves through some locks and rely on
>>> "operation-end-status"
>>> >> to keep results deterministic.
>>> >>
>>> >> Regards,
>>> >> Julo
>>> >>
>>> >>
>>> >>
>>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>>> >>
>>> >> Please respond to cbm@rose.hp.com
>>> >>
>>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
>>Julian
>>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>>> >> cc:   Black_David@emc.com
>>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Hi Error Recovery Team,
>>> >>
>>> >> iSCSI can discard PDUs because of digest errors and request
>>> >> retransmissions using the iSCSI data SACK.  To deal with such
>>> >> an eventuality, targets that want to support data SACK have
>>> >> the following options:
>>> >>
>>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>>> >>   a SACK could come anytime before the status is ack'ed by the
>>> >>   initiator. [ simple, but extremely expensive in memory resources]
>>> >>
>>> >> (B) (re-introduce data-ACKs into the draft, and) implement
data-ACKs.
>>> >>   Thus enables keeping only those I/O buffers that haven't been
>ack'ed
>>> >>   by the initiator. IOW, become a real full transport! [ everyone
>>> disliked
>>> >>   it earlier...]
>>> >>
>>> >> (C) re-access the medium for data retransmission requests.  Now
there
>>> >>   are 3 sub-cases in this to handle the changed data on the medium
in
>>a
>>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>>> >> legal.)
>>> >>      (1) On seeing any write, stall till status is ack'ed for all
the
>>> >>             previous reads (basically drain the pipe). [simple, but
>>> incurs
>>> >>             an additional roundtrip delay for all writes].
>>> >>      (2) A variation of the above, keep an eye only on the prior
>>> >>             overlapping reads. [more BW efficient, but complicated
to
>>> >>             resolve the block dependencies in a stream of
>>> reads followed
>>> >>             by writes]
>>> >>         (3) Document the caveat and leave it upto the applications
>>> >>             to avoid this case since this leads to data integrity
>>> issues.
>>> >>             [pushing to apps since the transport can't get it
right!]
>>> >>
>>> >> My first preference is (B), followed by (A), and I suggest we not go
>>> >> to (C) at all with its inherent dangers.
>>> >>
>>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>>> >> right thing to do architecturally instead of being a
>"semi-transport"!
>>> >>
>>> >> Comments?
>>> >> --
>>> >> Mallikarjun
>>> >>
>>> >>
>>> >> Mallikarjun Chadalapaka
>>> >> Networked Storage Architecture
>>> >> Network Storage Solutions Organization
>>> >> MS 5668   Hewlett-Packard, Roseville.
>>> >> cbm@rose.hp.com
>>> >>
>>> >>
>>>
>>__________________________________________________________________________

>>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>>> legal
>>> >>         if SCSI sets the ORDERED task attribute on both the
>>> commands AND
>>> >>         sets the NACA bit to one to indicate that Write shall be
>>> executed
>>> >>         only if the Read did not fail (result in a Check Condition).
>>> >>
>>> >>         In the current case, since Read completed just fine from
>>SCSI's
>>> >>         point of view, SCSI is moving on to execute Write.  Those
>read
>>> >> buffers
>>> >>         had been freed up since iSCSI received an ACK at the TCP
>>level,
>>> >> and
>>> >>         since iSCSI has no other way to have the data ack'ed!
>





From owner-ips@ece.cmu.edu  Wed Apr  4 18:50:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03842
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:50:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34KGSa16205
	for ips-outgoing; Wed, 4 Apr 2001 16:16:28 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34KGHr16168
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 16:16:18 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id WAA160082
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 22:16:07 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id WAA67126
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 22:14:14 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.006F0DBB ; Wed, 4 Apr 2001 22:13:00 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A24.006F0C38.00@d12mta02.de.ibm.com>
Date: Wed, 4 Apr 2001 21:38:52 +0200
Subject: Re: ISCSI: Detail on counting offset for fixed markers
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk





Barry,

In 06 it will read:

   The Marker indicates the offset to the next iSCSI PDU header.  The
   Marker is eight bytes in length, and contains two 32-bit offset fields
   that indicate how many bytes to skip in the TCP stream in order to find
   the next iSCSI PDU header.  The offset is counted from the marker end to
   the beginning of the next header.  The marker uses two copies of the
   pointer so that a marker spanning a TCP packet boundary should leave at
   least one valid copy in one of the packets.


   Thanks,
   Julo

"Barry Reinhold" <bbrtrebia@mediaone.net> on 04/04/2001 17:29:59

Please respond to "Barry Reinhold" <bbrtrebia@mediaone.net>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   "ISCSI" <ips@ece.cmu.edu>
Subject:  ISCSI: Detail on counting offset for fixed markers




Julian,
     A mundane clearification that need to go into the specification
relative to
appendix C (Fixed Markers). It currently reads:

"The Marker indicates the offset to the next iSCSI message header.
   The Marker is eight bytes in length, and contains two 32-bit offset
   fields that indicate how many bytes to skip in the TCP stream in
   order to find the next iSCSI message header.  The marker uses two
   copies of the pointer so that a marker spanning a TCP packet boundary
   will leave at least one valid copy in one of the packets."

Since we are counting bytes, does the offset start after the 4 bytes that
make up this copy of the pointer (exculdes pointer)or does it start with
the
first byte of the pointer (includes pointer)?

A third option is that it starts after both copies of the pointers so that
the two values are the same....




Barry Reinhold
Principal Architect
Trebia Networks
barry.reinhold@trebia.com
603-868-5144/603-659-0885/978-929-0830 x138








From owner-ips@ece.cmu.edu  Wed Apr  4 18:50:14 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03843
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:50:12 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LNUK20847
	for ips-outgoing; Wed, 4 Apr 2001 17:23:30 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34LMdr20810
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:22:39 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id XAA70648;
	Wed, 4 Apr 2001 23:22:32 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta06_cs0 [9.165.222.255])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id XAA162326;
	Wed, 4 Apr 2001 23:20:40 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A24.0075691F ; Wed, 4 Apr 2001 23:22:26 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: someshg@yahoo.com
cc: someshg@yahoo.com, ips@ece.cmu.edu
Message-ID: <C1256A24.007567CE.00@d12mta05.de.ibm.com>
Date: Wed, 4 Apr 2001 23:22:56 +0200
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Somesh,

Can you give us a reference for those rates?  Where do they come from?

Regards,
Julo

"Somesh Gupta" <someshg@yahoo.com> on 04/04/2001 23:02:06

Please respond to someshg@yahoo.com

To:   Julian Satran/Haifa/IBM@IBMIL, someshg@yahoo.com
cc:   ips@ece.cmu.edu
Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"




Assuming that the packet corruption escape rate is 1 in 10billion,
we have (rough assuming 1K byte per packet), 1 escaped packet every
10 Trillion Bytes of data transfer. Seems to me that if I
had to transfer 1 MBytes for having to recover at the
command level rather than at a more granular level, that does
not pose much of an additional burden (1 MB out of 10 Trillion
bytes). Also assuming each i/o is 1 MByte in size, you would
have to do recovery for every 1 in 10 million transactions.

I don't know how realistic the 1 in 10 billion packet corruption
escape rate is but I am using the number from past discussions.

Somesh

> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: Wednesday, April 04, 2001 11:56 AM
> To: someshg@yahoo.com
> Cc: ips@ece.cmu.edu
> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> What are the numbers you are looking at:
>
> 1 per 10 sec, 1/10h or 1 /10y?
>
> Julo
>
> "Somesh Gupta" <someshg@yahoo.com> on 04/04/2001 20:15:53
>
> Please respond to someshg@yahoo.com
>
> To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>
>
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> > julian_satran@il.ibm.com
> > Sent: Wednesday, April 04, 2001 7:32 AM
> > To: ips@ece.cmu.edu
> > Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> > SNACK is here for two reasons - Status retry (which is cheap) and Data
> > retry as a side benefit.
>
>   Unless there is clear benefit (i.e. the event is frequent enough
>   to justify recovery at this level), the entire mechanism should be
>   dropped - it is neither cheap nor free. If it is relatively
>   infrequent, the recovery at the command level should be a sufficient
>   mechanism
>
> > CRC errors are not that rare (although we don't have real data the
> > simulation with file systems seem to indicate that numbers could
> > be as high
> > a 0.0002%). A restart of link - is expensive (slow start) and even if
> they
> > are far lower for many applications a slow start is a painfull event.
>
>   Intuitively, it seems that the combination of link level CRC, TCP
>   checksum, and good hardware (ECC, parity etc) should lead to a
>   much lower level of errors caught by the iSCSI CRC algorithm. We have
>   to seperate error detection (i.e. what if I have bad hardware or
>   some vendor makes bad/buggy intermediate system) from recovery
>   mechanisms (not based on hardware being bad or buggy - market forces
>   will wean out the vendor) which should not be based on assumptions
>   of bugs in hardware/software of specific implementations.
>
> >
> > Removing them from the spec is not a path we should take lightly.
>
>   I would phrase it the other way. We should not keep adding things
>   unless there is very clear proof that the additional feature is
>   beneficial and does not have negative side effects (and there is
>   some consensus on adding it)
> >
> > Julo
> >
> > "Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
> >
> > Please respond to "Jon Hall" <jhall@emc.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> >
> > I agree with Somesh.  And would go farther -- the complexity
> > that results from retaining enough target-side state to respond
> > to a SACK/SNACK request is non-trivial and needs clear justification.
> > Intuitively, a CRC that discovers an error in an iSCSI pdu header
> > (that the TCP cksum missed) seems like it should be a rare event.
> >
> > What is the frequency of this event?  IMO the answer to this
> > question should be written into the protocol spec -- assuming
> > that it substantiates the benefit of SACK/SNACK.  Otherwise, the
> > SACK/SNACK pdu should be removed.
> >
> > -Jon
> >
> > julian_satran@il.ibm.com writes:
> > >
> > >Somesh,
> > >
> > >As I stated earlier - the DataSN was created to detect missing data
> PDUs.
> > >SNACK is needed to recover missing StatusSN and missing dataSN
> is only a
> > >bonus if the target wants to support it.  It is a trivial mechanism
and
> I
> > >think it should stay.
> > >
> > >Julo
> > >
> > >"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
> > >
> > >Please respond to someshg@yahoo.com
> > >
> > >To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> > >cc:
> > >Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >
> > >
> > >
> > >
> > >Sorry to have been missing for a while. Hope you will
> > >appreciate my being back in action :-). It was a fairly
> > >clear consensus in Orlando that applications broke up
> > >their transfers into reasonably small chunks i.e. they
> > >did not have very long running transfers.
> > >
> > >Therefore the consensus was that a command level recovery
> > >mechanism was sufficient instead of an ack/sack for each
> > >data PDU.
> > >
> > >The SACK mechanism was a post Orlando invention. Without
> > >an ack mechanism (for every data PDU), the SACK mechanism
> > >just imposes additional burden on either end of the session,
> > >without really much benefit.
> > >
> > >The benefit of having SACK is of saving bandwidth in case
> > >the data part of the data PDU failed an integrity check
> > >(but passed TCP checksum). This is a rare enough case that
> > >as a percentage, the bandwidth loss from retransmitting
> > >all the data associated with a read or write command is
> > >very very small.
> > >
> > >In addition, it avoids the complexity of restarting
> > >something from the middle, as compared to from the begining.
> > >
> > >To me it seems that there is significant simplicity (from
> > >implementation, reliability and recovery process) from
> > >having smaller data transfer per command.
> > >
> > >I would really like to get rid of the SACK command.
> > >
> > >Somesh
> > >
> > >> -----Original Message-----
> > >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
> Behalf Of
> > >> julian_satran@il.ibm.com
> > >> Sent: Wednesday, March 28, 2001 6:57 AM
> > >> To: ips@ece.cmu.edu
> > >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> Mallikarjun,
> > >>
> > >> Last summer I thought that recovery within a connection
> should be left
> > to
> > >> TCP. It is simple and could be made available through IPsec
> (if no new
> > >> option of any form can be added).
> > >>
> > >> Two things killed this:
> > >>
> > >>    The requirement to have a data encapsulation that can pass
through
> > >>    application proxies (like a storage router)
> > >>    The "NO WAY" message we got from IESG-Security on a CRC only
IPSec
> > >>    header
> > >>
> > >>
> > >> As for the ACK - I am very much in favor of it (it is a no brainer)
> and
> > >> implementations are in fact allowed to drop even unacked data.
> > >>
> > >> I am bound by the Orlando meeting decision to drop it. Except the
> > regular
> > >> "oppose everything" crowd the two vocal opponents where Somesh
> > Gupta and
> > >> Matt Wakeley.
> > >>
> > >> David may want or not to re-open the issue - I am not going
> to ask for
> > >it.
> > >>
> > >> Regards,
> > >> Julo
> > >>
> > >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> > >>
> > >> Please respond to cbm@rose.hp.com
> > >>
> > >> To:   Black_David@emc.com
> > >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> > someshg@yahoo.com,
> > >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> > >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> > >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> David and Julian,
> > >>
> > >> I appreciate both your views, and should I say that they're
> > >> along predicted lines :-)
> > >>
> > >> - David's right in saying that the situation is akin to FC's.
> > >>   However, I would like to point out that FC is an unreliable
> > >>   transport, and hence is forced to pick up a lot of the transport
> > >>   baggage (at least in FCP-2, as I understand), in addition
> > >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> > >>   TCP being the "reliable" transport, iSCSI is going along the
> > >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> > >>   point is - if this is indeed a necessary evil, why don't we
> > >>   complete iSCSI's transport functionality by data-ACKs?
> > >>
> > >> - If data SACK is introduced mostly to make up for TCP's
> shortcomings,
> > >>   we're making its usage (and implementation) drastically less
> > appealing
> > >>   since the only way error recovery algorithms can *rely* on
> data SACK
> > >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> > proposal),
> > >>   which is extremely expensive.  IOW, we're defining data SACK in
the
> > >>   draft and not providing any incentives to implement and use it!
> > >>
> > >> - I submit that since iSCSI is being hailed as the ideal SCSI
> Transport
> > >>   protocol in its definition so far (and I believe, rightly so -
> > >mandating
> > >>   command ordering, bi-di support, SCSI CRN support to name a few
> > >> examples),
> > >>   the perfectly SCSI-legal R/W interactions that break in other
> > >transports
> > >>   *do not* have to break in iSCSI.
> > >>
> > >> - A last idea (may seem radical at this point) in regards to iSCSI
> > >>   being a "full transport". This provides us an opportunity to "cast
> > >>   off" the transport baggage in future when we truly move to a
> > "reliable"
> > >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> > >>   keeping the encapsulation stuff separate from the transport stuff.
> > >>   (Julian, I heard from Randy that ideas similar to this
> were explored
> > >>   in your Haifa meeting.  And yes, he recalls they were
> given up since
> > >>   TCP was supposed to be reliable and granularity of recovery
> > was deemed
> > >>   one I/O.)
> > >>
> > >> With that said, may I request David (with his co-chair hat on, :-))
> > >> to add some binding comments/observations on this discussion?
> > >>
> > >> If we decide to leave data SACKs as unattractive to implement,
> > the draft
> > >> should in the least add a statement like - "Note that satisfying all
> > >> possible data SACK requests for a task with an unacknowledged status
> > >> implies implementing the I/O replay buffer on the part of targets."
> > >> --
> > >> Mallikarjun
> > >>
> > >>
> > >> Mallikarjun Chadalapaka
> > >> Networked Storage Architecture
> > >> Network Storage Solutions Organization
> > >> MS 5668   Hewlett-Packard, Roseville.
> > >> cbm@rose.hp.com
> > >>
> > >>
> > >>
> > >>
> > >> >I think Julian's basically right -- I would point
> > >> >out that any case of write after read that breaks
> > >> >over iSCSI will also break over Fibre Channel.
> > >> >On FC, the scenario starts with a frame CRC failure
> > >> >on read data at the Initiator, so applications
> > >> >have to cope and typically do so by enforcing
> > >> >ordering at the app rather than using SCSI task
> > >> >ordering.
> > >> >
> > >> >While SCSI has clever tools like ACA and task
> > >> >ordering that appear to allow dependent operations
> > >> >to be sent to the target concurrently, in practice
> > >> >they don't work and/or aren't used (funny thing,
> > >> >those two reinforce each other ;-) ).  Hence
> > >> >a minimal approach to them is in order:
> > >> >- Make sure the result will interoperate.
> > >> >- Make sure T10 doesn't ding us for leaving something
> > >> >    completely out.
> > >> >- Don't specify anything not needed for the above.
> > >> >
> > >> >My 0.02,
> > >> >--David
> > >> >
> > >> >> -----Original Message-----
> > >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> > >> >> To:    cbm@rose.hp.com
> > >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
> hufferd@us.ibm.com;
> > >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> > >> >> Black_David@emc.com
> > >> >> Subject:    Re: iSCSI ERT: data SACK/replay
> buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >> Mallikarjun,
> > >> >>
> > >> >> I commiserate with you at the lack of ack for data but the
Orlando
> > >> meeting
> > >> >> stated - no.  Recall that I kept the number only as a mechanism
to
> > >> detect
> > >> >> missing packets.
> > >> >>
> > >> >> You can achieve the effect you want by keeping around data for a
> > while
> > >> >> (you
> > >> >> determine how long and then discard).
> > >> >>
> > >> >> If a SACK comes and you can recover - fine. If not you either
> > reaccess
> > >> the
> > >> >> media (if you know how) or reject
> > >> >> and let the initiator retry.
> > >> >>
> > >> >> You should not worry about R/W conflicts as programs bound to
have
> > >such
> > >> >> conflicts either:
> > >> >>
> > >> >> 1)can live with them or
> > >> >> 2)protect themselves through some locks and rely on
> > >> "operation-end-status"
> > >> >> to keep results deterministic.
> > >> >>
> > >> >> Regards,
> > >> >> Julo
> > >> >>
> > >> >>
> > >> >>
> > >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> > >> >>
> > >> >> Please respond to cbm@rose.hp.com
> > >> >>
> > >> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
> > >Julian
> > >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> > >> >> cc:   Black_David@emc.com
> > >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Hi Error Recovery Team,
> > >> >>
> > >> >> iSCSI can discard PDUs because of digest errors and request
> > >> >> retransmissions using the iSCSI data SACK.  To deal with such
> > >> >> an eventuality, targets that want to support data SACK have
> > >> >> the following options:
> > >> >>
> > >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> > >> >>   a SACK could come anytime before the status is ack'ed by the
> > >> >>   initiator. [ simple, but extremely expensive in memory
> resources]
> > >> >>
> > >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> > data-ACKs.
> > >> >>   Thus enables keeping only those I/O buffers that haven't been
> > ack'ed
> > >> >>   by the initiator. IOW, become a real full transport! [ everyone
> > >> disliked
> > >> >>   it earlier...]
> > >> >>
> > >> >> (C) re-access the medium for data retransmission requests.
> > Now there
> > >> >>   are 3 sub-cases in this to handle the changed data on the
> > medium in
> > >a
> > >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
> is
> > >> >> legal.)
> > >> >>      (1) On seeing any write, stall till status is ack'ed
> > for all the
> > >> >>             previous reads (basically drain the pipe).
> [simple, but
> > >> incurs
> > >> >>             an additional roundtrip delay for all writes].
> > >> >>      (2) A variation of the above, keep an eye only on the prior
> > >> >>             overlapping reads. [more BW efficient, but
> > complicated to
> > >> >>             resolve the block dependencies in a stream of
> > >> reads followed
> > >> >>             by writes]
> > >> >>         (3) Document the caveat and leave it upto the
applications
> > >> >>             to avoid this case since this leads to data integrity
> > >> issues.
> > >> >>             [pushing to apps since the transport can't get
> > it right!]
> > >> >>
> > >> >> My first preference is (B), followed by (A), and I suggest we not
> go
> > >> >> to (C) at all with its inherent dangers.
> > >> >>
> > >> >> Doing (B) naturally completes the transport job that iSCSI has
> taken
> > >> >> on itself in view of TCP's claimed unreliable checksum.  That is
> the
> > >> >> right thing to do architecturally instead of being a
> > "semi-transport"!
> > >> >>
> > >> >> Comments?
> > >> >> --
> > >> >> Mallikarjun
> > >> >>
> > >> >>
> > >> >> Mallikarjun Chadalapaka
> > >> >> Networked Storage Architecture
> > >> >> Network Storage Solutions Organization
> > >> >> MS 5668   Hewlett-Packard, Roseville.
> > >> >> cbm@rose.hp.com
> > >> >>
> > >> >>
> > >>
> > >_________________________________________________________________
> > _________
> > >> >> Note.1: A Read followed by a Write (to the same blocks) is
> perfectly
> > >> legal
> > >> >>         if SCSI sets the ORDERED task attribute on both the
> > >> commands AND
> > >> >>         sets the NACA bit to one to indicate that Write shall be
> > >> executed
> > >> >>         only if the Read did not fail (result in a Check
> Condition).
> > >> >>
> > >> >>         In the current case, since Read completed just fine from
> > >SCSI's
> > >> >>         point of view, SCSI is moving on to execute Write.  Those
> > read
> > >> >> buffers
> > >> >>         had been freed up since iSCSI received an ACK at the TCP
> > >level,
> > >> >> and
> > >> >>         since iSCSI has no other way to have the data ack'ed!
> >
> >
>
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>
>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com






From owner-ips@ece.cmu.edu  Wed Apr  4 18:51:17 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03886
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:51:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LmY822688
	for ips-outgoing; Wed, 4 Apr 2001 17:48:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34LmQr22679
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:48:26 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 511C695F; Wed,  4 Apr 2001 14:48:25 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA15077;
	Wed, 4 Apr 2001 14:48:20 -0700 (PDT)
Message-ID: <3ACB97DC.ECE79128@cup.hp.com>
Date: Wed, 04 Apr 2001 14:53:32 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A24.0074CC11.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------6845C52C1000FD5855FACA51"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------6845C52C1000FD5855FACA51
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


julian_satran@il.ibm.com wrote:
> 
> Jon,
> 
> Inexpensive implementation are always free to do away with recovery. That
> si true for targets too.

That's not the interpretation one gets from reading the spec and prior
discussions on this list. Per the spec, support for Status SACK is
mandatory while support for data SACK is optional. 

IOW, targets MUST retains state information to satisfy a potential
status SACK request.

- Santosh


> But not specifying the mechanism for the more expensive one we make them
> non-interoperable.
--------------6845C52C1000FD5855FACA51
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------6845C52C1000FD5855FACA51--



From owner-ips@ece.cmu.edu  Wed Apr  4 18:51:55 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA03905
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 18:51:50 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34L8Wg19780
	for ips-outgoing; Wed, 4 Apr 2001 17:08:32 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from smtp013.mail.yahoo.com (smtp013.mail.yahoo.com [216.136.173.57])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f34L7cr19733
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:07:38 -0400 (EDT)
Received: from sdsl-216-36-75-164.dsl.sjc.megapath.net (HELO somesh) (216.36.75.164)
  by smtp.mail.vip.sc5.yahoo.com with SMTP; 4 Apr 2001 21:07:31 -0000
X-Apparently-From: <someshg@yahoo.com>
Reply-To: <someshg@yahoo.com>
From: "Somesh Gupta" <someshg@yahoo.com>
To: <julian_satran@il.ibm.com>, <someshg@yahoo.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Wed, 4 Apr 2001 14:02:06 -0700
Message-ID: <NMEALCLOIBCHBDHLCMIJIELKCCAA.someshg@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
In-Reply-To: <C1256A24.0067A747.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Assuming that the packet corruption escape rate is 1 in 10billion,
we have (rough assuming 1K byte per packet), 1 escaped packet every
10 Trillion Bytes of data transfer. Seems to me that if I
had to transfer 1 MBytes for having to recover at the
command level rather than at a more granular level, that does
not pose much of an additional burden (1 MB out of 10 Trillion
bytes). Also assuming each i/o is 1 MByte in size, you would
have to do recovery for every 1 in 10 million transactions.

I don't know how realistic the 1 in 10 billion packet corruption
escape rate is but I am using the number from past discussions.

Somesh

> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: Wednesday, April 04, 2001 11:56 AM
> To: someshg@yahoo.com
> Cc: ips@ece.cmu.edu
> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> What are the numbers you are looking at:
>
> 1 per 10 sec, 1/10h or 1 /10y?
>
> Julo
>
> "Somesh Gupta" <someshg@yahoo.com> on 04/04/2001 20:15:53
>
> Please respond to someshg@yahoo.com
>
> To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>
>
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> > julian_satran@il.ibm.com
> > Sent: Wednesday, April 04, 2001 7:32 AM
> > To: ips@ece.cmu.edu
> > Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> > SNACK is here for two reasons - Status retry (which is cheap) and Data
> > retry as a side benefit.
>
>   Unless there is clear benefit (i.e. the event is frequent enough
>   to justify recovery at this level), the entire mechanism should be
>   dropped - it is neither cheap nor free. If it is relatively
>   infrequent, the recovery at the command level should be a sufficient
>   mechanism
>
> > CRC errors are not that rare (although we don't have real data the
> > simulation with file systems seem to indicate that numbers could
> > be as high
> > a 0.0002%). A restart of link - is expensive (slow start) and even if
> they
> > are far lower for many applications a slow start is a painfull event.
>
>   Intuitively, it seems that the combination of link level CRC, TCP
>   checksum, and good hardware (ECC, parity etc) should lead to a
>   much lower level of errors caught by the iSCSI CRC algorithm. We have
>   to seperate error detection (i.e. what if I have bad hardware or
>   some vendor makes bad/buggy intermediate system) from recovery
>   mechanisms (not based on hardware being bad or buggy - market forces
>   will wean out the vendor) which should not be based on assumptions
>   of bugs in hardware/software of specific implementations.
>
> >
> > Removing them from the spec is not a path we should take lightly.
>
>   I would phrase it the other way. We should not keep adding things
>   unless there is very clear proof that the additional feature is
>   beneficial and does not have negative side effects (and there is
>   some consensus on adding it)
> >
> > Julo
> >
> > "Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
> >
> > Please respond to "Jon Hall" <jhall@emc.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> >
> > I agree with Somesh.  And would go farther -- the complexity
> > that results from retaining enough target-side state to respond
> > to a SACK/SNACK request is non-trivial and needs clear justification.
> > Intuitively, a CRC that discovers an error in an iSCSI pdu header
> > (that the TCP cksum missed) seems like it should be a rare event.
> >
> > What is the frequency of this event?  IMO the answer to this
> > question should be written into the protocol spec -- assuming
> > that it substantiates the benefit of SACK/SNACK.  Otherwise, the
> > SACK/SNACK pdu should be removed.
> >
> > -Jon
> >
> > julian_satran@il.ibm.com writes:
> > >
> > >Somesh,
> > >
> > >As I stated earlier - the DataSN was created to detect missing data
> PDUs.
> > >SNACK is needed to recover missing StatusSN and missing dataSN
> is only a
> > >bonus if the target wants to support it.  It is a trivial mechanism and
> I
> > >think it should stay.
> > >
> > >Julo
> > >
> > >"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
> > >
> > >Please respond to someshg@yahoo.com
> > >
> > >To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> > >cc:
> > >Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >
> > >
> > >
> > >
> > >Sorry to have been missing for a while. Hope you will
> > >appreciate my being back in action :-). It was a fairly
> > >clear consensus in Orlando that applications broke up
> > >their transfers into reasonably small chunks i.e. they
> > >did not have very long running transfers.
> > >
> > >Therefore the consensus was that a command level recovery
> > >mechanism was sufficient instead of an ack/sack for each
> > >data PDU.
> > >
> > >The SACK mechanism was a post Orlando invention. Without
> > >an ack mechanism (for every data PDU), the SACK mechanism
> > >just imposes additional burden on either end of the session,
> > >without really much benefit.
> > >
> > >The benefit of having SACK is of saving bandwidth in case
> > >the data part of the data PDU failed an integrity check
> > >(but passed TCP checksum). This is a rare enough case that
> > >as a percentage, the bandwidth loss from retransmitting
> > >all the data associated with a read or write command is
> > >very very small.
> > >
> > >In addition, it avoids the complexity of restarting
> > >something from the middle, as compared to from the begining.
> > >
> > >To me it seems that there is significant simplicity (from
> > >implementation, reliability and recovery process) from
> > >having smaller data transfer per command.
> > >
> > >I would really like to get rid of the SACK command.
> > >
> > >Somesh
> > >
> > >> -----Original Message-----
> > >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
> Behalf Of
> > >> julian_satran@il.ibm.com
> > >> Sent: Wednesday, March 28, 2001 6:57 AM
> > >> To: ips@ece.cmu.edu
> > >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> Mallikarjun,
> > >>
> > >> Last summer I thought that recovery within a connection
> should be left
> > to
> > >> TCP. It is simple and could be made available through IPsec
> (if no new
> > >> option of any form can be added).
> > >>
> > >> Two things killed this:
> > >>
> > >>    The requirement to have a data encapsulation that can pass through
> > >>    application proxies (like a storage router)
> > >>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> > >>    header
> > >>
> > >>
> > >> As for the ACK - I am very much in favor of it (it is a no brainer)
> and
> > >> implementations are in fact allowed to drop even unacked data.
> > >>
> > >> I am bound by the Orlando meeting decision to drop it. Except the
> > regular
> > >> "oppose everything" crowd the two vocal opponents where Somesh
> > Gupta and
> > >> Matt Wakeley.
> > >>
> > >> David may want or not to re-open the issue - I am not going
> to ask for
> > >it.
> > >>
> > >> Regards,
> > >> Julo
> > >>
> > >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> > >>
> > >> Please respond to cbm@rose.hp.com
> > >>
> > >> To:   Black_David@emc.com
> > >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> > someshg@yahoo.com,
> > >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> > >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> > >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >>
> > >>
> > >>
> > >>
> > >> David and Julian,
> > >>
> > >> I appreciate both your views, and should I say that they're
> > >> along predicted lines :-)
> > >>
> > >> - David's right in saying that the situation is akin to FC's.
> > >>   However, I would like to point out that FC is an unreliable
> > >>   transport, and hence is forced to pick up a lot of the transport
> > >>   baggage (at least in FCP-2, as I understand), in addition
> > >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> > >>   TCP being the "reliable" transport, iSCSI is going along the
> > >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> > >>   point is - if this is indeed a necessary evil, why don't we
> > >>   complete iSCSI's transport functionality by data-ACKs?
> > >>
> > >> - If data SACK is introduced mostly to make up for TCP's
> shortcomings,
> > >>   we're making its usage (and implementation) drastically less
> > appealing
> > >>   since the only way error recovery algorithms can *rely* on
> data SACK
> > >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> > proposal),
> > >>   which is extremely expensive.  IOW, we're defining data SACK in the
> > >>   draft and not providing any incentives to implement and use it!
> > >>
> > >> - I submit that since iSCSI is being hailed as the ideal SCSI
> Transport
> > >>   protocol in its definition so far (and I believe, rightly so -
> > >mandating
> > >>   command ordering, bi-di support, SCSI CRN support to name a few
> > >> examples),
> > >>   the perfectly SCSI-legal R/W interactions that break in other
> > >transports
> > >>   *do not* have to break in iSCSI.
> > >>
> > >> - A last idea (may seem radical at this point) in regards to iSCSI
> > >>   being a "full transport". This provides us an opportunity to "cast
> > >>   off" the transport baggage in future when we truly move to a
> > "reliable"
> > >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> > >>   keeping the encapsulation stuff separate from the transport stuff.
> > >>   (Julian, I heard from Randy that ideas similar to this
> were explored
> > >>   in your Haifa meeting.  And yes, he recalls they were
> given up since
> > >>   TCP was supposed to be reliable and granularity of recovery
> > was deemed
> > >>   one I/O.)
> > >>
> > >> With that said, may I request David (with his co-chair hat on, :-))
> > >> to add some binding comments/observations on this discussion?
> > >>
> > >> If we decide to leave data SACKs as unattractive to implement,
> > the draft
> > >> should in the least add a statement like - "Note that satisfying all
> > >> possible data SACK requests for a task with an unacknowledged status
> > >> implies implementing the I/O replay buffer on the part of targets."
> > >> --
> > >> Mallikarjun
> > >>
> > >>
> > >> Mallikarjun Chadalapaka
> > >> Networked Storage Architecture
> > >> Network Storage Solutions Organization
> > >> MS 5668   Hewlett-Packard, Roseville.
> > >> cbm@rose.hp.com
> > >>
> > >>
> > >>
> > >>
> > >> >I think Julian's basically right -- I would point
> > >> >out that any case of write after read that breaks
> > >> >over iSCSI will also break over Fibre Channel.
> > >> >On FC, the scenario starts with a frame CRC failure
> > >> >on read data at the Initiator, so applications
> > >> >have to cope and typically do so by enforcing
> > >> >ordering at the app rather than using SCSI task
> > >> >ordering.
> > >> >
> > >> >While SCSI has clever tools like ACA and task
> > >> >ordering that appear to allow dependent operations
> > >> >to be sent to the target concurrently, in practice
> > >> >they don't work and/or aren't used (funny thing,
> > >> >those two reinforce each other ;-) ).  Hence
> > >> >a minimal approach to them is in order:
> > >> >- Make sure the result will interoperate.
> > >> >- Make sure T10 doesn't ding us for leaving something
> > >> >    completely out.
> > >> >- Don't specify anything not needed for the above.
> > >> >
> > >> >My 0.02,
> > >> >--David
> > >> >
> > >> >> -----Original Message-----
> > >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> > >> >> To:    cbm@rose.hp.com
> > >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
> hufferd@us.ibm.com;
> > >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> > >> >> Black_David@emc.com
> > >> >> Subject:    Re: iSCSI ERT: data SACK/replay
> buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >> Mallikarjun,
> > >> >>
> > >> >> I commiserate with you at the lack of ack for data but the Orlando
> > >> meeting
> > >> >> stated - no.  Recall that I kept the number only as a mechanism to
> > >> detect
> > >> >> missing packets.
> > >> >>
> > >> >> You can achieve the effect you want by keeping around data for a
> > while
> > >> >> (you
> > >> >> determine how long and then discard).
> > >> >>
> > >> >> If a SACK comes and you can recover - fine. If not you either
> > reaccess
> > >> the
> > >> >> media (if you know how) or reject
> > >> >> and let the initiator retry.
> > >> >>
> > >> >> You should not worry about R/W conflicts as programs bound to have
> > >such
> > >> >> conflicts either:
> > >> >>
> > >> >> 1)can live with them or
> > >> >> 2)protect themselves through some locks and rely on
> > >> "operation-end-status"
> > >> >> to keep results deterministic.
> > >> >>
> > >> >> Regards,
> > >> >> Julo
> > >> >>
> > >> >>
> > >> >>
> > >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> > >> >>
> > >> >> Please respond to cbm@rose.hp.com
> > >> >>
> > >> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
> > >Julian
> > >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> > >> >> cc:   Black_David@emc.com
> > >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Hi Error Recovery Team,
> > >> >>
> > >> >> iSCSI can discard PDUs because of digest errors and request
> > >> >> retransmissions using the iSCSI data SACK.  To deal with such
> > >> >> an eventuality, targets that want to support data SACK have
> > >> >> the following options:
> > >> >>
> > >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> > >> >>   a SACK could come anytime before the status is ack'ed by the
> > >> >>   initiator. [ simple, but extremely expensive in memory
> resources]
> > >> >>
> > >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> > data-ACKs.
> > >> >>   Thus enables keeping only those I/O buffers that haven't been
> > ack'ed
> > >> >>   by the initiator. IOW, become a real full transport! [ everyone
> > >> disliked
> > >> >>   it earlier...]
> > >> >>
> > >> >> (C) re-access the medium for data retransmission requests.
> > Now there
> > >> >>   are 3 sub-cases in this to handle the changed data on the
> > medium in
> > >a
> > >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
> is
> > >> >> legal.)
> > >> >>      (1) On seeing any write, stall till status is ack'ed
> > for all the
> > >> >>             previous reads (basically drain the pipe).
> [simple, but
> > >> incurs
> > >> >>             an additional roundtrip delay for all writes].
> > >> >>      (2) A variation of the above, keep an eye only on the prior
> > >> >>             overlapping reads. [more BW efficient, but
> > complicated to
> > >> >>             resolve the block dependencies in a stream of
> > >> reads followed
> > >> >>             by writes]
> > >> >>         (3) Document the caveat and leave it upto the applications
> > >> >>             to avoid this case since this leads to data integrity
> > >> issues.
> > >> >>             [pushing to apps since the transport can't get
> > it right!]
> > >> >>
> > >> >> My first preference is (B), followed by (A), and I suggest we not
> go
> > >> >> to (C) at all with its inherent dangers.
> > >> >>
> > >> >> Doing (B) naturally completes the transport job that iSCSI has
> taken
> > >> >> on itself in view of TCP's claimed unreliable checksum.  That is
> the
> > >> >> right thing to do architecturally instead of being a
> > "semi-transport"!
> > >> >>
> > >> >> Comments?
> > >> >> --
> > >> >> Mallikarjun
> > >> >>
> > >> >>
> > >> >> Mallikarjun Chadalapaka
> > >> >> Networked Storage Architecture
> > >> >> Network Storage Solutions Organization
> > >> >> MS 5668   Hewlett-Packard, Roseville.
> > >> >> cbm@rose.hp.com
> > >> >>
> > >> >>
> > >>
> > >_________________________________________________________________
> > _________
> > >> >> Note.1: A Read followed by a Write (to the same blocks) is
> perfectly
> > >> legal
> > >> >>         if SCSI sets the ORDERED task attribute on both the
> > >> commands AND
> > >> >>         sets the NACA bit to one to indicate that Write shall be
> > >> executed
> > >> >>         only if the Read did not fail (result in a Check
> Condition).
> > >> >>
> > >> >>         In the current case, since Read completed just fine from
> > >SCSI's
> > >> >>         point of view, SCSI is moving on to execute Write.  Those
> > read
> > >> >> buffers
> > >> >>         had been freed up since iSCSI received an ACK at the TCP
> > >level,
> > >> >> and
> > >> >>         since iSCSI has no other way to have the data ack'ed!
> >
> >
>
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>
>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



From owner-ips@ece.cmu.edu  Wed Apr  4 19:59:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA04770
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 19:59:10 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LpYQ22924
	for ips-outgoing; Wed, 4 Apr 2001 17:51:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34LpPr22893
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:51:25 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by atlrel1.hp.com (Postfix) with ESMTP id C8993311
	for <ips@ece.cmu.edu>; Wed,  4 Apr 2001 17:51:11 -0400 (EDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id OAA11424 for ips@ece.cmu.edu; Wed, 4 Apr 2001 14:52:12 -0700 (PDT)
Message-Id: <200104042152.OAA11424@core.rose.hp.com>
Subject: Re: iSCSI: synch and steering comments
To: ips@ece.cmu.edu
Date: Wed, 04 Apr 2001 14:52:11 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julian,

>Mallikarjun,
>
>The draft has now one single mechanism and this is optional.
>Two implemenatations that have different synch and steering will not be
>able to use them.
>
>Alternatively this group may want to mandate one for interopearbility.
>
>My take is that we can wait (until iSCSI-02 -:))

How about mandating "no synch and steering" mode for all implementations
in iSCSI-01?  Doesn't it ensure interoperability?

Since you're choosing not to comment in spite of my continued restatements
of the header-digest-error-with-no-sync question, I am assuming you agree 
with option (a) below and hope to see it stated in rev06.

Thank you.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com


>
>"Mallikarjun C." <cbm@rose.hp.com> on 04/04/2001 20:02:12
>
>Please respond to cbm@rose.hp.com
>
>To:   ips@ece.cmu.edu
>cc:
>Subject:  Re: iSCSI: synch and steering comments
>
>
>
>
>Julian,
>
>>Mallikarjun,
>>
>>I am not sure about which comment. If it is about synch and steering I
>>think that recovery from header digest errors
>>should not mandate a synch mechanism.  Some very sophisticated
>>implementations may want to take advantage of such a mechanism if it is
>>there. As this interaction may fairly complex and implementation dependent
>>we will assume that we will drop the connection in the recovery
>>descriptions we will provide.
>
>Sorry, I am not clear on what you meant (keep in mind that that I am
>not asking to mandate a synch mechanism) -
>
>Are you saying that when synch and steering is not implemented in an iSCSI
>device:
>  a) it can recover from header digest errors only by dropping the
>connection
>     (which the ER algorithms would assume)
>OR
>  b) it can recover from header digest errors by "fairly complex and
>     implementaion dependent" mechanisms which rely on the Length field
>     anyway, and try to analyze perhaps several PDUs for achieving the
>     framing synch again?
>
>I am assuming that it would be (a).  I am also requesting that this be made
>clear in the draft.
>
>Here's the third comment at the bottom that you missed.  Your comments
>would be very helpful.  Thanks!
>
>-It appears to me that at least one Synch and Steering layer must be
> defined/referred to as the minimal implementation in the main draft to
> enable interoperability, when implementations do implement Synch and
>Steering.
>
>+++ why ? +++
>
>I may be using "interoperability" in a somewhat unconventional sense here.
>While the draft says that Synch and Steering layer is optional, I don't see
>that it requires implementations to always support a "no synch & steering"
>mode, even when they support one type of Synch and Steering layer.  Given
>that there's no mandatory Synch and Steering layer either, I don't see how
>two
>iSCSI boxes that a customer buys are guaranteed to interoperate.  Please
>comment if the draft already implies what I am asking for.
>--
>Mallikarjun
>
>
>Mallikarjun Chadalapaka
>Networked Storage Architecture
>Network Storage Solutions Organization
>MS 5668   Hewlett-Packard, Roseville.
>cbm@rose.hp.com
>
>>
>>This is also partly a result of choosing Format 2.
>>
>>Regards,
>>Julo
>>


From owner-ips@ece.cmu.edu  Wed Apr  4 20:00:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA04811
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 20:00:41 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34LwW723357
	for ips-outgoing; Wed, 4 Apr 2001 17:58:32 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from smtp017.mail.yahoo.com (smtp017.mail.yahoo.com [216.136.174.114])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f34LvZr23306
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 17:57:36 -0400 (EDT)
Received: from sdsl-216-36-75-164.dsl.sjc.megapath.net (HELO somesh) (216.36.75.164)
  by smtp.mail.vip.sc5.yahoo.com with SMTP; 4 Apr 2001 21:57:31 -0000
X-Apparently-From: <someshg@yahoo.com>
Reply-To: <someshg@yahoo.com>
From: "Somesh Gupta" <someshg@yahoo.com>
To: <julian_satran@il.ibm.com>, <someshg@yahoo.com>
Cc: <someshg@yahoo.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Wed, 4 Apr 2001 14:52:05 -0700
Message-ID: <NMEALCLOIBCHBDHLCMIJIELMCCAA.someshg@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
In-Reply-To: <C1256A24.007567CE.00@d12mta05.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

There was an earlier discussion on this very thread with
posting from Pierre and Bob. If you list all the posting on the
thread and pick the one from Pierre and Bob. And please
read the caveat in the last sentence in my posting.

To beat a dead horse again and again ...

The requirements for detecting errors are more stringent,
even though as Steph makes very valid points in his last
message.

The requirements for recovery are different as it is better
to let market forces and maintainance crew take care of the
bad middle boxes.

Somesh

> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Wednesday, April 04, 2001 2:23 PM
> To: someshg@yahoo.com
> Cc: someshg@yahoo.com; ips@ece.cmu.edu
> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> Somesh,
>
> Can you give us a reference for those rates?  Where do they come from?
>
> Regards,
> Julo
>
> "Somesh Gupta" <someshg@yahoo.com> on 04/04/2001 23:02:06
>
> Please respond to someshg@yahoo.com
>
> To:   Julian Satran/Haifa/IBM@IBMIL, someshg@yahoo.com
> cc:   ips@ece.cmu.edu
> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
> Assuming that the packet corruption escape rate is 1 in 10billion,
> we have (rough assuming 1K byte per packet), 1 escaped packet every
> 10 Trillion Bytes of data transfer. Seems to me that if I
> had to transfer 1 MBytes for having to recover at the
> command level rather than at a more granular level, that does
> not pose much of an additional burden (1 MB out of 10 Trillion
> bytes). Also assuming each i/o is 1 MByte in size, you would
> have to do recovery for every 1 in 10 million transactions.
>
> I don't know how realistic the 1 in 10 billion packet corruption
> escape rate is but I am using the number from past discussions.
>
> Somesh
>
> > -----Original Message-----
> > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > Sent: Wednesday, April 04, 2001 11:56 AM
> > To: someshg@yahoo.com
> > Cc: ips@ece.cmu.edu
> > Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> > What are the numbers you are looking at:
> >
> > 1 per 10 sec, 1/10h or 1 /10y?
> >
> > Julo
> >
> > "Somesh Gupta" <someshg@yahoo.com> on 04/04/2001 20:15:53
> >
> > Please respond to someshg@yahoo.com
> >
> > To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> > cc:
> > Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >
> >
> >
> >
> > > -----Original Message-----
> > > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> > > julian_satran@il.ibm.com
> > > Sent: Wednesday, April 04, 2001 7:32 AM
> > > To: ips@ece.cmu.edu
> > > Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >
> > >
> > >
> > >
> > > SNACK is here for two reasons - Status retry (which is cheap) and Data
> > > retry as a side benefit.
> >
> >   Unless there is clear benefit (i.e. the event is frequent enough
> >   to justify recovery at this level), the entire mechanism should be
> >   dropped - it is neither cheap nor free. If it is relatively
> >   infrequent, the recovery at the command level should be a sufficient
> >   mechanism
> >
> > > CRC errors are not that rare (although we don't have real data the
> > > simulation with file systems seem to indicate that numbers could
> > > be as high
> > > a 0.0002%). A restart of link - is expensive (slow start) and even if
> > they
> > > are far lower for many applications a slow start is a painfull event.
> >
> >   Intuitively, it seems that the combination of link level CRC, TCP
> >   checksum, and good hardware (ECC, parity etc) should lead to a
> >   much lower level of errors caught by the iSCSI CRC algorithm. We have
> >   to seperate error detection (i.e. what if I have bad hardware or
> >   some vendor makes bad/buggy intermediate system) from recovery
> >   mechanisms (not based on hardware being bad or buggy - market forces
> >   will wean out the vendor) which should not be based on assumptions
> >   of bugs in hardware/software of specific implementations.
> >
> > >
> > > Removing them from the spec is not a path we should take lightly.
> >
> >   I would phrase it the other way. We should not keep adding things
> >   unless there is very clear proof that the additional feature is
> >   beneficial and does not have negative side effects (and there is
> >   some consensus on adding it)
> > >
> > > Julo
> > >
> > > "Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
> > >
> > > Please respond to "Jon Hall" <jhall@emc.com>
> > >
> > > To:   ips@ece.cmu.edu
> > > cc:
> > > Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > >
> > >
> > >
> > >
> > >
> > > I agree with Somesh.  And would go farther -- the complexity
> > > that results from retaining enough target-side state to respond
> > > to a SACK/SNACK request is non-trivial and needs clear justification.
> > > Intuitively, a CRC that discovers an error in an iSCSI pdu header
> > > (that the TCP cksum missed) seems like it should be a rare event.
> > >
> > > What is the frequency of this event?  IMO the answer to this
> > > question should be written into the protocol spec -- assuming
> > > that it substantiates the benefit of SACK/SNACK.  Otherwise, the
> > > SACK/SNACK pdu should be removed.
> > >
> > > -Jon
> > >
> > > julian_satran@il.ibm.com writes:
> > > >
> > > >Somesh,
> > > >
> > > >As I stated earlier - the DataSN was created to detect missing data
> > PDUs.
> > > >SNACK is needed to recover missing StatusSN and missing dataSN
> > is only a
> > > >bonus if the target wants to support it.  It is a trivial mechanism
> and
> > I
> > > >think it should stay.
> > > >
> > > >Julo
> > > >
> > > >"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
> > > >
> > > >Please respond to someshg@yahoo.com
> > > >
> > > >To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> > > >cc:
> > > >Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > > >
> > > >
> > > >
> > > >
> > > >Sorry to have been missing for a while. Hope you will
> > > >appreciate my being back in action :-). It was a fairly
> > > >clear consensus in Orlando that applications broke up
> > > >their transfers into reasonably small chunks i.e. they
> > > >did not have very long running transfers.
> > > >
> > > >Therefore the consensus was that a command level recovery
> > > >mechanism was sufficient instead of an ack/sack for each
> > > >data PDU.
> > > >
> > > >The SACK mechanism was a post Orlando invention. Without
> > > >an ack mechanism (for every data PDU), the SACK mechanism
> > > >just imposes additional burden on either end of the session,
> > > >without really much benefit.
> > > >
> > > >The benefit of having SACK is of saving bandwidth in case
> > > >the data part of the data PDU failed an integrity check
> > > >(but passed TCP checksum). This is a rare enough case that
> > > >as a percentage, the bandwidth loss from retransmitting
> > > >all the data associated with a read or write command is
> > > >very very small.
> > > >
> > > >In addition, it avoids the complexity of restarting
> > > >something from the middle, as compared to from the begining.
> > > >
> > > >To me it seems that there is significant simplicity (from
> > > >implementation, reliability and recovery process) from
> > > >having smaller data transfer per command.
> > > >
> > > >I would really like to get rid of the SACK command.
> > > >
> > > >Somesh
> > > >
> > > >> -----Original Message-----
> > > >> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
> > Behalf Of
> > > >> julian_satran@il.ibm.com
> > > >> Sent: Wednesday, March 28, 2001 6:57 AM
> > > >> To: ips@ece.cmu.edu
> > > >> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> Mallikarjun,
> > > >>
> > > >> Last summer I thought that recovery within a connection
> > should be left
> > > to
> > > >> TCP. It is simple and could be made available through IPsec
> > (if no new
> > > >> option of any form can be added).
> > > >>
> > > >> Two things killed this:
> > > >>
> > > >>    The requirement to have a data encapsulation that can pass
> through
> > > >>    application proxies (like a storage router)
> > > >>    The "NO WAY" message we got from IESG-Security on a CRC only
> IPSec
> > > >>    header
> > > >>
> > > >>
> > > >> As for the ACK - I am very much in favor of it (it is a no brainer)
> > and
> > > >> implementations are in fact allowed to drop even unacked data.
> > > >>
> > > >> I am bound by the Orlando meeting decision to drop it. Except the
> > > regular
> > > >> "oppose everything" crowd the two vocal opponents where Somesh
> > > Gupta and
> > > >> Matt Wakeley.
> > > >>
> > > >> David may want or not to re-open the issue - I am not going
> > to ask for
> > > >it.
> > > >>
> > > >> Regards,
> > > >> Julo
> > > >>
> > > >> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> > > >>
> > > >> Please respond to cbm@rose.hp.com
> > > >>
> > > >> To:   Black_David@emc.com
> > > >> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> > > someshg@yahoo.com,
> > > >>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> > > >>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> > > >> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> David and Julian,
> > > >>
> > > >> I appreciate both your views, and should I say that they're
> > > >> along predicted lines :-)
> > > >>
> > > >> - David's right in saying that the situation is akin to FC's.
> > > >>   However, I would like to point out that FC is an unreliable
> > > >>   transport, and hence is forced to pick up a lot of the transport
> > > >>   baggage (at least in FCP-2, as I understand), in addition
> > > >>   to being a SCSI encapsulation layer.  Unfortunately, even with
> > > >>   TCP being the "reliable" transport, iSCSI is going along the
> > > >>   same lines - ie. transport baggage + SCSI encapsulation.  My
> > > >>   point is - if this is indeed a necessary evil, why don't we
> > > >>   complete iSCSI's transport functionality by data-ACKs?
> > > >>
> > > >> - If data SACK is introduced mostly to make up for TCP's
> > shortcomings,
> > > >>   we're making its usage (and implementation) drastically less
> > > appealing
> > > >>   since the only way error recovery algorithms can *rely* on
> > data SACK
> > > >>   is when replay is supported (or, "ReplaySupport=yes"  in my
> > > proposal),
> > > >>   which is extremely expensive.  IOW, we're defining data SACK in
> the
> > > >>   draft and not providing any incentives to implement and use it!
> > > >>
> > > >> - I submit that since iSCSI is being hailed as the ideal SCSI
> > Transport
> > > >>   protocol in its definition so far (and I believe, rightly so -
> > > >mandating
> > > >>   command ordering, bi-di support, SCSI CRN support to name a few
> > > >> examples),
> > > >>   the perfectly SCSI-legal R/W interactions that break in other
> > > >transports
> > > >>   *do not* have to break in iSCSI.
> > > >>
> > > >> - A last idea (may seem radical at this point) in regards to iSCSI
> > > >>   being a "full transport". This provides us an
> opportunity to "cast
> > > >>   off" the transport baggage in future when we truly move to a
> > > "reliable"
> > > >>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> > > >>   keeping the encapsulation stuff separate from the
> transport stuff.
> > > >>   (Julian, I heard from Randy that ideas similar to this
> > were explored
> > > >>   in your Haifa meeting.  And yes, he recalls they were
> > given up since
> > > >>   TCP was supposed to be reliable and granularity of recovery
> > > was deemed
> > > >>   one I/O.)
> > > >>
> > > >> With that said, may I request David (with his co-chair hat on, :-))
> > > >> to add some binding comments/observations on this discussion?
> > > >>
> > > >> If we decide to leave data SACKs as unattractive to implement,
> > > the draft
> > > >> should in the least add a statement like - "Note that
> satisfying all
> > > >> possible data SACK requests for a task with an
> unacknowledged status
> > > >> implies implementing the I/O replay buffer on the part of targets."
> > > >> --
> > > >> Mallikarjun
> > > >>
> > > >>
> > > >> Mallikarjun Chadalapaka
> > > >> Networked Storage Architecture
> > > >> Network Storage Solutions Organization
> > > >> MS 5668   Hewlett-Packard, Roseville.
> > > >> cbm@rose.hp.com
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> >I think Julian's basically right -- I would point
> > > >> >out that any case of write after read that breaks
> > > >> >over iSCSI will also break over Fibre Channel.
> > > >> >On FC, the scenario starts with a frame CRC failure
> > > >> >on read data at the Initiator, so applications
> > > >> >have to cope and typically do so by enforcing
> > > >> >ordering at the app rather than using SCSI task
> > > >> >ordering.
> > > >> >
> > > >> >While SCSI has clever tools like ACA and task
> > > >> >ordering that appear to allow dependent operations
> > > >> >to be sent to the target concurrently, in practice
> > > >> >they don't work and/or aren't used (funny thing,
> > > >> >those two reinforce each other ;-) ).  Hence
> > > >> >a minimal approach to them is in order:
> > > >> >- Make sure the result will interoperate.
> > > >> >- Make sure T10 doesn't ding us for leaving something
> > > >> >    completely out.
> > > >> >- Don't specify anything not needed for the above.
> > > >> >
> > > >> >My 0.02,
> > > >> >--David
> > > >> >
> > > >> >> -----Original Message-----
> > > >> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > > >> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> > > >> >> To:    cbm@rose.hp.com
> > > >> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
> > hufferd@us.ibm.com;
> > > >> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> > > >> >> Black_David@emc.com
> > > >> >> Subject:    Re: iSCSI ERT: data SACK/replay
> > buffer/"semi-transport"
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> Mallikarjun,
> > > >> >>
> > > >> >> I commiserate with you at the lack of ack for data but the
> Orlando
> > > >> meeting
> > > >> >> stated - no.  Recall that I kept the number only as a mechanism
> to
> > > >> detect
> > > >> >> missing packets.
> > > >> >>
> > > >> >> You can achieve the effect you want by keeping around data for a
> > > while
> > > >> >> (you
> > > >> >> determine how long and then discard).
> > > >> >>
> > > >> >> If a SACK comes and you can recover - fine. If not you either
> > > reaccess
> > > >> the
> > > >> >> media (if you know how) or reject
> > > >> >> and let the initiator retry.
> > > >> >>
> > > >> >> You should not worry about R/W conflicts as programs bound to
> have
> > > >such
> > > >> >> conflicts either:
> > > >> >>
> > > >> >> 1)can live with them or
> > > >> >> 2)protect themselves through some locks and rely on
> > > >> "operation-end-status"
> > > >> >> to keep results deterministic.
> > > >> >>
> > > >> >> Regards,
> > > >> >> Julo
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> > > >> >>
> > > >> >> Please respond to cbm@rose.hp.com
> > > >> >>
> > > >> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
> > > >Julian
> > > >> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> > > >> >> cc:   Black_David@emc.com
> > > >> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> Hi Error Recovery Team,
> > > >> >>
> > > >> >> iSCSI can discard PDUs because of digest errors and request
> > > >> >> retransmissions using the iSCSI data SACK.  To deal with such
> > > >> >> an eventuality, targets that want to support data SACK have
> > > >> >> the following options:
> > > >> >>
> > > >> >> (A) maintain a complete "replay" buffer for the entire I/O since
> > > >> >>   a SACK could come anytime before the status is ack'ed by the
> > > >> >>   initiator. [ simple, but extremely expensive in memory
> > resources]
> > > >> >>
> > > >> >> (B) (re-introduce data-ACKs into the draft, and) implement
> > > data-ACKs.
> > > >> >>   Thus enables keeping only those I/O buffers that haven't been
> > > ack'ed
> > > >> >>   by the initiator. IOW, become a real full transport!
> [ everyone
> > > >> disliked
> > > >> >>   it earlier...]
> > > >> >>
> > > >> >> (C) re-access the medium for data retransmission requests.
> > > Now there
> > > >> >>   are 3 sub-cases in this to handle the changed data on the
> > > medium in
> > > >a
> > > >> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom
> on how it
> > is
> > > >> >> legal.)
> > > >> >>      (1) On seeing any write, stall till status is ack'ed
> > > for all the
> > > >> >>             previous reads (basically drain the pipe).
> > [simple, but
> > > >> incurs
> > > >> >>             an additional roundtrip delay for all writes].
> > > >> >>      (2) A variation of the above, keep an eye only on the prior
> > > >> >>             overlapping reads. [more BW efficient, but
> > > complicated to
> > > >> >>             resolve the block dependencies in a stream of
> > > >> reads followed
> > > >> >>             by writes]
> > > >> >>         (3) Document the caveat and leave it upto the
> applications
> > > >> >>             to avoid this case since this leads to data
> integrity
> > > >> issues.
> > > >> >>             [pushing to apps since the transport can't get
> > > it right!]
> > > >> >>
> > > >> >> My first preference is (B), followed by (A), and I
> suggest we not
> > go
> > > >> >> to (C) at all with its inherent dangers.
> > > >> >>
> > > >> >> Doing (B) naturally completes the transport job that iSCSI has
> > taken
> > > >> >> on itself in view of TCP's claimed unreliable checksum.  That is
> > the
> > > >> >> right thing to do architecturally instead of being a
> > > "semi-transport"!
> > > >> >>
> > > >> >> Comments?
> > > >> >> --
> > > >> >> Mallikarjun
> > > >> >>
> > > >> >>
> > > >> >> Mallikarjun Chadalapaka
> > > >> >> Networked Storage Architecture
> > > >> >> Network Storage Solutions Organization
> > > >> >> MS 5668   Hewlett-Packard, Roseville.
> > > >> >> cbm@rose.hp.com
> > > >> >>
> > > >> >>
> > > >>
> > > >_________________________________________________________________
> > > _________
> > > >> >> Note.1: A Read followed by a Write (to the same blocks) is
> > perfectly
> > > >> legal
> > > >> >>         if SCSI sets the ORDERED task attribute on both the
> > > >> commands AND
> > > >> >>         sets the NACA bit to one to indicate that Write shall be
> > > >> executed
> > > >> >>         only if the Read did not fail (result in a Check
> > Condition).
> > > >> >>
> > > >> >>         In the current case, since Read completed just fine from
> > > >SCSI's
> > > >> >>         point of view, SCSI is moving on to execute
> Write.  Those
> > > read
> > > >> >> buffers
> > > >> >>         had been freed up since iSCSI received an ACK at the TCP
> > > >level,
> > > >> >> and
> > > >> >>         since iSCSI has no other way to have the data ack'ed!
> > >
> > >
> >
> >
> > _________________________________________________________
> > Do You Yahoo!?
> > Get your free @yahoo.com address at http://mail.yahoo.com
> >
> >
> >
>
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>
>
>


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



From owner-ips@ece.cmu.edu  Wed Apr  4 21:17:14 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05491
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 21:17:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34Kv0c18981
	for ips-outgoing; Wed, 4 Apr 2001 16:57:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lub1028.lss.emc.com ([168.159.39.28])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34Ktkr18861
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 16:55:51 -0400 (EDT)
Received: from emc.com (IDENT:jhall@localhost.localdomain [127.0.0.1])
	by lub1028.lss.emc.com (8.9.3/8.9.3) with ESMTP id QAA28295
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 16:55:40 -0400
Message-Id: <200104042055.QAA28295@lub1028.lss.emc.com>
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Wed, 04 Apr 2001 16:55:40 -0400
From: "Jon Hall" <jhall@emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


But CRC errors are not really the issue.  It is the
singular case of a TCP cksum failing to detect what a
CRC succeeds in detecting, and this occurring to a TCP
segment containing an iSCSI hdr with a StatSN.

Is there a reason to believe that iSCSI StatSNs will be
lost at a higher rate than is currently documented for TCP
cksum failure?  Or, is the problem a loss of one TCP segment
in tens (possibly hundreds) of millions of segments.  Where
the bad segment may contain a StatSN but probably doesn't
because it is a data pdu.  If the latter, why does a SCSI-level
timeout and retry (on the initiator) not suffice?  [Note,
an initiator timeout/retry does not require a connection
to be closed.]

I realize that I am being annoyingly repetitious, but it is
not an idle question.  For some targets, retained rsp status
is not cheap (and retained rsp data is not tractable at all).

IMO there appears to be no real need for SNACK.  And, more
radically, there appears to be no need for StatSNs.

Maybe, as Somesh said, this is a dead horse but why include
something in the spec which suggests a need for target-side
complexity, while not solving a clear and compelling
requirement?

-Jon

julian_satran@il.ibm.com writes:
>
>SNACK is here for two reasons - Status retry (which is cheap) and Data
>retry as a side benefit.
>CRC errors are not that rare (although we don't have real data the
>simulation with file systems seem to indicate that numbers could be as high
>a 0.0002%). A restart of link - is expensive (slow start) and even if they
>are far lower for many applications a slow start is a painfull event.
>
>Removing them from the spec is not a path we should take lightly.
>
>Julo
>
>"Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
>
>Please respond to "Jon Hall" <jhall@emc.com>
>
>To:   ips@ece.cmu.edu
>cc:
>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>
>
>
>I agree with Somesh.  And would go farther -- the complexity
>that results from retaining enough target-side state to respond
>to a SACK/SNACK request is non-trivial and needs clear justification.
>Intuitively, a CRC that discovers an error in an iSCSI pdu header
>(that the TCP cksum missed) seems like it should be a rare event.
>
>What is the frequency of this event?  IMO the answer to this
>question should be written into the protocol spec -- assuming
>that it substantiates the benefit of SACK/SNACK.  Otherwise, the
>SACK/SNACK pdu should be removed.
>
>-Jon
>
>julian_satran@il.ibm.com writes:
>>
>>Somesh,
>>
>>As I stated earlier - the DataSN was created to detect missing data PDUs.
>>SNACK is needed to recover missing StatusSN and missing dataSN is only a
>>bonus if the target wants to support it.  It is a trivial mechanism and I
>>think it should stay.
>>
>>Julo
>>
>>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
>>
>>Please respond to someshg@yahoo.com
>>
>>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
>>cc:
>>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>>Sorry to have been missing for a while. Hope you will
>>appreciate my being back in action :-). It was a fairly
>>clear consensus in Orlando that applications broke up
>>their transfers into reasonably small chunks i.e. they
>>did not have very long running transfers.
>>
>>Therefore the consensus was that a command level recovery
>>mechanism was sufficient instead of an ack/sack for each
>>data PDU.
>>
>>The SACK mechanism was a post Orlando invention. Without
>>an ack mechanism (for every data PDU), the SACK mechanism
>>just imposes additional burden on either end of the session,
>>without really much benefit.
>>
>>The benefit of having SACK is of saving bandwidth in case
>>the data part of the data PDU failed an integrity check
>>(but passed TCP checksum). This is a rare enough case that
>>as a percentage, the bandwidth loss from retransmitting
>>all the data associated with a read or write command is
>>very very small.
>>
>>In addition, it avoids the complexity of restarting
>>something from the middle, as compared to from the begining.
>>
>>To me it seems that there is significant simplicity (from
>>implementation, reliability and recovery process) from
>>having smaller data transfer per command.
>>
>>I would really like to get rid of the SACK command.
>>
>>Somesh
>>
>>> -----Original Message-----
>>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>>> julian_satran@il.ibm.com
>>> Sent: Wednesday, March 28, 2001 6:57 AM
>>> To: ips@ece.cmu.edu
>>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>
>>>
>>>
>>>
>>> Mallikarjun,
>>>
>>> Last summer I thought that recovery within a connection should be left
>to
>>> TCP. It is simple and could be made available through IPsec (if no new
>>> option of any form can be added).
>>>
>>> Two things killed this:
>>>
>>>    The requirement to have a data encapsulation that can pass through
>>>    application proxies (like a storage router)
>>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>>    header
>>>
>>>
>>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>>> implementations are in fact allowed to drop even unacked data.
>>>
>>> I am bound by the Orlando meeting decision to drop it. Except the
>regular
>>> "oppose everything" crowd the two vocal opponents where Somesh Gupta and
>>> Matt Wakeley.
>>>
>>> David may want or not to re-open the issue - I am not going to ask for
>>it.
>>>
>>> Regards,
>>> Julo
>>>
>>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>>
>>> Please respond to cbm@rose.hp.com
>>>
>>> To:   Black_David@emc.com
>>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com, someshg@yahoo.com,
>>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>
>>>
>>>
>>>
>>> David and Julian,
>>>
>>> I appreciate both your views, and should I say that they're
>>> along predicted lines :-)
>>>
>>> - David's right in saying that the situation is akin to FC's.
>>>   However, I would like to point out that FC is an unreliable
>>>   transport, and hence is forced to pick up a lot of the transport
>>>   baggage (at least in FCP-2, as I understand), in addition
>>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>>   TCP being the "reliable" transport, iSCSI is going along the
>>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>>   point is - if this is indeed a necessary evil, why don't we
>>>   complete iSCSI's transport functionality by data-ACKs?
>>>
>>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>>   we're making its usage (and implementation) drastically less appealing
>>>   since the only way error recovery algorithms can *rely* on data SACK
>>>   is when replay is supported (or, "ReplaySupport=yes"  in my proposal),
>>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>>   draft and not providing any incentives to implement and use it!
>>>
>>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>>>   protocol in its definition so far (and I believe, rightly so -
>>mandating
>>>   command ordering, bi-di support, SCSI CRN support to name a few
>>> examples),
>>>   the perfectly SCSI-legal R/W interactions that break in other
>>transports
>>>   *do not* have to break in iSCSI.
>>>
>>> - A last idea (may seem radical at this point) in regards to iSCSI
>>>   being a "full transport". This provides us an opportunity to "cast
>>>   off" the transport baggage in future when we truly move to a
>"reliable"
>>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>>   keeping the encapsulation stuff separate from the transport stuff.
>>>   (Julian, I heard from Randy that ideas similar to this were explored
>>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>>   TCP was supposed to be reliable and granularity of recovery was deemed
>>>   one I/O.)
>>>
>>> With that said, may I request David (with his co-chair hat on, :-))
>>> to add some binding comments/observations on this discussion?
>>>
>>> If we decide to leave data SACKs as unattractive to implement, the draft
>>> should in the least add a statement like - "Note that satisfying all
>>> possible data SACK requests for a task with an unacknowledged status
>>> implies implementing the I/O replay buffer on the part of targets."
>>> --
>>> Mallikarjun
>>>
>>>
>>> Mallikarjun Chadalapaka
>>> Networked Storage Architecture
>>> Network Storage Solutions Organization
>>> MS 5668   Hewlett-Packard, Roseville.
>>> cbm@rose.hp.com
>>>
>>>
>>>
>>>
>>> >I think Julian's basically right -- I would point
>>> >out that any case of write after read that breaks
>>> >over iSCSI will also break over Fibre Channel.
>>> >On FC, the scenario starts with a frame CRC failure
>>> >on read data at the Initiator, so applications
>>> >have to cope and typically do so by enforcing
>>> >ordering at the app rather than using SCSI task
>>> >ordering.
>>> >
>>> >While SCSI has clever tools like ACA and task
>>> >ordering that appear to allow dependent operations
>>> >to be sent to the target concurrently, in practice
>>> >they don't work and/or aren't used (funny thing,
>>> >those two reinforce each other ;-) ).  Hence
>>> >a minimal approach to them is in order:
>>> >- Make sure the result will interoperate.
>>> >- Make sure T10 doesn't ding us for leaving something
>>> >    completely out.
>>> >- Don't specify anything not needed for the above.
>>> >
>>> >My 0.02,
>>> >--David
>>> >
>>> >> -----Original Message-----
>>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>>> >> To:    cbm@rose.hp.com
>>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>>> >> Black_David@emc.com
>>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>> >>
>>> >>
>>> >>
>>> >> Mallikarjun,
>>> >>
>>> >> I commiserate with you at the lack of ack for data but the Orlando
>>> meeting
>>> >> stated - no.  Recall that I kept the number only as a mechanism to
>>> detect
>>> >> missing packets.
>>> >>
>>> >> You can achieve the effect you want by keeping around data for a
>while
>>> >> (you
>>> >> determine how long and then discard).
>>> >>
>>> >> If a SACK comes and you can recover - fine. If not you either
>reaccess
>>> the
>>> >> media (if you know how) or reject
>>> >> and let the initiator retry.
>>> >>
>>> >> You should not worry about R/W conflicts as programs bound to have
>>such
>>> >> conflicts either:
>>> >>
>>> >> 1)can live with them or
>>> >> 2)protect themselves through some locks and rely on
>>> "operation-end-status"
>>> >> to keep results deterministic.
>>> >>
>>> >> Regards,
>>> >> Julo
>>> >>
>>> >>
>>> >>
>>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>>> >>
>>> >> Please respond to cbm@rose.hp.com
>>> >>
>>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
>>Julian
>>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>>> >> cc:   Black_David@emc.com
>>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Hi Error Recovery Team,
>>> >>
>>> >> iSCSI can discard PDUs because of digest errors and request
>>> >> retransmissions using the iSCSI data SACK.  To deal with such
>>> >> an eventuality, targets that want to support data SACK have
>>> >> the following options:
>>> >>
>>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>>> >>   a SACK could come anytime before the status is ack'ed by the
>>> >>   initiator. [ simple, but extremely expensive in memory resources]
>>> >>
>>> >> (B) (re-introduce data-ACKs into the draft, and) implement data-ACKs.
>>> >>   Thus enables keeping only those I/O buffers that haven't been
>ack'ed
>>> >>   by the initiator. IOW, become a real full transport! [ everyone
>>> disliked
>>> >>   it earlier...]
>>> >>
>>> >> (C) re-access the medium for data retransmission requests.  Now there
>>> >>   are 3 sub-cases in this to handle the changed data on the medium in
>>a
>>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>>> >> legal.)
>>> >>      (1) On seeing any write, stall till status is ack'ed for all the
>>> >>             previous reads (basically drain the pipe). [simple, but
>>> incurs
>>> >>             an additional roundtrip delay for all writes].
>>> >>      (2) A variation of the above, keep an eye only on the prior
>>> >>             overlapping reads. [more BW efficient, but complicated to
>>> >>             resolve the block dependencies in a stream of
>>> reads followed
>>> >>             by writes]
>>> >>         (3) Document the caveat and leave it upto the applications
>>> >>             to avoid this case since this leads to data integrity
>>> issues.
>>> >>             [pushing to apps since the transport can't get it right!]
>>> >>
>>> >> My first preference is (B), followed by (A), and I suggest we not go
>>> >> to (C) at all with its inherent dangers.
>>> >>
>>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>>> >> right thing to do architecturally instead of being a
>"semi-transport"!
>>> >>
>>> >> Comments?
>>> >> --
>>> >> Mallikarjun
>>> >>
>>> >>
>>> >> Mallikarjun Chadalapaka
>>> >> Networked Storage Architecture
>>> >> Network Storage Solutions Organization
>>> >> MS 5668   Hewlett-Packard, Roseville.
>>> >> cbm@rose.hp.com
>>> >>
>>> >>
>>>
>>__________________________________________________________________________
>>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>>> legal
>>> >>         if SCSI sets the ORDERED task attribute on both the
>>> commands AND
>>> >>         sets the NACA bit to one to indicate that Write shall be
>>> executed
>>> >>         only if the Read did not fail (result in a Check Condition).
>>> >>
>>> >>         In the current case, since Read completed just fine from
>>SCSI's
>>> >>         point of view, SCSI is moving on to execute Write.  Those
>read
>>> >> buffers
>>> >>         had been freed up since iSCSI received an ACK at the TCP
>>level,
>>> >> and
>>> >>         since iSCSI has no other way to have the data ack'ed!
>


From owner-ips@ece.cmu.edu  Wed Apr  4 21:17:42 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05503
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 21:17:41 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34N4fs27843
	for ips-outgoing; Wed, 4 Apr 2001 19:04:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34N4Vr27832
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 19:04:31 -0400 (EDT)
Received: from colosus2.cup.hp.com (colosus2.cup.hp.com [15.13.128.145])
	by palrel3.hp.com (Postfix) with ESMTP id 2AB433A1
	for <ips@ece.cmu.edu>; Wed,  4 Apr 2001 16:04:30 -0700 (PDT)
Received: from hp.com (IDENT:plabat@pl703521.cup.hp.com [15.13.133.216])
	by colosus2.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id QAA19667;
	Wed, 4 Apr 2001 16:04:29 -0700 (PDT)
Message-ID: <3ACBA915.2F2D882B@hp.com>
Date: Wed, 04 Apr 2001 16:07:01 -0700
From: Pierre Labat <pierre_labat@hp.com>
Organization: Hewlett Packard ATM-SISL
X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.5-15 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A24.00755916.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:

> Yes - and I think that the effects can be observed even with random errors
> because of the weakness of the checksum.

Julian,

This paper shows that the TCP checksum is weaker in case of non uniform
data.

But we are interested by the overall performance Link layer CRC
+ TCP checksum.

And if i look at the conclusion of this paper
"VII. Observations and recommendations" there is:
page 540 upper left
"The ATM CRC will fail to detect a splice approximately at a rate of 1 in 2^32.
 Therefore the chance of the TCP checksum being called upon to detect a splice
 is much less than 1 in 10e-8 * 2e-32 or less than one chance in 10e17"

Hence i get an overall  escape rate of 1 in 10e17 or
in other words  0.000000000000001%

How do you get this number of 0.0002% ?

Regards,

Pierre

>
>
> Julo
>
> Pierre Labat <pierre_labat@hp.com> on 04/04/2001 23:13:38
>
> Please respond to Pierre Labat <pierre_labat@hp.com>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
> julian_satran@il.ibm.com wrote:
>
> > SNACK is here for two reasons - Status retry (which is cheap) and Data
> > retry as a side benefit.
> > CRC errors are not that rare (although we don't have real data the
> > simulation with file systems seem to indicate that numbers could be as
> high
> > a 0.0002%).
>
> Julo,
>
> Could you explain how you get this number.
> Does it come from
>                      J. Stone et. al "Performance of Checksums and CRC's
> over
> Real Data"
>                       IEEE/ACM Transactions on Networking, Vol. 6, No. 5,
> October 1998
>
> http://dev.acm.org/pubs/articles/journals/ton/1998-6-5/p529-stone/p529-stone.pdf
>
> ???
> I don't see how you got this number.
> What i saw was:
>                     Less than 1 escape in 10e17 segments when taking into
>                     account the link layer AAL5 CRC. (see page 540 left
> column
> on top).
>
> Regards,
>
> Pierre
>
> > A restart of link - is expensive (slow start) and even if they
> > are far lower for many applications a slow start is a painfull event.
> >
> > Removing them from the spec is not a path we should take lightly.
> >
> > Julo



From owner-ips@ece.cmu.edu  Wed Apr  4 21:19:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05531
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 21:19:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f34N3hk27801
	for ips-outgoing; Wed, 4 Apr 2001 19:03:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d06lmsgate-2.uk.ibm.com (d06lmsgate-2.uk.ibm.com [195.212.29.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f34N2Jr27738
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 19:02:19 -0400 (EDT)
Received: from d06relay01.portsmouth.uk.ibm.com (d06relay01.portsmouth.uk.ibm.com [9.166.84.147])
	by d06lmsgate-2.uk.ibm.com (1.0.0) with ESMTP id XAA88718
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 23:46:55 +0100
From: julian_satran@il.ibm.com
Received: from d06mta05.portsmouth.uk.ibm.com (d06mta05_cs0 [9.180.35.3])
	by d06relay01.portsmouth.uk.ibm.com (8.8.8m3/NCO v4.95) with SMTP id AAA98912
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 00:02:07 +0100
Received: by d06mta05.portsmouth.uk.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id 80256A24.007E8642 ; Thu, 5 Apr 2001 00:01:59 +0100
X-Lotus-FromDomain: IBMIL@IBMDE@IBMGB
To: cbm@rose.hp.com
cc: ips@ece.cmu.edu
Message-ID: <80256A24.00742311.00@d06mta05.portsmouth.uk.ibm.com>
Date: Wed, 4 Apr 2001 20:52:44 +0200
Subject: Re: iSCSI: synch and steering comments
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Mallikarjun,

The draft has now one single mechanism and this is optional.
Two implemenatations that have different synch and steering will not be
able to use them.

Alternatively this group may want to mandate one for interopearbility.

My take is that we can wait (until iSCSI-02 -:))

Julo

"Mallikarjun C." <cbm@rose.hp.com> on 04/04/2001 20:02:12

Please respond to cbm@rose.hp.com

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI: synch and steering comments




Julian,

>Mallikarjun,
>
>I am not sure about which comment. If it is about synch and steering I
>think that recovery from header digest errors
>should not mandate a synch mechanism.  Some very sophisticated
>implementations may want to take advantage of such a mechanism if it is
>there. As this interaction may fairly complex and implementation dependent
>we will assume that we will drop the connection in the recovery
>descriptions we will provide.

Sorry, I am not clear on what you meant (keep in mind that that I am
not asking to mandate a synch mechanism) -

Are you saying that when synch and steering is not implemented in an iSCSI
device:
  a) it can recover from header digest errors only by dropping the
connection
     (which the ER algorithms would assume)
OR
  b) it can recover from header digest errors by "fairly complex and
     implementaion dependent" mechanisms which rely on the Length field
     anyway, and try to analyze perhaps several PDUs for achieving the
     framing synch again?

I am assuming that it would be (a).  I am also requesting that this be made
clear in the draft.

Here's the third comment at the bottom that you missed.  Your comments
would be very helpful.  Thanks!

-It appears to me that at least one Synch and Steering layer must be
 defined/referred to as the minimal implementation in the main draft to
 enable interoperability, when implementations do implement Synch and
Steering.

+++ why ? +++

I may be using "interoperability" in a somewhat unconventional sense here.
While the draft says that Synch and Steering layer is optional, I don't see
that it requires implementations to always support a "no synch & steering"
mode, even when they support one type of Synch and Steering layer.  Given
that there's no mandatory Synch and Steering layer either, I don't see how
two
iSCSI boxes that a customer buys are guaranteed to interoperate.  Please
comment if the draft already implies what I am asking for.
--
Mallikarjun


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668   Hewlett-Packard, Roseville.
cbm@rose.hp.com

>
>This is also partly a result of choosing Format 2.
>
>Regards,
>Julo
>
>"Mallikarjun C." <cbm@rose.hp.com> on 02/04/2001 07:14:54
>
>Please respond to cbm@rose.hp.com
>
>To:   ips@ece.cmu.edu
>cc:
>Subject:  Re: iSCSI: synch and steering comments
>
>
>
>
>Julian,
>
>Thanks for the clarification.
>
>Could you please take time to respond to the other two comments I had?
>Or, do I take it that you will get back shortly?
>
>If those comments are indeed incorrect, please help me understand why
>so.
>
>Thank you.
>--
>Mallikarjun
>
>
>Mallikarjun Chadalapaka
>Networked Storage Architecture
>Network Storage Solutions Organization
>MS 5668   Hewlett-Packard, Roseville.
>cbm@rose.hp.com
>
>
>>I've marked it with ---
>>
>>Matt Wakeley <matt_wakeley@agilent.com> on 31/03/2001 10:25:25
>>
>>Please respond to Matt Wakeley <matt_wakeley@agilent.com>
>>
>>To:   IPS Reflector <ips@ece.cmu.edu>
>>cc:
>>Subject:  Re: iSCSI: synch and steering comments
>>
>>
>>
>>
>>Julian,
>>
>>There were many comments in this message.  To which comment are you
>>refering
>>to?
>>
>>-Matt
>>
>>julian_satran@il.ibm.com wrote:
>>>
>>> Mallikarjun,
>>>
>>> It is clearly communicated in the paragraph above it - but fine I will
>>add
>>> it here too.
>>>
>>> Julo
>>>
>>> "Mallikarjun C." <cbm@rose.hp.com> on 30/03/2001 00:54:20
>>>
>>> Please respond to cbm@rose.hp.com
>>>
>>> To:   ips@ece.cmu.edu
>>> cc:
>>> Subject:  Re: iSCSI: synch and steering comments
>>>
>>> Julian,
>>>
>>> Some comments.
>>>
>>> >Answers in text. Thanks, Julo
>>> >
>>> >
>>> ..
>>>
>>> >-Suggest adding the following statement to section 1.2.8.2.
>>> >
>>> > All conventional, in-order data arrival notifications generated by
TCP
>>> > are passed through to iSCSI by the Synch and Steering layer after
>>> > appropriate data placements while none of the out-of-order data
>>> placements
>>> > that it performs are communicated to upper layers.
>>> >
>>> >+++ I have added the following to 1.2.8.2
>>> >
>>> >   On the incoming path the Synch and Steering layer does not change
>the
>>> >   way TCP notifies iSCSI about in-order data arrival.  All
>out-of-order
>>> >   data placements
>>> >   performed by the Synch and Steering layer are hidden from iSCSI.
>>-------------------------------------------------------------------------------

>
>>>
>>> Okay, I'd however prefer it to imply that in-order data placement is
>also
>>> handled by Synch and Steering in the same sentence, instead of only
>>> commenting on in-order notifications, and out-of-order placements.
>>>
>>-------------------------------------------------------------------------------

>
>>
>>> >
>>> >   I have aloso changed a bit the figure to convey better the fact
that
>>> TCP
>>> >   and Synch&Steering are related (not strictly layered +++
>>>
>>> That's a good idea.
>>>
>>> >
>>> >   ++++
>>> >
>>> >-Section 1.2.8.2 states that a Synch and Steering layer is optional.
>>> > It has to be qualifed that it is optional only for those iSCSI
devices
>>> > which perform connection recovery on header digest errors, since
>that's
>>> > how they cope with loss of framing. (I guess this may change in next
>>> rev?)
>>> >
>>> >+++ with the new format I think that we have:
>>> >
>>> >- one more chance if we go for format 1 or
>>> >- drop the connection on header error
>>> >
>>> >In both cases we can leave synch and steering optional
>>>
>>> Well, that doesn't address the thrust of my comment.  I was implying
>>> that the draft should make it clear that those implementations which
>>> don't support Synch and Steering should end the connection on a header
>>> digest error and/or parity error, and not go into (what Somesh called)
>>> a speculative mode.
>>>
>>> >
>>> >+++
>>> >
>>> >-It appears to me that at least one Synch and Steering layer must be
>>> > defined/referred to as the minimal implementation in the main draft
to
>>> > enable interoperability, when implementations do implement Synch and
>>> >Steering.
>>> >
>>> >+++ why ? +++
>>>
>>> I may be using "interoperability" in a somewhat unconventional sense
>>here.
>>> While the draft says that Synch and Steering layer is optional, I don't
>>see
>>> that it requires implementations to always support a "no synch &
>>steering"
>>> mode, even when they support one type of Synch and Steering layer.
>Given
>>> that
>>> there's no mandatory Synch and Steering layer either, I don't see how
>two
>>> iSCSI boxes that a customer buys are guaranteed to interoperate.
Please
>>> comment if the draft already implies what I am asking for.
>>>
>>> Thanks.
>>> --
>>> Mallikarjun
>>>
>>> Mallikarjun Chadalapaka
>>> Networked Storage Architecture
>>> Network Storage Solutions Organization
>>> MS 5668   Hewlett-Packard, Roseville.
>>> cbm@rose.hp.com
>>
>>
>>
>>
>
>
>
>
>
>







From owner-ips@ece.cmu.edu  Wed Apr  4 22:10:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA07026
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 22:10:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3508aO01618
	for ips-outgoing; Wed, 4 Apr 2001 20:08:36 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3507sr01589
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 20:07:54 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id CAA189858;
	Thu, 5 Apr 2001 02:07:47 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta06_cs0 [9.165.222.255])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id CAA41320;
	Thu, 5 Apr 2001 02:04:36 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A25.0000B0D5 ; Thu, 5 Apr 2001 02:07:32 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Santosh Rao <santoshr@cup.hp.com>
cc: Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Message-ID: <C1256A25.00004D90.00@d12mta05.de.ibm.com>
Date: Thu, 5 Apr 2001 02:03:12 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

I can't find the place where this is stated. SNACK as a PDU type is
mandated. But it can be rejected outright.

1.2.2.2 show explicitely that SACK can be rejected. We can add a protocol
specific parameter in the target Logical Unit Control Page (non-setable) by
which the target will indicate support for SNACK.

Julo

Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"





julian_satran@il.ibm.com wrote:
>
> Jon,
>
> Inexpensive implementation are always free to do away with recovery. That
> si true for targets too.

That's not the interpretation one gets from reading the spec and prior
discussions on this list. Per the spec, support for Status SACK is
mandatory while support for data SACK is optional.

IOW, targets MUST retains state information to satisfy a potential
status SACK request.

- Santosh


> But not specifying the mechanism for the more expensive one we make them
> non-interoperable.




From owner-ips@ece.cmu.edu  Wed Apr  4 22:58:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA08415
	for <ips-archive@odin.ietf.org>; Wed, 4 Apr 2001 22:58:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3511b204636
	for ips-outgoing; Wed, 4 Apr 2001 21:01:37 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35114r04576
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 21:01:04 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id AC007573; Wed,  4 Apr 2001 18:01:03 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id SAA00870;
	Wed, 4 Apr 2001 18:00:57 -0700 (PDT)
Message-ID: <3ACBC502.5265BF34@cup.hp.com>
Date: Wed, 04 Apr 2001 18:06:10 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A25.00004D90.00@d12mta05.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------9F095CEECBD02A09905E7075"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------9F095CEECBD02A09905E7075
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,


"6.7.1.1 Recovery Within-connection 
    
   At the initiator, the following cases lend themselves to
within-connection recovery: (1)Lost iSCSI numbered Response recognized
by either receiving it with a data digest error or receiving a Response
PDU with a higher StatSN than expected. The initiator MAY request the
missing responses through SACK, in which case the target MUST reissue
them. "


To me, the following statement :
"The initiator MAY request the missing responses through SACK, in which
case the target MUST reissue them."

implies that the target MUST be able to re-issue a SCSI Response PDU,
when a Status SACK is received, which implies the target MUST maintain
the I/O state information around [until StatSN is acknowledged.]

Section 1.2.2.2 of my version of iSCSI rev 05 (taken from
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-05.txt) reads
as follows :

"1.2.2.2 Response/Status Numbering and Acknowledging 
    
   Responses in transit from the target to the initiator are numbered.  
   The StatSN (Status Sequence Number) is used for this purpose. StatSN 
   is a counter maintained per connection.  ExpStatSN is used by the 
   initiator to acknowledge status. 
    
   Status numbering starts after Login. During login, there is always 
   only one outstanding command per connection and status numbering is 
   not needed. 
   The login response includes an initial value for status numbering. 
    

  
Satran, J.       Standards-Track, Expire October 2001               12 

                                iSCSI                   March 1, 2001 
 
 
   To enable command recovery the target MAY maintain enough state 
   information to enable data and status recovery after a connection 
   failure. 
   A target can discard all the state information maintained for 
   recovery after the status delivery is acknowledged through ExpStatSN. 
   A large difference between StatSN and ExpStatSN may indicate a failed 
   connection. 
    
   Initiators and Targets MUST support the response-numbering scheme. "


Can you please clarify where in the above section is there an example
that Status SACK can be rejected by targets ? Further, the Reject PDU
(Section 2.20.1) only allows a reject reason of Data SACK reject. There
is no reason code of "Status SACK Reject".

All this to me implies that Status SACK support is mandatory and targets
MUST maintain I/O state information until the StatSN is acknowledged.

Can you please clarify if this is not the case, and if so, what is the
intent of the above text ?

- Santosh


julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> I can't find the place where this is stated. SNACK as a PDU type is
> mandated. But it can be rejected outright.
> 
> 1.2.2.2 show explicitely that SACK can be rejected. We can add a protocol
> specific parameter in the target Logical Unit Control Page (non-setable) by
> which the target will indicate support for SNACK.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> julian_satran@il.ibm.com wrote:
> >
> > Jon,
> >
> > Inexpensive implementation are always free to do away with recovery. That
> > si true for targets too.
> 
> That's not the interpretation one gets from reading the spec and prior
> discussions on this list. Per the spec, support for Status SACK is
> mandatory while support for data SACK is optional.
> 
> IOW, targets MUST retains state information to satisfy a potential
> status SACK request.
> 
> - Santosh
> 
> > But not specifying the mechanism for the more expensive one we make them
> > non-interoperable.
--------------9F095CEECBD02A09905E7075
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------9F095CEECBD02A09905E7075--



From owner-ips@ece.cmu.edu  Thu Apr  5 00:39:23 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA09753
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 00:39:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3529di08043
	for ips-outgoing; Wed, 4 Apr 2001 22:09:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3528lr08003
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 22:08:47 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by palrel1.hp.com (Postfix) with ESMTP id 2E4F445
	for <ips@ece.cmu.edu>; Wed,  4 Apr 2001 19:08:46 -0700 (PDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id TAA17086 for ips@ece.cmu.edu; Wed, 4 Apr 2001 19:09:47 -0700 (PDT)
Message-Id: <200104050209.TAA17086@core.rose.hp.com>
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
To: ips@ece.cmu.edu
Date: Wed, 04 Apr 2001 19:09:46 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

>
>Santosh,
>
>I can't find the place where this is stated. SNACK as a PDU type is
>mandated. But it can be rejected outright.

Sorry, you agreed that status SACK is mandatory in ERT forum last
week in response to my comments.  Has there been a change in your opinion?

Attached is the email in a long email thread (issue 3) where you agreed 
to make this explicit in rev06.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com


>1.2.2.2 show explicitely that SACK can be rejected. We can add a protocol
>specific parameter in the target Logical Unit Control Page (non-setable) by
>which the target will indicate support for SNACK.
>
>Julo

Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"





julian_satran@il.ibm.com wrote:
>
> Jon,
>
> Inexpensive implementation are always free to do away with recovery. That
> si true for targets too.

That's not the interpretation one gets from reading the spec and prior
discussions on this list. Per the spec, support for Status SACK is
mandatory while support for data SACK is optional.

IOW, targets MUST retains state information to satisfy a potential
status SACK request.

- Santosh


------------------------------------------------------------------------------

From julian_satran@il.ibm.com Tue Mar 27 05:16:54 PST 2001
Received: from mailhub.rose.hp.com (mailhub.rose.hp.com [15.96.64.24]) by core.rose.hp.com with ESMTP (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id FAA26277 for <cbm@core.rose.hp.com>; Tue, 27 Mar 2001 05:16:52 -0800 (PST)
From: julian_satran@il.ibm.com
Received: from atlrel2.hp.com (atlrel2.hp.com [15.10.184.10]) by mailhub.rose.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id FAA10600 for <cbm@rose.hp.com>; Tue, 27 Mar 2001 05:15:51 -0800 (PST)
Received: from d06lmsgate.uk.ibm.COM (d06lmsgate.uk.ibm.com [195.212.29.1])
	by atlrel2.hp.com (Postfix) with ESMTP id AFC11120
	for <cbm@rose.hp.com>; Tue, 27 Mar 2001 08:15:49 -0500 (EST)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d06lmsgate.uk.ibm.COM (1.0.0) with ESMTP id NAA50078;
	Tue, 27 Mar 2001 13:55:50 +0100
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id PAA174358;
	Tue, 27 Mar 2001 15:09:58 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A1C.00483A46 ; Tue, 27 Mar 2001 15:08:55 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: cbm@rose.hp.com
Cc: someshg@yahoo.com, steph@cs.uchicago.edu, hufferd@us.ibm.com,
        cbm@rose.hp.com, ldalleore@snapserver.com,
        Venkat Rangan <venkat@rhapsodynetworks.com>, Black_David@emc.com
Message-ID: <C1256A1C.0048399C.00@d12mta02.de.ibm.com>
Date: Tue, 27 Mar 2001 15:12:01 +0200
Subject: Re: iSCSI ERT: error recovery comments
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Status: RO



Comments in text.  Thanks Julo

"Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 01:41:48

Please respond to cbm@rose.hp.com

To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu, Julian
      Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
cc:   Black_David@emc.com
Subject:  iSCSI ERT: error recovery comments




Hi Julian and the Team,

Here are some comments on error recovery issues.  I hope these will
be addressed soon.  Thanks.

1. The draft should clearly state that if a target doesn't support
  retry (replay in my previous memo's terminology), it must not silently
  accept a command with retry bit and re-do the I/O.

2. Consequent to the above -
     - Clarification required on section 6.7.1, page 83, last para.
          Please confirm and clarify in the draft: If the target sends
          a response with an iSCSI error response of "SACK-rejected" that
          implicitly terminates the task - no retries are allowed. If the
          target sends a Reject PDU with "Data SACK Reject" code, the task
          stays open and the initiator may try to recover using SACK/retry.
+++ I will clarify
it will read:
   An iSCSI target MAY reject a data-SNACK and terminate the command with
   an iSCSI command response of SNACK rejected. In this case, the task is
   terminated and no future action is expected at target and initiator.

   Alternatively, an iSCSI target MAY reject a data-SNACK with a reject
   response of data SNACK rejected. In this case the task is still open and
   may be recovered using the retry.

+++
        - On a data digest error on a data PDU without the F-bit, the draft
          states that the target must wait for a data PDU with the F-bit
          (per section 6.2), then a command termination is signalled with
          a Reject PDU!  I like the formulation in 2.4.2 better.  I
strongly
          recommend that similarly, the target shall send a SCSI Response
          with a iSCSI response of "delivery subsystem failure".  In
general,
          I suggest that anytime a target terminates a task internally, it
          must generate a SCSI Response PDU with an appropriate response
code.
+++ It reads now:

   When a target receives an iSCSI PDU with a header digest error or a
   payload digest error in an iSCSI PDU, it MUST answer with a Reject iSCSI
   PDU with a Reason-code of Header-Digest-error or Data-Digest-Error and
   discard the offending PDU.  If the error is a Data-Digest-Error in a
   Data-PDU, the target MUST either request retransmission with a R2T or
   answer with a command response PDU with a response-code of
   delivery-subsystem-failure and abort the task. If the target is
   answering with an error in the command response PDU it must wait for the
   target to receive all the data (signaled by a Data PDU with the final
   bit Set for all outstanding R2Ts) the command response PDU.
++++

3. While the following is implied in different sections, it is not
  obvious.  Please clarify the following in the draft - "Status SACK
  support is mandatory, whereas data SACK support is not."

+++ will do in 2.16.1 +++

4. The general policy of retry should be that all ordered commands
  shall support retry bit, since the loss of an ordered command
  creates a hole in target scoreboarding and stalls the target
  pipeline.  Retry hopefully can plug the hole quickly to avoid this.

5. As a fallout of the above comment, Retry bit must be supported
  for Text Commands.

+++ I have added the X-bit.  The reason I did no earlier was that I could
not foresee
a case in which the command is not idempotent - I can allways be resent -
but I guess it is cleaner with the X +++


6. Section 2.20, page 71 on Reject must specify if a retry of the operation
  is allowed for each Reject PDU reason code.  Lack of specification could
  lead to interoperability issues down the road with "retry wars" raging
  between heterogeneous implementations (ex., target rejects the retry bit,
  initiator retries the "retry" bit,....).
+++ the part now reads:

   The reject Reason is coded as follows:

      1 - Format Error
      2 - Header Digest Error
      3 - Data (payload) Digest Error
      4 - Data-SNACK Reject
      5 - Command Retry Reject
      15 - Full Feature Phase Command before login

      Some of the reject reasons terminate or prevent the creation of a
      task at the target and no retry is possible in those cases. Format
      error for a command, Command Retry Reject and Full Feature Phase
      Command before login are in this category.


7. NOP-OUT does not require CmdSNs.  Why make it an ordered command
  and run the risk of a digest error on it leading to a hole in
  command ordering?

+++ the reason I wanted it ordered is to check the whole command path - but
you may try to convince me that it is not a good idea +++
--
Mallikarjun


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668   Hewlett-Packard, Roseville.
cbm@rose.hp.com





From owner-ips@ece.cmu.edu  Thu Apr  5 00:41:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA09786
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 00:41:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f352ZdE09455
	for ips-outgoing; Wed, 4 Apr 2001 22:35:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f352ZRr09422
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 22:35:27 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id EAA253404
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 04:35:19 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id EAA101306
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 04:32:09 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A25.000DF0BE ; Thu, 5 Apr 2001 04:32:15 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A25.000DEF8C.00@d12mta02.de.ibm.com>
Date: Thu, 5 Apr 2001 04:35:49 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



I will clarify. Julo

Santosh Rao <santoshr@cup.hp.com> on 05/04/2001 03:06:10

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"




Julian,


"6.7.1.1 Recovery Within-connection

   At the initiator, the following cases lend themselves to
within-connection recovery: (1)Lost iSCSI numbered Response recognized
by either receiving it with a data digest error or receiving a Response
PDU with a higher StatSN than expected. The initiator MAY request the
missing responses through SACK, in which case the target MUST reissue
them. "


To me, the following statement :
"The initiator MAY request the missing responses through SACK, in which
case the target MUST reissue them."

implies that the target MUST be able to re-issue a SCSI Response PDU,
when a Status SACK is received, which implies the target MUST maintain
the I/O state information around [until StatSN is acknowledged.]

Section 1.2.2.2 of my version of iSCSI rev 05 (taken from
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-05.txt) reads
as follows :

"1.2.2.2 Response/Status Numbering and Acknowledging

   Responses in transit from the target to the initiator are numbered.
   The StatSN (Status Sequence Number) is used for this purpose. StatSN
   is a counter maintained per connection.  ExpStatSN is used by the
   initiator to acknowledge status.

   Status numbering starts after Login. During login, there is always
   only one outstanding command per connection and status numbering is
   not needed.
   The login response includes an initial value for status numbering.



Satran, J.       Standards-Track, Expire October 2001               12

                                iSCSI                   March 1, 2001


   To enable command recovery the target MAY maintain enough state
   information to enable data and status recovery after a connection
   failure.
   A target can discard all the state information maintained for
   recovery after the status delivery is acknowledged through ExpStatSN.
   A large difference between StatSN and ExpStatSN may indicate a failed
   connection.

   Initiators and Targets MUST support the response-numbering scheme. "


Can you please clarify where in the above section is there an example
that Status SACK can be rejected by targets ? Further, the Reject PDU
(Section 2.20.1) only allows a reject reason of Data SACK reject. There
is no reason code of "Status SACK Reject".

All this to me implies that Status SACK support is mandatory and targets
MUST maintain I/O state information until the StatSN is acknowledged.

Can you please clarify if this is not the case, and if so, what is the
intent of the above text ?

- Santosh


julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> I can't find the place where this is stated. SNACK as a PDU type is
> mandated. But it can be rejected outright.
>
> 1.2.2.2 show explicitely that SACK can be rejected. We can add a protocol
> specific parameter in the target Logical Unit Control Page (non-setable)
by
> which the target will indicate support for SNACK.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
> julian_satran@il.ibm.com wrote:
> >
> > Jon,
> >
> > Inexpensive implementation are always free to do away with recovery.
That
> > si true for targets too.
>
> That's not the interpretation one gets from reading the spec and prior
> discussions on this list. Per the spec, support for Status SACK is
> mandatory while support for data SACK is optional.
>
> IOW, targets MUST retains state information to satisfy a potential
> status SACK request.
>
> - Santosh
>
> > But not specifying the mechanism for the more expensive one we make
them
> > non-interoperable.
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr  5 03:29:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA24657
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 03:29:47 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3511em04642
	for ips-outgoing; Wed, 4 Apr 2001 21:01:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3510Zr04552
	for <ips@ece.cmu.edu>; Wed, 4 Apr 2001 21:00:35 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3528b093189;
	Wed, 4 Apr 2001 19:08:37 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <sandeepj@research.bell-labs.com>
Cc: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple connections.
Date: Wed, 4 Apr 2001 17:58:30 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJCEPPCFAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <3ACB59C4.BCD5D379@research.bell-labs.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Sandeep,

My comment was concerned over what could be happening at the sequencer and I
view that as different from the target.  I view the sequencer as an aspect
of iSCSI separate from the SCSI target.  At the iSCSI sequencer, you would
have a few comparisons being made to see if a CmdSN can be issued to the
target which will be looking for the next PDU in the sequence.  For this
mechanism to work, I would recommend that all PDUs include a serial number.
In the case of a non-sequentially treated PDU, the code would look something
like this using unsigned integers.

if ( (pending CmdSN -  ns reference CmdSN ) > 2^(SERIAL_BITS - 1))
	{
	reject_pdu(CmdSN, SEQUENCER_INVALIDATION);
	}

An ability force the sequencer up to this non-sequential PDU while rejecting
any pending or stuck PDUs seems like a reasonable solution.  As the
sequencer must already make this comparison, no additional work is being
done.  No understanding of the PDU content is required also.

You can not simply drop the connection and expect to retain the sequential
nature of the interface, not introduce errors reflected within the SCSI
layer, or introduce uncertainty of completion status.  The ability to keep
the connection running has benefits in this area.  I would expect there to
be some disagreement, but keeping the retry methods simple would also have
benefits.  Control over PDUs pending within the sequencer introduced as a
result of multiple connections should have a means of controlling these
potential situations.  One reaction would be to eliminate multiple
connections and get rid of this clumsy sequencer.

The same considerations is happening with handling digest errors.  Handling
errors is hard, introducing additional state uncertainty as a means of
handling an error only seems to make a difficult situation worse.

Doug

> Doug,
>
> thanks.  If you (or anyone) could correct the psuedo-code below to
> illustrate your solution, it might help achieve quicker consensus
> and avoid some discussion.
>
> I see what I missed, in addition to Julian's point about the
> refTaskTag usage preventing ITT reuse.  But dont you still need
> the cmdSN of the original task to find out if task_mgmt command
> is early or late?
>  (a..assuming you are still sending the task_mgmt
> command with immediate delivery)
>
> **Event=task_mgmt at initiator:
>     purge PDUs in queue at initiator
>     send task_mgmt to target (cmdSN=0)
>
> **Event=task_mgmt at target:
>     compare refCmdSN with executing <min,max>CmdSN queue
>     if (refCmdSN < minCmdSN)
>         /*task_mgmt cmd is early */
>         must wait & drop the orig_task PDU when it arrives
>     else if (refCmdSN > maxCmdSN)
>         /*task_mgmt cmd is late, original task has completed at target*/
>         return task_response (response code=Task was not in task set)
>     else
>         /*task is executing*/
>         give task_mgmt command to SCSI layer
>
>
> -Sandeep
>
>
> Douglas Otis wrote:
> >
> > David,
> >
> > Sandeep missed a point found within serial math, you have a window that
> > rotates with respect to prior commands based on the magnitude of the
> > difference.  There is no need to maintain any state other than
> the sequence
> > of the flagged command where prior pending to be sent commands
> are rejected.
> > Obviously before this window rotates more than 2 billion PDUs,
> this prior
> > value will need to be retired.  This is not a difficult or high overhead
> > operation with respect to rejecting prior commands.  There
> would not be any
> > decisions within the sequencer regarding content of any
> rejected PDU.  You
> > still should want to purge PDUs waiting in a queue pending to
> be sent to the
> > target should an "immediate" command be flagged.  Your concept
> creates an
> > odd event with both sequential and non-sequential delivery of a task
> > management command.  You are then left with a time interval where a
> > non-sequential command reception must modify behavior waiting
> for a possible
> > counter-part.  Causing all pending PDUs to be rejected
> immediately there is
> > no waiting for status information or any further activity to occur.  You
> > would see reject-reject-status.  If the initiator needs these rejected
> > commands replayed, this becomes an option of the initiator.
> >
> > Doug
> >
> > > > I would state this much stronger.  Applications had better
> not have to
> > > know
> > > > that it is iSCSI underneath vs. FCP or parallel SCSI else I
> believe we
> > > > missed the objective (granted, some things such as target
> address space
> > > are
> > > > unavoidably different, but I believe task management
> functions should be
> > > the
> > > > same).  The transport needs to handle the transport issues without
> > > exposing
> > > > quirks to the SCSI or application layer.
> > >
> > > Unfortunately, I think we have an impossible situation.  It
> appears to me
> > > that
> > > we have to pick at most two of the following three goals, as
> I have yet to
> > > see
> > > any way to achieve all three for a single task management command on a
> > > multiple connection session:
> > >
> > > (1) The command takes effect immediately and its status/response
> > >       is available immediately.
> > > (2) The command affects all commands in flight, and its
> status/response
> > >       is delayed until all such effects are complete.
> > > (3) There is no significant visible departure from existing SCSI task
> > >       management behavior.
> > >
> > > The problem is that trying to do both (1) and (2) either
> requires SCSI to
> > > "execute" the task management command twice or requires that iSCSI do
> > > some task management (e.g., on the in-flight commands) on
> SCSI's behalf
> > > (or worse like having SCSI prolong the execution of the task
> management
> > > command until everything in flight in iSCSI arrives).  All of
> these appear
> > > to lead to problems with (3) in one form or another - two executions
> > > result in two SCSI status/responses that have to be merged, and iSCSI
> > > task management will sooner or later do something different from SCSI
> > > (e.g., I sincerely doubt that a Target in a bridge will ever
> get this 100%
> > > identical to the devices that are being bridged).
> > >
> > > The current iSCSI draft provides the choice of  [(1)] XOR [(2), (3)];
> > > the reason for not getting (3) with (1) is the possibility of the task
> > > management command bypassing commands that it's supposed to
> > > affect.  Charles' original proposal is [(2), (3)] because it has
> > > to time out
> > > a stuck connection before executing the command, and is roughly
> > > equivalent to sending the command for ordered delivery and having
> > > the implementation treat any queue between iSCSI and SCSI as
> > > being on the SCSI side of the line.  Doug Otis's counter-proposal
> > > falls into the category of iSCSI doing task management on SCSI's
> > > behalf and provides an example of how this results in visible changes
> > > in behavior -- for the CLEAR ACA task management command,
> > > aborting all tasks that are queued or in flight is generally
> incorrect.
> > >
> > > I would note that this issue does not arise on single
> connection sessions,
> > > because sending the command for immediate delivery plus some care not
> > > to reorder things in the iSCSI Target (i.e., consider the
> iSCSI to SCSI
> > > queue
> > > to be in "SCSI" and hence subject to the task management command)
> > > obtains all of (1) through (3).
> > >
> > > Going out on a limb, I suspect applications will generally
> want [(2), (3)]
> > > -- send for ordered delivery and wait for the dust to settle
> because that
> > > provides the best odds of having some weird device get into a known
> > > state from which further progress is possible.  This allows the
> > > application
> > > to not know whether parallel SCSI, FCP or iSCSI is underneath and
> > > relies on other iSCSI recovery procedures to make sure that the task
> > > management command is delivered and executed (e.g., unstick and/or
> > > close "stuck" connections).  There will be cases in which (1) is
> > > needed (e.g., observe tape robot doing something obviously wrong,
> > > and get it to stop immediately), but those may involve fairly blunt
> > > instruments (e.g., LUN RESET) and the need to clean up any collateral
> > > damage.
> > >
> > > Sandeep's proposal to create state in the target either fails
> to achieve
> > > (1) [if the response is delayed until the state is removed] or
> > > violates SAM2
> > > [returns the response to the task management command before the task
> > > management command is complete].  Having state linger after a
> completed
> > > LUN or TARGET RESET is almost certainly wrong.
> > >
> > > So, I think I'm down to sending task management functions
> once, usually
> > > for ordered delivery with the application making the ordered
> vs. immediate
> > > delivery choice (and sending the task management function twice if it
> > > so chooses).  I think apps will generally choose ordered
> > > delivery, choosing
> > > predictable behavior over immediacy concerns.  Aside from a longer
> > > discussion of this issue, I still don't see the need for additional
> > > mechanism(s) to task management - what have I missed in the above
> > > discussion?
> > >
> > > --David
> > >
> > > ---------------------------------------------------
> > > David L. Black, Senior Technologist
> > > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> > > black_david@emc.com       Mobile: +1 (978) 394-7754
> > > ---------------------------------------------------
> > >
> > >
>



From owner-ips@ece.cmu.edu  Thu Apr  5 15:10:23 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA11451
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 15:10:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35Du1h23376
	for ips-outgoing; Thu, 5 Apr 2001 09:56:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lub1028.lss.emc.com ([168.159.39.28])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35DtHr23281
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 09:55:18 -0400 (EDT)
Received: from emc.com (IDENT:jhall@localhost.localdomain [127.0.0.1])
	by lub1028.lss.emc.com (8.9.3/8.9.3) with ESMTP id JAA01158
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 09:55:11 -0400
Message-Id: <200104051355.JAA01158@lub1028.lss.emc.com>
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Thu, 05 Apr 2001 09:55:11 -0400
From: "Jon Hall" <jhall@emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Julian,

I don't understand.  Are you saying that an "expensive" target
will implement specific error recovery mechanisms for very rare
events?  Or are you saying that this case is not a rare event?

If the former, there is a problem of completeness (e.g., should
there be recovery procedures for when the sun goes nova :-).
If the latter, this would be very interesting and useful to
know about...

-Jon

julian_satran@il.ibm.com writes:
>
>Jon,
>
>Inexpensive implementation are always free to do away with recovery. That
>si true for targets too.
>But not specifying the mechanism for the more expensive one we make them
>non-interoperable.
>
>Julo
>
>"Jon Hall" <jhall@emc.com> on 04/04/2001 22:55:40
>
>Please respond to "Jon Hall" <jhall@emc.com>
>
>To:   ips@ece.cmu.edu
>cc:
>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>But CRC errors are not really the issue.  It is the
>singular case of a TCP cksum failing to detect what a
>CRC succeeds in detecting, and this occurring to a TCP
>segment containing an iSCSI hdr with a StatSN.
>
>Is there a reason to believe that iSCSI StatSNs will be
>lost at a higher rate than is currently documented for TCP
>cksum failure?  Or, is the problem a loss of one TCP segment
>in tens (possibly hundreds) of millions of segments.  Where
>the bad segment may contain a StatSN but probably doesn't
>because it is a data pdu.  If the latter, why does a SCSI-level
>timeout and retry (on the initiator) not suffice?  [Note,
>an initiator timeout/retry does not require a connection
>to be closed.]
>
>I realize that I am being annoyingly repetitious, but it is
>not an idle question.  For some targets, retained rsp status
>is not cheap (and retained rsp data is not tractable at all).
>
>IMO there appears to be no real need for SNACK.  And, more
>radically, there appears to be no need for StatSNs.
>
>Maybe, as Somesh said, this is a dead horse but why include
>something in the spec which suggests a need for target-side
>complexity, while not solving a clear and compelling
>requirement?
>
>-Jon
>
>julian_satran@il.ibm.com writes:
>>
>>SNACK is here for two reasons - Status retry (which is cheap) and Data
>>retry as a side benefit.
>>CRC errors are not that rare (although we don't have real data the
>>simulation with file systems seem to indicate that numbers could be as
>high
>>a 0.0002%). A restart of link - is expensive (slow start) and even if they
>>are far lower for many applications a slow start is a painfull event.
>>
>>Removing them from the spec is not a path we should take lightly.
>>
>>Julo
>>
>>"Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
>>
>>Please respond to "Jon Hall" <jhall@emc.com>
>>
>>To:   ips@ece.cmu.edu
>>cc:
>>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>>
>>I agree with Somesh.  And would go farther -- the complexity
>>that results from retaining enough target-side state to respond
>>to a SACK/SNACK request is non-trivial and needs clear justification.
>>Intuitively, a CRC that discovers an error in an iSCSI pdu header
>>(that the TCP cksum missed) seems like it should be a rare event.
>>
>>What is the frequency of this event?  IMO the answer to this
>>question should be written into the protocol spec -- assuming
>>that it substantiates the benefit of SACK/SNACK.  Otherwise, the
>>SACK/SNACK pdu should be removed.
>>
>>-Jon
>>
>>julian_satran@il.ibm.com writes:
>>>
>>>Somesh,
>>>
>>>As I stated earlier - the DataSN was created to detect missing data PDUs.
>>>SNACK is needed to recover missing StatusSN and missing dataSN is only a
>>>bonus if the target wants to support it.  It is a trivial mechanism and I
>>>think it should stay.
>>>
>>>Julo
>>>
>>>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
>>>
>>>Please respond to someshg@yahoo.com
>>>
>>>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
>>>cc:
>>>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>
>>>
>>>
>>>
>>>Sorry to have been missing for a while. Hope you will
>>>appreciate my being back in action :-). It was a fairly
>>>clear consensus in Orlando that applications broke up
>>>their transfers into reasonably small chunks i.e. they
>>>did not have very long running transfers.
>>>
>>>Therefore the consensus was that a command level recovery
>>>mechanism was sufficient instead of an ack/sack for each
>>>data PDU.
>>>
>>>The SACK mechanism was a post Orlando invention. Without
>>>an ack mechanism (for every data PDU), the SACK mechanism
>>>just imposes additional burden on either end of the session,
>>>without really much benefit.
>>>
>>>The benefit of having SACK is of saving bandwidth in case
>>>the data part of the data PDU failed an integrity check
>>>(but passed TCP checksum). This is a rare enough case that
>>>as a percentage, the bandwidth loss from retransmitting
>>>all the data associated with a read or write command is
>>>very very small.
>>>
>>>In addition, it avoids the complexity of restarting
>>>something from the middle, as compared to from the begining.
>>>
>>>To me it seems that there is significant simplicity (from
>>>implementation, reliability and recovery process) from
>>>having smaller data transfer per command.
>>>
>>>I would really like to get rid of the SACK command.
>>>
>>>Somesh
>>>
>>>> -----Original Message-----
>>>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>>>> julian_satran@il.ibm.com
>>>> Sent: Wednesday, March 28, 2001 6:57 AM
>>>> To: ips@ece.cmu.edu
>>>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>>
>>>>
>>>>
>>>>
>>>> Mallikarjun,
>>>>
>>>> Last summer I thought that recovery within a connection should be left
>>to
>>>> TCP. It is simple and could be made available through IPsec (if no new
>>>> option of any form can be added).
>>>>
>>>> Two things killed this:
>>>>
>>>>    The requirement to have a data encapsulation that can pass through
>>>>    application proxies (like a storage router)
>>>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>>>    header
>>>>
>>>>
>>>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>>>> implementations are in fact allowed to drop even unacked data.
>>>>
>>>> I am bound by the Orlando meeting decision to drop it. Except the
>>regular
>>>> "oppose everything" crowd the two vocal opponents where Somesh Gupta
>and
>>>> Matt Wakeley.
>>>>
>>>> David may want or not to re-open the issue - I am not going to ask for
>>>it.
>>>>
>>>> Regards,
>>>> Julo
>>>>
>>>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>>>
>>>> Please respond to cbm@rose.hp.com
>>>>
>>>> To:   Black_David@emc.com
>>>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
>someshg@yahoo.com,
>>>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>>>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>>
>>>>
>>>>
>>>>
>>>> David and Julian,
>>>>
>>>> I appreciate both your views, and should I say that they're
>>>> along predicted lines :-)
>>>>
>>>> - David's right in saying that the situation is akin to FC's.
>>>>   However, I would like to point out that FC is an unreliable
>>>>   transport, and hence is forced to pick up a lot of the transport
>>>>   baggage (at least in FCP-2, as I understand), in addition
>>>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>>>   TCP being the "reliable" transport, iSCSI is going along the
>>>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>>>   point is - if this is indeed a necessary evil, why don't we
>>>>   complete iSCSI's transport functionality by data-ACKs?
>>>>
>>>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>>>   we're making its usage (and implementation) drastically less
>appealing
>>>>   since the only way error recovery algorithms can *rely* on data SACK
>>>>   is when replay is supported (or, "ReplaySupport=yes"  in my
>proposal),
>>>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>>>   draft and not providing any incentives to implement and use it!
>>>>
>>>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>>>>   protocol in its definition so far (and I believe, rightly so -
>>>mandating
>>>>   command ordering, bi-di support, SCSI CRN support to name a few
>>>> examples),
>>>>   the perfectly SCSI-legal R/W interactions that break in other
>>>transports
>>>>   *do not* have to break in iSCSI.
>>>>
>>>> - A last idea (may seem radical at this point) in regards to iSCSI
>>>>   being a "full transport". This provides us an opportunity to "cast
>>>>   off" the transport baggage in future when we truly move to a
>>"reliable"
>>>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>>>   keeping the encapsulation stuff separate from the transport stuff.
>>>>   (Julian, I heard from Randy that ideas similar to this were explored
>>>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>>>   TCP was supposed to be reliable and granularity of recovery was
>deemed
>>>>   one I/O.)
>>>>
>>>> With that said, may I request David (with his co-chair hat on, :-))
>>>> to add some binding comments/observations on this discussion?
>>>>
>>>> If we decide to leave data SACKs as unattractive to implement, the
>draft
>>>> should in the least add a statement like - "Note that satisfying all
>>>> possible data SACK requests for a task with an unacknowledged status
>>>> implies implementing the I/O replay buffer on the part of targets."
>>>> --
>>>> Mallikarjun
>>>>
>>>>
>>>> Mallikarjun Chadalapaka
>>>> Networked Storage Architecture
>>>> Network Storage Solutions Organization
>>>> MS 5668   Hewlett-Packard, Roseville.
>>>> cbm@rose.hp.com
>>>>
>>>>
>>>>
>>>>
>>>> >I think Julian's basically right -- I would point
>>>> >out that any case of write after read that breaks
>>>> >over iSCSI will also break over Fibre Channel.
>>>> >On FC, the scenario starts with a frame CRC failure
>>>> >on read data at the Initiator, so applications
>>>> >have to cope and typically do so by enforcing
>>>> >ordering at the app rather than using SCSI task
>>>> >ordering.
>>>> >
>>>> >While SCSI has clever tools like ACA and task
>>>> >ordering that appear to allow dependent operations
>>>> >to be sent to the target concurrently, in practice
>>>> >they don't work and/or aren't used (funny thing,
>>>> >those two reinforce each other ;-) ).  Hence
>>>> >a minimal approach to them is in order:
>>>> >- Make sure the result will interoperate.
>>>> >- Make sure T10 doesn't ding us for leaving something
>>>> >    completely out.
>>>> >- Don't specify anything not needed for the above.
>>>> >
>>>> >My 0.02,
>>>> >--David
>>>> >
>>>> >> -----Original Message-----
>>>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>>>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>>>> >> To:    cbm@rose.hp.com
>>>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>>>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>>>> >> Black_David@emc.com
>>>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>> >>
>>>> >>
>>>> >>
>>>> >> Mallikarjun,
>>>> >>
>>>> >> I commiserate with you at the lack of ack for data but the Orlando
>>>> meeting
>>>> >> stated - no.  Recall that I kept the number only as a mechanism to
>>>> detect
>>>> >> missing packets.
>>>> >>
>>>> >> You can achieve the effect you want by keeping around data for a
>>while
>>>> >> (you
>>>> >> determine how long and then discard).
>>>> >>
>>>> >> If a SACK comes and you can recover - fine. If not you either
>>reaccess
>>>> the
>>>> >> media (if you know how) or reject
>>>> >> and let the initiator retry.
>>>> >>
>>>> >> You should not worry about R/W conflicts as programs bound to have
>>>such
>>>> >> conflicts either:
>>>> >>
>>>> >> 1)can live with them or
>>>> >> 2)protect themselves through some locks and rely on
>>>> "operation-end-status"
>>>> >> to keep results deterministic.
>>>> >>
>>>> >> Regards,
>>>> >> Julo
>>>> >>
>>>> >>
>>>> >>
>>>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>>>> >>
>>>> >> Please respond to cbm@rose.hp.com
>>>> >>
>>>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
>>>Julian
>>>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>>>> >> cc:   Black_David@emc.com
>>>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> Hi Error Recovery Team,
>>>> >>
>>>> >> iSCSI can discard PDUs because of digest errors and request
>>>> >> retransmissions using the iSCSI data SACK.  To deal with such
>>>> >> an eventuality, targets that want to support data SACK have
>>>> >> the following options:
>>>> >>
>>>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>>>> >>   a SACK could come anytime before the status is ack'ed by the
>>>> >>   initiator. [ simple, but extremely expensive in memory resources]
>>>> >>
>>>> >> (B) (re-introduce data-ACKs into the draft, and) implement
>data-ACKs.
>>>> >>   Thus enables keeping only those I/O buffers that haven't been
>>ack'ed
>>>> >>   by the initiator. IOW, become a real full transport! [ everyone
>>>> disliked
>>>> >>   it earlier...]
>>>> >>
>>>> >> (C) re-access the medium for data retransmission requests.  Now
>there
>>>> >>   are 3 sub-cases in this to handle the changed data on the medium
>in
>>>a
>>>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>>>> >> legal.)
>>>> >>      (1) On seeing any write, stall till status is ack'ed for all
>the
>>>> >>             previous reads (basically drain the pipe). [simple, but
>>>> incurs
>>>> >>             an additional roundtrip delay for all writes].
>>>> >>      (2) A variation of the above, keep an eye only on the prior
>>>> >>             overlapping reads. [more BW efficient, but complicated
>to
>>>> >>             resolve the block dependencies in a stream of
>>>> reads followed
>>>> >>             by writes]
>>>> >>         (3) Document the caveat and leave it upto the applications
>>>> >>             to avoid this case since this leads to data integrity
>>>> issues.
>>>> >>             [pushing to apps since the transport can't get it
>right!]
>>>> >>
>>>> >> My first preference is (B), followed by (A), and I suggest we not go
>>>> >> to (C) at all with its inherent dangers.
>>>> >>
>>>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>>>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>>>> >> right thing to do architecturally instead of being a
>>"semi-transport"!
>>>> >>
>>>> >> Comments?
>>>> >> --
>>>> >> Mallikarjun
>>>> >>
>>>> >>
>>>> >> Mallikarjun Chadalapaka
>>>> >> Networked Storage Architecture
>>>> >> Network Storage Solutions Organization
>>>> >> MS 5668   Hewlett-Packard, Roseville.
>>>> >> cbm@rose.hp.com
>>>> >>
>>>> >>
>>>>
>>>__________________________________________________________________________
>
>>>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>>>> legal
>>>> >>         if SCSI sets the ORDERED task attribute on both the
>>>> commands AND
>>>> >>         sets the NACA bit to one to indicate that Write shall be
>>>> executed
>>>> >>         only if the Read did not fail (result in a Check Condition).
>>>> >>
>>>> >>         In the current case, since Read completed just fine from
>>>SCSI's
>>>> >>         point of view, SCSI is moving on to execute Write.  Those
>>read
>>>> >> buffers
>>>> >>         had been freed up since iSCSI received an ACK at the TCP
>>>level,
>>>> >> and
>>>> >>         since iSCSI has no other way to have the data ack'ed!
>>
>


From owner-ips@ece.cmu.edu  Thu Apr  5 15:13:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA11533
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 15:13:45 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35HL6t06838
	for ips-outgoing; Thu, 5 Apr 2001 13:21:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35HJ3r06689
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 13:19:07 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id TAA96980
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 19:17:20 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id TAA19968
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 19:15:22 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A25.005DF6A0 ; Thu, 5 Apr 2001 19:06:19 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A25.0054AF2C.00@d12mta02.de.ibm.com>
Date: Thu, 5 Apr 2001 16:41:47 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Correct - as to if  it happens when the Sun goes Nova I think it is far
more frequent and for critical applications (business or life) people might
be paying to avoid small glitches several times a day.

Julo

"Jon Hall" <jhall@emc.com> on 05/04/2001 15:55:11

Please respond to "Jon Hall" <jhall@emc.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"





Julian,

I don't understand.  Are you saying that an "expensive" target
will implement specific error recovery mechanisms for very rare
events?  Or are you saying that this case is not a rare event?

If the former, there is a problem of completeness (e.g., should
there be recovery procedures for when the sun goes nova :-).
If the latter, this would be very interesting and useful to
know about...

-Jon

julian_satran@il.ibm.com writes:
>
>Jon,
>
>Inexpensive implementation are always free to do away with recovery. That
>si true for targets too.
>But not specifying the mechanism for the more expensive one we make them
>non-interoperable.
>
>Julo
>
>"Jon Hall" <jhall@emc.com> on 04/04/2001 22:55:40
>
>Please respond to "Jon Hall" <jhall@emc.com>
>
>To:   ips@ece.cmu.edu
>cc:
>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
>But CRC errors are not really the issue.  It is the
>singular case of a TCP cksum failing to detect what a
>CRC succeeds in detecting, and this occurring to a TCP
>segment containing an iSCSI hdr with a StatSN.
>
>Is there a reason to believe that iSCSI StatSNs will be
>lost at a higher rate than is currently documented for TCP
>cksum failure?  Or, is the problem a loss of one TCP segment
>in tens (possibly hundreds) of millions of segments.  Where
>the bad segment may contain a StatSN but probably doesn't
>because it is a data pdu.  If the latter, why does a SCSI-level
>timeout and retry (on the initiator) not suffice?  [Note,
>an initiator timeout/retry does not require a connection
>to be closed.]
>
>I realize that I am being annoyingly repetitious, but it is
>not an idle question.  For some targets, retained rsp status
>is not cheap (and retained rsp data is not tractable at all).
>
>IMO there appears to be no real need for SNACK.  And, more
>radically, there appears to be no need for StatSNs.
>
>Maybe, as Somesh said, this is a dead horse but why include
>something in the spec which suggests a need for target-side
>complexity, while not solving a clear and compelling
>requirement?
>
>-Jon
>
>julian_satran@il.ibm.com writes:
>>
>>SNACK is here for two reasons - Status retry (which is cheap) and Data
>>retry as a side benefit.
>>CRC errors are not that rare (although we don't have real data the
>>simulation with file systems seem to indicate that numbers could be as
>high
>>a 0.0002%). A restart of link - is expensive (slow start) and even if
they
>>are far lower for many applications a slow start is a painfull event.
>>
>>Removing them from the spec is not a path we should take lightly.
>>
>>Julo
>>
>>"Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
>>
>>Please respond to "Jon Hall" <jhall@emc.com>
>>
>>To:   ips@ece.cmu.edu
>>cc:
>>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>
>>
>>
>>
>>
>>I agree with Somesh.  And would go farther -- the complexity
>>that results from retaining enough target-side state to respond
>>to a SACK/SNACK request is non-trivial and needs clear justification.
>>Intuitively, a CRC that discovers an error in an iSCSI pdu header
>>(that the TCP cksum missed) seems like it should be a rare event.
>>
>>What is the frequency of this event?  IMO the answer to this
>>question should be written into the protocol spec -- assuming
>>that it substantiates the benefit of SACK/SNACK.  Otherwise, the
>>SACK/SNACK pdu should be removed.
>>
>>-Jon
>>
>>julian_satran@il.ibm.com writes:
>>>
>>>Somesh,
>>>
>>>As I stated earlier - the DataSN was created to detect missing data
PDUs.
>>>SNACK is needed to recover missing StatusSN and missing dataSN is only a
>>>bonus if the target wants to support it.  It is a trivial mechanism and
I
>>>think it should stay.
>>>
>>>Julo
>>>
>>>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
>>>
>>>Please respond to someshg@yahoo.com
>>>
>>>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
>>>cc:
>>>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>
>>>
>>>
>>>
>>>Sorry to have been missing for a while. Hope you will
>>>appreciate my being back in action :-). It was a fairly
>>>clear consensus in Orlando that applications broke up
>>>their transfers into reasonably small chunks i.e. they
>>>did not have very long running transfers.
>>>
>>>Therefore the consensus was that a command level recovery
>>>mechanism was sufficient instead of an ack/sack for each
>>>data PDU.
>>>
>>>The SACK mechanism was a post Orlando invention. Without
>>>an ack mechanism (for every data PDU), the SACK mechanism
>>>just imposes additional burden on either end of the session,
>>>without really much benefit.
>>>
>>>The benefit of having SACK is of saving bandwidth in case
>>>the data part of the data PDU failed an integrity check
>>>(but passed TCP checksum). This is a rare enough case that
>>>as a percentage, the bandwidth loss from retransmitting
>>>all the data associated with a read or write command is
>>>very very small.
>>>
>>>In addition, it avoids the complexity of restarting
>>>something from the middle, as compared to from the begining.
>>>
>>>To me it seems that there is significant simplicity (from
>>>implementation, reliability and recovery process) from
>>>having smaller data transfer per command.
>>>
>>>I would really like to get rid of the SACK command.
>>>
>>>Somesh
>>>
>>>> -----Original Message-----
>>>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>>>> julian_satran@il.ibm.com
>>>> Sent: Wednesday, March 28, 2001 6:57 AM
>>>> To: ips@ece.cmu.edu
>>>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>>
>>>>
>>>>
>>>>
>>>> Mallikarjun,
>>>>
>>>> Last summer I thought that recovery within a connection should be left
>>to
>>>> TCP. It is simple and could be made available through IPsec (if no new
>>>> option of any form can be added).
>>>>
>>>> Two things killed this:
>>>>
>>>>    The requirement to have a data encapsulation that can pass through
>>>>    application proxies (like a storage router)
>>>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>>>>    header
>>>>
>>>>
>>>> As for the ACK - I am very much in favor of it (it is a no brainer)
and
>>>> implementations are in fact allowed to drop even unacked data.
>>>>
>>>> I am bound by the Orlando meeting decision to drop it. Except the
>>regular
>>>> "oppose everything" crowd the two vocal opponents where Somesh Gupta
>and
>>>> Matt Wakeley.
>>>>
>>>> David may want or not to re-open the issue - I am not going to ask for
>>>it.
>>>>
>>>> Regards,
>>>> Julo
>>>>
>>>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>>>>
>>>> Please respond to cbm@rose.hp.com
>>>>
>>>> To:   Black_David@emc.com
>>>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
>someshg@yahoo.com,
>>>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>>>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>>>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>>
>>>>
>>>>
>>>>
>>>> David and Julian,
>>>>
>>>> I appreciate both your views, and should I say that they're
>>>> along predicted lines :-)
>>>>
>>>> - David's right in saying that the situation is akin to FC's.
>>>>   However, I would like to point out that FC is an unreliable
>>>>   transport, and hence is forced to pick up a lot of the transport
>>>>   baggage (at least in FCP-2, as I understand), in addition
>>>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>>>>   TCP being the "reliable" transport, iSCSI is going along the
>>>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>>>>   point is - if this is indeed a necessary evil, why don't we
>>>>   complete iSCSI's transport functionality by data-ACKs?
>>>>
>>>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>>>>   we're making its usage (and implementation) drastically less
>appealing
>>>>   since the only way error recovery algorithms can *rely* on data SACK
>>>>   is when replay is supported (or, "ReplaySupport=yes"  in my
>proposal),
>>>>   which is extremely expensive.  IOW, we're defining data SACK in the
>>>>   draft and not providing any incentives to implement and use it!
>>>>
>>>> - I submit that since iSCSI is being hailed as the ideal SCSI
Transport
>>>>   protocol in its definition so far (and I believe, rightly so -
>>>mandating
>>>>   command ordering, bi-di support, SCSI CRN support to name a few
>>>> examples),
>>>>   the perfectly SCSI-legal R/W interactions that break in other
>>>transports
>>>>   *do not* have to break in iSCSI.
>>>>
>>>> - A last idea (may seem radical at this point) in regards to iSCSI
>>>>   being a "full transport". This provides us an opportunity to "cast
>>>>   off" the transport baggage in future when we truly move to a
>>"reliable"
>>>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>>>>   keeping the encapsulation stuff separate from the transport stuff.
>>>>   (Julian, I heard from Randy that ideas similar to this were explored
>>>>   in your Haifa meeting.  And yes, he recalls they were given up since
>>>>   TCP was supposed to be reliable and granularity of recovery was
>deemed
>>>>   one I/O.)
>>>>
>>>> With that said, may I request David (with his co-chair hat on, :-))
>>>> to add some binding comments/observations on this discussion?
>>>>
>>>> If we decide to leave data SACKs as unattractive to implement, the
>draft
>>>> should in the least add a statement like - "Note that satisfying all
>>>> possible data SACK requests for a task with an unacknowledged status
>>>> implies implementing the I/O replay buffer on the part of targets."
>>>> --
>>>> Mallikarjun
>>>>
>>>>
>>>> Mallikarjun Chadalapaka
>>>> Networked Storage Architecture
>>>> Network Storage Solutions Organization
>>>> MS 5668   Hewlett-Packard, Roseville.
>>>> cbm@rose.hp.com
>>>>
>>>>
>>>>
>>>>
>>>> >I think Julian's basically right -- I would point
>>>> >out that any case of write after read that breaks
>>>> >over iSCSI will also break over Fibre Channel.
>>>> >On FC, the scenario starts with a frame CRC failure
>>>> >on read data at the Initiator, so applications
>>>> >have to cope and typically do so by enforcing
>>>> >ordering at the app rather than using SCSI task
>>>> >ordering.
>>>> >
>>>> >While SCSI has clever tools like ACA and task
>>>> >ordering that appear to allow dependent operations
>>>> >to be sent to the target concurrently, in practice
>>>> >they don't work and/or aren't used (funny thing,
>>>> >those two reinforce each other ;-) ).  Hence
>>>> >a minimal approach to them is in order:
>>>> >- Make sure the result will interoperate.
>>>> >- Make sure T10 doesn't ding us for leaving something
>>>> >    completely out.
>>>> >- Don't specify anything not needed for the above.
>>>> >
>>>> >My 0.02,
>>>> >--David
>>>> >
>>>> >> -----Original Message-----
>>>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>>>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>>>> >> To:    cbm@rose.hp.com
>>>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu;
hufferd@us.ibm.com;
>>>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>>>> >> Black_David@emc.com
>>>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>> >>
>>>> >>
>>>> >>
>>>> >> Mallikarjun,
>>>> >>
>>>> >> I commiserate with you at the lack of ack for data but the Orlando
>>>> meeting
>>>> >> stated - no.  Recall that I kept the number only as a mechanism to
>>>> detect
>>>> >> missing packets.
>>>> >>
>>>> >> You can achieve the effect you want by keeping around data for a
>>while
>>>> >> (you
>>>> >> determine how long and then discard).
>>>> >>
>>>> >> If a SACK comes and you can recover - fine. If not you either
>>reaccess
>>>> the
>>>> >> media (if you know how) or reject
>>>> >> and let the initiator retry.
>>>> >>
>>>> >> You should not worry about R/W conflicts as programs bound to have
>>>such
>>>> >> conflicts either:
>>>> >>
>>>> >> 1)can live with them or
>>>> >> 2)protect themselves through some locks and rely on
>>>> "operation-end-status"
>>>> >> to keep results deterministic.
>>>> >>
>>>> >> Regards,
>>>> >> Julo
>>>> >>
>>>> >>
>>>> >>
>>>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>>>> >>
>>>> >> Please respond to cbm@rose.hp.com
>>>> >>
>>>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
>>>Julian
>>>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>>>> >> cc:   Black_David@emc.com
>>>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> Hi Error Recovery Team,
>>>> >>
>>>> >> iSCSI can discard PDUs because of digest errors and request
>>>> >> retransmissions using the iSCSI data SACK.  To deal with such
>>>> >> an eventuality, targets that want to support data SACK have
>>>> >> the following options:
>>>> >>
>>>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>>>> >>   a SACK could come anytime before the status is ack'ed by the
>>>> >>   initiator. [ simple, but extremely expensive in memory resources]
>>>> >>
>>>> >> (B) (re-introduce data-ACKs into the draft, and) implement
>data-ACKs.
>>>> >>   Thus enables keeping only those I/O buffers that haven't been
>>ack'ed
>>>> >>   by the initiator. IOW, become a real full transport! [ everyone
>>>> disliked
>>>> >>   it earlier...]
>>>> >>
>>>> >> (C) re-access the medium for data retransmission requests.  Now
>there
>>>> >>   are 3 sub-cases in this to handle the changed data on the medium
>in
>>>a
>>>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it
is
>>>> >> legal.)
>>>> >>      (1) On seeing any write, stall till status is ack'ed for all
>the
>>>> >>             previous reads (basically drain the pipe). [simple, but
>>>> incurs
>>>> >>             an additional roundtrip delay for all writes].
>>>> >>      (2) A variation of the above, keep an eye only on the prior
>>>> >>             overlapping reads. [more BW efficient, but complicated
>to
>>>> >>             resolve the block dependencies in a stream of
>>>> reads followed
>>>> >>             by writes]
>>>> >>         (3) Document the caveat and leave it upto the applications
>>>> >>             to avoid this case since this leads to data integrity
>>>> issues.
>>>> >>             [pushing to apps since the transport can't get it
>right!]
>>>> >>
>>>> >> My first preference is (B), followed by (A), and I suggest we not
go
>>>> >> to (C) at all with its inherent dangers.
>>>> >>
>>>> >> Doing (B) naturally completes the transport job that iSCSI has
taken
>>>> >> on itself in view of TCP's claimed unreliable checksum.  That is
the
>>>> >> right thing to do architecturally instead of being a
>>"semi-transport"!
>>>> >>
>>>> >> Comments?
>>>> >> --
>>>> >> Mallikarjun
>>>> >>
>>>> >>
>>>> >> Mallikarjun Chadalapaka
>>>> >> Networked Storage Architecture
>>>> >> Network Storage Solutions Organization
>>>> >> MS 5668   Hewlett-Packard, Roseville.
>>>> >> cbm@rose.hp.com
>>>> >>
>>>> >>
>>>>
>>>__________________________________________________________________________

>
>>>> >> Note.1: A Read followed by a Write (to the same blocks) is
perfectly
>>>> legal
>>>> >>         if SCSI sets the ORDERED task attribute on both the
>>>> commands AND
>>>> >>         sets the NACA bit to one to indicate that Write shall be
>>>> executed
>>>> >>         only if the Read did not fail (result in a Check
Condition).
>>>> >>
>>>> >>         In the current case, since Read completed just fine from
>>>SCSI's
>>>> >>         point of view, SCSI is moving on to execute Write.  Those
>>read
>>>> >> buffers
>>>> >>         had been freed up since iSCSI received an ACK at the TCP
>>>level,
>>>> >> and
>>>> >>         since iSCSI has no other way to have the data ack'ed!
>>
>





From owner-ips@ece.cmu.edu  Thu Apr  5 17:00:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA14215
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 16:59:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35J7EW14151
	for ips-outgoing; Thu, 5 Apr 2001 15:07:14 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35J6Nr14064
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 15:06:23 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id EBAF581B; Thu,  5 Apr 2001 12:06:20 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id MAA29502;
	Thu, 5 Apr 2001 12:06:14 -0700 (PDT)
Message-ID: <3ACCC361.37235861@cup.hp.com>
Date: Thu, 05 Apr 2001 12:11:29 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A25.00513DFA.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------58C65CADECB90FAE6490E98A"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------58C65CADECB90FAE6490E98A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:

> The reason we made SACK and status recovery
> practically a MUST is that without them we are bound to have only session
> drop as an alternative.
> If the target does not keep any information after it has sent out status it
> can't even retry a command.  And if it can retry a command it should be
> able to do SACK.
> 
> But perhaps there is a place in the market for the kind of devices Somesh
> is suggesting that do all recovery at SCSI level (and that can't copy a
> terabyte of data without a session drop).

Julian,

I fail to see why deploying recovery schemes other than SACK/SNACK will
cause session drop to occur. Initiators can choose to use the command
"retry" as a recovery or just error the command to the SCSI ULP and let
is retry. This does not imply a session [or connection] drop. (?)

- Santosh


> 
> If that is true (which I doubt) we can make SNACK support optional.
> 
> Julo
> 
> "Mallikarjun C." <cbm@rose.hp.com> on 05/04/2001 04:09:46
> 
> Please respond to cbm@rose.hp.com
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> >
> >Santosh,
> >
> >I can't find the place where this is stated. SNACK as a PDU type is
> >mandated. But it can be rejected outright.
> 
> Sorry, you agreed that status SACK is mandatory in ERT forum last
> week in response to my comments.  Has there been a change in your opinion?
> 
> Attached is the email in a long email thread (issue 3) where you agreed
> to make this explicit in rev06.
> --
> Mallikarjun
> 
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668   Hewlett-Packard, Roseville.
> cbm@rose.hp.com
> 
> >1.2.2.2 show explicitely that SACK can be rejected. We can add a protocol
> >specific parameter in the target Logical Unit Control Page (non-setable)
> by
> >which the target will indicate support for SNACK.
> >
> >Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> julian_satran@il.ibm.com wrote:
> >
> > Jon,
> >
> > Inexpensive implementation are always free to do away with recovery. That
> > si true for targets too.
> 
> That's not the interpretation one gets from reading the spec and prior
> discussions on this list. Per the spec, support for Status SACK is
> mandatory while support for data SACK is optional.
> 
> IOW, targets MUST retains state information to satisfy a potential
> status SACK request.
> 
> - Santosh
--------------58C65CADECB90FAE6490E98A
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------58C65CADECB90FAE6490E98A--



From owner-ips@ece.cmu.edu  Thu Apr  5 17:04:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA14404
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 17:04:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35J2De13800
	for ips-outgoing; Thu, 5 Apr 2001 15:02:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35J19r13716
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 15:01:09 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 8416E39C; Thu,  5 Apr 2001 12:01:07 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id MAA29179;
	Thu, 5 Apr 2001 12:01:03 -0700 (PDT)
Message-ID: <3ACCC22A.1403A211@cup.hp.com>
Date: Thu, 05 Apr 2001 12:06:18 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A25.00513DFA.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------6015B21294DF0996B6B53150"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------6015B21294DF0996B6B53150
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,

The existing StatSN SNACK mechanism will NOT work if it is made
optional. The original request that was made in the thread
http://ips.pdl.cs.cmu.edu/mail/msg03257.html was to allow a SACK
mechanism that would allow individual Status PDUs to be acknowledged
[not SNACK which requests re-send of missing Status PDU, and thereby,
requires the target to retain state information until StatSN is
acknowledged].

If StatSN SNACK is made optional, a target that does not support SNACK
will result in holes never being filled in StatSN sequence, and thereby,
initiators being unable to acknowledge status PDUs received after the
hole. This can cause targets to hold onto stale I/O state information
for very long periods. [or forever].

With the current StatSN SNACK scheme, a target can NEVER discard its old
I/O state information, since if it does so, it cannot satisfy SNACK
requests. If SNACK requests are not satisfied, holes remain in the
StatSN sequence at the initiator and it cannot acknowledge Status PDUs
received thereafter.

If we must retain the StatSN mechanism in iSCSI, then, the SACK
mechanism [as opposed to a SNACK], wherein, the initiator ack's
individual status PDUs received when a hole occurs should be the
preferred scheme. This alows both sides to continue the handshake of
resource release even in the presence of holes, without imposing
requirements on targets to retain I/O state information.

The holes created in StatSN are implicitly filled by the initiator based
on the result of its "retry" of the failed command. Alternatively, the
StatSN hole is considered to be filled if the initiator chooses not to
retry the command [ex: on ULP timeout].

- Santosh


julian_satran@il.ibm.com wrote:


> 
> Mallikarjun,
> 
> You are right. Too much travel and jet lag.  The reason we made SACK and
> status recovery
> practically a MUST is that without them we are bound to have only session
> drop as an alternative.
> If the target does not keep any information after it has sent out status it
> can't even retry a command.  And if it can retry a command it should be
> able to do SACK.
> 
> But perhaps there is a place in the market for the kind of devices Somesh
> is suggesting that do all recovery at SCSI level (and that can't copy a
> terabyte of data without a session drop).
> 
> If that is true (which I doubt) we can make SNACK support optional.
> 
> Julo
> 
> "Mallikarjun C." <cbm@rose.hp.com> on 05/04/2001 04:09:46
> 
> Please respond to cbm@rose.hp.com
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> >
> >Santosh,
> >
> >I can't find the place where this is stated. SNACK as a PDU type is
> >mandated. But it can be rejected outright.
> 
> Sorry, you agreed that status SACK is mandatory in ERT forum last
> week in response to my comments.  Has there been a change in your opinion?
> 
> Attached is the email in a long email thread (issue 3) where you agreed
> to make this explicit in rev06.
> --
> Mallikarjun
> 
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668   Hewlett-Packard, Roseville.
> cbm@rose.hp.com
> 
> >1.2.2.2 show explicitely that SACK can be rejected. We can add a protocol
> >specific parameter in the target Logical Unit Control Page (non-setable)
> by
> >which the target will indicate support for SNACK.
> >
> >Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> julian_satran@il.ibm.com wrote:
> >
> > Jon,
> >
> > Inexpensive implementation are always free to do away with recovery. That
> > si true for targets too.
> 
> That's not the interpretation one gets from reading the spec and prior
> discussions on this list. Per the spec, support for Status SACK is
> mandatory while support for data SACK is optional.
> 
> IOW, targets MUST retains state information to satisfy a potential
> status SACK request.
> 
> - Santosh
> 
> ------------------------------------------------------------------------------
> 
> >From julian_satran@il.ibm.com Tue Mar 27 05:16:54 PST 2001
> Received: from mailhub.rose.hp.com (mailhub.rose.hp.com [15.96.64.24]) by
> core.rose.hp.com with ESMTP (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id
> FAA26277 for <cbm@core.rose.hp.com>; Tue, 27 Mar 2001 05:16:52 -0800 (PST)
> From: julian_satran@il.ibm.com
> Received: from atlrel2.hp.com (atlrel2.hp.com [15.10.184.10]) by
> mailhub.rose.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id FAA10600 for
> <cbm@rose.hp.com>; Tue, 27 Mar 2001 05:15:51 -0800 (PST)
> Received: from d06lmsgate.uk.ibm.COM (d06lmsgate.uk.ibm.com [195.212.29.1])
>      by atlrel2.hp.com (Postfix) with ESMTP id AFC11120
>      for <cbm@rose.hp.com>; Tue, 27 Mar 2001 08:15:49 -0500 (EST)
> Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
>      by d06lmsgate.uk.ibm.COM (1.0.0) with ESMTP id NAA50078;
>      Tue, 27 Mar 2001 13:55:50 +0100
> Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
>      by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id PAA174358;
>      Tue, 27 Mar 2001 15:09:58 +0200
> Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))
> id C1256A1C.00483A46 ; Tue, 27 Mar 2001 15:08:55 +0200
> X-Lotus-FromDomain: IBMIL@IBMDE
> To: cbm@rose.hp.com
> Cc: someshg@yahoo.com, steph@cs.uchicago.edu, hufferd@us.ibm.com,
>         cbm@rose.hp.com, ldalleore@snapserver.com,
>         Venkat Rangan <venkat@rhapsodynetworks.com>, Black_David@emc.com
> Message-ID: <C1256A1C.0048399C.00@d12mta02.de.ibm.com>
> Date: Tue, 27 Mar 2001 15:12:01 +0200
> Subject: Re: iSCSI ERT: error recovery comments
> Mime-Version: 1.0
> Content-type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Status: RO
> 
> Comments in text.  Thanks Julo
> 
> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 01:41:48
> 
> Please respond to cbm@rose.hp.com
> 
> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu, Julian
>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> cc:   Black_David@emc.com
> Subject:  iSCSI ERT: error recovery comments
> 
> Hi Julian and the Team,
> 
> Here are some comments on error recovery issues.  I hope these will
> be addressed soon.  Thanks.
> 
> 1. The draft should clearly state that if a target doesn't support
>   retry (replay in my previous memo's terminology), it must not silently
>   accept a command with retry bit and re-do the I/O.
> 
> 2. Consequent to the above -
>      - Clarification required on section 6.7.1, page 83, last para.
>           Please confirm and clarify in the draft: If the target sends
>           a response with an iSCSI error response of "SACK-rejected" that
>           implicitly terminates the task - no retries are allowed. If the
>           target sends a Reject PDU with "Data SACK Reject" code, the task
>           stays open and the initiator may try to recover using SACK/retry.
> +++ I will clarify
> it will read:
>    An iSCSI target MAY reject a data-SNACK and terminate the command with
>    an iSCSI command response of SNACK rejected. In this case, the task is
>    terminated and no future action is expected at target and initiator.
> 
>    Alternatively, an iSCSI target MAY reject a data-SNACK with a reject
>    response of data SNACK rejected. In this case the task is still open and
>    may be recovered using the retry.
> 
> +++
>         - On a data digest error on a data PDU without the F-bit, the draft
>           states that the target must wait for a data PDU with the F-bit
>           (per section 6.2), then a command termination is signalled with
>           a Reject PDU!  I like the formulation in 2.4.2 better.  I
> strongly
>           recommend that similarly, the target shall send a SCSI Response
>           with a iSCSI response of "delivery subsystem failure".  In
> general,
>           I suggest that anytime a target terminates a task internally, it
>           must generate a SCSI Response PDU with an appropriate response
> code.
> +++ It reads now:
> 
>    When a target receives an iSCSI PDU with a header digest error or a
>    payload digest error in an iSCSI PDU, it MUST answer with a Reject iSCSI
>    PDU with a Reason-code of Header-Digest-error or Data-Digest-Error and
>    discard the offending PDU.  If the error is a Data-Digest-Error in a
>    Data-PDU, the target MUST either request retransmission with a R2T or
>    answer with a command response PDU with a response-code of
>    delivery-subsystem-failure and abort the task. If the target is
>    answering with an error in the command response PDU it must wait for the
>    target to receive all the data (signaled by a Data PDU with the final
>    bit Set for all outstanding R2Ts) the command response PDU.
> ++++
> 
> 3. While the following is implied in different sections, it is not
>   obvious.  Please clarify the following in the draft - "Status SACK
>   support is mandatory, whereas data SACK support is not."
> 
> +++ will do in 2.16.1 +++
> 
> 4. The general policy of retry should be that all ordered commands
>   shall support retry bit, since the loss of an ordered command
>   creates a hole in target scoreboarding and stalls the target
>   pipeline.  Retry hopefully can plug the hole quickly to avoid this.
> 
> 5. As a fallout of the above comment, Retry bit must be supported
>   for Text Commands.
> 
> +++ I have added the X-bit.  The reason I did no earlier was that I could
> not foresee
> a case in which the command is not idempotent - I can allways be resent -
> but I guess it is cleaner with the X +++
> 
> 6. Section 2.20, page 71 on Reject must specify if a retry of the operation
>   is allowed for each Reject PDU reason code.  Lack of specification could
>   lead to interoperability issues down the road with "retry wars" raging
>   between heterogeneous implementations (ex., target rejects the retry bit,
>   initiator retries the "retry" bit,....).
> +++ the part now reads:
> 
>    The reject Reason is coded as follows:
> 
>       1 - Format Error
>       2 - Header Digest Error
>       3 - Data (payload) Digest Error
>       4 - Data-SNACK Reject
>       5 - Command Retry Reject
>       15 - Full Feature Phase Command before login
> 
>       Some of the reject reasons terminate or prevent the creation of a
>       task at the target and no retry is possible in those cases. Format
>       error for a command, Command Retry Reject and Full Feature Phase
>       Command before login are in this category.
> 
> 7. NOP-OUT does not require CmdSNs.  Why make it an ordered command
>   and run the risk of a digest error on it leading to a hole in
>   command ordering?
> 
> +++ the reason I wanted it ordered is to check the whole command path - but
> you may try to convince me that it is not a good idea +++
> --
> Mallikarjun
> 
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668   Hewlett-Packard, Roseville.
> cbm@rose.hp.com
--------------6015B21294DF0996B6B53150
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------6015B21294DF0996B6B53150--



From owner-ips@ece.cmu.edu  Thu Apr  5 17:45:13 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA15293
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 17:45:12 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35GhAe04430
	for ips-outgoing; Thu, 5 Apr 2001 12:43:10 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35GgMr04357
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 12:42:22 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id SAA129924
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 18:42:10 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id SAA30158
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 18:38:57 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A25.005B743A ; Thu, 5 Apr 2001 18:38:55 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A25.00513DFA.00@d12mta02.de.ibm.com>
Date: Thu, 5 Apr 2001 11:27:47 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Mallikarjun,

You are right. Too much travel and jet lag.  The reason we made SACK and
status recovery
practically a MUST is that without them we are bound to have only session
drop as an alternative.
If the target does not keep any information after it has sent out status it
can't even retry a command.  And if it can retry a command it should be
able to do SACK.

But perhaps there is a place in the market for the kind of devices Somesh
is suggesting that do all recovery at SCSI level (and that can't copy a
terabyte of data without a session drop).

If that is true (which I doubt) we can make SNACK support optional.

Julo

"Mallikarjun C." <cbm@rose.hp.com> on 05/04/2001 04:09:46

Please respond to cbm@rose.hp.com

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"




>
>Santosh,
>
>I can't find the place where this is stated. SNACK as a PDU type is
>mandated. But it can be rejected outright.

Sorry, you agreed that status SACK is mandatory in ERT forum last
week in response to my comments.  Has there been a change in your opinion?

Attached is the email in a long email thread (issue 3) where you agreed
to make this explicit in rev06.
--
Mallikarjun


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668   Hewlett-Packard, Roseville.
cbm@rose.hp.com


>1.2.2.2 show explicitely that SACK can be rejected. We can add a protocol
>specific parameter in the target Logical Unit Control Page (non-setable)
by
>which the target will indicate support for SNACK.
>
>Julo

Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"





julian_satran@il.ibm.com wrote:
>
> Jon,
>
> Inexpensive implementation are always free to do away with recovery. That
> si true for targets too.

That's not the interpretation one gets from reading the spec and prior
discussions on this list. Per the spec, support for Status SACK is
mandatory while support for data SACK is optional.

IOW, targets MUST retains state information to satisfy a potential
status SACK request.

- Santosh


------------------------------------------------------------------------------


From julian_satran@il.ibm.com Tue Mar 27 05:16:54 PST 2001
Received: from mailhub.rose.hp.com (mailhub.rose.hp.com [15.96.64.24]) by core.rose.hp.com with ESMTP (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id FAA26277 for <cbm@core.rose.hp.com>; Tue, 27 Mar 2001 05:16:52 -0800 (PST)
From: julian_satran@il.ibm.com
Received: from atlrel2.hp.com (atlrel2.hp.com [15.10.184.10]) by mailhub.rose.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id FAA10600 for <cbm@rose.hp.com>; Tue, 27 Mar 2001 05:15:51 -0800 (PST)
Received: from d06lmsgate.uk.ibm.COM (d06lmsgate.uk.ibm.com [195.212.29.1])
     by atlrel2.hp.com (Postfix) with ESMTP id AFC11120
     for <cbm@rose.hp.com>; Tue, 27 Mar 2001 08:15:49 -0500 (EST)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
     by d06lmsgate.uk.ibm.COM (1.0.0) with ESMTP id NAA50078;
     Tue, 27 Mar 2001 13:55:50 +0100
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
     by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id PAA174358;
     Tue, 27 Mar 2001 15:09:58 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999)) id C1256A1C.00483A46 ; Tue, 27 Mar 2001 15:08:55 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: cbm@rose.hp.com
Cc: someshg@yahoo.com, steph@cs.uchicago.edu, hufferd@us.ibm.com,
        cbm@rose.hp.com, ldalleore@snapserver.com,
        Venkat Rangan <venkat@rhapsodynetworks.com>, Black_David@emc.com
Message-ID: <C1256A1C.0048399C.00@d12mta02.de.ibm.com>
Date: Tue, 27 Mar 2001 15:12:01 +0200
Subject: Re: iSCSI ERT: error recovery comments
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Status: RO



Comments in text.  Thanks Julo

"Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 01:41:48

Please respond to cbm@rose.hp.com

To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu, Julian
      Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
cc:   Black_David@emc.com
Subject:  iSCSI ERT: error recovery comments




Hi Julian and the Team,

Here are some comments on error recovery issues.  I hope these will
be addressed soon.  Thanks.

1. The draft should clearly state that if a target doesn't support
  retry (replay in my previous memo's terminology), it must not silently
  accept a command with retry bit and re-do the I/O.

2. Consequent to the above -
     - Clarification required on section 6.7.1, page 83, last para.
          Please confirm and clarify in the draft: If the target sends
          a response with an iSCSI error response of "SACK-rejected" that
          implicitly terminates the task - no retries are allowed. If the
          target sends a Reject PDU with "Data SACK Reject" code, the task
          stays open and the initiator may try to recover using SACK/retry.
+++ I will clarify
it will read:
   An iSCSI target MAY reject a data-SNACK and terminate the command with
   an iSCSI command response of SNACK rejected. In this case, the task is
   terminated and no future action is expected at target and initiator.

   Alternatively, an iSCSI target MAY reject a data-SNACK with a reject
   response of data SNACK rejected. In this case the task is still open and
   may be recovered using the retry.

+++
        - On a data digest error on a data PDU without the F-bit, the draft
          states that the target must wait for a data PDU with the F-bit
          (per section 6.2), then a command termination is signalled with
          a Reject PDU!  I like the formulation in 2.4.2 better.  I
strongly
          recommend that similarly, the target shall send a SCSI Response
          with a iSCSI response of "delivery subsystem failure".  In
general,
          I suggest that anytime a target terminates a task internally, it
          must generate a SCSI Response PDU with an appropriate response
code.
+++ It reads now:

   When a target receives an iSCSI PDU with a header digest error or a
   payload digest error in an iSCSI PDU, it MUST answer with a Reject iSCSI
   PDU with a Reason-code of Header-Digest-error or Data-Digest-Error and
   discard the offending PDU.  If the error is a Data-Digest-Error in a
   Data-PDU, the target MUST either request retransmission with a R2T or
   answer with a command response PDU with a response-code of
   delivery-subsystem-failure and abort the task. If the target is
   answering with an error in the command response PDU it must wait for the
   target to receive all the data (signaled by a Data PDU with the final
   bit Set for all outstanding R2Ts) the command response PDU.
++++

3. While the following is implied in different sections, it is not
  obvious.  Please clarify the following in the draft - "Status SACK
  support is mandatory, whereas data SACK support is not."

+++ will do in 2.16.1 +++

4. The general policy of retry should be that all ordered commands
  shall support retry bit, since the loss of an ordered command
  creates a hole in target scoreboarding and stalls the target
  pipeline.  Retry hopefully can plug the hole quickly to avoid this.

5. As a fallout of the above comment, Retry bit must be supported
  for Text Commands.

+++ I have added the X-bit.  The reason I did no earlier was that I could
not foresee
a case in which the command is not idempotent - I can allways be resent -
but I guess it is cleaner with the X +++


6. Section 2.20, page 71 on Reject must specify if a retry of the operation
  is allowed for each Reject PDU reason code.  Lack of specification could
  lead to interoperability issues down the road with "retry wars" raging
  between heterogeneous implementations (ex., target rejects the retry bit,
  initiator retries the "retry" bit,....).
+++ the part now reads:

   The reject Reason is coded as follows:

      1 - Format Error
      2 - Header Digest Error
      3 - Data (payload) Digest Error
      4 - Data-SNACK Reject
      5 - Command Retry Reject
      15 - Full Feature Phase Command before login

      Some of the reject reasons terminate or prevent the creation of a
      task at the target and no retry is possible in those cases. Format
      error for a command, Command Retry Reject and Full Feature Phase
      Command before login are in this category.


7. NOP-OUT does not require CmdSNs.  Why make it an ordered command
  and run the risk of a digest error on it leading to a hole in
  command ordering?

+++ the reason I wanted it ordered is to check the whole command path - but
you may try to convince me that it is not a good idea +++
--
Mallikarjun


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668   Hewlett-Packard, Roseville.
cbm@rose.hp.com








From owner-ips@ece.cmu.edu  Thu Apr  5 17:54:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA15434
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 17:54:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35HH5C06593
	for ips-outgoing; Thu, 5 Apr 2001 13:17:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35HGPr06529
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 13:16:25 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id NAA29457; Thu, 5 Apr 2001 13:16:13 -0400 (EDT)
Message-ID: <3ACCA828.82194501@cisco.com>
Date: Thu, 05 Apr 2001 12:15:20 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: Jon Hall <jhall@emc.com>
CC: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <200104051355.JAA01158@lub1028.lss.emc.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Jon-

Part of the reason that the session recovery mechanism was added
was to support "non-idempotent" devices, such as tape drives,
media changers, printers (although there are better ways to get
to printers over a network than iSCSI), and so on.  Disks can
usually deal with error recovery at the SCSI level, since it's
not so bad to (carefully) re-try reads and writes.  Multipath
drivers, such as PowerPath, deal with these over multiple interfaces
as well.  However, stream commands do not include the concept
of a block offset, so sending a write twice will write two blocks
on the tape, rather than just writing the same location twice as
in a disk.  SCSI tape drivers do not attempt this sort of recovery;
it has to either be handled by the application (most applications
just abort the backup, restore, or whatever), or by the transport,
which is what the use of StatSN was created to provide.

DataSN and S(N)ACK was added later to provide a finer-grained
recovery when a single command is transferring a large amount of
data, but I agree that that part is overkill.  Just re-sending the
whole command with all of its data is good enough.  Nevertheless,
we still need to keep and acknowledge status using StatSN on at
least the non-disk devices.

So it's not just a matter of cheap/expensive devices, it's also
a matter of device type.

Regards,

Mark

Jon Hall wrote:
> 
> Julian,
> 
> I don't understand.  Are you saying that an "expensive" target
> will implement specific error recovery mechanisms for very rare
> events?  Or are you saying that this case is not a rare event?
> 
> If the former, there is a problem of completeness (e.g., should
> there be recovery procedures for when the sun goes nova :-).
> If the latter, this would be very interesting and useful to
> know about...
> 
> -Jon
> 
> julian_satran@il.ibm.com writes:
> >
> >Jon,
> >
> >Inexpensive implementation are always free to do away with recovery. That
> >si true for targets too.
> >But not specifying the mechanism for the more expensive one we make them
> >non-interoperable.
> >
> >Julo
> >
> >"Jon Hall" <jhall@emc.com> on 04/04/2001 22:55:40
> >
> >Please respond to "Jon Hall" <jhall@emc.com>
> >
> >To:   ips@ece.cmu.edu
> >cc:
> >Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> >
> >But CRC errors are not really the issue.  It is the
> >singular case of a TCP cksum failing to detect what a
> >CRC succeeds in detecting, and this occurring to a TCP
> >segment containing an iSCSI hdr with a StatSN.
> >
> >Is there a reason to believe that iSCSI StatSNs will be
> >lost at a higher rate than is currently documented for TCP
> >cksum failure?  Or, is the problem a loss of one TCP segment
> >in tens (possibly hundreds) of millions of segments.  Where
> >the bad segment may contain a StatSN but probably doesn't
> >because it is a data pdu.  If the latter, why does a SCSI-level
> >timeout and retry (on the initiator) not suffice?  [Note,
> >an initiator timeout/retry does not require a connection
> >to be closed.]
> >
> >I realize that I am being annoyingly repetitious, but it is
> >not an idle question.  For some targets, retained rsp status
> >is not cheap (and retained rsp data is not tractable at all).
> >
> >IMO there appears to be no real need for SNACK.  And, more
> >radically, there appears to be no need for StatSNs.
> >
> >Maybe, as Somesh said, this is a dead horse but why include
> >something in the spec which suggests a need for target-side
> >complexity, while not solving a clear and compelling
> >requirement?
> >
> >-Jon
> >
> >julian_satran@il.ibm.com writes:
> >>
> >>SNACK is here for two reasons - Status retry (which is cheap) and Data
> >>retry as a side benefit.
> >>CRC errors are not that rare (although we don't have real data the
> >>simulation with file systems seem to indicate that numbers could be as
> >high
> >>a 0.0002%). A restart of link - is expensive (slow start) and even if they
> >>are far lower for many applications a slow start is a painfull event.
> >>
> >>Removing them from the spec is not a path we should take lightly.
> >>
> >>Julo
> >>
> >>"Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
> >>
> >>Please respond to "Jon Hall" <jhall@emc.com>
> >>
> >>To:   ips@ece.cmu.edu
> >>cc:
> >>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>
> >>
> >>
> >>
> >>
> >>I agree with Somesh.  And would go farther -- the complexity
> >>that results from retaining enough target-side state to respond
> >>to a SACK/SNACK request is non-trivial and needs clear justification.
> >>Intuitively, a CRC that discovers an error in an iSCSI pdu header
> >>(that the TCP cksum missed) seems like it should be a rare event.
> >>
> >>What is the frequency of this event?  IMO the answer to this
> >>question should be written into the protocol spec -- assuming
> >>that it substantiates the benefit of SACK/SNACK.  Otherwise, the
> >>SACK/SNACK pdu should be removed.
> >>
> >>-Jon
> >>
> >>julian_satran@il.ibm.com writes:
> >>>
> >>>Somesh,
> >>>
> >>>As I stated earlier - the DataSN was created to detect missing data PDUs.
> >>>SNACK is needed to recover missing StatusSN and missing dataSN is only a
> >>>bonus if the target wants to support it.  It is a trivial mechanism and I
> >>>think it should stay.
> >>>
> >>>Julo
> >>>
> >>>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
> >>>
> >>>Please respond to someshg@yahoo.com
> >>>
> >>>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> >>>cc:
> >>>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>>
> >>>
> >>>
> >>>
> >>>Sorry to have been missing for a while. Hope you will
> >>>appreciate my being back in action :-). It was a fairly
> >>>clear consensus in Orlando that applications broke up
> >>>their transfers into reasonably small chunks i.e. they
> >>>did not have very long running transfers.
> >>>
> >>>Therefore the consensus was that a command level recovery
> >>>mechanism was sufficient instead of an ack/sack for each
> >>>data PDU.
> >>>
> >>>The SACK mechanism was a post Orlando invention. Without
> >>>an ack mechanism (for every data PDU), the SACK mechanism
> >>>just imposes additional burden on either end of the session,
> >>>without really much benefit.
> >>>
> >>>The benefit of having SACK is of saving bandwidth in case
> >>>the data part of the data PDU failed an integrity check
> >>>(but passed TCP checksum). This is a rare enough case that
> >>>as a percentage, the bandwidth loss from retransmitting
> >>>all the data associated with a read or write command is
> >>>very very small.
> >>>
> >>>In addition, it avoids the complexity of restarting
> >>>something from the middle, as compared to from the begining.
> >>>
> >>>To me it seems that there is significant simplicity (from
> >>>implementation, reliability and recovery process) from
> >>>having smaller data transfer per command.
> >>>
> >>>I would really like to get rid of the SACK command.
> >>>
> >>>Somesh
> >>>
> >>>> -----Original Message-----
> >>>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> >>>> julian_satran@il.ibm.com
> >>>> Sent: Wednesday, March 28, 2001 6:57 AM
> >>>> To: ips@ece.cmu.edu
> >>>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Mallikarjun,
> >>>>
> >>>> Last summer I thought that recovery within a connection should be left
> >>to
> >>>> TCP. It is simple and could be made available through IPsec (if no new
> >>>> option of any form can be added).
> >>>>
> >>>> Two things killed this:
> >>>>
> >>>>    The requirement to have a data encapsulation that can pass through
> >>>>    application proxies (like a storage router)
> >>>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
> >>>>    header
> >>>>
> >>>>
> >>>> As for the ACK - I am very much in favor of it (it is a no brainer) and
> >>>> implementations are in fact allowed to drop even unacked data.
> >>>>
> >>>> I am bound by the Orlando meeting decision to drop it. Except the
> >>regular
> >>>> "oppose everything" crowd the two vocal opponents where Somesh Gupta
> >and
> >>>> Matt Wakeley.
> >>>>
> >>>> David may want or not to re-open the issue - I am not going to ask for
> >>>it.
> >>>>
> >>>> Regards,
> >>>> Julo
> >>>>
> >>>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
> >>>>
> >>>> Please respond to cbm@rose.hp.com
> >>>>
> >>>> To:   Black_David@emc.com
> >>>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
> >someshg@yahoo.com,
> >>>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
> >>>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
> >>>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> David and Julian,
> >>>>
> >>>> I appreciate both your views, and should I say that they're
> >>>> along predicted lines :-)
> >>>>
> >>>> - David's right in saying that the situation is akin to FC's.
> >>>>   However, I would like to point out that FC is an unreliable
> >>>>   transport, and hence is forced to pick up a lot of the transport
> >>>>   baggage (at least in FCP-2, as I understand), in addition
> >>>>   to being a SCSI encapsulation layer.  Unfortunately, even with
> >>>>   TCP being the "reliable" transport, iSCSI is going along the
> >>>>   same lines - ie. transport baggage + SCSI encapsulation.  My
> >>>>   point is - if this is indeed a necessary evil, why don't we
> >>>>   complete iSCSI's transport functionality by data-ACKs?
> >>>>
> >>>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
> >>>>   we're making its usage (and implementation) drastically less
> >appealing
> >>>>   since the only way error recovery algorithms can *rely* on data SACK
> >>>>   is when replay is supported (or, "ReplaySupport=yes"  in my
> >proposal),
> >>>>   which is extremely expensive.  IOW, we're defining data SACK in the
> >>>>   draft and not providing any incentives to implement and use it!
> >>>>
> >>>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
> >>>>   protocol in its definition so far (and I believe, rightly so -
> >>>mandating
> >>>>   command ordering, bi-di support, SCSI CRN support to name a few
> >>>> examples),
> >>>>   the perfectly SCSI-legal R/W interactions that break in other
> >>>transports
> >>>>   *do not* have to break in iSCSI.
> >>>>
> >>>> - A last idea (may seem radical at this point) in regards to iSCSI
> >>>>   being a "full transport". This provides us an opportunity to "cast
> >>>>   off" the transport baggage in future when we truly move to a
> >>"reliable"
> >>>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
> >>>>   keeping the encapsulation stuff separate from the transport stuff.
> >>>>   (Julian, I heard from Randy that ideas similar to this were explored
> >>>>   in your Haifa meeting.  And yes, he recalls they were given up since
> >>>>   TCP was supposed to be reliable and granularity of recovery was
> >deemed
> >>>>   one I/O.)
> >>>>
> >>>> With that said, may I request David (with his co-chair hat on, :-))
> >>>> to add some binding comments/observations on this discussion?
> >>>>
> >>>> If we decide to leave data SACKs as unattractive to implement, the
> >draft
> >>>> should in the least add a statement like - "Note that satisfying all
> >>>> possible data SACK requests for a task with an unacknowledged status
> >>>> implies implementing the I/O replay buffer on the part of targets."
> >>>> --
> >>>> Mallikarjun
> >>>>
> >>>>
> >>>> Mallikarjun Chadalapaka
> >>>> Networked Storage Architecture
> >>>> Network Storage Solutions Organization
> >>>> MS 5668   Hewlett-Packard, Roseville.
> >>>> cbm@rose.hp.com
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> >I think Julian's basically right -- I would point
> >>>> >out that any case of write after read that breaks
> >>>> >over iSCSI will also break over Fibre Channel.
> >>>> >On FC, the scenario starts with a frame CRC failure
> >>>> >on read data at the Initiator, so applications
> >>>> >have to cope and typically do so by enforcing
> >>>> >ordering at the app rather than using SCSI task
> >>>> >ordering.
> >>>> >
> >>>> >While SCSI has clever tools like ACA and task
> >>>> >ordering that appear to allow dependent operations
> >>>> >to be sent to the target concurrently, in practice
> >>>> >they don't work and/or aren't used (funny thing,
> >>>> >those two reinforce each other ;-) ).  Hence
> >>>> >a minimal approach to them is in order:
> >>>> >- Make sure the result will interoperate.
> >>>> >- Make sure T10 doesn't ding us for leaving something
> >>>> >    completely out.
> >>>> >- Don't specify anything not needed for the above.
> >>>> >
> >>>> >My 0.02,
> >>>> >--David
> >>>> >
> >>>> >> -----Original Message-----
> >>>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> >>>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
> >>>> >> To:    cbm@rose.hp.com
> >>>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
> >>>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
> >>>> >> Black_David@emc.com
> >>>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Mallikarjun,
> >>>> >>
> >>>> >> I commiserate with you at the lack of ack for data but the Orlando
> >>>> meeting
> >>>> >> stated - no.  Recall that I kept the number only as a mechanism to
> >>>> detect
> >>>> >> missing packets.
> >>>> >>
> >>>> >> You can achieve the effect you want by keeping around data for a
> >>while
> >>>> >> (you
> >>>> >> determine how long and then discard).
> >>>> >>
> >>>> >> If a SACK comes and you can recover - fine. If not you either
> >>reaccess
> >>>> the
> >>>> >> media (if you know how) or reject
> >>>> >> and let the initiator retry.
> >>>> >>
> >>>> >> You should not worry about R/W conflicts as programs bound to have
> >>>such
> >>>> >> conflicts either:
> >>>> >>
> >>>> >> 1)can live with them or
> >>>> >> 2)protect themselves through some locks and rely on
> >>>> "operation-end-status"
> >>>> >> to keep results deterministic.
> >>>> >>
> >>>> >> Regards,
> >>>> >> Julo
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
> >>>> >>
> >>>> >> Please respond to cbm@rose.hp.com
> >>>> >>
> >>>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
> >>>Julian
> >>>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> >>>> >> cc:   Black_David@emc.com
> >>>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Hi Error Recovery Team,
> >>>> >>
> >>>> >> iSCSI can discard PDUs because of digest errors and request
> >>>> >> retransmissions using the iSCSI data SACK.  To deal with such
> >>>> >> an eventuality, targets that want to support data SACK have
> >>>> >> the following options:
> >>>> >>
> >>>> >> (A) maintain a complete "replay" buffer for the entire I/O since
> >>>> >>   a SACK could come anytime before the status is ack'ed by the
> >>>> >>   initiator. [ simple, but extremely expensive in memory resources]
> >>>> >>
> >>>> >> (B) (re-introduce data-ACKs into the draft, and) implement
> >data-ACKs.
> >>>> >>   Thus enables keeping only those I/O buffers that haven't been
> >>ack'ed
> >>>> >>   by the initiator. IOW, become a real full transport! [ everyone
> >>>> disliked
> >>>> >>   it earlier...]
> >>>> >>
> >>>> >> (C) re-access the medium for data retransmission requests.  Now
> >there
> >>>> >>   are 3 sub-cases in this to handle the changed data on the medium
> >in
> >>>a
> >>>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
> >>>> >> legal.)
> >>>> >>      (1) On seeing any write, stall till status is ack'ed for all
> >the
> >>>> >>             previous reads (basically drain the pipe). [simple, but
> >>>> incurs
> >>>> >>             an additional roundtrip delay for all writes].
> >>>> >>      (2) A variation of the above, keep an eye only on the prior
> >>>> >>             overlapping reads. [more BW efficient, but complicated
> >to
> >>>> >>             resolve the block dependencies in a stream of
> >>>> reads followed
> >>>> >>             by writes]
> >>>> >>         (3) Document the caveat and leave it upto the applications
> >>>> >>             to avoid this case since this leads to data integrity
> >>>> issues.
> >>>> >>             [pushing to apps since the transport can't get it
> >right!]
> >>>> >>
> >>>> >> My first preference is (B), followed by (A), and I suggest we not go
> >>>> >> to (C) at all with its inherent dangers.
> >>>> >>
> >>>> >> Doing (B) naturally completes the transport job that iSCSI has taken
> >>>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
> >>>> >> right thing to do architecturally instead of being a
> >>"semi-transport"!
> >>>> >>
> >>>> >> Comments?
> >>>> >> --
> >>>> >> Mallikarjun
> >>>> >>
> >>>> >>
> >>>> >> Mallikarjun Chadalapaka
> >>>> >> Networked Storage Architecture
> >>>> >> Network Storage Solutions Organization
> >>>> >> MS 5668   Hewlett-Packard, Roseville.
> >>>> >> cbm@rose.hp.com
> >>>> >>
> >>>> >>
> >>>>
> >>>__________________________________________________________________________
> >
> >>>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
> >>>> legal
> >>>> >>         if SCSI sets the ORDERED task attribute on both the
> >>>> commands AND
> >>>> >>         sets the NACA bit to one to indicate that Write shall be
> >>>> executed
> >>>> >>         only if the Read did not fail (result in a Check Condition).
> >>>> >>
> >>>> >>         In the current case, since Read completed just fine from
> >>>SCSI's
> >>>> >>         point of view, SCSI is moving on to execute Write.  Those
> >>read
> >>>> >> buffers
> >>>> >>         had been freed up since iSCSI received an ACK at the TCP
> >>>level,
> >>>> >> and
> >>>> >>         since iSCSI has no other way to have the data ack'ed!
> >>
> >

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Thu Apr  5 19:24:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA16464
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 19:24:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35Im9D12812
	for ips-outgoing; Thu, 5 Apr 2001 14:48:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35IlMr12772
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 14:47:22 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id A1AB6132D; Thu,  5 Apr 2001 11:47:11 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA28139;
	Thu, 5 Apr 2001 11:46:27 -0700 (PDT)
Message-ID: <3ACCBEBE.A3AA98D3@cup.hp.com>
Date: Thu, 05 Apr 2001 11:51:42 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Mark Bakke <mbakke@cisco.com>
Cc: Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <200104051355.JAA01158@lub1028.lss.emc.com> <3ACCA828.82194501@cisco.com>
Content-Type: multipart/mixed;
 boundary="------------93DA6649401A4C0BBA094201"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------93DA6649401A4C0BBA094201
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Mark Bakke wrote:

> Part of the reason that the session recovery mechanism was added
> was to support "non-idempotent" devices, such as tape drives,
> media changers, printers (although there are better ways to get
> to printers over a network than iSCSI), and so on.  Disks can
> usually deal with error recovery at the SCSI level, since it's
> not so bad to (carefully) re-try reads and writes.  Multipath
> drivers, such as PowerPath, deal with these over multiple interfaces
> as well.  However, stream commands do not include the concept
> of a block offset, so sending a write twice will write two blocks
> on the tape, rather than just writing the same location twice as
> in a disk.  SCSI tape drivers do not attempt this sort of recovery;
> it has to either be handled by the application (most applications
> just abort the backup, restore, or whatever), or by the transport,
> which is what the use of StatSN was created to provide.

Mark,

First off, we are talking about scenarios caused by a TCP checksum
escape that was caught by an iSCSI digest [CRC]. GIven the TCP checksum
escape rates that Pierre was quoting in his earlier mail, this is too
low a rate to justify inclusion of intra-task recovery mechanisms in
iSCSI.

Such non-idempotent devices can respond with a Reject indicating a
reason of "command retry reject" when they see a command with the
"retry" bit set. Given the low rate of TCP checksum escapes, the impact
to sequential access media type devices is minimal and the cost of
buffering all data buffers until StatSN has been acknowledged may be a
higher cost compared to the benefits derived from intra-task recovery.

There is also the question of protocol stability and simplicity. If we
keep iSCSI down to minimal additional error recovery other than existing
TCP and SCSI recovery mechanisms, we are enhancing the stability of the
solution since it relies on existing mature error recovery schemes.
Continuing to add new additional error recovery schemes into iSCSI on
top of TCP and SCSI error recovery mechanisms will turn iSCSI into
another nightmare like FC in terms of complexity and interop.

Regards,
Santosh
--------------93DA6649401A4C0BBA094201
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------93DA6649401A4C0BBA094201--



From owner-ips@ece.cmu.edu  Thu Apr  5 20:13:31 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA16794
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 20:13:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35MTH327642
	for ips-outgoing; Thu, 5 Apr 2001 18:29:17 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lub1028.lss.emc.com ([168.159.39.28])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35MSYr27602
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 18:28:35 -0400 (EDT)
Received: from emc.com (IDENT:jhall@localhost.localdomain [127.0.0.1])
	by lub1028.lss.emc.com (8.9.3/8.9.3) with ESMTP id SAA04439
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 18:28:28 -0400
Message-Id: <200104052228.SAA04439@lub1028.lss.emc.com>
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Thu, 05 Apr 2001 18:28:28 -0400
From: "Jon Hall" <jhall@emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Mark,

Thanks for the info.  I don't claim to understand all of the
issues with stream commands, although the idempotent issue is
one that I had heard before.  Still, there seems to be no
supporting information that would suggest that this specific
error case is anything other than extremely rare.

What level of complexity should be attempted to recover from
rare errors -- complexity causes its own error cases (as has
been discussed on the list).  And, as you observe, streams
devices must support abort and retry for extreme errors in
any case.

Given that the error is extremely rare, IMO the protocol would
benefit from the removal of StatSN/SNACK/SACK.  If there is
information that shows that this error case is not very rare,
that would be interesting.  If there is no such information,
then I think the horse is well and truely dead :-).

-Jon

Mark Bakke writes:
>Jon-
>
>Part of the reason that the session recovery mechanism was added
>was to support "non-idempotent" devices, such as tape drives,
>media changers, printers (although there are better ways to get
>to printers over a network than iSCSI), and so on.  Disks can
>usually deal with error recovery at the SCSI level, since it's
>not so bad to (carefully) re-try reads and writes.  Multipath
>drivers, such as PowerPath, deal with these over multiple interfaces
>as well.  However, stream commands do not include the concept
>of a block offset, so sending a write twice will write two blocks
>on the tape, rather than just writing the same location twice as
>in a disk.  SCSI tape drivers do not attempt this sort of recovery;
>it has to either be handled by the application (most applications
>just abort the backup, restore, or whatever), or by the transport,
>which is what the use of StatSN was created to provide.
>
>DataSN and S(N)ACK was added later to provide a finer-grained
>recovery when a single command is transferring a large amount of
>data, but I agree that that part is overkill.  Just re-sending the
>whole command with all of its data is good enough.  Nevertheless,
>we still need to keep and acknowledge status using StatSN on at
>least the non-disk devices.
>
>So it's not just a matter of cheap/expensive devices, it's also
>a matter of device type.
>
>Regards,
>
>Mark
>
>Jon Hall wrote:
>> 
>> Julian,
>> 
>> I don't understand.  Are you saying that an "expensive" target
>> will implement specific error recovery mechanisms for very rare
>> events?  Or are you saying that this case is not a rare event?
>> 
>> If the former, there is a problem of completeness (e.g., should
>> there be recovery procedures for when the sun goes nova :-).
>> If the latter, this would be very interesting and useful to
>> know about...
>> 
>> -Jon
>> 
>> julian_satran@il.ibm.com writes:
>> >
>> >Jon,
>> >
>> >Inexpensive implementation are always free to do away with recovery. That
>> >si true for targets too.
>> >But not specifying the mechanism for the more expensive one we make them
>> >non-interoperable.
>> >
>> >Julo
>> >
>> >"Jon Hall" <jhall@emc.com> on 04/04/2001 22:55:40
>> >
>> >Please respond to "Jon Hall" <jhall@emc.com>
>> >
>> >To:   ips@ece.cmu.edu
>> >cc:
>> >Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >
>> >
>> >But CRC errors are not really the issue.  It is the
>> >singular case of a TCP cksum failing to detect what a
>> >CRC succeeds in detecting, and this occurring to a TCP
>> >segment containing an iSCSI hdr with a StatSN.
>> >
>> >Is there a reason to believe that iSCSI StatSNs will be
>> >lost at a higher rate than is currently documented for TCP
>> >cksum failure?  Or, is the problem a loss of one TCP segment
>> >in tens (possibly hundreds) of millions of segments.  Where
>> >the bad segment may contain a StatSN but probably doesn't
>> >because it is a data pdu.  If the latter, why does a SCSI-level
>> >timeout and retry (on the initiator) not suffice?  [Note,
>> >an initiator timeout/retry does not require a connection
>> >to be closed.]
>> >
>> >I realize that I am being annoyingly repetitious, but it is
>> >not an idle question.  For some targets, retained rsp status
>> >is not cheap (and retained rsp data is not tractable at all).
>> >
>> >IMO there appears to be no real need for SNACK.  And, more
>> >radically, there appears to be no need for StatSNs.
>> >
>> >Maybe, as Somesh said, this is a dead horse but why include
>> >something in the spec which suggests a need for target-side
>> >complexity, while not solving a clear and compelling
>> >requirement?
>> >
>> >-Jon
>> >
>> >julian_satran@il.ibm.com writes:
>> >>
>> >>SNACK is here for two reasons - Status retry (which is cheap) and Data
>> >>retry as a side benefit.
>> >>CRC errors are not that rare (although we don't have real data the
>> >>simulation with file systems seem to indicate that numbers could be as
>> >high
>> >>a 0.0002%). A restart of link - is expensive (slow start) and even if they
>> >>are far lower for many applications a slow start is a painfull event.
>> >>
>> >>Removing them from the spec is not a path we should take lightly.
>> >>
>> >>Julo
>> >>
>> >>"Jon Hall" <jhall@emc.com> on 02/04/2001 16:13:35
>> >>
>> >>Please respond to "Jon Hall" <jhall@emc.com>
>> >>
>> >>To:   ips@ece.cmu.edu
>> >>cc:
>> >>Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>I agree with Somesh.  And would go farther -- the complexity
>> >>that results from retaining enough target-side state to respond
>> >>to a SACK/SNACK request is non-trivial and needs clear justification.
>> >>Intuitively, a CRC that discovers an error in an iSCSI pdu header
>> >>(that the TCP cksum missed) seems like it should be a rare event.
>> >>
>> >>What is the frequency of this event?  IMO the answer to this
>> >>question should be written into the protocol spec -- assuming
>> >>that it substantiates the benefit of SACK/SNACK.  Otherwise, the
>> >>SACK/SNACK pdu should be removed.
>> >>
>> >>-Jon
>> >>
>> >>julian_satran@il.ibm.com writes:
>> >>>
>> >>>Somesh,
>> >>>
>> >>>As I stated earlier - the DataSN was created to detect missing data PDUs.
>> >>>SNACK is needed to recover missing StatusSN and missing dataSN is only a
>> >>>bonus if the target wants to support it.  It is a trivial mechanism and I
>> >>>think it should stay.
>> >>>
>> >>>Julo
>> >>>
>> >>>"Somesh Gupta" <someshg@yahoo.com> on 31/03/2001 02:25:52
>> >>>
>> >>>Please respond to someshg@yahoo.com
>> >>>
>> >>>To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
>> >>>cc:
>> >>>Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>Sorry to have been missing for a while. Hope you will
>> >>>appreciate my being back in action :-). It was a fairly
>> >>>clear consensus in Orlando that applications broke up
>> >>>their transfers into reasonably small chunks i.e. they
>> >>>did not have very long running transfers.
>> >>>
>> >>>Therefore the consensus was that a command level recovery
>> >>>mechanism was sufficient instead of an ack/sack for each
>> >>>data PDU.
>> >>>
>> >>>The SACK mechanism was a post Orlando invention. Without
>> >>>an ack mechanism (for every data PDU), the SACK mechanism
>> >>>just imposes additional burden on either end of the session,
>> >>>without really much benefit.
>> >>>
>> >>>The benefit of having SACK is of saving bandwidth in case
>> >>>the data part of the data PDU failed an integrity check
>> >>>(but passed TCP checksum). This is a rare enough case that
>> >>>as a percentage, the bandwidth loss from retransmitting
>> >>>all the data associated with a read or write command is
>> >>>very very small.
>> >>>
>> >>>In addition, it avoids the complexity of restarting
>> >>>something from the middle, as compared to from the begining.
>> >>>
>> >>>To me it seems that there is significant simplicity (from
>> >>>implementation, reliability and recovery process) from
>> >>>having smaller data transfer per command.
>> >>>
>> >>>I would really like to get rid of the SACK command.
>> >>>
>> >>>Somesh
>> >>>
>> >>>> -----Original Message-----
>> >>>> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
>> >>>> julian_satran@il.ibm.com
>> >>>> Sent: Wednesday, March 28, 2001 6:57 AM
>> >>>> To: ips@ece.cmu.edu
>> >>>> Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> Mallikarjun,
>> >>>>
>> >>>> Last summer I thought that recovery within a connection should be left
>> >>to
>> >>>> TCP. It is simple and could be made available through IPsec (if no new
>> >>>> option of any form can be added).
>> >>>>
>> >>>> Two things killed this:
>> >>>>
>> >>>>    The requirement to have a data encapsulation that can pass through
>> >>>>    application proxies (like a storage router)
>> >>>>    The "NO WAY" message we got from IESG-Security on a CRC only IPSec
>> >>>>    header
>> >>>>
>> >>>>
>> >>>> As for the ACK - I am very much in favor of it (it is a no brainer) and
>> >>>> implementations are in fact allowed to drop even unacked data.
>> >>>>
>> >>>> I am bound by the Orlando meeting decision to drop it. Except the
>> >>regular
>> >>>> "oppose everything" crowd the two vocal opponents where Somesh Gupta
>> >and
>> >>>> Matt Wakeley.
>> >>>>
>> >>>> David may want or not to re-open the issue - I am not going to ask for
>> >>>it.
>> >>>>
>> >>>> Regards,
>> >>>> Julo
>> >>>>
>> >>>> "Mallikarjun C." <cbm@rose.hp.com> on 28/03/2001 00:45:02
>> >>>>
>> >>>> Please respond to cbm@rose.hp.com
>> >>>>
>> >>>> To:   Black_David@emc.com
>> >>>> cc:   Julian Satran/Haifa/IBM@IBMIL, cbm@rose.hp.com,
>> >someshg@yahoo.com,
>> >>>>       steph@cs.uchicago.edu, John Hufferd/San Jose/IBM@IBMUS,
>> >>>>       ldalleore@snapserver.com, venkat@rhapsodynetworks.com
>> >>>> Subject:  RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> David and Julian,
>> >>>>
>> >>>> I appreciate both your views, and should I say that they're
>> >>>> along predicted lines :-)
>> >>>>
>> >>>> - David's right in saying that the situation is akin to FC's.
>> >>>>   However, I would like to point out that FC is an unreliable
>> >>>>   transport, and hence is forced to pick up a lot of the transport
>> >>>>   baggage (at least in FCP-2, as I understand), in addition
>> >>>>   to being a SCSI encapsulation layer.  Unfortunately, even with
>> >>>>   TCP being the "reliable" transport, iSCSI is going along the
>> >>>>   same lines - ie. transport baggage + SCSI encapsulation.  My
>> >>>>   point is - if this is indeed a necessary evil, why don't we
>> >>>>   complete iSCSI's transport functionality by data-ACKs?
>> >>>>
>> >>>> - If data SACK is introduced mostly to make up for TCP's shortcomings,
>> >>>>   we're making its usage (and implementation) drastically less
>> >appealing
>> >>>>   since the only way error recovery algorithms can *rely* on data SACK
>> >>>>   is when replay is supported (or, "ReplaySupport=yes"  in my
>> >proposal),
>> >>>>   which is extremely expensive.  IOW, we're defining data SACK in the
>> >>>>   draft and not providing any incentives to implement and use it!
>> >>>>
>> >>>> - I submit that since iSCSI is being hailed as the ideal SCSI Transport
>> >>>>   protocol in its definition so far (and I believe, rightly so -
>> >>>mandating
>> >>>>   command ordering, bi-di support, SCSI CRN support to name a few
>> >>>> examples),
>> >>>>   the perfectly SCSI-legal R/W interactions that break in other
>> >>>transports
>> >>>>   *do not* have to break in iSCSI.
>> >>>>
>> >>>> - A last idea (may seem radical at this point) in regards to iSCSI
>> >>>>   being a "full transport". This provides us an opportunity to "cast
>> >>>>   off" the transport baggage in future when we truly move to a
>> >>"reliable"
>> >>>>   transport (perhaps TCP with CRCs/SCTP ?) - if we do a good job of
>> >>>>   keeping the encapsulation stuff separate from the transport stuff.
>> >>>>   (Julian, I heard from Randy that ideas similar to this were explored
>> >>>>   in your Haifa meeting.  And yes, he recalls they were given up since
>> >>>>   TCP was supposed to be reliable and granularity of recovery was
>> >deemed
>> >>>>   one I/O.)
>> >>>>
>> >>>> With that said, may I request David (with his co-chair hat on, :-))
>> >>>> to add some binding comments/observations on this discussion?
>> >>>>
>> >>>> If we decide to leave data SACKs as unattractive to implement, the
>> >draft
>> >>>> should in the least add a statement like - "Note that satisfying all
>> >>>> possible data SACK requests for a task with an unacknowledged status
>> >>>> implies implementing the I/O replay buffer on the part of targets."
>> >>>> --
>> >>>> Mallikarjun
>> >>>>
>> >>>>
>> >>>> Mallikarjun Chadalapaka
>> >>>> Networked Storage Architecture
>> >>>> Network Storage Solutions Organization
>> >>>> MS 5668   Hewlett-Packard, Roseville.
>> >>>> cbm@rose.hp.com
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> >I think Julian's basically right -- I would point
>> >>>> >out that any case of write after read that breaks
>> >>>> >over iSCSI will also break over Fibre Channel.
>> >>>> >On FC, the scenario starts with a frame CRC failure
>> >>>> >on read data at the Initiator, so applications
>> >>>> >have to cope and typically do so by enforcing
>> >>>> >ordering at the app rather than using SCSI task
>> >>>> >ordering.
>> >>>> >
>> >>>> >While SCSI has clever tools like ACA and task
>> >>>> >ordering that appear to allow dependent operations
>> >>>> >to be sent to the target concurrently, in practice
>> >>>> >they don't work and/or aren't used (funny thing,
>> >>>> >those two reinforce each other ;-) ).  Hence
>> >>>> >a minimal approach to them is in order:
>> >>>> >- Make sure the result will interoperate.
>> >>>> >- Make sure T10 doesn't ding us for leaving something
>> >>>> >    completely out.
>> >>>> >- Don't specify anything not needed for the above.
>> >>>> >
>> >>>> >My 0.02,
>> >>>> >--David
>> >>>> >
>> >>>> >> -----Original Message-----
>> >>>> >> From:  julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
>> >>>> >> Sent:  Tuesday, March 27, 2001 9:23 AM
>> >>>> >> To:    cbm@rose.hp.com
>> >>>> >> Cc:    someshg@yahoo.com; steph@cs.uchicago.edu; hufferd@us.ibm.com;
>> >>>> >> cbm@rose.hp.com; ldalleore@snapserver.com; Venkat Rangan;
>> >>>> >> Black_David@emc.com
>> >>>> >> Subject:    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> Mallikarjun,
>> >>>> >>
>> >>>> >> I commiserate with you at the lack of ack for data but the Orlando
>> >>>> meeting
>> >>>> >> stated - no.  Recall that I kept the number only as a mechanism to
>> >>>> detect
>> >>>> >> missing packets.
>> >>>> >>
>> >>>> >> You can achieve the effect you want by keeping around data for a
>> >>while
>> >>>> >> (you
>> >>>> >> determine how long and then discard).
>> >>>> >>
>> >>>> >> If a SACK comes and you can recover - fine. If not you either
>> >>reaccess
>> >>>> the
>> >>>> >> media (if you know how) or reject
>> >>>> >> and let the initiator retry.
>> >>>> >>
>> >>>> >> You should not worry about R/W conflicts as programs bound to have
>> >>>such
>> >>>> >> conflicts either:
>> >>>> >>
>> >>>> >> 1)can live with them or
>> >>>> >> 2)protect themselves through some locks and rely on
>> >>>> "operation-end-status"
>> >>>> >> to keep results deterministic.
>> >>>> >>
>> >>>> >> Regards,
>> >>>> >> Julo
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 03:34:16
>> >>>> >>
>> >>>> >> Please respond to cbm@rose.hp.com
>> >>>> >>
>> >>>> >> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu,
>> >>>Julian
>> >>>> >>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
>> >>>> >> cc:   Black_David@emc.com
>> >>>> >> Subject:  iSCSI ERT: data SACK/replay buffer/"semi-transport"
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> Hi Error Recovery Team,
>> >>>> >>
>> >>>> >> iSCSI can discard PDUs because of digest errors and request
>> >>>> >> retransmissions using the iSCSI data SACK.  To deal with such
>> >>>> >> an eventuality, targets that want to support data SACK have
>> >>>> >> the following options:
>> >>>> >>
>> >>>> >> (A) maintain a complete "replay" buffer for the entire I/O since
>> >>>> >>   a SACK could come anytime before the status is ack'ed by the
>> >>>> >>   initiator. [ simple, but extremely expensive in memory resources]
>> >>>> >>
>> >>>> >> (B) (re-introduce data-ACKs into the draft, and) implement
>> >data-ACKs.
>> >>>> >>   Thus enables keeping only those I/O buffers that haven't been
>> >>ack'ed
>> >>>> >>   by the initiator. IOW, become a real full transport! [ everyone
>> >>>> disliked
>> >>>> >>   it earlier...]
>> >>>> >>
>> >>>> >> (C) re-access the medium for data retransmission requests.  Now
>> >there
>> >>>> >>   are 3 sub-cases in this to handle the changed data on the medium
>> >in
>> >>>a
>> >>>> >>   write-after-read scenario.  (SEE NOTE.1 at the bottom on how it is
>> >>>> >> legal.)
>> >>>> >>      (1) On seeing any write, stall till status is ack'ed for all
>> >the
>> >>>> >>             previous reads (basically drain the pipe). [simple, but
>> >>>> incurs
>> >>>> >>             an additional roundtrip delay for all writes].
>> >>>> >>      (2) A variation of the above, keep an eye only on the prior
>> >>>> >>             overlapping reads. [more BW efficient, but complicated
>> >to
>> >>>> >>             resolve the block dependencies in a stream of
>> >>>> reads followed
>> >>>> >>             by writes]
>> >>>> >>         (3) Document the caveat and leave it upto the applications
>> >>>> >>             to avoid this case since this leads to data integrity
>> >>>> issues.
>> >>>> >>             [pushing to apps since the transport can't get it
>> >right!]
>> >>>> >>
>> >>>> >> My first preference is (B), followed by (A), and I suggest we not go
>> >>>> >> to (C) at all with its inherent dangers.
>> >>>> >>
>> >>>> >> Doing (B) naturally completes the transport job that iSCSI has taken
>> >>>> >> on itself in view of TCP's claimed unreliable checksum.  That is the
>> >>>> >> right thing to do architecturally instead of being a
>> >>"semi-transport"!
>> >>>> >>
>> >>>> >> Comments?
>> >>>> >> --
>> >>>> >> Mallikarjun
>> >>>> >>
>> >>>> >>
>> >>>> >> Mallikarjun Chadalapaka
>> >>>> >> Networked Storage Architecture
>> >>>> >> Network Storage Solutions Organization
>> >>>> >> MS 5668   Hewlett-Packard, Roseville.
>> >>>> >> cbm@rose.hp.com
>> >>>> >>
>> >>>> >>
>> >>>>
>> >>>__________________________________________________________________________
>> >
>> >>>> >> Note.1: A Read followed by a Write (to the same blocks) is perfectly
>> >>>> legal
>> >>>> >>         if SCSI sets the ORDERED task attribute on both the
>> >>>> commands AND
>> >>>> >>         sets the NACA bit to one to indicate that Write shall be
>> >>>> executed
>> >>>> >>         only if the Read did not fail (result in a Check Condition).
>> >>>> >>
>> >>>> >>         In the current case, since Read completed just fine from
>> >>>SCSI's
>> >>>> >>         point of view, SCSI is moving on to execute Write.  Those
>> >>read
>> >>>> >> buffers
>> >>>> >>         had been freed up since iSCSI received an ACK at the TCP
>> >>>level,
>> >>>> >> and
>> >>>> >>         since iSCSI has no other way to have the data ack'ed!
>> >>
>> >
>
>-- 
>Mark A. Bakke
>Cisco Systems
>mbakke@cisco.com
>763.398.1054


From owner-ips@ece.cmu.edu  Thu Apr  5 20:13:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA16805
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 20:13:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35MfNU28398
	for ips-outgoing; Thu, 5 Apr 2001 18:41:23 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1t.cos.agilent.com (msgbas1tx.cos.agilent.com [192.6.9.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35Mf4r28374
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 18:41:04 -0400 (EDT)
Received: from msgrel1.and.agilent.com (msgrel1.and.agilent.com [130.30.33.104])
	by msgbas1t.cos.agilent.com (Postfix) with ESMTP id 260A9BC3
	for <ips@ece.cmu.edu>; Thu,  5 Apr 2001 16:41:04 -0600 (MDT)
Received: from rtl.rose.agilent.com (rtl.rose.agilent.com [156.140.232.231])
	by msgrel1.and.agilent.com (Postfix) with ESMTP id 4584716B
	for <ips@ece.cmu.edu>; Thu,  5 Apr 2001 18:41:03 -0400 (EDT)
Received: from agilent.com (matt5670.rose.agilent.com [156.140.234.148])
	by rtl.rose.agilent.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.1.0) with ESMTP id PAA01807
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 15:41:01 -0700 (PDT)
Message-ID: <3ACCF4FC.1C01C623@agilent.com>
Date: Thu, 05 Apr 2001 15:43:08 -0700
From: Matt Wakeley <matt_wakeley@agilent.com>
Reply-To: Matt Wakeley <matt_wakeley@agilent.com>
Organization: Agilent Technologies
X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: Re: Initiator-detected format or digest errors
References: <C1256A24.00633D8D.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Why not send a reject back to the originator of the bad frame, and just let
the task time out?

-Matt

julian_satran@il.ibm.com wrote:
> 
> Tom,
> 
> An initiator will pass the appropriate response to the SCSI layer and will
> abort the task if it can identify one.  Further behavior of initiators and
> targets is implementation dependent.
> 
> 6.2 specifies this.
> 
> Julo
> 
> "Thomas McSweeney" <rf42tpme@us.ibm.com> on 04/04/2001 13:57:20
> 
> Please respond to "Thomas McSweeney" <rf42tpme@us.ibm.com>
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Initiator-detected format or digest errors
> 
> Section "2.20 Reject" talks about the target receiving a bad frame and
> sending Reject.  What should an Initiator do if it receives a PDU with a
> format or digest error?  Should it send Reject?  If so, we'll need to
> ensure that the Initiator fields of the MIB include an object to count
> Reject commands transmitted.
> 
> Tom McSweeney
> iSCSI Development, Storage Systems Group, IBM
> Email: rf42tpme@us.ibm.com
> Phone: (USA) 919-254-5634  (tie line: 444-5634)
> Fax:   (USA) 919-254-0391  (tie line: 444-0391)


From owner-ips@ece.cmu.edu  Thu Apr  5 20:14:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA16817
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 20:14:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f35LVB523866
	for ips-outgoing; Thu, 5 Apr 2001 17:31:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f35LUer23832
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 17:30:44 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id XAA181396;
	Thu, 5 Apr 2001 23:30:32 +0200
From: biran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id XAA171440;
	Thu, 5 Apr 2001 23:27:19 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A25.0075DB5D ; Thu, 5 Apr 2001 23:27:18 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: shepler@eng.sun.com
cc: ips@ece.cmu.edu
Message-ID: <C1256A25.0075D953.00@d12mta02.de.ibm.com>
Date: Fri, 6 Apr 2001 00:25:13 +0300
Subject: Re: Public key AuthMethod
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Spencer,

Thanks for pointing out this info. I had another correspondence
with the CAT-WG chair, and first, the erroneous Experimental-labeled
version has now been replaced on the IETF web site.

For the advancing question, in regard to RFC-2847 he said:

"The question of whether/when RFC-2025 will advance to Draft is a
separate and more complex one, and isn't equivalent to the question of
advancing RFC-2847 (which is a derived specification, based on a subset of
RFC-2025).  Either/both would be IESG decisions, based in part on the
status
of interoperable implementations"

In view of all this, I will go ahead and define the
public key AuthMehtod based on RFC-2025 (mainly because
this is the only standard we have for this) - unless
I hear strong opposition / better suggestion.

  Regards,
    Ofer


Ofer Biran
Storage and Systems Technology
IBM Research Lab in Haifa
biran@il.ibm.com  972-4-8296253


Spencer Shepler <shepler@eng.sun.com> on 04/03/2001 09:07:20 PM

Please respond to shepler@eng.sun.com

To:   Ofer Biran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu, mike@eisler.com
Subject:  Re: Public key AuthMethod





Note that NFSv4 specifies LIPKEY (RFC2847) and SPKM-3 as mandatory to
implement.  LIPKEY/RFC2847 builds upon the SPKM RFC and defines the
SPKM-3 mechanism which meets the needs of NFSv4.  SPKM-3/LIPKEY is
being implemented and therefore the SPKM RFC may be moving forward as
part of the NFSv4 work.

Spencer

On Tue, biran@il.ibm.com wrote:
>
>
> In Minneapolis I proposed to add the public key AuthMethod based on
> SPKM (public key implementation of GSS-API, RFC-2025). SPKM is really
> suitable (it gives the exact definition of tokens to be exchange in iSCSI
> text
> messages for public key authentication including optional certificates
> exchange, and MAC digest based on shared key generated by the
> exchange, that might be negotiated in the iSCSI login).
>
> However, there is a question mark about the status of RFC-2025. It is on
> standards truck at Proposed Standard level, but it is from 1996... I had
> a correspondence with the CAT-WG chair, and here are two citations:
>
> "I'm unaware, however, of any current plans for advancement of this
> document
> beyond Proposed and it hasn't been actively discussed within the WG for
> some
> time. I'm also unsure as to its number of existing implementations."
>
> "Nonetheless, I believe that it remains well suited as a specification
for
> an
> X.509-based authentication mechanism.  I'm not aware of an alternative
> specification with comparable scope currently defined within an Internet
> standards-track RFC"
>
> (BTW, if you look at the version linked from the RFC pages, the
> "Status of this Memo" section states:
> "This memo defines an Experimental Protocol for the Internet
community..."
> however the same section in the version fetched from the RFC Editor-pages
> states:
> "This document specifies an Internet standards track protocol for the
>    Internet community..."
> the CAT-WG chair confirmed that the first copy is a mistake.)
>
> In anyway, can we (/ should we) rely on RFC that its plan for becoming a
> standard is not clear at all?
>
> Another option for the public key AuthMethod might be a reduced version
> of the TLS handshake (implemented in the iSCSI text messages, not using
> the TLS record layer).  This can provide authentication (with optional
> certificate exchange) and a shared secret that can be used for MAC
> digest according to the TLS MAC specification (but used of course as
> optional iSCSI digest and not inside TLS).
>
> I believe it's preferable to adopt an existing security standard as much
> as possible than inventing something new for iSCSI.
>
> I'd like to hear some opinions on these before we decide how to define
> the public key AuthMethod.
>
> Regards,
>   Ofer
>
>
> Ofer Biran
> Storage and Systems Technology
> IBM Research Lab in Haifa
> biran@il.ibm.com  972-4-8296253
>
>

--

- Spencer -







From owner-ips@ece.cmu.edu  Thu Apr  5 23:37:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA20787
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 23:37:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f361gIH08683
	for ips-outgoing; Thu, 5 Apr 2001 21:42:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f361fPr08629
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 21:41:26 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id DAA270982
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 03:41:17 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id DAA83922
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 03:39:23 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A26.0008FB49 ; Fri, 6 Apr 2001 03:38:06 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A26.0008FB1A.00@d12mta02.de.ibm.com>
Date: Fri, 6 Apr 2001 03:41:42 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

SNACK and SACK are the same thing (I just renamed them to avoid confusion
with TCP SACK).
The status is acked by ExpStatSN (and only indirectly by SNACK). SNACK
enables fast recovery of
a hole (whithout having to resort to a timeout).  We decided long ago
against individual acks as bulk acking through a window is cheaper and
safer (repetition).

Julo

Santosh Rao <santoshr@cup.hp.com> on 05/04/2001 21:06:18

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"




Julian,

The existing StatSN SNACK mechanism will NOT work if it is made
optional. The original request that was made in the thread
http://ips.pdl.cs.cmu.edu/mail/msg03257.html was to allow a SACK
mechanism that would allow individual Status PDUs to be acknowledged
[not SNACK which requests re-send of missing Status PDU, and thereby,
requires the target to retain state information until StatSN is
acknowledged].

If StatSN SNACK is made optional, a target that does not support SNACK
will result in holes never being filled in StatSN sequence, and thereby,
initiators being unable to acknowledge status PDUs received after the
hole. This can cause targets to hold onto stale I/O state information
for very long periods. [or forever].

With the current StatSN SNACK scheme, a target can NEVER discard its old
I/O state information, since if it does so, it cannot satisfy SNACK
requests. If SNACK requests are not satisfied, holes remain in the
StatSN sequence at the initiator and it cannot acknowledge Status PDUs
received thereafter.

If we must retain the StatSN mechanism in iSCSI, then, the SACK
mechanism [as opposed to a SNACK], wherein, the initiator ack's
individual status PDUs received when a hole occurs should be the
preferred scheme. This alows both sides to continue the handshake of
resource release even in the presence of holes, without imposing
requirements on targets to retain I/O state information.

The holes created in StatSN are implicitly filled by the initiator based
on the result of its "retry" of the failed command. Alternatively, the
StatSN hole is considered to be filled if the initiator chooses not to
retry the command [ex: on ULP timeout].

- Santosh


julian_satran@il.ibm.com wrote:


>
> Mallikarjun,
>
> You are right. Too much travel and jet lag.  The reason we made SACK and
> status recovery
> practically a MUST is that without them we are bound to have only session
> drop as an alternative.
> If the target does not keep any information after it has sent out status
it
> can't even retry a command.  And if it can retry a command it should be
> able to do SACK.
>
> But perhaps there is a place in the market for the kind of devices Somesh
> is suggesting that do all recovery at SCSI level (and that can't copy a
> terabyte of data without a session drop).
>
> If that is true (which I doubt) we can make SNACK support optional.
>
> Julo
>
> "Mallikarjun C." <cbm@rose.hp.com> on 05/04/2001 04:09:46
>
> Please respond to cbm@rose.hp.com
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
> >
> >Santosh,
> >
> >I can't find the place where this is stated. SNACK as a PDU type is
> >mandated. But it can be rejected outright.
>
> Sorry, you agreed that status SACK is mandatory in ERT forum last
> week in response to my comments.  Has there been a change in your
opinion?
>
> Attached is the email in a long email thread (issue 3) where you agreed
> to make this explicit in rev06.
> --
> Mallikarjun
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668   Hewlett-Packard, Roseville.
> cbm@rose.hp.com
>
> >1.2.2.2 show explicitely that SACK can be rejected. We can add a
protocol
> >specific parameter in the target Logical Unit Control Page (non-setable)
> by
> >which the target will indicate support for SNACK.
> >
> >Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
> julian_satran@il.ibm.com wrote:
> >
> > Jon,
> >
> > Inexpensive implementation are always free to do away with recovery.
That
> > si true for targets too.
>
> That's not the interpretation one gets from reading the spec and prior
> discussions on this list. Per the spec, support for Status SACK is
> mandatory while support for data SACK is optional.
>
> IOW, targets MUST retains state information to satisfy a potential
> status SACK request.
>
> - Santosh
>
>
------------------------------------------------------------------------------

>
> >From julian_satran@il.ibm.com Tue Mar 27 05:16:54 PST 2001
> Received: from mailhub.rose.hp.com (mailhub.rose.hp.com [15.96.64.24]) by
> core.rose.hp.com with ESMTP (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id
> FAA26277 for <cbm@core.rose.hp.com>; Tue, 27 Mar 2001 05:16:52 -0800
(PST)
> From: julian_satran@il.ibm.com
> Received: from atlrel2.hp.com (atlrel2.hp.com [15.10.184.10]) by
> mailhub.rose.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id FAA10600 for
> <cbm@rose.hp.com>; Tue, 27 Mar 2001 05:15:51 -0800 (PST)
> Received: from d06lmsgate.uk.ibm.COM (d06lmsgate.uk.ibm.com
[195.212.29.1])
>      by atlrel2.hp.com (Postfix) with ESMTP id AFC11120
>      for <cbm@rose.hp.com>; Tue, 27 Mar 2001 08:15:49 -0500 (EST)
> Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com
[9.165.215.22])
>      by d06lmsgate.uk.ibm.COM (1.0.0) with ESMTP id NAA50078;
>      Tue, 27 Mar 2001 13:55:50 +0100
> Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
>      by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id PAA174358;
>      Tue, 27 Mar 2001 15:09:58 +0200
> Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2
5-20-1999))
> id C1256A1C.00483A46 ; Tue, 27 Mar 2001 15:08:55 +0200
> X-Lotus-FromDomain: IBMIL@IBMDE
> To: cbm@rose.hp.com
> Cc: someshg@yahoo.com, steph@cs.uchicago.edu, hufferd@us.ibm.com,
>         cbm@rose.hp.com, ldalleore@snapserver.com,
>         Venkat Rangan <venkat@rhapsodynetworks.com>, Black_David@emc.com
> Message-ID: <C1256A1C.0048399C.00@d12mta02.de.ibm.com>
> Date: Tue, 27 Mar 2001 15:12:01 +0200
> Subject: Re: iSCSI ERT: error recovery comments
> Mime-Version: 1.0
> Content-type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Status: RO
>
> Comments in text.  Thanks Julo
>
> "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 01:41:48
>
> Please respond to cbm@rose.hp.com
>
> To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu, Julian
>       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> cc:   Black_David@emc.com
> Subject:  iSCSI ERT: error recovery comments
>
> Hi Julian and the Team,
>
> Here are some comments on error recovery issues.  I hope these will
> be addressed soon.  Thanks.
>
> 1. The draft should clearly state that if a target doesn't support
>   retry (replay in my previous memo's terminology), it must not silently
>   accept a command with retry bit and re-do the I/O.
>
> 2. Consequent to the above -
>      - Clarification required on section 6.7.1, page 83, last para.
>           Please confirm and clarify in the draft: If the target sends
>           a response with an iSCSI error response of "SACK-rejected" that
>           implicitly terminates the task - no retries are allowed. If the
>           target sends a Reject PDU with "Data SACK Reject" code, the
task
>           stays open and the initiator may try to recover using
SACK/retry.
> +++ I will clarify
> it will read:
>    An iSCSI target MAY reject a data-SNACK and terminate the command with
>    an iSCSI command response of SNACK rejected. In this case, the task is
>    terminated and no future action is expected at target and initiator.
>
>    Alternatively, an iSCSI target MAY reject a data-SNACK with a reject
>    response of data SNACK rejected. In this case the task is still open
and
>    may be recovered using the retry.
>
> +++
>         - On a data digest error on a data PDU without the F-bit, the
draft
>           states that the target must wait for a data PDU with the F-bit
>           (per section 6.2), then a command termination is signalled with
>           a Reject PDU!  I like the formulation in 2.4.2 better.  I
> strongly
>           recommend that similarly, the target shall send a SCSI Response
>           with a iSCSI response of "delivery subsystem failure".  In
> general,
>           I suggest that anytime a target terminates a task internally,
it
>           must generate a SCSI Response PDU with an appropriate response
> code.
> +++ It reads now:
>
>    When a target receives an iSCSI PDU with a header digest error or a
>    payload digest error in an iSCSI PDU, it MUST answer with a Reject
iSCSI
>    PDU with a Reason-code of Header-Digest-error or Data-Digest-Error and
>    discard the offending PDU.  If the error is a Data-Digest-Error in a
>    Data-PDU, the target MUST either request retransmission with a R2T or
>    answer with a command response PDU with a response-code of
>    delivery-subsystem-failure and abort the task. If the target is
>    answering with an error in the command response PDU it must wait for
the
>    target to receive all the data (signaled by a Data PDU with the final
>    bit Set for all outstanding R2Ts) the command response PDU.
> ++++
>
> 3. While the following is implied in different sections, it is not
>   obvious.  Please clarify the following in the draft - "Status SACK
>   support is mandatory, whereas data SACK support is not."
>
> +++ will do in 2.16.1 +++
>
> 4. The general policy of retry should be that all ordered commands
>   shall support retry bit, since the loss of an ordered command
>   creates a hole in target scoreboarding and stalls the target
>   pipeline.  Retry hopefully can plug the hole quickly to avoid this.
>
> 5. As a fallout of the above comment, Retry bit must be supported
>   for Text Commands.
>
> +++ I have added the X-bit.  The reason I did no earlier was that I could
> not foresee
> a case in which the command is not idempotent - I can allways be resent -
> but I guess it is cleaner with the X +++
>
> 6. Section 2.20, page 71 on Reject must specify if a retry of the
operation
>   is allowed for each Reject PDU reason code.  Lack of specification
could
>   lead to interoperability issues down the road with "retry wars" raging
>   between heterogeneous implementations (ex., target rejects the retry
bit,
>   initiator retries the "retry" bit,....).
> +++ the part now reads:
>
>    The reject Reason is coded as follows:
>
>       1 - Format Error
>       2 - Header Digest Error
>       3 - Data (payload) Digest Error
>       4 - Data-SNACK Reject
>       5 - Command Retry Reject
>       15 - Full Feature Phase Command before login
>
>       Some of the reject reasons terminate or prevent the creation of a
>       task at the target and no retry is possible in those cases. Format
>       error for a command, Command Retry Reject and Full Feature Phase
>       Command before login are in this category.
>
> 7. NOP-OUT does not require CmdSNs.  Why make it an ordered command
>   and run the risk of a digest error on it leading to a hole in
>   command ordering?
>
> +++ the reason I wanted it ordered is to check the whole command path -
but
> you may try to convince me that it is not a good idea +++
> --
> Mallikarjun
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668   Hewlett-Packard, Roseville.
> cbm@rose.hp.com
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr  5 23:39:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA20798
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 23:39:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f361mHE08975
	for ips-outgoing; Thu, 5 Apr 2001 21:48:17 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com (mxic1.isus.emc.com [168.159.211.82])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f361m3r08969
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 21:48:03 -0400 (EDT)
Received: by mxic1.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <HMMD4158>; Thu, 5 Apr 2001 21:49:24 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153CD@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: jhall@emc.com, ips@ece.cmu.edu
Subject: SNACK and recovery
Date: Thu, 5 Apr 2001 21:47:55 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This turns out to be a matter not just of rarity,
but also one of consequences.  As Mark points
out, for tapes and similar devices, the consequences
are disastrous - the backup aborts, and when
"those in charge" come in the next morning,
they have no usable backup tape, and are very
unhappy.  While Jon says "streams devices must
support abort and retry for extreme errors in
any case", the abort may well be the entire
backup and the retry might be next weekend ...
not a good situation.

Over in Fibre Channel world, FCP-2 contains
recovery support that resulted from the
discovery that despite the fact that non-
delivery of a Fibre Channel frame (Class 2 or
3 - it doesn't matter which) is "extremely
rare":
- Buffer overrun is prevented by both link
	and end-to-end buffer usage controls.
- FC switches are engineered to not drop
	frames to the maximum extent possible
	due in part to these consequences.
- There's a 32-bit CRC covering the entire
	FC frame.
failure to deliver a frame happens often enough
that a recovery mechanism is needed to avoid
tape backup aborts and the like.  Unlike TCP,
Fibre Channel has no built-in retransmit
mechanism.

In contrast to Fibre Channel, we are dealing
with something rarer because TCP retransmit will
take care of most things that can go wrong in
switches and there's a 16 bit checksum whose
failure will trigger retransmits.  What this
appears to come down to is:

- Does a 16-bit TCP checksum catch enough of
the corruption events to make it acceptable to
take drastic measures like aborting a backup
when a 32 bit CRC fails on a response that
made it through the 16 bit checksum?

The discussion's been a bit convoluted.  Some
simple yes/no answers to the above question
accompanied by short reasoning would be appreciated.
I think Julian's said "no" and quoted a filesystem
number that we're awaiting a reference to.

Just to muddy the waters further, let me point out
that tape targets tend to be less complex than
disk targets.  Tapes don't reorder commands, and
often don't even queue them.  Saving the last N
responses is not that difficult when the responses
go out in the order that the commands came in
(easier to organize saving them), and the initiator
has to be very careful about the number of commands
in flight to avoid disasters caused by dropped
commands (should lead to reasonable results from
relatively small values of N).

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Thu Apr  5 23:40:38 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA20827
	for <ips-archive@odin.ietf.org>; Thu, 5 Apr 2001 23:40:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f362HKp10524
	for ips-outgoing; Thu, 5 Apr 2001 22:17:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f362Gwr10501
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 22:16:58 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 722B41873; Thu,  5 Apr 2001 19:16:57 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id TAA00888;
	Thu, 5 Apr 2001 19:16:52 -0700 (PDT)
Message-ID: <3ACD2851.4D298CB3@cup.hp.com>
Date: Thu, 05 Apr 2001 19:22:09 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A26.0008FB1A.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------DE45A889A782FB2E095710DE"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------DE45A889A782FB2E095710DE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> SNACK and SACK are the same thing (I just renamed them to avoid confusion
> with TCP SACK).
> The status is acked by ExpStatSN (and only indirectly by SNACK). SNACK
> enables fast recovery of
> a hole (whithout having to resort to a timeout).

Julian,

The bottom line is that the current SNACK mechanism as defined in the
spec will NOT work if it is made optional, and at the same time, it is
too expensive to mandate the SNACK mechanism. 

The current SNACK mechanism is really a negative ACK requesting the
target to re-send the status PDU. This mechanism has 2 dis-advantages :

a) requires targets to retain I/O state information until StatSN is
acknowledged.
b) Does not allow forward progress with the release of I/O resources in
the event that a target could not retain that state information or for
some other reason could not service the SNACK.

I am suggesting that the alternate model of SACK be used, wherein, a
SACK is an individual ACK of a received status PDU. This SACK only kicks
in on detection of a hole. The hole is implicitly plugged by the
initiator on eventual completion of the command 
[on timeout followed by abort or retry].

The advantage of this alternate model is :
a) Does not require state information to be stored at targets beyond I/O
completion.
b) Allows a more reliable mechanism of resource release.

The dis-advantage of this mechanism is :
a) It results in I/O timeout when Status PDU was dropped due to a digest
error.

Once again, the question boils down to the rate of TCP checksum escapes
and the probability of such escapes affecting status PDUs. If this is
low enough, such a timeout on a digest error of a status PDU should be
acceptable. 

>  We decided long ago
> against individual acks as bulk acking through a window is cheaper and
> safer (repetition).

I am not suggesting removal of bulk ack scheme. My suggestion is that
SACK kick in on a hole and the initiator revert to bulk ACK scheme once
it considers the hole to be plugged (thru the eventual completion of the
I/O on the timeout path followed by abort or retry).

- Santosh


> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 05/04/2001 21:06:18
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   ips@ece.cmu.edu
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> Julian,
> 
> The existing StatSN SNACK mechanism will NOT work if it is made
> optional. The original request that was made in the thread
> http://ips.pdl.cs.cmu.edu/mail/msg03257.html was to allow a SACK
> mechanism that would allow individual Status PDUs to be acknowledged
> [not SNACK which requests re-send of missing Status PDU, and thereby,
> requires the target to retain state information until StatSN is
> acknowledged].
> 
> If StatSN SNACK is made optional, a target that does not support SNACK
> will result in holes never being filled in StatSN sequence, and thereby,
> initiators being unable to acknowledge status PDUs received after the
> hole. This can cause targets to hold onto stale I/O state information
> for very long periods. [or forever].
> 
> With the current StatSN SNACK scheme, a target can NEVER discard its old
> I/O state information, since if it does so, it cannot satisfy SNACK
> requests. If SNACK requests are not satisfied, holes remain in the
> StatSN sequence at the initiator and it cannot acknowledge Status PDUs
> received thereafter.
> 
> If we must retain the StatSN mechanism in iSCSI, then, the SACK
> mechanism [as opposed to a SNACK], wherein, the initiator ack's
> individual status PDUs received when a hole occurs should be the
> preferred scheme. This alows both sides to continue the handshake of
> resource release even in the presence of holes, without imposing
> requirements on targets to retain I/O state information.
> 
> The holes created in StatSN are implicitly filled by the initiator based
> on the result of its "retry" of the failed command. Alternatively, the
> StatSN hole is considered to be filled if the initiator chooses not to
> retry the command [ex: on ULP timeout].
> 
> - Santosh
> 
> julian_satran@il.ibm.com wrote:
> 
> >
> > Mallikarjun,
> >
> > You are right. Too much travel and jet lag.  The reason we made SACK and
> > status recovery
> > practically a MUST is that without them we are bound to have only session
> > drop as an alternative.
> > If the target does not keep any information after it has sent out status
> it
> > can't even retry a command.  And if it can retry a command it should be
> > able to do SACK.
> >
> > But perhaps there is a place in the market for the kind of devices Somesh
> > is suggesting that do all recovery at SCSI level (and that can't copy a
> > terabyte of data without a session drop).
> >
> > If that is true (which I doubt) we can make SNACK support optional.
> >
> > Julo
> >
> > "Mallikarjun C." <cbm@rose.hp.com> on 05/04/2001 04:09:46
> >
> > Please respond to cbm@rose.hp.com
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> > >
> > >Santosh,
> > >
> > >I can't find the place where this is stated. SNACK as a PDU type is
> > >mandated. But it can be rejected outright.
> >
> > Sorry, you agreed that status SACK is mandatory in ERT forum last
> > week in response to my comments.  Has there been a change in your
> opinion?
> >
> > Attached is the email in a long email thread (issue 3) where you agreed
> > to make this explicit in rev06.
> > --
> > Mallikarjun
> >
> > Mallikarjun Chadalapaka
> > Networked Storage Architecture
> > Network Storage Solutions Organization
> > MS 5668   Hewlett-Packard, Roseville.
> > cbm@rose.hp.com
> >
> > >1.2.2.2 show explicitely that SACK can be rejected. We can add a
> protocol
> > >specific parameter in the target Logical Unit Control Page (non-setable)
> > by
> > >which the target will indicate support for SNACK.
> > >
> > >Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 04/04/2001 23:53:32
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   Julian Satran/Haifa/IBM@IBMIL
> > cc:   Jon Hall <jhall@emc.com>, ips@ece.cmu.edu
> > Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> > julian_satran@il.ibm.com wrote:
> > >
> > > Jon,
> > >
> > > Inexpensive implementation are always free to do away with recovery.
> That
> > > si true for targets too.
> >
> > That's not the interpretation one gets from reading the spec and prior
> > discussions on this list. Per the spec, support for Status SACK is
> > mandatory while support for data SACK is optional.
> >
> > IOW, targets MUST retains state information to satisfy a potential
> > status SACK request.
> >
> > - Santosh
> >
> >
> ------------------------------------------------------------------------------
> 
> >
> > >From julian_satran@il.ibm.com Tue Mar 27 05:16:54 PST 2001
> > Received: from mailhub.rose.hp.com (mailhub.rose.hp.com [15.96.64.24]) by
> > core.rose.hp.com with ESMTP (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id
> > FAA26277 for <cbm@core.rose.hp.com>; Tue, 27 Mar 2001 05:16:52 -0800
> (PST)
> > From: julian_satran@il.ibm.com
> > Received: from atlrel2.hp.com (atlrel2.hp.com [15.10.184.10]) by
> > mailhub.rose.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id FAA10600 for
> > <cbm@rose.hp.com>; Tue, 27 Mar 2001 05:15:51 -0800 (PST)
> > Received: from d06lmsgate.uk.ibm.COM (d06lmsgate.uk.ibm.com
> [195.212.29.1])
> >      by atlrel2.hp.com (Postfix) with ESMTP id AFC11120
> >      for <cbm@rose.hp.com>; Tue, 27 Mar 2001 08:15:49 -0500 (EST)
> > Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com
> [9.165.215.22])
> >      by d06lmsgate.uk.ibm.COM (1.0.0) with ESMTP id NAA50078;
> >      Tue, 27 Mar 2001 13:55:50 +0100
> > Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
> >      by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id PAA174358;
> >      Tue, 27 Mar 2001 15:09:58 +0200
> > Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2
> 5-20-1999))
> > id C1256A1C.00483A46 ; Tue, 27 Mar 2001 15:08:55 +0200
> > X-Lotus-FromDomain: IBMIL@IBMDE
> > To: cbm@rose.hp.com
> > Cc: someshg@yahoo.com, steph@cs.uchicago.edu, hufferd@us.ibm.com,
> >         cbm@rose.hp.com, ldalleore@snapserver.com,
> >         Venkat Rangan <venkat@rhapsodynetworks.com>, Black_David@emc.com
> > Message-ID: <C1256A1C.0048399C.00@d12mta02.de.ibm.com>
> > Date: Tue, 27 Mar 2001 15:12:01 +0200
> > Subject: Re: iSCSI ERT: error recovery comments
> > Mime-Version: 1.0
> > Content-type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > Status: RO
> >
> > Comments in text.  Thanks Julo
> >
> > "Mallikarjun C." <cbm@rose.hp.com> on 27/03/2001 01:41:48
> >
> > Please respond to cbm@rose.hp.com
> >
> > To:   cbm@rose.hp.com, someshg@yahoo.com, steph@cs.uchicago.edu, Julian
> >       Satran/Haifa/IBM@IBMIL, John Hufferd/San Jose/IBM@IBMUS
> > cc:   Black_David@emc.com
> > Subject:  iSCSI ERT: error recovery comments
> >
> > Hi Julian and the Team,
> >
> > Here are some comments on error recovery issues.  I hope these will
> > be addressed soon.  Thanks.
> >
> > 1. The draft should clearly state that if a target doesn't support
> >   retry (replay in my previous memo's terminology), it must not silently
> >   accept a command with retry bit and re-do the I/O.
> >
> > 2. Consequent to the above -
> >      - Clarification required on section 6.7.1, page 83, last para.
> >           Please confirm and clarify in the draft: If the target sends
> >           a response with an iSCSI error response of "SACK-rejected" that
> >           implicitly terminates the task - no retries are allowed. If the
> >           target sends a Reject PDU with "Data SACK Reject" code, the
> task
> >           stays open and the initiator may try to recover using
> SACK/retry.
> > +++ I will clarify
> > it will read:
> >    An iSCSI target MAY reject a data-SNACK and terminate the command with
> >    an iSCSI command response of SNACK rejected. In this case, the task is
> >    terminated and no future action is expected at target and initiator.
> >
> >    Alternatively, an iSCSI target MAY reject a data-SNACK with a reject
> >    response of data SNACK rejected. In this case the task is still open
> and
> >    may be recovered using the retry.
> >
> > +++
> >         - On a data digest error on a data PDU without the F-bit, the
> draft
> >           states that the target must wait for a data PDU with the F-bit
> >           (per section 6.2), then a command termination is signalled with
> >           a Reject PDU!  I like the formulation in 2.4.2 better.  I
> > strongly
> >           recommend that similarly, the target shall send a SCSI Response
> >           with a iSCSI response of "delivery subsystem failure".  In
> > general,
> >           I suggest that anytime a target terminates a task internally,
> it
> >           must generate a SCSI Response PDU with an appropriate response
> > code.
> > +++ It reads now:
> >
> >    When a target receives an iSCSI PDU with a header digest error or a
> >    payload digest error in an iSCSI PDU, it MUST answer with a Reject
> iSCSI
> >    PDU with a Reason-code of Header-Digest-error or Data-Digest-Error and
> >    discard the offending PDU.  If the error is a Data-Digest-Error in a
> >    Data-PDU, the target MUST either request retransmission with a R2T or
> >    answer with a command response PDU with a response-code of
> >    delivery-subsystem-failure and abort the task. If the target is
> >    answering with an error in the command response PDU it must wait for
> the
> >    target to receive all the data (signaled by a Data PDU with the final
> >    bit Set for all outstanding R2Ts) the command response PDU.
> > ++++
> >
> > 3. While the following is implied in different sections, it is not
> >   obvious.  Please clarify the following in the draft - "Status SACK
> >   support is mandatory, whereas data SACK support is not."
> >
> > +++ will do in 2.16.1 +++
> >
> > 4. The general policy of retry should be that all ordered commands
> >   shall support retry bit, since the loss of an ordered command
> >   creates a hole in target scoreboarding and stalls the target
> >   pipeline.  Retry hopefully can plug the hole quickly to avoid this.
> >
> > 5. As a fallout of the above comment, Retry bit must be supported
> >   for Text Commands.
> >
> > +++ I have added the X-bit.  The reason I did no earlier was that I could
> > not foresee
> > a case in which the command is not idempotent - I can allways be resent -
> > but I guess it is cleaner with the X +++
> >
> > 6. Section 2.20, page 71 on Reject must specify if a retry of the
> operation
> >   is allowed for each Reject PDU reason code.  Lack of specification
> could
> >   lead to interoperability issues down the road with "retry wars" raging
> >   between heterogeneous implementations (ex., target rejects the retry
> bit,
> >   initiator retries the "retry" bit,....).
> > +++ the part now reads:
> >
> >    The reject Reason is coded as follows:
> >
> >       1 - Format Error
> >       2 - Header Digest Error
> >       3 - Data (payload) Digest Error
> >       4 - Data-SNACK Reject
> >       5 - Command Retry Reject
> >       15 - Full Feature Phase Command before login
> >
> >       Some of the reject reasons terminate or prevent the creation of a
> >       task at the target and no retry is possible in those cases. Format
> >       error for a command, Command Retry Reject and Full Feature Phase
> >       Command before login are in this category.
> >
> > 7. NOP-OUT does not require CmdSNs.  Why make it an ordered command
> >   and run the risk of a digest error on it leading to a hole in
> >   command ordering?
> >
> > +++ the reason I wanted it ordered is to check the whole command path -
> but
> > you may try to convince me that it is not a good idea +++
> > --
> > Mallikarjun
> >
> > Mallikarjun Chadalapaka
> > Networked Storage Architecture
> > Network Storage Solutions Organization
> > MS 5668   Hewlett-Packard, Roseville.
> > cbm@rose.hp.com
>  - santoshr.vcf
--------------DE45A889A782FB2E095710DE
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------DE45A889A782FB2E095710DE--



From owner-ips@ece.cmu.edu  Fri Apr  6 01:36:52 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA25229
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 01:36:50 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f363oMF15503
	for ips-outgoing; Thu, 5 Apr 2001 23:50:22 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f363o1r15477
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 23:50:01 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by dirty; Thu Apr  5 23:48:00 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Thu Apr  5 23:48:00 EDT 2001
Received: (from sandeepj@localhost)
	by aura.research.bell-labs.com (8.9.1/8.9.1) id XAA00104
	for ips@ece.cmu.edu; Thu, 5 Apr 2001 23:47:59 -0400 (EDT)
Date: Thu, 5 Apr 2001 23:47:59 -0400 (EDT)
Message-Id: <200104060347.XAA00104@aura.research.bell-labs.com>
From: sandeepj@research.bell-labs.com (Sandeep Joshi)
To: ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple con nections.
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> Unfortunately, I think we have an impossible situation.  It appears to me
> that
> we have to pick at most two of the following three goals, as I have yet to
> see
> any way to achieve all three for a single task management command on a
> multiple connection session:
> 
> (1) The command takes effect immediately and its status/response
>         is available immediately.
> (2) The command affects all commands in flight, and its status/response
>        is delayed until all such effects are complete.
> (3) There is no significant visible departure from existing SCSI task
>        management behavior.
> ...
> ...
> Sandeep's proposal to create state in the target either fails to achieve
> (1) [if the response is delayed until the state is removed] or violates SAM2
> [returns the response to the task management command before the task
> management command is complete].  Having state linger after a completed
> LUN or TARGET RESET is almost certainly wrong.

Hi David,

Still wading thru related emails but I believe that if a refCmdSN
is added to the task management PDU (not present currently but
could be added for task-related management commands), then it 
might fix the above-mentioned flaws and allow for safe execution 
and immediate delivery of the abort task to the target.  

refCmdSN (cmdSN of refTaskTag) can tell you where the abort task 
command was received in the target command stream.

Processing at target :
1) Initiator task tag reuse : should not happen before refCmdSN
   so can be used for comparisons until refCmdSN expires.
2) State deletion : can be done by target after refCmdSN PDU
   has arrived and is processed/dropped.
3) If Abort task command is early (refCmdSN not arrived) : 
   then create state and drop PDUs when they arrive.
4) If Abort task command is Late (executing now beyond refCmdSN): 
   target sends a task response of task-not-found (this return 
   code exists in the draft).   Otherwise, we may cancel the
   wrong task if the initiator task tag has been reused!
5) If Abort task reaches when target is executing refCmdSN : 
   pass abort task to SCSI layer and return response.
6) Task response can be returned as appropriate to conform with
   SAM2 - either after in-flight commands arrive or immediately
   since the target knows what needs to be done later.  I am slightly 
   confused here since your goals (1)&(2) appear to be contradictory
   for application to in-flight commands.. it depends on semantics 
   what "taking effect" implies ?

One question is, is it reasonable to assume that the initiator
knows the cmdSN of an issued original task(cmdSN<->initiator taskTag).  
This knowledge may be anyway needed if commands are to be re-issued 
in cmdSN order during recovery of a broken connection.  Its an 
implementation issue but can be debated.

Any other holes that we see..
1) multi-NIC initiators ?
2) dont want to introduce state at target ?
3) may affect iSCSI routers or gateways in strange ways ?
4) linked command issues ?

thanks,
-Sandeep


From owner-ips@ece.cmu.edu  Fri Apr  6 01:37:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA25296
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 01:37:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f363aM114786
	for ips-outgoing; Thu, 5 Apr 2001 23:36:22 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f363Zlr14760
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 23:35:47 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f364i9094766;
	Thu, 5 Apr 2001 21:44:09 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Ips" <ips@ece.cmu.edu>
Cc: "David Black" <Black_David@emc.com>,
        "Elizabeth G Rodriguez \(Elizabeth\)" <egrodriguez@lucent.com>
Subject: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Thu, 5 Apr 2001 20:34:04 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEAHCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Ver 5 Pg. 10

   "Command numbering is session-wide and is used for ordered command
   delivery over multiple connections.  It can also be used as a
   mechanism for command flow control over a session."

"1.2.2.1 Command Numbering and Acknowledging

   iSCSI supports ordered command delivery within a session.  All
   commands (initiator-to-target) are numbered."

   "Commands in transit from the initiator SCSI layer to the SCSI target
   layer are numbered by iSCSI; the number is carried by the iSCSI PDU
   as CmdSN (Command-Sequence-Number).  The numbering is session-wide.
   All iSCSI PDUs that have a task association carry this number. CmdSNs
   are allocated by the initiator iSCSI within a 32-bit unsigned counter
   (modulo 2**32).  The value 0 is reserved and used to mean immediate
   delivery. Comparisons and arithmetic on CmdSN SHOULD use Serial
   Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.

   Not covered in this document are he means by which the SCSI layer may
   request immediate delivery for a command or by which iSCSI will
   decide by itself to mark a PDU for immediate delivery."


Not all commands are serialized as a result of serial number 0 representing
a special case for immediate delivery.  This has an impact on flow control,
acknowledgement, and a deterministic recovery in the face of an error
situation.  With the exception of those commands with null serialization,
all commands MUST be sequenced at the point of network aggregation described
here as a PDU sequencer that issues commands to the SCSI target.  This focal
point is likely to find situations where its normal operation is curtailed.

	1) Prolonged device operation resulting in a resource constraint.
	2) Digest Error causing a missing sequence.
	3) Connection loss causing a missing sequence.

The technique of using a null sequence to bypass the sequencer has some
inherent problems.  In first case, those queued commands MAY become invalid
following management that terminates prolonged operation with a command that
has bypassed the sequencer.  Those invalidated commands queued within the
sequencer can not be cancelled in an orderly manner within the existing
scheme.  This sequencer MUST be used as defined in the iSCSI proposal and,
as a result, these queued commands are beyond SCSI and iSCSI controls.
Issuing a null sequence command followed by a replicate serialized command
will have differing results depending on the target but will not result in a
deterministic treatment of these pending commands.

The use of the sequencer bypass technique (null serialization) should signal
an extreme measure where logically, those commands being bypassed become
suspect.  The conservative approach to this situation would be to reject all
bypassed commands.  As a result of this conservative behavior, a technique
that does not use a null sequence would be to institute a flag such as
"Exigent" that signals an extreme condition exists and that all pending
commands within the sequencer are to be rejected.

Note:
In addition to checking for the next PDU sequence, the sequencer should be
checking for PDUs with a serialization number that are prior to the desired
next sequence.  This examination would look something like this:

if ( (sequencer CmdSN - next sequence CmdSN ) > 2^(SERIAL_BITS - 1))
	{
	reject_pdu(CmdSN, SEQUENCER_INVALIDATION);
	}

if (sequencer CmdSN == next sequence CmdSN)
	{
	send_pdu(CmdSN);
	next sequence++;
	}

Upon receipt of a PDU flagged as "Exigent", the 'next sequence' value
immediately becomes the serial number of this command as well as ExpCmd
advancing to this value plus one.  The effect of this is to have the
sequencer reject all pending commands up to the "Exigent" command.  This has
the benefit of removing these now suspect commands as well as allowing this
highly urgent command to be sent to the target immediately without
accidentally affecting subsequent commands as is now possible.

As it is now, one other possible use of a null sequence would be to always
bypass the sequencer as perhaps the sequencer is viewed as unnecessary.  The
use of zero to allow commands to bypass the sequencer then represents a
problem with respect to resource management as now the flow control scheme
no longer works.  If bypassing the sequencer is the desired behavior and
this command will not impact validity of those commands serialized prior to
this command, then this PDU flagged "Casual" would allow this command to be
issued directly to the target.  These commands should still include a serial
number to allow flow control and acknowledgement to remain functional.  The
task of acknowledgement would be to comprise a min-max list of those
commands sent and to look for the highest sequential value.

In the case of a lost connection, waiting to time-out on those holes created
within the sequencer would be one method of handling this situation.  If
there was a means of rejecting those commands within the sequencer using an
"Exigent" command would mean there are no holes left to fill.  Those
commands received would be rejected and the initiator could then resend
these commands on a different connection without stumbling through a process
of repeated timeouts.  Should the method used to recover from a digestion
error mean terminating the connection, then those commands can also be
quickly shifted over to a new connection.

This does not resolve all issues created in a error event, but it does
provide a simpler solution for most of those events concerning the sequencer
and gets rid of the special case handing of the command serial number.  Not
having flow control, acknowledgement, and a method of dealing with queued
commands appear as a serious flaw in the present protocol.  I hope I have
presented this in a clear manner and I am interested in finding how others
deal with these situations.

Doug



From owner-ips@ece.cmu.edu  Fri Apr  6 02:12:31 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA05411
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 02:12:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f362CJX10178
	for ips-outgoing; Thu, 5 Apr 2001 22:12:19 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f362Bqr10154
	for <ips@ece.cmu.edu>; Thu, 5 Apr 2001 22:11:52 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id EAA13538
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 04:11:45 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id EAA43638
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 04:09:50 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A26.000BC61F ; Fri, 6 Apr 2001 04:08:36 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A26.000BC542.00@d12mta02.de.ibm.com>
Date: Fri, 6 Apr 2001 03:59:58 +0200
Subject: Re: Initiator-detected format or digest errors
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Matt,

3 reasons:

a. the initiator drives recovery. What can a target do with the reject?
(he is probaly in need of a reset)

b.the initiator is unaware of the target state after thhe error

c.there is reject command (and response) -:)

Julo

Matt Wakeley <matt_wakeley@agilent.com> on 06/04/2001 00:43:08

Please respond to Matt Wakeley <matt_wakeley@agilent.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  Re: Initiator-detected format or digest errors




Why not send a reject back to the originator of the bad frame, and just let
the task time out?

-Matt

julian_satran@il.ibm.com wrote:
>
> Tom,
>
> An initiator will pass the appropriate response to the SCSI layer and
will
> abort the task if it can identify one.  Further behavior of initiators
and
> targets is implementation dependent.
>
> 6.2 specifies this.
>
> Julo
>
> "Thomas McSweeney" <rf42tpme@us.ibm.com> on 04/04/2001 13:57:20
>
> Please respond to "Thomas McSweeney" <rf42tpme@us.ibm.com>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Initiator-detected format or digest errors
>
> Section "2.20 Reject" talks about the target receiving a bad frame and
> sending Reject.  What should an Initiator do if it receives a PDU with a
> format or digest error?  Should it send Reject?  If so, we'll need to
> ensure that the Initiator fields of the MIB include an object to count
> Reject commands transmitted.
>
> Tom McSweeney
> iSCSI Development, Storage Systems Group, IBM
> Email: rf42tpme@us.ibm.com
> Phone: (USA) 919-254-5634  (tie line: 444-5634)
> Fax:   (USA) 919-254-0391  (tie line: 444-0391)





From owner-ips@ece.cmu.edu  Fri Apr  6 11:46:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA12602
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 11:46:16 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36DCg724490
	for ips-outgoing; Fri, 6 Apr 2001 09:12:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36DCRr24473
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 09:12:27 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6K711C>; Fri, 6 Apr 2001 09:03:16 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153D1@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	very
Date: Fri, 6 Apr 2001 09:11:26 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Let me see if I can distill out the issue in Doug's long
message on this subject ...

When a CmdSN of zero is used to mark a command for immediate
delivery, the CmdSN-based acknowledgement and windowing
mechanisms don't apply to that command.  This means that
any command sent for immediate delivery (CmdSN = 0):
(A) Cannot recover from a CRC (digest) error via CmdSN-based
	retransmit.  The Initiator can still time this out,
	but that seems like a poor way to initiate recovery
	of something that was supposed to be done immediately.
(B) Does not have its use of resources (e.g. command buffer)
	on the target controlled by the CmdSN windowing
	mechanism, complicating target resource management.
	Targets have better control over their resources if
	all inbound commands use one set of resources,
	described/controlled by the CmdSN window, rather
	than having to set aside separate resources just
	in case the initiator sends an immediate command. 
The alternate proposal that Doug envisions is to apply
CmdSN (and hence its error recovery and resource management)
to immediate commands, and use a flag  bit elsewhere in
the header to indicate that the command is to be delivered
to SCSI immediately by iSCSI rather than waiting for
"missing" commands to show up.  That seems reasonable,
and comments are invited.  One of the things that this
does is transfer the responsibility to keep some space
in the CmdSN window open for immediate commands (and
determine how much is appropriate) from the Target to
the Initiator - all other things being equal, this
is the right direction to move functionality in a SCSI
system.

Note Well: This concept of "immediate" delivery is an
iSCSI concept that affects only the iSCSI CmdSN sequence
- this does not affect TCP's "deliver in order" behavior.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------

> -----Original Message-----
> From:	Douglas Otis [SMTP:dotis@sanlight.net]
> Sent:	Thursday, April 05, 2001 11:34 PM
> To:	Ips
> Cc:	David Black; Elizabeth G Rodriguez (Elizabeth)
> Subject:	iSCSI:flow control, acknowledgement, and a deterministic
> recovery
> 
> Ver 5 Pg. 10
> 
>    "Command numbering is session-wide and is used for ordered command
>    delivery over multiple connections.  It can also be used as a
>    mechanism for command flow control over a session."
> 
> "1.2.2.1 Command Numbering and Acknowledging
> 
>    iSCSI supports ordered command delivery within a session.  All
>    commands (initiator-to-target) are numbered."
> 
>    "Commands in transit from the initiator SCSI layer to the SCSI target
>    layer are numbered by iSCSI; the number is carried by the iSCSI PDU
>    as CmdSN (Command-Sequence-Number).  The numbering is session-wide.
>    All iSCSI PDUs that have a task association carry this number. CmdSNs
>    are allocated by the initiator iSCSI within a 32-bit unsigned counter
>    (modulo 2**32).  The value 0 is reserved and used to mean immediate
>    delivery. Comparisons and arithmetic on CmdSN SHOULD use Serial
>    Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.
> 
>    Not covered in this document are he means by which the SCSI layer may
>    request immediate delivery for a command or by which iSCSI will
>    decide by itself to mark a PDU for immediate delivery."
> 
> 
> Not all commands are serialized as a result of serial number 0
> representing
> a special case for immediate delivery.  This has an impact on flow
> control,
> acknowledgement, and a deterministic recovery in the face of an error
> situation.  With the exception of those commands with null serialization,
> all commands MUST be sequenced at the point of network aggregation
> described
> here as a PDU sequencer that issues commands to the SCSI target.  This
> focal
> point is likely to find situations where its normal operation is
> curtailed.
> 
> 	1) Prolonged device operation resulting in a resource constraint.
> 	2) Digest Error causing a missing sequence.
> 	3) Connection loss causing a missing sequence.
> 
> The technique of using a null sequence to bypass the sequencer has some
> inherent problems.  In first case, those queued commands MAY become
> invalid
> following management that terminates prolonged operation with a command
> that
> has bypassed the sequencer.  Those invalidated commands queued within the
> sequencer can not be cancelled in an orderly manner within the existing
> scheme.  This sequencer MUST be used as defined in the iSCSI proposal and,
> as a result, these queued commands are beyond SCSI and iSCSI controls.
> Issuing a null sequence command followed by a replicate serialized command
> will have differing results depending on the target but will not result in
> a
> deterministic treatment of these pending commands.
> 
> The use of the sequencer bypass technique (null serialization) should
> signal
> an extreme measure where logically, those commands being bypassed become
> suspect.  The conservative approach to this situation would be to reject
> all
> bypassed commands.  As a result of this conservative behavior, a technique
> that does not use a null sequence would be to institute a flag such as
> "Exigent" that signals an extreme condition exists and that all pending
> commands within the sequencer are to be rejected.
> 
> Note:
> In addition to checking for the next PDU sequence, the sequencer should be
> checking for PDUs with a serialization number that are prior to the
> desired
> next sequence.  This examination would look something like this:
> 
> if ( (sequencer CmdSN - next sequence CmdSN ) > 2^(SERIAL_BITS - 1))
> 	{
> 	reject_pdu(CmdSN, SEQUENCER_INVALIDATION);
> 	}
> 
> if (sequencer CmdSN == next sequence CmdSN)
> 	{
> 	send_pdu(CmdSN);
> 	next sequence++;
> 	}
> 
> Upon receipt of a PDU flagged as "Exigent", the 'next sequence' value
> immediately becomes the serial number of this command as well as ExpCmd
> advancing to this value plus one.  The effect of this is to have the
> sequencer reject all pending commands up to the "Exigent" command.  This
> has
> the benefit of removing these now suspect commands as well as allowing
> this
> highly urgent command to be sent to the target immediately without
> accidentally affecting subsequent commands as is now possible.
> 
> As it is now, one other possible use of a null sequence would be to always
> bypass the sequencer as perhaps the sequencer is viewed as unnecessary.
> The
> use of zero to allow commands to bypass the sequencer then represents a
> problem with respect to resource management as now the flow control scheme
> no longer works.  If bypassing the sequencer is the desired behavior and
> this command will not impact validity of those commands serialized prior
> to
> this command, then this PDU flagged "Casual" would allow this command to
> be
> issued directly to the target.  These commands should still include a
> serial
> number to allow flow control and acknowledgement to remain functional.
> The
> task of acknowledgement would be to comprise a min-max list of those
> commands sent and to look for the highest sequential value.
> 
> In the case of a lost connection, waiting to time-out on those holes
> created
> within the sequencer would be one method of handling this situation.  If
> there was a means of rejecting those commands within the sequencer using
> an
> "Exigent" command would mean there are no holes left to fill.  Those
> commands received would be rejected and the initiator could then resend
> these commands on a different connection without stumbling through a
> process
> of repeated timeouts.  Should the method used to recover from a digestion
> error mean terminating the connection, then those commands can also be
> quickly shifted over to a new connection.
> 
> This does not resolve all issues created in a error event, but it does
> provide a simpler solution for most of those events concerning the
> sequencer
> and gets rid of the special case handing of the command serial number.
> Not
> having flow control, acknowledgement, and a method of dealing with queued
> commands appear as a serious flaw in the present protocol.  I hope I have
> presented this in a clear manner and I am interested in finding how others
> deal with these situations.
> 
> Doug


From owner-ips@ece.cmu.edu  Fri Apr  6 13:02:04 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA14472
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 13:02:03 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36FTjl03505
	for ips-outgoing; Fri, 6 Apr 2001 11:29:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.brocade.com (asbestos.brocade.com [63.121.140.244])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36FNAr02946
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 11:23:11 -0400 (EDT)
Received: from thor.brocade.com (thor [192.168.126.45])
	by mail.brocade.com (8.8.8+Sun/8.8.8) with ESMTP id IAA11357
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 08:23:04 -0700 (PDT)
Received: by thor.brocade.com with Internet Mail Service (5.5.2653.19)
	id <FXLAZKW4>; Fri, 6 Apr 2001 08:23:04 -0700
Message-ID: <FFD40DB4943CD411876500508BAD02797D469E@sj5-ex2.brocade.com>
From: Robert Snively <rsnively@Brocade.COM>
To: "'ips@ece.cmu.edu'" <ips@ece.cmu.edu>
Subject: FW: iSCSI: Out Of Sequence due to null sequence with multiple con
	nections.
Date: Fri, 6 Apr 2001 08:23:01 -0700 
X-Mailer: Internet Mail Service (5.5.2653.19)
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This apparently did not get out on the ips reflector.  Trying again.

-----Original Message-----
From: Robert Snively 
Sent: Tuesday, April 03, 2001 3:11 PM
To: 'Black_David@emc.com'; CBinford@pirus.com; ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple
connections.


I support David's conclusions that task management should be sent
once.  Fortunately, the ordered/unordered question need not even
be asked, because SAM-2 compliant SCSI applications already know 
what will happen.

At present, SCSI applications do not have a clear guarantee of
the order between task management functions and the processing or
delivery of any particular task.  In fact, the concept of 
an ORDERED attribute applied to a task management function is
unknown.  As a result, SCSI drivers have to be aware of 
the implications.  Those implications include the possibility that
a command may be in any state of delivery or completion as a
task management function is executed and therefore may or may not
be included in the scope of the task management function.  This
has been partially clarified by the recently created status value
developed by Charles Binford, "TASK ABORTED", which at least indicates
which tasks for other initiators have been cleared by a task 
management function.

In general, the approach is as follows:

	When a task management operation is received by a target,
	tasks are treated in one of three ways, depending on where
	they are with respect to the timing of the execution of
	the command.

   a)	If the command has been completed, it is perfectly possible that
	completion status for the command is transmitted in spite of the
	fact that the task management operation has been received.

   b)	If the command is in progress or enqueued, it is likely that
	the command will be managed according to the rules of the
	task management command unless a) or c) happen.

   c) If the command has not been received by the time the task 
	management function is completed, the command will be received
	and treated as a command that has occurred after the task
	management function, even if it were sent before that.  That
	typically involves presentation of a UNIT ATTENTION condition
	or some other notification.

	When a task management operation passes through an initiator,
	the initiator has the option of acting on the commands 
	whose state it knows, including the possibility of discarding
	or presenting both expected and unexpected returned status,
	in a vendor specific manner.

It is possible that all this work of creating a synchronous behavior
for actions that are designed to be asynchronous may be unnecessary.
The present drivers know what to do to clean up the mess.  All you
have done is statistically increase the number of tasks that may 
be involved in behaviors a) and c) above.

References in support of this include:

>From SAM-2, Rev 15, 4.6.2:

	For convenience, the SCSI architecture model assumes 
	in-order delivery to be a property of the service delivery
	subsystem. This assumption is made to simplify the 
	description of behavior and does not constitute a requirement.
	In addition, this specification makes no assumption about, 
	or places any requirement on the ordering of requests or
	responses between one sending device and several receiving devices.

>From SAM-2, Rev 15, 4.7.4

	The order in which task management requests are 
	executed is not specified by this standard. In particular, this
	standard does not require in-order delivery of such 
	requests, as defined in 4.6.2, or execution by the task manager
	in the order received. To guarantee the execution order 
	of task management requests referencing a specific logical
	unit, an initiator should, therefore, not have more than 
	one such request pending to that logical unit.

>  -----Original Message-----
>  From: Black_David@emc.com [mailto:Black_David@emc.com]
>  Sent: Tuesday, April 03, 2001 12:47 PM
>  To: CBinford@pirus.com; ips@ece.cmu.edu
>  Subject: RE: iSCSI: Out Of Sequence due to null sequence 
>  with multiple
>  con nections.
>  
>  
>  > I would state this much stronger.  Applications had better 
>  not have to
>  know
>  > that it is iSCSI underneath vs. FCP or parallel SCSI else 
>  I believe we
>  > missed the objective (granted, some things such as target 
>  address space
>  are
>  > unavoidably different, but I believe task management 
>  functions should be
>  the
>  > same).  The transport needs to handle the transport issues without
>  exposing
>  > quirks to the SCSI or application layer.
>  
>  Unfortunately, I think we have an impossible situation.  It 
>  appears to me
>  that
>  we have to pick at most two of the following three goals, as 
>  I have yet to
>  see
>  any way to achieve all three for a single task management 
>  command on a
>  multiple connection session:
>  
>  (1) The command takes effect immediately and its status/response
>  	is available immediately.
>  (2) The command affects all commands in flight, and its 
>  status/response
>  	is delayed until all such effects are complete.
>  (3) There is no significant visible departure from existing SCSI task
>  	management behavior.
>  
>  The problem is that trying to do both (1) and (2) either 
>  requires SCSI to
>  "execute" the task management command twice or requires that iSCSI do
>  some task management (e.g., on the in-flight commands) on 
>  SCSI's behalf
>  (or worse like having SCSI prolong the execution of the task 
>  management
>  command until everything in flight in iSCSI arrives).  All 
>  of these appear
>  to lead to problems with (3) in one form or another - two executions
>  result in two SCSI status/responses that have to be merged, and iSCSI
>  task management will sooner or later do something different from SCSI
>  (e.g., I sincerely doubt that a Target in a bridge will ever 
>  get this 100%
>  identical to the devices that are being bridged).
>  
>  The current iSCSI draft provides the choice of  [(1)] XOR [(2), (3)];
>  the reason for not getting (3) with (1) is the possibility 
>  of the task
>  management command bypassing commands that it's supposed to
>  affect.  Charles' original proposal is [(2), (3)] because it 
>  has to time out
>  a stuck connection before executing the command, and is roughly
>  equivalent to sending the command for ordered delivery and having
>  the implementation treat any queue between iSCSI and SCSI as
>  being on the SCSI side of the line.  Doug Otis's counter-proposal
>  falls into the category of iSCSI doing task management on SCSI's
>  behalf and provides an example of how this results in visible changes
>  in behavior -- for the CLEAR ACA task management command,
>  aborting all tasks that are queued or in flight is generally 
>  incorrect.
>  
>  I would note that this issue does not arise on single 
>  connection sessions,
>  because sending the command for immediate delivery plus some care not
>  to reorder things in the iSCSI Target (i.e., consider the 
>  iSCSI to SCSI
>  queue
>  to be in "SCSI" and hence subject to the task management command)
>  obtains all of (1) through (3).
>  
>  Going out on a limb, I suspect applications will generally 
>  want [(2), (3)]
>  -- send for ordered delivery and wait for the dust to settle 
>  because that
>  provides the best odds of having some weird device get into a known
>  state from which further progress is possible.  This allows 
>  the application
>  to not know whether parallel SCSI, FCP or iSCSI is underneath and
>  relies on other iSCSI recovery procedures to make sure that the task
>  management command is delivered and executed (e.g., unstick and/or
>  close "stuck" connections).  There will be cases in which (1) is
>  needed (e.g., observe tape robot doing something obviously wrong,
>  and get it to stop immediately), but those may involve fairly blunt
>  instruments (e.g., LUN RESET) and the need to clean up any collateral
>  damage.
>  
>  Sandeep's proposal to create state in the target either 
>  fails to achieve
>  (1) [if the response is delayed until the state is removed] 
>  or violates SAM2
>  [returns the response to the task management command before the task
>  management command is complete].  Having state linger after 
>  a completed
>  LUN or TARGET RESET is almost certainly wrong.
>  
>  So, I think I'm down to sending task management functions 
>  once, usually
>  for ordered delivery with the application making the ordered 
>  vs. immediate
>  delivery choice (and sending the task management function twice if it
>  so chooses).  I think apps will generally choose ordered 
>  delivery, choosing
>  predictable behavior over immediacy concerns.  Aside from a longer
>  discussion of this issue, I still don't see the need for additional
>  mechanism(s) to task management - what have I missed in the above
>  discussion?
>  
>  --David
>  
>  ---------------------------------------------------
>  David L. Black, Senior Technologist
>  EMC Corporation, 42 South St., Hopkinton, MA  01748
>  +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
>  black_david@emc.com       Mobile: +1 (978) 394-7754
>  ---------------------------------------------------
>  
>  


From owner-ips@ece.cmu.edu  Fri Apr  6 15:16:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA17873
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 15:16:57 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36GMmJ07005
	for ips-outgoing; Fri, 6 Apr 2001 12:22:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36GMZr06995
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 12:22:35 -0400 (EDT)
Received: from hpindlm.cup.hp.com (hpindlm.cup.hp.com [15.13.95.89])
	by palrel3.hp.com (Postfix) with ESMTP
	id 07138AC6; Fri,  6 Apr 2001 09:22:34 -0700 (PDT)
Received: from mk731913.cup.hp.com (mk731912.cup.hp.com [15.8.80.111])
	by hpindlm.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id JAA21508;
	Fri, 6 Apr 2001 09:26:30 -0700 (PDT)
Message-Id: <5.0.2.1.2.20010405095356.029c8e90@esalpha2.cup.hp.com>
X-Sender: krause@esalpha2.cup.hp.com
X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
Date: Thu, 05 Apr 2001 10:09:53 -0700
To: Stephen Bailey <steph@cs.uchicago.edu>
From: Michael Krause <krause@cup.hp.com>
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Cc: ips@ece.cmu.edu
In-Reply-To: <20010403131353.B6D6494006@sandmail.sandburst.com>
References: <Message from Robert Snively <rsnively@Brocade.COM>
 <FFD40DB4943CD411876500508BAD02797D467D@sj5-ex2.brocade.com>
 <FFD40DB4943CD411876500508BAD02797D467D@sj5-ex2.brocade.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

At 09:12 AM 4/3/2001 -0400, Stephen Bailey wrote:
> > The Stone and Partridge paper is mostly not applicable to an iSCSI
> > environment.  The principal failure mechanisms were major software
> > bugs in the driver stack of PC-oriented machines.

People make mistakes in all implementations.  Examination of other similar 
packet processing technology for mistakes is applicable to any effort and 
one should perform a risk assessment as to the probability of the mistakes 
being repeated here.  The fact that the mistakes were in PC-oriented 
machines is basically irrelevant and storage is not immune from having 
similar mistakes (have seen storage implementations that were just as poor 
in terms of quality as any other segment of the industry).


>I'm in complete agreement with Bob.
>
>I haven't seen a good analysis of TCP checksum escapes which resulted
>from intermediary manipulation (I haven't read the papers, but
>hopefully soon), but my hunch is that it's incredibly rare.
>
>An endpoint precipiated TCP checksum `escape' also escape a CRC or any
>other similar integrity check.  That is why I think all this
>additional integrity checking (on iSCSI headers & data), is an
>incredible amount of extra work (not just in computing the CRCs, but
>also in designing the SACK mechanism and recovery for digest failures)
>for no real gain.

I agree that some of the recovery is overkill but disagree that error 
detection is as well.  At a minimum, one needs to have a strong end-to-end 
error detection mechanism.  Many believe a 16-bit checksum is not adequate 
to protect their data and given the importance of this data to our 
customers, most feel the specification must define such a mechanism (with 
some having strong feelings that this mechanism should NOT be 
optional).  Now whether we need to have 2 CRCs, etc. is a separate debate 
but they need to be there and most of us will require that they be used in 
any product / solution delivered to the customer.

>The real loss is that it's immensely slowing time-to-market for iSCSI 
>(both in the front end specification and the back end implementation).

A fast TTM solution that is not the highest quality (prevents silent data 
corruption) will lead to customer distrust and a repeat of the FC adoption 
rate - only 10 years later has it really started to penetrate customer 
solutions.


>A straw-man proposal (very unpopular given where we are, I know) would
>be to specify iSCSI without additional integrity checks (other than
>what you can get through security mechanisms, which is probably not
>visible to iSCSI anyway), and if that `fails' (I'm sure it won't), we
>can put an integrity shim between iSCSI and the transport.
>
>One example of how to do this would be Julian's TAF.  Another would be
>the WARP RDMA layer.

If another layer is put in place that provides data integrity, then it is 
redundant to do this at the iSCSI layer as well and this is one place where 
an option can be used, i.e. one negotiates the underlying framing mechanism 
(e.g. WARP) and if it is present, then iSCSI does not activate the CRC 
services.  If it is not, then it does thereby insuring that there is always 
end-to-end data integrity present in the solution.


>We don't have to specify how to do this now

If this is to be supported then it should be specified now (can be done 
rather opaquely by just setting a "transport services" attribute for strong 
end-to-end data integrity protection.

>, and the point is that
>it's hard to do so, because we really don't know what problem we're
>solving with it.  We're OK as long as we have a way to address it in
>the future without completely chucking what already exists.
>
>The other point to remember is that iSCSI still has to make the
>ID->Proposed->Draft->Internet traversal, and anybody that thinks it's
>going to do that on the first try is kidding themselves.  It's more
>important to get SOMETHING out there that exposes the implementation
>holes than to design a cathedral on paper.

Nothing is perfect the first time out but in the tightening economy and 
increasing customer quality demands from the get-go, the trade-off between 
quality / reliability and TTM is not something people should rush to 
make.  The market is not what it used to be where good enough was alright; 
customers expect more today and with good cause.

Mike



From owner-ips@ece.cmu.edu  Fri Apr  6 15:23:57 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA18047
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 15:23:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36EMKe28835
	for ips-outgoing; Fri, 6 Apr 2001 10:22:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from bramg1.net.external.hp.com (bramg1.net.external.hp.com [192.6.126.73])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36ELUr28787
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 10:21:31 -0400 (EDT)
Received: from quasit.br.itc.hp.com (quasit.br.itc.hp.com [15.145.8.135])
	by bramg1.net.external.hp.com (Postfix) with ESMTP id EE0B9ED
	for <ips@ece.cmu.edu>; Fri,  6 Apr 2001 16:21:28 +0200 (METDST)
Received: from loddon.br.itc.hp.com (loddon.br.itc.hp.com [15.145.8.166])
	by quasit.br.itc.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit6.0.6 OpenMail) with SMTP id PAA19369
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 15:21:27 +0100 (BST)
Received: from 15.145.8.166 by loddon.br.itc.hp.com (InterScan E-Mail VirusWall NT); Fri, 06 Apr 2001 15:21:27 +0100 (GMT Daylight Time)
Received: by loddon.br.itc.hp.com with Internet Mail Service (5.5.2650.21)
	id <H0C2YCQS>; Fri, 6 Apr 2001 15:21:27 +0100
Message-ID: <0B9A57FF1D57D411B47500D0B73E5CC101E7A665@dickens.bri.hp.com>
From: "BURBRIDGE,MATTHEW (HP-UnitedKingdom,ex2)" <matthew_burbridge@hp.com>
To: "'ips@ece.cmu.edu'" <ips@ece.cmu.edu>
Subject: FW: iSCSI:flow control, acknowledgement, and a deterministic reco
	 very
Date: Fri, 6 Apr 2001 15:21:24 +0100 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="ISO-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Alternatively, only allow one command with CmdSN=0 outstanding at anyone
time.  This has the advantage of only having to reserve one buffer at the
target,  the retry bit could still be used if the command or response has a
digest error, and finally its simple!.  The obvious disadvantage is that
there can only be one message is progress at a time - is this an issue?

Matthew Burbridge
NIS-Bristol
Hewlett Packard
Telnet: 312 7010
E-mail: matthewb@bri.hp.com


> -----Original Message-----
> From: Black_David@emc.com [mailto:Black_David@emc.com]
> Sent: 06 April 2001 14:11
> To: dotis@sanlight.net; ips@ece.cmu.edu
> Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic
> reco very
> 
> 
> Let me see if I can distill out the issue in Doug's long
> message on this subject ...
> 
> When a CmdSN of zero is used to mark a command for immediate
> delivery, the CmdSN-based acknowledgement and windowing
> mechanisms don't apply to that command.  This means that
> any command sent for immediate delivery (CmdSN = 0):
> (A) Cannot recover from a CRC (digest) error via CmdSN-based
> 	retransmit.  The Initiator can still time this out,
> 	but that seems like a poor way to initiate recovery
> 	of something that was supposed to be done immediately.
> (B) Does not have its use of resources (e.g. command buffer)
> 	on the target controlled by the CmdSN windowing
> 	mechanism, complicating target resource management.
> 	Targets have better control over their resources if
> 	all inbound commands use one set of resources,
> 	described/controlled by the CmdSN window, rather
> 	than having to set aside separate resources just
> 	in case the initiator sends an immediate command. 
> The alternate proposal that Doug envisions is to apply
> CmdSN (and hence its error recovery and resource management)
> to immediate commands, and use a flag  bit elsewhere in
> the header to indicate that the command is to be delivered
> to SCSI immediately by iSCSI rather than waiting for
> "missing" commands to show up.  That seems reasonable,
> and comments are invited.  One of the things that this
> does is transfer the responsibility to keep some space
> in the CmdSN window open for immediate commands (and
> determine how much is appropriate) from the Target to
> the Initiator - all other things being equal, this
> is the right direction to move functionality in a SCSI
> system.
> 
> Note Well: This concept of "immediate" delivery is an
> iSCSI concept that affects only the iSCSI CmdSN sequence
> - this does not affect TCP's "deliver in order" behavior.
> 
> Thanks,
> --David
> 
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
> 
> > -----Original Message-----
> > From:	Douglas Otis [SMTP:dotis@sanlight.net]
> > Sent:	Thursday, April 05, 2001 11:34 PM
> > To:	Ips
> > Cc:	David Black; Elizabeth G Rodriguez (Elizabeth)
> > Subject:	iSCSI:flow control, acknowledgement, and a deterministic
> > recovery
> > 
> > Ver 5 Pg. 10
> > 
> >    "Command numbering is session-wide and is used for 
> ordered command
> >    delivery over multiple connections.  It can also be used as a
> >    mechanism for command flow control over a session."
> > 
> > "1.2.2.1 Command Numbering and Acknowledging
> > 
> >    iSCSI supports ordered command delivery within a session.  All
> >    commands (initiator-to-target) are numbered."
> > 
> >    "Commands in transit from the initiator SCSI layer to 
> the SCSI target
> >    layer are numbered by iSCSI; the number is carried by 
> the iSCSI PDU
> >    as CmdSN (Command-Sequence-Number).  The numbering is 
> session-wide.
> >    All iSCSI PDUs that have a task association carry this 
> number. CmdSNs
> >    are allocated by the initiator iSCSI within a 32-bit 
> unsigned counter
> >    (modulo 2**32).  The value 0 is reserved and used to 
> mean immediate
> >    delivery. Comparisons and arithmetic on CmdSN SHOULD use Serial
> >    Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.
> > 
> >    Not covered in this document are he means by which the 
> SCSI layer may
> >    request immediate delivery for a command or by which iSCSI will
> >    decide by itself to mark a PDU for immediate delivery."
> > 
> > 
> > Not all commands are serialized as a result of serial number 0
> > representing
> > a special case for immediate delivery.  This has an impact on flow
> > control,
> > acknowledgement, and a deterministic recovery in the face 
> of an error
> > situation.  With the exception of those commands with null 
> serialization,
> > all commands MUST be sequenced at the point of network aggregation
> > described
> > here as a PDU sequencer that issues commands to the SCSI 
> target.  This
> > focal
> > point is likely to find situations where its normal operation is
> > curtailed.
> > 
> > 	1) Prolonged device operation resulting in a resource 
> constraint.
> > 	2) Digest Error causing a missing sequence.
> > 	3) Connection loss causing a missing sequence.
> > 
> > The technique of using a null sequence to bypass the 
> sequencer has some
> > inherent problems.  In first case, those queued commands MAY become
> > invalid
> > following management that terminates prolonged operation 
> with a command
> > that
> > has bypassed the sequencer.  Those invalidated commands 
> queued within the
> > sequencer can not be cancelled in an orderly manner within 
> the existing
> > scheme.  This sequencer MUST be used as defined in the 
> iSCSI proposal and,
> > as a result, these queued commands are beyond SCSI and 
> iSCSI controls.
> > Issuing a null sequence command followed by a replicate 
> serialized command
> > will have differing results depending on the target but 
> will not result in
> > a
> > deterministic treatment of these pending commands.
> > 
> > The use of the sequencer bypass technique (null 
> serialization) should
> > signal
> > an extreme measure where logically, those commands being 
> bypassed become
> > suspect.  The conservative approach to this situation would 
> be to reject
> > all
> > bypassed commands.  As a result of this conservative 
> behavior, a technique
> > that does not use a null sequence would be to institute a 
> flag such as
> > "Exigent" that signals an extreme condition exists and that 
> all pending
> > commands within the sequencer are to be rejected.
> > 
> > Note:
> > In addition to checking for the next PDU sequence, the 
> sequencer should be
> > checking for PDUs with a serialization number that are prior to the
> > desired
> > next sequence.  This examination would look something like this:
> > 
> > if ( (sequencer CmdSN - next sequence CmdSN ) > 2^(SERIAL_BITS - 1))
> > 	{
> > 	reject_pdu(CmdSN, SEQUENCER_INVALIDATION);
> > 	}
> > 
> > if (sequencer CmdSN == next sequence CmdSN)
> > 	{
> > 	send_pdu(CmdSN);
> > 	next sequence++;
> > 	}
> > 
> > Upon receipt of a PDU flagged as "Exigent", the 'next 
> sequence' value
> > immediately becomes the serial number of this command as 
> well as ExpCmd
> > advancing to this value plus one.  The effect of this is to have the
> > sequencer reject all pending commands up to the "Exigent" 
> command.  This
> > has
> > the benefit of removing these now suspect commands as well 
> as allowing
> > this
> > highly urgent command to be sent to the target immediately without
> > accidentally affecting subsequent commands as is now possible.
> > 
> > As it is now, one other possible use of a null sequence 
> would be to always
> > bypass the sequencer as perhaps the sequencer is viewed as 
> unnecessary.
> > The
> > use of zero to allow commands to bypass the sequencer then 
> represents a
> > problem with respect to resource management as now the flow 
> control scheme
> > no longer works.  If bypassing the sequencer is the desired 
> behavior and
> > this command will not impact validity of those commands 
> serialized prior
> > to
> > this command, then this PDU flagged "Casual" would allow 
> this command to
> > be
> > issued directly to the target.  These commands should still 
> include a
> > serial
> > number to allow flow control and acknowledgement to remain 
> functional.
> > The
> > task of acknowledgement would be to comprise a min-max list of those
> > commands sent and to look for the highest sequential value.
> > 
> > In the case of a lost connection, waiting to time-out on those holes
> > created
> > within the sequencer would be one method of handling this 
> situation.  If
> > there was a means of rejecting those commands within the 
> sequencer using
> > an
> > "Exigent" command would mean there are no holes left to fill.  Those
> > commands received would be rejected and the initiator could 
> then resend
> > these commands on a different connection without stumbling through a
> > process
> > of repeated timeouts.  Should the method used to recover 
> from a digestion
> > error mean terminating the connection, then those commands 
> can also be
> > quickly shifted over to a new connection.
> > 
> > This does not resolve all issues created in a error event, 
> but it does
> > provide a simpler solution for most of those events concerning the
> > sequencer
> > and gets rid of the special case handing of the command 
> serial number.
> > Not
> > having flow control, acknowledgement, and a method of 
> dealing with queued
> > commands appear as a serious flaw in the present protocol.  
> I hope I have
> > presented this in a clear manner and I am interested in 
> finding how others
> > deal with these situations.
> > 
> > Doug
> 


From owner-ips@ece.cmu.edu  Fri Apr  6 18:06:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA21585
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 18:06:01 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36Jg0q20191
	for ips-outgoing; Fri, 6 Apr 2001 15:42:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36JfFr20113
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 15:41:19 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f36KnZ095894;
	Fri, 6 Apr 2001 13:49:35 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Robert Snively" <rsnively@Brocade.COM>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple connections.
Date: Fri, 6 Apr 2001 12:39:31 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIEAMCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <FFD40DB4943CD411876500508BAD02797D469E@sj5-ex2.brocade.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Robert,

The modification of using a flag to indicate "Exigent" rather than use of a
null sequence is a means to handle the event of a PDU bypassing many queued
commands within the iSCSI delivery system.  This queue at the iSCSI
sequencer is not the same as described in SCSI for a target device.  I do
not know of any corollary behavior within existing SCSI delivery schemes
where potentially a substantial number of commands are held beyond the
control of the initiator and the target.  In addition, due to the null
serialization, this command may be applied after subsequent commands have
been sent due to the unpredictable nature of IP networks with false
assumptions made as to the assured expediency of these networks.  Yes, place
in enough time-outs and this problem becomes a bit more predictable in some
cases.  Suggesting only a single null sequence PDU be used at any point in
time does not resolve principle issues of dealing with the sequencer queue
nor is this command acknowledged which is sure to lead to this rule being
violated upon a retry placing additional wild cards into the systems.

Although SAM may allow some flexibility in how management is handled, the
situation at the iSCSI sequencer goes beyond the intent of this flexibility
as I do not think they envisioned this trapped queue of commands.  Although
additional status interlocks within the target is welcome, those commands
held in the iSCSI sequencer will not be handled according to SCSI rules
unless a new abort behavior is in place at the SCSI target and the SCSI
management command signals this behavior.  The initiator should be given the
option of how these bypassed commands are to be handled.  Rejecting those
commands within the sequencer at the time of the bypass or "Exigent" command
allows the initiator this option without adding additional behavior models
to the target.

This flag solves three problems, one) late arrivals of a management command,
two) predictable handling of commands without SCSI target behavior
modifications, three) acknowledgement of these critical management commands.
It seems the means of handling this event as described requires less effort
than as now envisioned and relies less on SCSI trapping iSCSI errors.

Doug

> This apparently did not get out on the ips reflector.  Trying again.
>
> -----Original Message-----
> From: Robert Snively
> Sent: Tuesday, April 03, 2001 3:11 PM
> To: 'Black_David@emc.com'; CBinford@pirus.com; ips@ece.cmu.edu
> Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple
> connections.
>
>
> I support David's conclusions that task management should be sent
> once.  Fortunately, the ordered/unordered question need not even
> be asked, because SAM-2 compliant SCSI applications already know
> what will happen.
>
> At present, SCSI applications do not have a clear guarantee of
> the order between task management functions and the processing or
> delivery of any particular task.  In fact, the concept of
> an ORDERED attribute applied to a task management function is
> unknown.  As a result, SCSI drivers have to be aware of
> the implications.  Those implications include the possibility that
> a command may be in any state of delivery or completion as a
> task management function is executed and therefore may or may not
> be included in the scope of the task management function.  This
> has been partially clarified by the recently created status value
> developed by Charles Binford, "TASK ABORTED", which at least indicates
> which tasks for other initiators have been cleared by a task
> management function.
>
> In general, the approach is as follows:
>
> 	When a task management operation is received by a target,
> 	tasks are treated in one of three ways, depending on where
> 	they are with respect to the timing of the execution of
> 	the command.
>
>    a)	If the command has been completed, it is perfectly possible that
> 	completion status for the command is transmitted in spite of the
> 	fact that the task management operation has been received.
>
>    b)	If the command is in progress or enqueued, it is likely that
> 	the command will be managed according to the rules of the
> 	task management command unless a) or c) happen.
>
>    c) If the command has not been received by the time the task
> 	management function is completed, the command will be received
> 	and treated as a command that has occurred after the task
> 	management function, even if it were sent before that.  That
> 	typically involves presentation of a UNIT ATTENTION condition
> 	or some other notification.
>
> 	When a task management operation passes through an initiator,
> 	the initiator has the option of acting on the commands
> 	whose state it knows, including the possibility of discarding
> 	or presenting both expected and unexpected returned status,
> 	in a vendor specific manner.
>
> It is possible that all this work of creating a synchronous behavior
> for actions that are designed to be asynchronous may be unnecessary.
> The present drivers know what to do to clean up the mess.  All you
> have done is statistically increase the number of tasks that may
> be involved in behaviors a) and c) above.
>
> References in support of this include:
>
> >From SAM-2, Rev 15, 4.6.2:
>
> 	For convenience, the SCSI architecture model assumes
> 	in-order delivery to be a property of the service delivery
> 	subsystem. This assumption is made to simplify the
> 	description of behavior and does not constitute a requirement.
> 	In addition, this specification makes no assumption about,
> 	or places any requirement on the ordering of requests or
> 	responses between one sending device and several receiving devices.
>
> >From SAM-2, Rev 15, 4.7.4
>
> 	The order in which task management requests are
> 	executed is not specified by this standard. In particular, this
> 	standard does not require in-order delivery of such
> 	requests, as defined in 4.6.2, or execution by the task manager
> 	in the order received. To guarantee the execution order
> 	of task management requests referencing a specific logical
> 	unit, an initiator should, therefore, not have more than
> 	one such request pending to that logical unit.
>
> >  -----Original Message-----
> >  From: Black_David@emc.com [mailto:Black_David@emc.com]
> >  Sent: Tuesday, April 03, 2001 12:47 PM
> >  To: CBinford@pirus.com; ips@ece.cmu.edu
> >  Subject: RE: iSCSI: Out Of Sequence due to null sequence
> >  with multiple
> >  con nections.
> >
> >
> >  > I would state this much stronger.  Applications had better
> >  not have to
> >  know
> >  > that it is iSCSI underneath vs. FCP or parallel SCSI else
> >  I believe we
> >  > missed the objective (granted, some things such as target
> >  address space
> >  are
> >  > unavoidably different, but I believe task management
> >  functions should be
> >  the
> >  > same).  The transport needs to handle the transport issues without
> >  exposing
> >  > quirks to the SCSI or application layer.
> >
> >  Unfortunately, I think we have an impossible situation.  It
> >  appears to me
> >  that
> >  we have to pick at most two of the following three goals, as
> >  I have yet to
> >  see
> >  any way to achieve all three for a single task management
> >  command on a
> >  multiple connection session:
> >
> >  (1) The command takes effect immediately and its status/response
> >  	is available immediately.
> >  (2) The command affects all commands in flight, and its
> >  status/response
> >  	is delayed until all such effects are complete.
> >  (3) There is no significant visible departure from existing SCSI task
> >  	management behavior.
> >
> >  The problem is that trying to do both (1) and (2) either
> >  requires SCSI to
> >  "execute" the task management command twice or requires that iSCSI do
> >  some task management (e.g., on the in-flight commands) on
> >  SCSI's behalf
> >  (or worse like having SCSI prolong the execution of the task
> >  management
> >  command until everything in flight in iSCSI arrives).  All
> >  of these appear
> >  to lead to problems with (3) in one form or another - two executions
> >  result in two SCSI status/responses that have to be merged, and iSCSI
> >  task management will sooner or later do something different from SCSI
> >  (e.g., I sincerely doubt that a Target in a bridge will ever
> >  get this 100%
> >  identical to the devices that are being bridged).
> >
> >  The current iSCSI draft provides the choice of  [(1)] XOR [(2), (3)];
> >  the reason for not getting (3) with (1) is the possibility
> >  of the task
> >  management command bypassing commands that it's supposed to
> >  affect.  Charles' original proposal is [(2), (3)] because it
> >  has to time out
> >  a stuck connection before executing the command, and is roughly
> >  equivalent to sending the command for ordered delivery and having
> >  the implementation treat any queue between iSCSI and SCSI as
> >  being on the SCSI side of the line.  Doug Otis's counter-proposal
> >  falls into the category of iSCSI doing task management on SCSI's
> >  behalf and provides an example of how this results in visible changes
> >  in behavior -- for the CLEAR ACA task management command,
> >  aborting all tasks that are queued or in flight is generally
> >  incorrect.
> >
> >  I would note that this issue does not arise on single
> >  connection sessions,
> >  because sending the command for immediate delivery plus some care not
> >  to reorder things in the iSCSI Target (i.e., consider the
> >  iSCSI to SCSI
> >  queue
> >  to be in "SCSI" and hence subject to the task management command)
> >  obtains all of (1) through (3).
> >
> >  Going out on a limb, I suspect applications will generally
> >  want [(2), (3)]
> >  -- send for ordered delivery and wait for the dust to settle
> >  because that
> >  provides the best odds of having some weird device get into a known
> >  state from which further progress is possible.  This allows
> >  the application
> >  to not know whether parallel SCSI, FCP or iSCSI is underneath and
> >  relies on other iSCSI recovery procedures to make sure that the task
> >  management command is delivered and executed (e.g., unstick and/or
> >  close "stuck" connections).  There will be cases in which (1) is
> >  needed (e.g., observe tape robot doing something obviously wrong,
> >  and get it to stop immediately), but those may involve fairly blunt
> >  instruments (e.g., LUN RESET) and the need to clean up any collateral
> >  damage.
> >
> >  Sandeep's proposal to create state in the target either
> >  fails to achieve
> >  (1) [if the response is delayed until the state is removed]
> >  or violates SAM2
> >  [returns the response to the task management command before the task
> >  management command is complete].  Having state linger after
> >  a completed
> >  LUN or TARGET RESET is almost certainly wrong.
> >
> >  So, I think I'm down to sending task management functions
> >  once, usually
> >  for ordered delivery with the application making the ordered
> >  vs. immediate
> >  delivery choice (and sending the task management function twice if it
> >  so chooses).  I think apps will generally choose ordered
> >  delivery, choosing
> >  predictable behavior over immediacy concerns.  Aside from a longer
> >  discussion of this issue, I still don't see the need for additional
> >  mechanism(s) to task management - what have I missed in the above
> >  discussion?
> >
> >  --David
> >
> >  ---------------------------------------------------
> >  David L. Black, Senior Technologist
> >  EMC Corporation, 42 South St., Hopkinton, MA  01748
> >  +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> >  black_david@emc.com       Mobile: +1 (978) 394-7754
> >  ---------------------------------------------------
> >
> >
>



From owner-ips@ece.cmu.edu  Fri Apr  6 20:11:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA23523
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 20:11:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36M4vf29653
	for ips-outgoing; Fri, 6 Apr 2001 18:04:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.brocade.com (asbestos.brocade.com [63.121.140.244])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36KRar23298
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 16:27:36 -0400 (EDT)
Received: from thor.brocade.com (thor [192.168.126.45])
	by mail.brocade.com (8.8.8+Sun/8.8.8) with ESMTP id NAA02514;
	Fri, 6 Apr 2001 13:17:15 -0700 (PDT)
Received: by thor.brocade.com with Internet Mail Service (5.5.2653.19)
	id <FXLAZRDJ>; Fri, 6 Apr 2001 13:17:15 -0700
Message-ID: <FFD40DB4943CD411876500508BAD02797D46A7@sj5-ex2.brocade.com>
From: Robert Snively <rsnively@Brocade.COM>
To: "'Michael Krause'" <krause@cup.hp.com>,
        Stephen Bailey
	 <steph@cs.uchicago.edu>
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date: Fri, 6 Apr 2001 13:17:13 -0700 
X-Mailer: Internet Mail Service (5.5.2653.19)
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Michael,

Let me explain why I feel that the iSCSI environment is
different than the college-student music-downloading
environment studied by Stone and Partridge.

All iSCSI targets are new designs.  They do not ride on
obsolete consumer TCP/IP stacks, but rather on proprietary
TCP/IP stacks, hardware assisted TCP/IP stacks, and 
robust embedded Unix TCP/IP stacks.

Most iSCSI initiators are new designs.  While some may ride
on present host TCP/IP stacks, it is likely that most
will be new designs or hosted by robust and up-to-date
TCP/IP stacks.

All storage applications are debugged in an end-to-end
manner impossible for most other applications.  The 
"write/read/compare" and performance testing required of 
storage eliminates almost all the bugs that were contributing
to the higher numbers found in the Stone and Partridge
environment.

That leaves only the residual error rate, which may allow
TCP/IP delivery of segments with undetected errors less
than one time in ten billion, not 1 in 16 million.

The most interesting problem is that most of the additional
verification and the packet CRC is calculated through the
same hardware stack that calculates the TCP/IP checksum and
is therefore susceptible to many of the same errors, but now
blessed by a valid CRC value and valid TCP/IP checksum.

The net I draw from this is that careful design is key to
success and that CRC or positionally 
dependent checksum on the iSCSI data packets is probably 
a good idea.  However, retry of iSCSI data packets may not
be necessary.

Bob 

>  -----Original Message-----
>  From: Michael Krause [mailto:krause@cup.hp.com]
>  Sent: Thursday, April 05, 2001 10:10 AM
>  To: Stephen Bailey
>  Cc: ips@ece.cmu.edu
>  Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>  
>  
>  At 09:12 AM 4/3/2001 -0400, Stephen Bailey wrote:
>  > > The Stone and Partridge paper is mostly not applicable 
>  to an iSCSI
>  > > environment.  The principal failure mechanisms were 
>  major software
>  > > bugs in the driver stack of PC-oriented machines.
>  
>  People make mistakes in all implementations.  Examination of 
>  other similar 
>  packet processing technology for mistakes is applicable to 
>  any effort and 
>  one should perform a risk assessment as to the probability 
>  of the mistakes 
>  being repeated here.  The fact that the mistakes were in PC-oriented 
>  machines is basically irrelevant and storage is not immune 
>  from having 
>  similar mistakes (have seen storage implementations that 
>  were just as poor 
>  in terms of quality as any other segment of the industry).
>  
>  
>  >I'm in complete agreement with Bob.
>  >
>  >I haven't seen a good analysis of TCP checksum escapes 
>  which resulted
>  >from intermediary manipulation (I haven't read the papers, but
>  >hopefully soon), but my hunch is that it's incredibly rare.
>  >
>  >An endpoint precipiated TCP checksum `escape' also escape a 
>  CRC or any
>  >other similar integrity check.  That is why I think all this
>  >additional integrity checking (on iSCSI headers & data), is an
>  >incredible amount of extra work (not just in computing the CRCs, but
>  >also in designing the SACK mechanism and recovery for 
>  digest failures)
>  >for no real gain.
>  
>  I agree that some of the recovery is overkill but disagree 
>  that error 
>  detection is as well.  At a minimum, one needs to have a 
>  strong end-to-end 
>  error detection mechanism.  Many believe a 16-bit checksum 
>  is not adequate 
>  to protect their data and given the importance of this data to our 
>  customers, most feel the specification must define such a 
>  mechanism (with 
>  some having strong feelings that this mechanism should NOT be 
>  optional).  Now whether we need to have 2 CRCs, etc. is a 
>  separate debate 
>  but they need to be there and most of us will require that 
>  they be used in 
>  any product / solution delivered to the customer.
>  
>  >The real loss is that it's immensely slowing time-to-market 
>  for iSCSI 
>  >(both in the front end specification and the back end 
>  implementation).
>  
>  A fast TTM solution that is not the highest quality 
>  (prevents silent data 
>  corruption) will lead to customer distrust and a repeat of 
>  the FC adoption 
>  rate - only 10 years later has it really started to 
>  penetrate customer 
>  solutions.
>  
>  
>  >A straw-man proposal (very unpopular given where we are, I 
>  know) would
>  >be to specify iSCSI without additional integrity checks (other than
>  >what you can get through security mechanisms, which is probably not
>  >visible to iSCSI anyway), and if that `fails' (I'm sure it 
>  won't), we
>  >can put an integrity shim between iSCSI and the transport.
>  >
>  >One example of how to do this would be Julian's TAF.  
>  Another would be
>  >the WARP RDMA layer.
>  
>  If another layer is put in place that provides data 
>  integrity, then it is 
>  redundant to do this at the iSCSI layer as well and this is 
>  one place where 
>  an option can be used, i.e. one negotiates the underlying 
>  framing mechanism 
>  (e.g. WARP) and if it is present, then iSCSI does not 
>  activate the CRC 
>  services.  If it is not, then it does thereby insuring that 
>  there is always 
>  end-to-end data integrity present in the solution.
>  
>  
>  >We don't have to specify how to do this now
>  
>  If this is to be supported then it should be specified now 
>  (can be done 
>  rather opaquely by just setting a "transport services" 
>  attribute for strong 
>  end-to-end data integrity protection.
>  
>  >, and the point is that
>  >it's hard to do so, because we really don't know what problem we're
>  >solving with it.  We're OK as long as we have a way to address it in
>  >the future without completely chucking what already exists.
>  >
>  >The other point to remember is that iSCSI still has to make the
>  >ID->Proposed->Draft->Internet traversal, and anybody that 
>  thinks it's
>  >going to do that on the first try is kidding themselves.  It's more
>  >important to get SOMETHING out there that exposes the implementation
>  >holes than to design a cathedral on paper.
>  
>  Nothing is perfect the first time out but in the tightening 
>  economy and 
>  increasing customer quality demands from the get-go, the 
>  trade-off between 
>  quality / reliability and TTM is not something people should rush to 
>  make.  The market is not what it used to be where good 
>  enough was alright; 
>  customers expect more today and with good cause.
>  
>  Mike
>  
>  


From owner-ips@ece.cmu.edu  Fri Apr  6 20:11:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA23534
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 20:11:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36MBuM00146
	for ips-outgoing; Fri, 6 Apr 2001 18:11:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36MB8r00070
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 18:11:09 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f36NJY096010
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 16:19:34 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Ips" <ips@ece.cmu.edu>
Subject: iSCSI: Exigent trivia
Date: Fri, 6 Apr 2001 15:09:31 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEAOCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

FYI,

Not wishing to reuse terminology found in TCP or SCSI, a word commonly used
with respect to law enforcement seemed to relate to iSCSI as a flag to
substitute for the null sequence as a means of rejecting bypassed commands.

Legal definition: "Exigent Circumstances"

Emergency situations in which warrantless searches are permissible in order
to protect lives or prevent the destruction of evidence.

Doug



From owner-ips@ece.cmu.edu  Fri Apr  6 20:13:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA23556
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 20:13:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36M91b29906
	for ips-outgoing; Fri, 6 Apr 2001 18:09:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36M8Dr29866
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 18:08:13 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6K895X>; Fri, 6 Apr 2001 17:59:02 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153DB@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: sandeepj@research.bell-labs.com, ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple con
	 nections.
Date: Fri, 6 Apr 2001 18:08:06 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sandeep,

> Still wading thru related emails but I believe that if a refCmdSN
> is added to the task management PDU (not present currently but
> could be added for task-related management commands), then it 
> might fix the above-mentioned flaws and allow for safe execution 
> and immediate delivery of the abort task to the target.  

I think this is a functional subset of Doug Otis's suggestion to always
use a CmdSN and add a header flag indicating that the command should
be executed immediately at the target rather than waiting for those with
prior CmdSNs to arrive.  Doug's suggestion also consumes less space
in the header.  As to (1) vs. (2):

> (1) The command takes effect immediately and its status/response
>         is available immediately.
> (2) The command affects all commands in flight, and its status/response
>        is delayed until all such effects are complete.

I think you've covered most of the ground in:

> 6) Task response can be returned as appropriate to conform with
>    SAM2 - either after in-flight commands arrive or immediately
>    since the target knows what needs to be done later.  I am slightly 
>    confused here since your goals (1)&(2) appear to be contradictory
>    for application to in-flight commands.. it depends on semantics 
>    what "taking effect" implies ?

IMHO, returning the response before task execution is complete
("the target knows what needs to be done later") not only does
not comply with SAM2, but can also yield rather unexpected
behavior (e.g., situations in which an in-flight command is aborted
significantly after the response to the abort command has been
returned to the Initiator).  If this is correct, goals (1) and (2) are
contradictory because there's only one response and achieving
both goals requires sending that one response at two different
times (not a good idea).  Hence the Initiator has to choose
between (1) and (2) for each task management command.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr  6 21:47:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA25233
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 21:47:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36NX5E04858
	for ips-outgoing; Fri, 6 Apr 2001 19:33:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lub1028.lss.emc.com ([168.159.39.28])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f36NWor04846
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 19:32:50 -0400 (EDT)
Received: from emc.com (IDENT:jhall@localhost.localdomain [127.0.0.1])
	by lub1028.lss.emc.com (8.9.3/8.9.3) with ESMTP id TAA11777
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 19:32:44 -0400
Message-Id: <200104062332.TAA11777@lub1028.lss.emc.com>
To: ips@ece.cmu.edu
Subject: Re: SNACK and recovery 
Date: Fri, 06 Apr 2001 19:32:44 -0400
From: "Jon Hall" <jhall@emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

David,

Since you addressed this to me, I'll reply, but be forewarned
I really have nothing more to add to this thread :-).

Black_David writes:
>This turns out to be a matter not just of rarity,
>but also one of consequences.  As Mark points
>out, for tapes and similar devices, the consequences
>are disastrous - the backup aborts, and when
>"those in charge" come in the next morning,
>they have no usable backup tape, and are very
>unhappy.  While Jon says "streams devices must
>support abort and retry for extreme errors in
>any case", the abort may well be the entire
>backup and the retry might be next weekend ...
>not a good situation.

Perhaps, but could you describe the process that supports this
scenario?  A tape backup procedure that must succeed entirely,
and, if not, may only be repeated the following weekend -- seems
like it would be hard to sustain (quite apart from this discussion).

>Over in Fibre Channel world, FCP-2 contains
>recovery support that resulted from the
>discovery that despite the fact that non-
>delivery of a Fibre Channel frame (Class 2 or
>3 - it doesn't matter which) is "extremely
>rare":
>- Buffer overrun is prevented by both link
>	and end-to-end buffer usage controls.
>- FC switches are engineered to not drop
>	frames to the maximum extent possible
>	due in part to these consequences.
>- There's a 32-bit CRC covering the entire
>	FC frame.
>failure to deliver a frame happens often enough
>that a recovery mechanism is needed to avoid
>tape backup aborts and the like.  Unlike TCP,
>Fibre Channel has no built-in retransmit
>mechanism.

My objection to the complexity inherent in StatSN/SNACK/SACK
is in part motivated by the experience of those that run SCSI
commands over FC on high speed links.  In this FC context
retained status on the target is simply not supported.  But you
(and many on the list) know this  -- I'm not sure what to make
of your points above, are we just agreeing on this fact?

>In contrast to Fibre Channel, we are dealing
>with something rarer because TCP retransmit will
>take care of most things that can go wrong in
>switches and there's a 16 bit checksum whose
>failure will trigger retransmits.  What this
>appears to come down to is:
>
>- Does a 16-bit TCP checksum catch enough of
>the corruption events to make it acceptable to
>take drastic measures like aborting a backup
>when a 32 bit CRC fails on a response that
>made it through the 16 bit checksum?

Is it correct to ignore link-level error correction?

>The discussion's been a bit convoluted.  Some
>simple yes/no answers to the above question
>accompanied by short reasoning would be appreciated.
>I think Julian's said "no" and quoted a filesystem
>number that we're awaiting a reference to.

I tried to be clear in my opinion and its basis, but I don't
claim specific tape experience.  You have the paper by Stone
and Partridge, could we agree to a number within the range
that they set out.  What do you like, say 1 in 5 billion
packets have a TCP cksum failure?

Now, what to say about tapes.  Just naive conjecture on my
part but here goes.  Assume a 20 gig disk being backed up to
tape over iSCSI; what xfer size do we like for the write CDBs?
Would one Meg be OK for one write command?  Would that then be
20480 responses covered by StatSNs to backup the 20 gig?

Assuming that each response is a distinct TCP segment and
ignoring the fact that the corrupt data may not actually be
in the iSCSI header part of the TCP segment.  Then one backup
would fail for every 244,140 attempts.  Assuming that we do
the backup every day, that means we must redo the backup (for
this specific error case) once every 668 years (ignoring leap
year days).

[ Note, the error is detected, no corrupted data has gone
 unrecognized, the downside is that the backup must be redone. ]

Now, I don't credit my tape assumptions (though I hope they
are generous wrt the counter argument) -- those who know tape
processes and iSCSI flows should adjust them.  But, if this
scenario is in the ballpark, then "yes" seems to be the
answer to your question.

>Just to muddy the waters further, let me point out
>that tape targets tend to be less complex than
>disk targets.  Tapes don't reorder commands, and
>often don't even queue them.  Saving the last N
>responses is not that difficult when the responses
>go out in the order that the commands came in
>(easier to organize saving them), and the initiator
>has to be very careful about the number of commands
>in flight to avoid disasters caused by dropped
>commands (should lead to reasonable results from
>relatively small values of N).

Maybe, and are there are other rare errors to consider?  Where
is the line drawn (or how many pages of error recovery state
diagrams are enough? :-)

-Jon


From owner-ips@ece.cmu.edu  Fri Apr  6 21:50:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA25298
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 21:50:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f370W9K07954
	for ips-outgoing; Fri, 6 Apr 2001 20:32:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f370Vjr07939
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 20:31:45 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f371dw096114;
	Fri, 6 Apr 2001 18:39:59 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Sandeep Joshi" <sandeepj@research.bell-labs.com>, <Black_David@emc.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple  connections.
Date: Fri, 6 Apr 2001 17:29:55 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEAPCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
In-Reply-To: <3ACE4345.A4899584@research.bell-labs.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Sandeep,

I understand your concern as you view the sequencer as potentially holding
an inordinate amount of commands.  Flow control should help keep this queue
lean.  Consider the consequence of a connection failure creating a series of
holes in the sequence.  The alternative to an immediate rejection of
commands may be to time-out on each hole in the sequencer.  In this case,
resending a list of rejected commands is a better solution than an endless
sequence of timeouts if this queue is really that large.  Data wise, this
list should represent only a small amount to be resent.  If the CmdSN was
not session but LUN wide, then there would be less recovery.  I do not see
this happening often enough for there to be a concern about the number of
commands rejected.

Doug


> Black_David@emc.com wrote:
> >
> > Sandeep,
> >
> > > Still wading thru related emails but I believe that if a refCmdSN
> > > is added to the task management PDU (not present currently but
> > > could be added for task-related management commands), then it
> > > might fix the above-mentioned flaws and allow for safe execution
> > > and immediate delivery of the abort task to the target.
> >
> > I think this is a functional subset of Doug Otis's suggestion to always
> > use a CmdSN and add a header flag indicating that the command should
> > be executed immediately at the target rather than waiting for those with
> > prior CmdSNs to arrive.  Doug's suggestion also consumes less space
> > in the header.
>
> Yes, i now understand Doug's proposal.  The "PDU sequencer" was
> new terminology which baffled me some.. :-)
>
> His solution does seem like a cleaner approach (no target state)
>
> In some ways, it reminds one of the M_FLUSH function used
> in the STREAMS layer.   If we stand back a bit, we see that the
> scenario is essentially the same -> Stream module processing is
> normally asynchronous.   With M_FLUSH, the stream head desires to
> clean out all the downstream module queues.
>
> The only concern with that solution is that it may delete
> a whole lot of unrelated SCSI I/O tasks as the sequencer cmdSN
> moves way up to the cmdSN of the task_mgmt command.  "Abort task"
> as such will not exist in the iSCSI world.   These unrelated
> commands will now have to be retried by the initiator.
>
> Do we have any numbers on how often ABORT_TASK/ABORT_TASK_SET actually
> occurs in filesystem/SCSI workloads ?
>
> regards,
> -Sandeep
>



From owner-ips@ece.cmu.edu  Fri Apr  6 22:07:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA25424
	for <ips-archive@odin.ietf.org>; Fri, 6 Apr 2001 22:07:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f370I0m07238
	for ips-outgoing; Fri, 6 Apr 2001 20:18:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f370Hrr07230
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 20:17:53 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6K91LA>; Fri, 6 Apr 2001 20:08:42 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153DF@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: sandeepj@research.bell-labs.com
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple  co
	nnections.
Date: Fri, 6 Apr 2001 20:17:46 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> The only concern with that solution is that it may delete
> a whole lot of unrelated SCSI I/O tasks as the sequencer cmdSN
> moves way up to the cmdSN of the task_mgmt command.  "Abort task" 
> as such will not exist in the iSCSI world.   These unrelated 
> commands will now have to be retried by the initiator.  

FWIW, that's not a necessary consequence of the proposal to use
a flag instead of a zero CmdSN, although it does appear to be the
way that Doug envisions implementing this.

--David

> -----Original Message-----
> From:	Sandeep Joshi [SMTP:sandeepj@research.bell-labs.com]
> Sent:	Friday, April 06, 2001 6:29 PM
> To:	Black_David@emc.com
> Cc:	ips@ece.cmu.edu
> Subject:	Re: iSCSI: Out Of Sequence due to null sequence with
> multiple  connections.
> 
> Black_David@emc.com wrote:
> > 
> > Sandeep,
> > 
> > > Still wading thru related emails but I believe that if a refCmdSN
> > > is added to the task management PDU (not present currently but
> > > could be added for task-related management commands), then it
> > > might fix the above-mentioned flaws and allow for safe execution
> > > and immediate delivery of the abort task to the target.
> > 
> > I think this is a functional subset of Doug Otis's suggestion to always
> > use a CmdSN and add a header flag indicating that the command should
> > be executed immediately at the target rather than waiting for those with
> > prior CmdSNs to arrive.  Doug's suggestion also consumes less space
> > in the header. 
> 
> Yes, i now understand Doug's proposal.  The "PDU sequencer" was
> new terminology which baffled me some.. :-)
> 
> His solution does seem like a cleaner approach (no target state)
> 
> In some ways, it reminds one of the M_FLUSH function used
> in the STREAMS layer.   If we stand back a bit, we see that the
> scenario is essentially the same -> Stream module processing is 
> normally asynchronous.   With M_FLUSH, the stream head desires to 
> clean out all the downstream module queues.  
> 
> The only concern with that solution is that it may delete
> a whole lot of unrelated SCSI I/O tasks as the sequencer cmdSN
> moves way up to the cmdSN of the task_mgmt command.  "Abort task" 
> as such will not exist in the iSCSI world.   These unrelated 
> commands will now have to be retried by the initiator.  
> 
> Do we have any numbers on how often ABORT_TASK/ABORT_TASK_SET actually 
> occurs in filesystem/SCSI workloads ?
> 
> regards,
> -Sandeep


From owner-ips@ece.cmu.edu  Sat Apr  7 00:35:17 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA27820
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 00:35:16 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f372S7h13774
	for ips-outgoing; Fri, 6 Apr 2001 22:28:07 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f372RZr13756
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 22:27:35 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3BKXY>; Fri, 6 Apr 2001 22:28:54 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153E2@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: DRAFT Minneapolis Minutes
Date: Fri, 6 Apr 2001 22:27:23 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Many thanks to Mark Carlson and Elizabeth Rodriguez for
taking the minutes.  Please send corrections to the list,
as well as any objections to rough consensus decisions
reached in the meeting.  The deadline for changes to
get incorporated in the official minutes is the end of
the day on Wednesday, April 11th.  Thanks, --David

IPS WG Meeting Minutes - DRAFT
IETF #50
Minneapolis MN


Interim Meeting - There will be an interim meeting for the IPS working
group. 
It will be co-located with T10 in Nashua, NH on April 30 & May 1.  
FC related topics will be covered on Monday, April 30.  
SCSI related documents will be covered on Tuesday, May 1.
In addition to the IPS meetings, on Wednesday, May 2, during the
T10 CAP meeting, a discussion of a SCSI MIB will be held. 
Details of this meeting, including location and hotel room information,
have been sent to the IPS mailing list.

TCP framing discussion will be in TSVWG WG meeting the evening of Monday,
March 19.
IPS working group attendees are encouraged to attend, be familiar with the
draft,
and ask questions. The draft was cross-posted to both IPS and TSVWG mailing
lists.
To subscribe to the TSVWG mailing list or to view the archive, see
www.ietf.org/mailman/listinfo/tsvwg

RDMA - RDMA is a separate effort from the framing discussion.
A separate mailing list has been formed to address RDMA
To join, send an email to rdma-subscribe@yahoogroups.com .
***NOTE***: This address has been corrected from the one discussed in the
meeting.

T10 is considering a new SCSI model.  T10 participants agreed that it may
have
advantages but will be very different from current SCSI model.  Therefore it
is currently deferred to SAM-3.
This model may be of interested to iSCSI, as it will have advantages for
iSCSI.

-- iSCSI Requirements --

iSCSI requirements - document is almost complete.
draft-ietf-ips-iscsi-rqmts-01.txt.
1 more draft will be sent out, and then a last call will occur in
approximately 1 month
with the hope of submitted the result as an informational RFC around the
time of the
interim meeting (end of April).  Everyone should review the draft on the
list and
especially check that the MUSTs and SHOULDs are correct and nothing is
missing.

-- iSCSI specification - draft-ietf-ips-iscsi-05.txt --

- CRCs for iSCSI.  Two presentations made with respect to CRC vs checksum
usage in iSCSI.
Recommendation was to use a 32 bit CRC; three mentioned as candidates -
CRC-32C; CRC-32Q and CCITT-CRC32.  Reported that there really is no need for
a 64 bit CRC.
Consensus call on use of CRC instead of checksum. Loud CRC hum; no hum for
checksum,
	so use of CRC rather than checksums is the rough consensus of this
meeting.
Consensus call will be taken on which CRC to use at interim meeting, so that
WG has
more time to investigate and understand the options presented.

Request made for a single mandatory to implement CRC algorithm as opposed
to making the CRC algorithm selection negotiable.

- iSCSI Digests.  Current draft has multiple digests.  
3 proposals presented for digest and related header formats.
Request made for use of TLV format in whatever solution is finalized on.  
Consensus call made to have 1 header digest instead of multiple.  
Barry Reinhold and Julian Satran were asked to work and come back with a
single
proposed format at the Wed  meeting.  The rough consensus of the meeting was
that
the data length should always be in the same place in the header.

Consensus call - descriptors and diagrams should always be kept together in
the document.
	This is the rough consensus of the meeting.

Error Recovery will be addressed at the interim meeting


- Security
Public Key and TLS were removed in version 05 of the iSCSI draft.
Public Key will be reinstated in the 06 draft.
MUST provide authentication and data integrity.
MAY provide data privacy (encryption).

Need to have at least 1 mandatory to implement security protocol.
IPSec seems to be the selection in current draft.
Consensus call - Make IPSec mandatory to implement?  Arguments against, no
consensus.
Consensus call - mandatory to implement SRP?  Hum against.  Will be taken to
interim meeting.
Both Public Key and Radius authentication mechanisms will be added to the
next version
of the draft.

- SCSI ACA discussion

ACA (Auto Contingent Allegiance) is an optional SCSI mechanism that stops
execution
of a sequence of dependent SCSI commands when one of them fails.  The
situation
surrounding it is complex - T10 specifies ACA in SAM2, and hence iSCSI has
to
specify it and endeavor to make sure that ACA gets implemented sufficiently 
(two independent interoperable implementations) to avoid dropping ACA in the
transition from Proposed Standard to Draft Standard.  On the list David
Black
noted that this would make ACA implementation at least a "SHOULD" rather
than a "MAY".

The current iSCSI draft has language requiring use of ACA rather than
implementation;
that's overspecified (it's ok for cooperating iSCSI nodes to decide not to
use ACA),
and will be changed.

In practice, ACA is not a complete solution (e.g., if a Fibre Channel switch
drops
a frame due to a CRC error, ACA will not kick in).  T10 has been working on
other
mechanisms that address problems that ACA addresses in a more comprehensive
fashion,
but has not moved to drop ACA from SAM2, hence iSCSI has to deal with it.


-- iSCSI Boot presentation - draft-ietf-ips-iscsi-boot-02.txt

This draft is relatively stable - the bulk of this draft will become an
informational RFC rather than standards track.  David Black will figure
out whether some piece of it needs to be standards track.

-- iSCSI MIB work presented. - draft-bakke-iscsimib-02.txt

Difficulty in making this an iSCSI MIB only,
in that there is no SCSI MIB currently, but people want to see both iSCSI
and SCSI information addressed.
SCSI MIB will be a topic on the agenda at the next T10 CAP meeting, May 2.  
iSCSI MIB accepted as a WG item; next submission will be official WG item.

-- iSCSI naming and Discovery presented  -
draft-ietf-ips-iscsi-name-disc-00.txt

Two discovery methods presented - iSNS and SLP based.

iSNS is an new name server, specific to IPS.  Question raised as to why not
use SLP.  
Separate SLP presentation presented following; both iSNS and SLP can be
used.  
SLP works for basic discovery whereas iSNS provides additional
information/capabilities, 
including management functionality.

-- URN Namespace document presented - draft-bakke-iscsi-wwui-urn-00.txt

While the meeting consensus was to accept this as an official WG work item,
this was overridden by direction from the IESG subsequent to the meeting.

-- SLP document presented - draft-bakke-iscsi-slp-00.txt 

Meeting consensus was to accept this as an official WG item.

-- Back to iSCSI headers (first thing Wednesday)

Follow up from Monday:  Two header formats presented to WG for
consideration.   One has data length in fixed location, but limits size
of data.  Second more flexible, but data length field not in fixed location.
More details on each format to be written up and posted to list; 
decision on which header format to follow to be decided on list; 
Julian requests decision within 10 days.  The list subsequently
chose the "Format 2" prepared/proposed by Barry Reinhold.

-- iSNS draft presented - draft-ietf-ips-isns-01.txt

Primarily a status report and description of how iSNS applies to all three
protocols in the WG (iFCP, iSCSI, FCIP).  It's required by iFCP, but
optional
for the others.

Rationale for why SLP isn't enough and iSNS is needed will be sent to list.

A concern was raised about relying on iSNS for certificate distribution vs.
certificate exchange between the two ends of an IP Storage connection.  Not
resolved in the meeting and will need further consideration as part of
protocol
security work.

-- Fibre Channel related topics: --

Fibre Channel Common Encapsulation team being formed.  
Small team, with a target size of 6 people.  
Interested people to see David or Elizabeth (co-chairs) by Friday, March 23.

Three encapsulation presentations made: 
  Two had encapsulation format recommendations -
draft-weber-fcip-encaps-00.txt and
	a presentation from Vi Chau (FCIP draft co-author)
  Third consisted of requirements needed for iFCP -
draft-monia-ips-ifcpenc-00.txt

Discussions of expectations from common encapsulation format - 
should provide some means of data integrity & synchronization by 
guarding against accidental interpretation of encapsulated data as an FC
frame if framing synchronization to the data stream is lost and being
recovered;
this is not a requirement to provide a means of preventing intentional
hijacking;
simply a means of validation that what is seen is actually current and
valid.
The WG co-chair made it clear that the encapsulation design team must take
this
issue seriously.

FC Frames have CRC around them, so no CRC needed there, 
but what needs to be wrapped around the common encapsulation piece (header
and/or
header + SOF/EOF codes) to insure that data is correct?  
Statement of direction from WG, based on show of hands, is that CRC should
be used.
This is the rough consensus of the meeting.

One requirement of the draft was initially to make the common header
extensible - 
not needed by FCIP or iFCP.  Consensus call - remove this requirement.  This
is the
rough consensus of this meeting.

-- FCIP draft update discussed - draft-ietf-ips-fcovertcpip-01.txt

No draft submitted since Dec IETF meeting.  
Changes made and discussed by authors, but major pending changes did
not make updated the draft for this meeting reasonable.
Next draft will include changes recommended at both IETF-49 and the interim
meeting, 
as well as a better description of the FCIP port models, encapsulation, etc.

Next draft due April, prior to next interim meeting.

-- FCIP Model presentation - Mike O'Donnell

FCIP port model usage.  In current FCIP draft, unclear as to whether this is
following
a B-Port,  E-Port or some other kind of port model.  Presentation makes
clear that
both E-Ports and B-Ports  can be used by the FCIP device.  In the absence of
FCIP,
an inter-switch link in Fibre Channel connects two E-ports.  If FCIP is
implemented
in the Fibre Channel switch, the result is a logical E-port communicating
over FCIP.
If FCIP is implemented in an external bridge, a real E-port on the switch
communicates
with a B-port on the bridge and the result is still a logical E-port on the
FCIP
side of the switch.  These implementation structures will interoperate
(i.e.,
a logical E-port in a switch can communicate via FCIP with a logical E-port
implemented in a bridge that uses a B-port to talk to an E-port on a switch
- the
FCIP protocol is the same).

FC-BB2 meeting announced - during T11 week in Toronto, Canada on April 9.
Meeting is open to all.  FC-BB2 handles the aspects of FCIP usage that
require
Fibre Channel standards.


-- iFCP status - draft-ietf-ips-ifcp-00.txt

3 additional versions of the draft anticipated between now and August 2001.

Target is that this August draft will be complete, w/o any TBDs.

Noting that the draft contained a MIB lead to a more general discussion of
MIBs.
In general, MIBs should be advanced as separate documents, and authors need
to
look at the FC MIBs developed in the ipfc WG to avoid duplication.
The iSCSI MIB may provide a model for some of the iFCP MIB.

Final note - somehow, we managed to get 3 anagrammed acronyms.  FCIP and
iFCP are
protocols in this WG, and IPFC is the IP over Fibre Channel protocol done by
the
ipfc WG.



From owner-ips@ece.cmu.edu  Sat Apr  7 00:39:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA27846
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 00:39:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f36MUxo01305
	for ips-outgoing; Fri, 6 Apr 2001 18:30:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f36MU3r01205
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 18:30:03 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by dirty; Fri Apr  6 18:29:26 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Fri Apr  6 18:29:25 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id SAA16270;
	Fri, 6 Apr 2001 18:29:25 -0400 (EDT)
Message-ID: <3ACE4345.A4899584@research.bell-labs.com>
Date: Fri, 06 Apr 2001 18:29:25 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Black_David@emc.com
CC: ips@ece.cmu.edu
Subject: Re: iSCSI: Out Of Sequence due to null sequence with multiple 
 connections.
References: <0F31E5C394DAD311B60C00E029101A07080153DB@corpmx9.isus.emc.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Black_David@emc.com wrote:
> 
> Sandeep,
> 
> > Still wading thru related emails but I believe that if a refCmdSN
> > is added to the task management PDU (not present currently but
> > could be added for task-related management commands), then it
> > might fix the above-mentioned flaws and allow for safe execution
> > and immediate delivery of the abort task to the target.
> 
> I think this is a functional subset of Doug Otis's suggestion to always
> use a CmdSN and add a header flag indicating that the command should
> be executed immediately at the target rather than waiting for those with
> prior CmdSNs to arrive.  Doug's suggestion also consumes less space
> in the header. 

Yes, i now understand Doug's proposal.  The "PDU sequencer" was
new terminology which baffled me some.. :-)

His solution does seem like a cleaner approach (no target state)

In some ways, it reminds one of the M_FLUSH function used
in the STREAMS layer.   If we stand back a bit, we see that the
scenario is essentially the same -> Stream module processing is 
normally asynchronous.   With M_FLUSH, the stream head desires to 
clean out all the downstream module queues.  

The only concern with that solution is that it may delete
a whole lot of unrelated SCSI I/O tasks as the sequencer cmdSN
moves way up to the cmdSN of the task_mgmt command.  "Abort task" 
as such will not exist in the iSCSI world.   These unrelated 
commands will now have to be retried by the initiator.  

Do we have any numbers on how often ABORT_TASK/ABORT_TASK_SET actually 
occurs in filesystem/SCSI workloads ?

regards,
-Sandeep


From owner-ips@ece.cmu.edu  Sat Apr  7 02:22:55 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA11697
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 02:22:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f371L2o10513
	for ips-outgoing; Fri, 6 Apr 2001 21:21:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f371Klr10502
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 21:20:47 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3BKVY>; Fri, 6 Apr 2001 21:22:10 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153E0@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: jhall@emc.com, ips@ece.cmu.edu
Subject: RE: SNACK and recovery 
Date: Fri, 6 Apr 2001 21:20:39 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> My objection to the complexity inherent in StatSN/SNACK/SACK
> is in part motivated by the experience of those that run SCSI
> commands over FC on high speed links.  In this FC context
> retained status on the target is simply not supported.  But you
> (and many on the list) know this  -- I'm not sure what to make
> of your points above, are we just agreeing on this fact?

And at the moment SNACK is not required by the iSCSI
specification.  Such a target can choose to continue not to
retain status and hence reject all SNACKs (although the
result may well be that the TCP connection closes).  Whether
we have too many error recovery options and mechanisms
is a separate issue - as long as the SNACK mechanism
is optional, targets that find it burdensome don't have to
implement it.

> >- Does a 16-bit TCP checksum catch enough of
> >the corruption events to make it acceptable to
> >take drastic measures like aborting a backup
> >when a 32 bit CRC fails on a response that
> >made it through the 16 bit checksum?
> 
> Is it correct to ignore link-level error correction?

No, if the link corrects the error, it's not a corruption
event visible to an end system because it cannot
cause a TCP retransmit or iSCSI error recovery of
any sort.

> I tried to be clear in my opinion and its basis, but I don't
> claim specific tape experience.  You have the paper by Stone
> and Partridge, could we agree to a number within the range
> that they set out.  What do you like, say 1 in 5 billion
> packets have a TCP cksum failure?

This begins an interesting math adventure.  Let me play devils
advocate here, and accept the 1 in 5 billion number (of
failures undetected by the TCP checksum) to start,
and assume that the CRC catches essentially 100%
of those failures.  FWIW, the 1 in 5 billion rate translates
to about twice a day for 1k packets in one direction of
a saturated gigabit link, so if the 1 in 5 billion number
is correct, the case for data CRCs is crystal clear
(this is not related to Jon's point because there's a
lot more data flowing than status).

> Now, what to say about tapes.  Just naive conjecture on my
> part but here goes.  Assume a 20 gig disk being backed up to
> tape over iSCSI; what xfer size do we like for the write CDBs?
> Would one Meg be OK for one write command?  Would that then be
> 20480 responses covered by StatSNs to backup the 20 gig?

1 Meg seems way too large.  Let's try 32K - this is
1/32nd of 1 Meg and makes everything happen 32
times as often.

> Assuming that each response is a distinct TCP segment and
> ignoring the fact that the corrupt data may not actually be
> in the iSCSI header part of the TCP segment.  Then one backup
> would fail for every 244,140 attempts.  Assuming that we do
> the backup every day, that means we must redo the backup (for
> this specific error case) once every 668 years (ignoring leap
> year days).

Divide by 32 and we get once every 21 years.  Now, let's try to
back up a terabyte - that's 50 times 20 Gig and the failure
occurs once every 5 months - that's not good, but if 150
sites try to do this every night, on average there will
be one failure a night.  That's often enough to be a real problem
If one a terabyte a day seems excessive (it's not, but ...) let's
try it once a week.  Across 1000 sites, the average is again
around a failure a day.

I'm not claiming that my numbers are any more realistic than
Jon's.  Does anyone on the list want to paint the "tape expert"
target on their back and tell us where on this range the
numbers that correspond to reality lie?

> Maybe, and are there are other rare errors to consider?  Where
> is the line drawn (or how many pages of error recovery state
> diagrams are enough? :-)

Believe it or not, that's an open issue.  There are a number of folks
toiling away off-line on figuring out just what it takes to fully
describe error recovery based on the current state of things (or
some modifications that allow the task to be completed in a
reasonable amount of time).  With luck, we'll be able to expose
the result of their hard work to the group in the near future so
that between the list and the Nashua meeting we can have an
informed discussion about what is necessary in the way of
error recovery.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Sat Apr  7 03:59:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA12026
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 03:59:53 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f375jBH23782
	for ips-outgoing; Sat, 7 Apr 2001 01:45:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f375iKr23760
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 01:44:20 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id BAA56574
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 01:36:58 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.95) with ESMTP id XAA242166
	for <ips@ece.cmu.edu>; Fri, 6 Apr 2001 23:44:19 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
To: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFFA04D059.E53AEA85-ON88256A27.001F4E72@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Fri, 6 Apr 2001 22:44:10 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/06/2001 11:44:18 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

OK, if you go at it long enough you are punished with my two cents.

We of course need some real numbers on what the probability of the CRC
detected error, when TCP does not detect it.

Given the fact that we do not have that information,  I could only just use
some of the numbers that have been kicking around on this thread.

A Billion is NOT a large number, especially when we are talking about 10
Gigabit Links ( vendors sampling 10 Gigabit/sec HBAs, next year, some
shipping them in general availability (GA) in that year, and the rest in
2003.  And yes I also got information of a company that is currently
developing  100 Gigabit Links.)  So when I looked at some of the numbers, I
found that it meant that a link would see a failure about every twenty
minutes, some went for 200 minutes, etc.

I have one war story that might apply.  Years ago when we were first
thinking about putting the small disk drives in our large storage
controllers, we had folks calculate the Mean Time to Failure (MTF), of the
various Desktop HD.   Some individual MTF numbers sounded large, for any
given drive.  But then we computed the number of drives we would have in a
large installation,  it ended up that we would have a drive failure at
least every day.  (Thankfully, we had significantly better MTF numbers in
the drives that were actually used.)   So, the point is that sometimes
these large numbers come back to bite you in ways you had not considered at
first, when you think about it in a large installation.

OK, back to the thread.

Now I see sites all the time with 10s to 100s of Tape Units, and these
units.  In many cases this will mean that there will be a tape unit failure
that causes the critical  backup job to fail, somewhere on the computing
room floor, about every 2, 20, or 200 min.  This is a major impact on a
computing center that must process hundreds of backup each day.

Therefore, those of you that think you are talking about  very rare events,
should at least compute the 10 Gigabit/second Rates, and then the number of
paths, etc. that might be in an enterprise installation, and then state how
often a computing center will see such an event.  When many of these things
are done at night with unattended operations, these can be a significant
issue.  If it is probable that only one failure will occur per night, then
you are certain that when a disaster does occur,  they will not have a
valid backup over some amount of the data.

OK, I am not saying who's right or wrong here, but just that some of the
numbers I have heard, on this thread are not that impressive when looked at
with a 10 Gigabit/sec links and many paths.  (Let alone the future 100
Gigabit/sec links)  (Oh by the way, remember a 10 Gigabit link is really a
20 Gigabit link when you factor in full duplex.)

So it might be useful, for the Rare Event folks to do the calculations on
their numbers and tell us what they mean in terms of Minutes between
failures on 10 Gigabit links.  Then the rest of us can compute our own
picture on how many links we will probably have in our installation.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com



From owner-ips@ece.cmu.edu  Sat Apr  7 07:49:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id HAA13328
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 07:49:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f379YId04091
	for ips-outgoing; Sat, 7 Apr 2001 05:34:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f379Xxr04081
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 05:33:59 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id LAA41212
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 11:33:43 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id LAA83540
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 11:31:46 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A27.00343A13 ; Sat, 7 Apr 2001 11:30:27 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A27.00343993.00@d12mta02.de.ibm.com>
Date: Sat, 7 Apr 2001 11:27:48 +0200
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
		very
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Doug & David,

Commands can be recovered based on their ITT (that is the main identifier).

Nevertheless the idea of a separate immediate delivery flag is appealing.

The idea was raised many months ago and I will have to review some old
notes to find why we dropped it in the first place (except that FCP at the
time and SCSI now use the same marking - i.e. 0 for immediate).

Julo

Black_David@emc.com on 06/04/2001 15:11:26

Please respond to Black_David@emc.com

To:   dotis@sanlight.net, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI:flow control, acknowledgement, and a deterministic reco
      very




Let me see if I can distill out the issue in Doug's long
message on this subject ...

When a CmdSN of zero is used to mark a command for immediate
delivery, the CmdSN-based acknowledgement and windowing
mechanisms don't apply to that command.  This means that
any command sent for immediate delivery (CmdSN = 0):
(A) Cannot recover from a CRC (digest) error via CmdSN-based
     retransmit.  The Initiator can still time this out,
     but that seems like a poor way to initiate recovery
     of something that was supposed to be done immediately.
(B) Does not have its use of resources (e.g. command buffer)
     on the target controlled by the CmdSN windowing
     mechanism, complicating target resource management.
     Targets have better control over their resources if
     all inbound commands use one set of resources,
     described/controlled by the CmdSN window, rather
     than having to set aside separate resources just
     in case the initiator sends an immediate command.
The alternate proposal that Doug envisions is to apply
CmdSN (and hence its error recovery and resource management)
to immediate commands, and use a flag  bit elsewhere in
the header to indicate that the command is to be delivered
to SCSI immediately by iSCSI rather than waiting for
"missing" commands to show up.  That seems reasonable,
and comments are invited.  One of the things that this
does is transfer the responsibility to keep some space
in the CmdSN window open for immediate commands (and
determine how much is appropriate) from the Target to
the Initiator - all other things being equal, this
is the right direction to move functionality in a SCSI
system.

Note Well: This concept of "immediate" delivery is an
iSCSI concept that affects only the iSCSI CmdSN sequence
- this does not affect TCP's "deliver in order" behavior.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------

> -----Original Message-----
> From:   Douglas Otis [SMTP:dotis@sanlight.net]
> Sent:   Thursday, April 05, 2001 11:34 PM
> To:     Ips
> Cc:     David Black; Elizabeth G Rodriguez (Elizabeth)
> Subject:     iSCSI:flow control, acknowledgement, and a deterministic
> recovery
>
> Ver 5 Pg. 10
>
>    "Command numbering is session-wide and is used for ordered command
>    delivery over multiple connections.  It can also be used as a
>    mechanism for command flow control over a session."
>
> "1.2.2.1 Command Numbering and Acknowledging
>
>    iSCSI supports ordered command delivery within a session.  All
>    commands (initiator-to-target) are numbered."
>
>    "Commands in transit from the initiator SCSI layer to the SCSI target
>    layer are numbered by iSCSI; the number is carried by the iSCSI PDU
>    as CmdSN (Command-Sequence-Number).  The numbering is session-wide.
>    All iSCSI PDUs that have a task association carry this number. CmdSNs
>    are allocated by the initiator iSCSI within a 32-bit unsigned counter
>    (modulo 2**32).  The value 0 is reserved and used to mean immediate
>    delivery. Comparisons and arithmetic on CmdSN SHOULD use Serial
>    Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.
>
>    Not covered in this document are he means by which the SCSI layer may
>    request immediate delivery for a command or by which iSCSI will
>    decide by itself to mark a PDU for immediate delivery."
>
>
> Not all commands are serialized as a result of serial number 0
> representing
> a special case for immediate delivery.  This has an impact on flow
> control,
> acknowledgement, and a deterministic recovery in the face of an error
> situation.  With the exception of those commands with null serialization,
> all commands MUST be sequenced at the point of network aggregation
> described
> here as a PDU sequencer that issues commands to the SCSI target.  This
> focal
> point is likely to find situations where its normal operation is
> curtailed.
>
>    1) Prolonged device operation resulting in a resource constraint.
>    2) Digest Error causing a missing sequence.
>    3) Connection loss causing a missing sequence.
>
> The technique of using a null sequence to bypass the sequencer has some
> inherent problems.  In first case, those queued commands MAY become
> invalid
> following management that terminates prolonged operation with a command
> that
> has bypassed the sequencer.  Those invalidated commands queued within the
> sequencer can not be cancelled in an orderly manner within the existing
> scheme.  This sequencer MUST be used as defined in the iSCSI proposal
and,
> as a result, these queued commands are beyond SCSI and iSCSI controls.
> Issuing a null sequence command followed by a replicate serialized
command
> will have differing results depending on the target but will not result
in
> a
> deterministic treatment of these pending commands.
>
> The use of the sequencer bypass technique (null serialization) should
> signal
> an extreme measure where logically, those commands being bypassed become
> suspect.  The conservative approach to this situation would be to reject
> all
> bypassed commands.  As a result of this conservative behavior, a
technique
> that does not use a null sequence would be to institute a flag such as
> "Exigent" that signals an extreme condition exists and that all pending
> commands within the sequencer are to be rejected.
>
> Note:
> In addition to checking for the next PDU sequence, the sequencer should
be
> checking for PDUs with a serialization number that are prior to the
> desired
> next sequence.  This examination would look something like this:
>
> if ( (sequencer CmdSN - next sequence CmdSN ) > 2^(SERIAL_BITS - 1))
>    {
>    reject_pdu(CmdSN, SEQUENCER_INVALIDATION);
>    }
>
> if (sequencer CmdSN == next sequence CmdSN)
>    {
>    send_pdu(CmdSN);
>    next sequence++;
>    }
>
> Upon receipt of a PDU flagged as "Exigent", the 'next sequence' value
> immediately becomes the serial number of this command as well as ExpCmd
> advancing to this value plus one.  The effect of this is to have the
> sequencer reject all pending commands up to the "Exigent" command.  This
> has
> the benefit of removing these now suspect commands as well as allowing
> this
> highly urgent command to be sent to the target immediately without
> accidentally affecting subsequent commands as is now possible.
>
> As it is now, one other possible use of a null sequence would be to
always
> bypass the sequencer as perhaps the sequencer is viewed as unnecessary.
> The
> use of zero to allow commands to bypass the sequencer then represents a
> problem with respect to resource management as now the flow control
scheme
> no longer works.  If bypassing the sequencer is the desired behavior and
> this command will not impact validity of those commands serialized prior
> to
> this command, then this PDU flagged "Casual" would allow this command to
> be
> issued directly to the target.  These commands should still include a
> serial
> number to allow flow control and acknowledgement to remain functional.
> The
> task of acknowledgement would be to comprise a min-max list of those
> commands sent and to look for the highest sequential value.
>
> In the case of a lost connection, waiting to time-out on those holes
> created
> within the sequencer would be one method of handling this situation.  If
> there was a means of rejecting those commands within the sequencer using
> an
> "Exigent" command would mean there are no holes left to fill.  Those
> commands received would be rejected and the initiator could then resend
> these commands on a different connection without stumbling through a
> process
> of repeated timeouts.  Should the method used to recover from a digestion
> error mean terminating the connection, then those commands can also be
> quickly shifted over to a new connection.
>
> This does not resolve all issues created in a error event, but it does
> provide a simpler solution for most of those events concerning the
> sequencer
> and gets rid of the special case handing of the command serial number.
> Not
> having flow control, acknowledgement, and a method of dealing with queued
> commands appear as a serious flaw in the present protocol.  I hope I have
> presented this in a clear manner and I am interested in finding how
others
> deal with these situations.
>
> Doug





From owner-ips@ece.cmu.edu  Sat Apr  7 17:19:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18029
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 17:19:10 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f37GvZ819262
	for ips-outgoing; Sat, 7 Apr 2001 12:57:35 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f37GvCr19221
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 12:57:12 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id SAA20434
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 18:57:00 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id SAA171824
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 18:53:41 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A27.005CD0F1 ; Sat, 7 Apr 2001 18:53:48 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A27.005CD0A8.00@d12mta02.de.ibm.com>
Date: Sat, 7 Apr 2001 18:57:35 +0200
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple con
	 nections.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



CmdSN is meant to be short lived.  Commands numbers are not retained after
command start executing.  However you would like to have a task management
command operating on task executing.

Julo

sandeepj@research.bell-labs.com (Sandeep Joshi) on 06/04/2001 05:47:59

Please respond to sandeepj@research.bell-labs.com (Sandeep Joshi)

To:   ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI: Out Of Sequence due to null sequence with multiple con
      nections.




> Unfortunately, I think we have an impossible situation.  It appears to me
> that
> we have to pick at most two of the following three goals, as I have yet
to
> see
> any way to achieve all three for a single task management command on a
> multiple connection session:
>
> (1) The command takes effect immediately and its status/response
>         is available immediately.
> (2) The command affects all commands in flight, and its status/response
>        is delayed until all such effects are complete.
> (3) There is no significant visible departure from existing SCSI task
>        management behavior.
> ...
> ...
> Sandeep's proposal to create state in the target either fails to achieve
> (1) [if the response is delayed until the state is removed] or violates
SAM2
> [returns the response to the task management command before the task
> management command is complete].  Having state linger after a completed
> LUN or TARGET RESET is almost certainly wrong.

Hi David,

Still wading thru related emails but I believe that if a refCmdSN
is added to the task management PDU (not present currently but
could be added for task-related management commands), then it
might fix the above-mentioned flaws and allow for safe execution
and immediate delivery of the abort task to the target.

refCmdSN (cmdSN of refTaskTag) can tell you where the abort task
command was received in the target command stream.

Processing at target :
1) Initiator task tag reuse : should not happen before refCmdSN
   so can be used for comparisons until refCmdSN expires.
2) State deletion : can be done by target after refCmdSN PDU
   has arrived and is processed/dropped.
3) If Abort task command is early (refCmdSN not arrived) :
   then create state and drop PDUs when they arrive.
4) If Abort task command is Late (executing now beyond refCmdSN):
   target sends a task response of task-not-found (this return
   code exists in the draft).   Otherwise, we may cancel the
   wrong task if the initiator task tag has been reused!
5) If Abort task reaches when target is executing refCmdSN :
   pass abort task to SCSI layer and return response.
6) Task response can be returned as appropriate to conform with
   SAM2 - either after in-flight commands arrive or immediately
   since the target knows what needs to be done later.  I am slightly
   confused here since your goals (1)&(2) appear to be contradictory
   for application to in-flight commands.. it depends on semantics
   what "taking effect" implies ?

One question is, is it reasonable to assume that the initiator
knows the cmdSN of an issued original task(cmdSN<->initiator taskTag).
This knowledge may be anyway needed if commands are to be re-issued
in cmdSN order during recovery of a broken connection.  Its an
implementation issue but can be debated.

Any other holes that we see..
1) multi-NIC initiators ?
2) dont want to introduce state at target ?
3) may affect iSCSI routers or gateways in strange ways ?
4) linked command issues ?

thanks,
-Sandeep





From owner-ips@ece.cmu.edu  Sat Apr  7 17:19:51 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18040
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 17:19:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f37GnXi18829
	for ips-outgoing; Sat, 7 Apr 2001 12:49:33 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h000.c007.snv.cp.net [209.228.33.206])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f37Gn5r18815
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 12:49:05 -0400 (EDT)
Received: (cpmta 1065 invoked from network); 7 Apr 2001 09:48:58 -0700
Received: from unknown (HELO ljoy) (64.130.130.105)
  by smtp.telocity.com (209.228.33.206) with SMTP; 7 Apr 2001 09:48:58 -0700
X-Sent: 7 Apr 2001 16:48:58 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <sandeepj@research.bell-labs.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI: Out Of Sequence due to null sequence with multiple  connections.
Date: Sat, 7 Apr 2001 09:47:27 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEBCCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
In-Reply-To: <0F31E5C394DAD311B60C00E029101A07080153DF@corpmx9.isus.emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

I agree. Just using a flag rather than a zero CmdSN for immediate delivery
is a substantial improvement and this could be the extent of the change.
Without the sequencer reject, it does require interlocked target behavior
thus forcing the SCSI layer and the target to ensure a deterministic
recovery.  I tend to be conservative on how errors are handled.  To argue
rarity, ensuring a certain result is helpful following these types of errors
with the additional benefit of not needing timeouts for hole filling.  The
downside is there will be a potential list of rejected commands from
innocent LUNs to retry if the problem is LUN related.  I'll prefer a list
over a series of timeouts flushing a connection sputter that may result in a
series of sequence holes.

A close of a failing connection should also reject commands being held in
the sequencer with allegiance to the now closed connection.  As in this
case, reject does not have allegiance but the retry methods should look the
same.  In other words, nothing is being added to what is already in place
for a low code impact change category.

As a side note, if you were attempting an urgent shutdown, you may wish to
send more than one such immediate command rather than waiting for a response
from each LUN before sending the next.  As such, flow control and
acknowledgement for this activity becomes desired.

Doug

> > The only concern with that solution is that it may delete
> > a whole lot of unrelated SCSI I/O tasks as the sequencer cmdSN
> > moves way up to the cmdSN of the task_mgmt command.  "Abort task"
> > as such will not exist in the iSCSI world.   These unrelated
> > commands will now have to be retried by the initiator.
>
> FWIW, that's not a necessary consequence of the proposal to use
> a flag instead of a zero CmdSN, although it does appear to be the
> way that Doug envisions implementing this.
>
> --David
>



From owner-ips@ece.cmu.edu  Sat Apr  7 17:29:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18104
	for <ips-archive@odin.ietf.org>; Sat, 7 Apr 2001 17:29:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f37JPbn26783
	for ips-outgoing; Sat, 7 Apr 2001 15:25:37 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f37JPZr26778
	for <ips@ece.cmu.edu>; Sat, 7 Apr 2001 15:25:35 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id EE3A19EA; Sat,  7 Apr 2001 12:25:33 -0700 (PDT)
Received: (from santoshr@localhost)
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) id MAA11593;
	Sat, 7 Apr 2001 12:25:29 -0700 (PDT)
From: Santosh Rao <santoshr@cup.hp.com>
Message-Id: <200104071925.MAA11593@hpcuhe.cup.hp.com>
Subject: Re: SNACK and recovery
To: Black_David@emc.com
Date: Sat, 7 Apr 2001 12:25:29 -0700 (PDT)
Cc: jhall@emc.com, ips@ece.cmu.edu
In-Reply-To: <0F31E5C394DAD311B60C00E029101A07080153E0@corpmx9.isus.emc.com> from "Black_David@emc.com" at Apr 06, 2001 09:20:39 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

> And at the moment SNACK is not required by the iSCSI
> specification.  Such a target can choose to continue not to
> retain status and hence reject all SNACKs (although the
> result may well be that the TCP connection closes).  

The iSCSI rev 05 does mandate Status SNACK [though there has been talk in
this thread of changing that]. The Status SNACK mechanism, as currently
defined in the spec, will block forward progress of resource release at
targets if it is made optional, [eventually resulting in connection
close].

The lack of ability to handle StatSN SNACK must not result in TCP
connection closes. This is a result of the current SNACK model which only
allows forward progress with resource release if targets support status
SNACK. 

An alternate SACK mechanism which allowed initiators to selectively
acknowledge status PDUs that were received would not result in TCP
connection closes in such scenarios.

The current SNACK mechanism is biased towards targets that retain I/O
state information and penalizes implementations that do not do this [by
blocking forward progress of resource release when holes occur in StatSN].

- Santosh


From owner-ips@ece.cmu.edu  Sun Apr  8 10:16:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA09613
	for <ips-archive@odin.ietf.org>; Sun, 8 Apr 2001 10:16:47 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f38C85F24656
	for ips-outgoing; Sun, 8 Apr 2001 08:08:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f38C7Gr24623
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 08:07:16 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id OAA135700
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 14:07:02 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id OAA70918
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 14:03:41 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A28.0042428F ; Sun, 8 Apr 2001 14:03:44 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A28.00424221.00@d12mta02.de.ibm.com>
Date: Sun, 8 Apr 2001 09:50:17 +0200
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
		very
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



The main reason for selecting 0 and not a flag for immediate delivery was
to enable immediate delivery even when the command window is closed.

However we can achieve the same effect by using an immediate flag and
ausing the current CmdSN without advancing it.  With this we get a
reference to where in the stream the command where supposed to act if the
stream order is important. All commands that reached the target having
CmdSN less than the immediate CmdSN where sent before our command and all
those with a number equal or higher where sent after. Immediate commands
are the only ones that can have a CmdSN higher (by 1) that the allowed
window.

Julo




From owner-ips@ece.cmu.edu  Sun Apr  8 10:16:53 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA09624
	for <ips-archive@odin.ietf.org>; Sun, 8 Apr 2001 10:16:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f38C87A24659
	for ips-outgoing; Sun, 8 Apr 2001 08:08:07 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f38C7Cr24620
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 08:07:13 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id OAA28018
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 14:07:04 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id OAA163944
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 14:03:43 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A28.004244EB ; Sun, 8 Apr 2001 14:03:50 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A28.00424363.00@d12mta02.de.ibm.com>
Date: Sun, 8 Apr 2001 10:17:20 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

I have to think about it a bit more.  The main ack mechanism is ExpStatSN
(as I indicated in a previous note).  SNACk is meant to simplify recovering
holes without having to reissue commands or wait for timeout. Rejecing
SNACK with an "unsupported" indication but resending the status on command
retry can hardly be considered a good alternative while rejecting SNACK
with "no status to recover" has to be bubbled up to SCSI and that can be
bad for tapes.
I think that if you keep status until ack-ed SNACK makes only life easier
as it makes the recovery request explicit an specific - unlike the command
retry that is vague and unspecific.

Julo

Santosh Rao <santoshr@cup.hp.com> on 06/04/2001 20:26:47

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   santoshr@cup.hp.com, David Black <Black_David@emc.com>
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"




Julian,

I did not hear back on this and am re-sending in case you did not
receive the same. Your comments would be appreciated.

(Can you clarify if you intend to make the current SNACK mechanism
optional and if so, how it is expected to solve the "holes in StatSN"
problem for targets that don't implement StatSN SNACK ?)

Regards,
Santosh

-------------------------------------------------------------------------------------


Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Date:     Thu, 05 Apr 2001 19:22:09 -0700
From:     Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
To:       julian_satran@il.ibm.com
CC:       ips@ece.cmu.edu

julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> SNACK and SACK are the same thing (I just renamed them to avoid confusion
> with TCP SACK).
> The status is acked by ExpStatSN (and only indirectly by SNACK). SNACK
> enables fast recovery of
> a hole (whithout having to resort to a timeout).

Julian,

The bottom line is that the current SNACK mechanism as defined in the
spec will NOT work if it is made optional, and at the same time, it is
too expensive to mandate the SNACK mechanism.

The current SNACK mechanism is really a negative ACK requesting the
target to re-send the status PDU. This mechanism has 2 dis-advantages :

a) requires targets to retain I/O state information until StatSN is
acknowledged.
b) Does not allow forward progress with the release of I/O resources in
the event that a target could not retain that state information or for
some other reason could not service the SNACK.

I am suggesting that the alternate model of SACK be used, wherein, a
SACK is an individual ACK of a received status PDU. This SACK only kicks
in on detection of a hole. The hole is implicitly plugged by the
initiator on eventual completion of the command
[on timeout followed by abort or retry].

The advantage of this alternate model is :
a) Does not require state information to be stored at targets beyond I/O
completion.
b) Allows a more reliable mechanism of resource release.

The dis-advantage of this mechanism is :
a) It results in I/O timeout when Status PDU was dropped due to a digest
error.

Once again, the question boils down to the rate of TCP checksum escapes
and the probability of such escapes affecting status PDUs. If this is
low enough, such a timeout on a digest error of a status PDU should be
acceptable.

>  We decided long ago
> against individual acks as bulk acking through a window is cheaper and
> safer (repetition).

I am not suggesting removal of bulk ack scheme. My suggestion is that
SACK kick in on a hole and the initiator revert to bulk ACK scheme once
it considers the hole to be plugged (thru the eventual completion of the
I/O on the timeout path followed by abort or retry).

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Sun Apr  8 15:43:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA12032
	for <ips-archive@odin.ietf.org>; Sun, 8 Apr 2001 15:43:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f38HIQb09355
	for ips-outgoing; Sun, 8 Apr 2001 13:18:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h000.c007.snv.cp.net [209.228.33.206])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f38HHKr09282
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 13:17:20 -0400 (EDT)
Received: (cpmta 29335 invoked from network); 8 Apr 2001 10:12:15 -0700
Received: from unknown (HELO ljoy) (64.130.130.105)
  by smtp.telocity.com (209.228.33.206) with SMTP; 8 Apr 2001 10:12:15 -0700
X-Sent: 8 Apr 2001 17:12:15 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Sun, 8 Apr 2001 10:10:45 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJAEBGCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <C1256A28.00424221.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

If you do not reject commands pending within the sequencer, then you will
need to log these out of sequence events to ensure the command taken out of
sequence will not cause a hole later.  You could allow an exception for
window size but if you are attempting to issue many of these urgent
commands, then a single exception does not provide much flexibility.

Should you reject all prior commands within the sequencer, there is no issue
regarding the logging of these out of sequence commands.  Should there be a
sequence of urgent commands, by flagging the first such command forces the
window to be cleared to allow these urgent commands their immediate
deployment.

This rejection technique also has the advantage of deleting command holes
without timeouts should there be a problem with a connection.  This
technique supplemented with SNACK appears to provide a rather immediate
recovery again without reliance on the SCSI layer.

As there should already be code ready to handle rejected commands, using
this mechanism does not add code and allows the transport to use this very
simple mechanism.  This also does not depend on the target within the SCSI
layer to handle these out of sequence events so there is less likelihood of
needing the SCSI layer tailored to use iSCSI.

While placing this immediate function into a flag ensures that such a
command is acknowledged eventually if rejection is not used and immediately
if commands are rejected, this flag however provides the vital information
of command placement to prevent such a command from applying after
subsequent commands for without this flag there is no assurance when such a
command is received.

Doug



> The main reason for selecting 0 and not a flag for immediate delivery was
> to enable immediate delivery even when the command window is closed.
>
> However we can achieve the same effect by using an immediate flag and
> ausing the current CmdSN without advancing it.  With this we get a
> reference to where in the stream the command where supposed to act if the
> stream order is important. All commands that reached the target having
> CmdSN less than the immediate CmdSN where sent before our command and all
> those with a number equal or higher where sent after. Immediate commands
> are the only ones that can have a CmdSN higher (by 1) that the allowed
> window.
>
> Julo
>
>
>



From owner-ips@ece.cmu.edu  Sun Apr  8 18:48:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA12874
	for <ips-archive@odin.ietf.org>; Sun, 8 Apr 2001 18:48:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f38KnJp19994
	for ips-outgoing; Sun, 8 Apr 2001 16:49:19 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h014.c007.snv.cp.net [209.228.33.221])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f38Kj8r19772
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 16:45:08 -0400 (EDT)
Received: (cpmta 13862 invoked from network); 8 Apr 2001 13:42:26 -0700
Received: from dsl-64-130-130-105.telocity.com (HELO ljoy) (64.130.130.105)
  by smtp.telocity.com (209.228.33.221) with SMTP; 8 Apr 2001 13:42:26 -0700
X-Sent: 8 Apr 2001 20:42:26 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: "Douglas Otis" <dotis@sanlight.net>, <julian_satran@il.ibm.com>,
        <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Sun, 8 Apr 2001 13:40:55 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEBGCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <NEBBJGDMMLHHCIKHGBEJAEBGCGAA.dotis@sanlight.net>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Following on,

As an overview, there is a change allowing commands trapped within the
sequencer to be rejected for replay at a new sequence.  There would need to
be a means of re-associating the PDU with new CmdSN or use the X-bit to
reissue.  The rejected sequences should become invalidated for only those
rejected due to an immediate command bypass and disallow the X-bit.  To
replay, these sequence associations would require reassignment.

The advantage would be:
  1) No need to score-board any sequencer bypass.
  2) Immediate opening of the window for urgent commands.
  3) Quick repairs for a connection problem.
  4) Immediate acknowledgement for all executed commands.
  5) No need to trap iSCSI errors within SCSI layer.

You would want to invalidate sequences only for commands rejected due to an
immediate command placement.  PDU rejected due to a digest error appears to
invalidate the sequence within the iSCSI proposal.  What is abort task
action, retire SCSI tag?  Is the target making assurances of a free tag?
Targets detecting data digest errors may abort tasks by issuing a Reject PDU
after waiting for all data if it does not support R2T. Is the only method of
control, a SCSI layer abort, and should this be an option?

Notes:
Ver 5 Page 80:

"6.2 Digest Errors

   When a target receives an iSCSI PDU with a header digest error or a
   payload digest error in an iSCSI PDU, it MUST answer with a Reject
   iSCSI PDU with a Reason-code of Header-Digest-error or Data-Digest-
   Error and discard the offending PDU.  If the error is a Data-Digest-
   Error in a Data-PDU, the target MUST either request retransmission
   with a R2T or answer with a Reject iSCSI PDU and abort the task.
   However if the error is detected while data from the initiator is
   still expected (the command PDU did not contain all the data and the
   target has not received a Data PDU with the final bit Set) the target
   MUST wait until it receives the a Data PDU with the F bit set before
   sending the Reject PDU.

   When an initiator receives an iSCSI PDU with a header digest error,
   it MUST discard it.  When an initiator receives any iSCSI PDU other
   than a data PDU, with a Data-Digest-Error, and this PDU is part of a
   task (has an Initiator Task Tag set), it MUST discard the PDU. It MAY
   restart the task (reissue the command with the same Initiator Task
   Tag and the X-bit set to 1).  If the reissued command is a SCSI
   command and it implies Read Data (Expected Data Length is not 0), the
   reissued command also includes the sequence number of the Next Data
   Packet expected by the initiator (0 if there was no data packet yet).

   When an initiator receives an iSCSI data PDU with a Data-Digest
   error, it must discard the PDU and it MUST either request the missing
   data PDUs through SACK or abort the task and terminate the command
   with an error.

6.3 Sequence Errors

   When an initiator receives an iSCSI data PDU with an out-of-order
   DataSN or a SCSI command response PDU with an EndDataSN implying
   missing data PDUs it MAY request the missing data PDUs through a data
   SACK PDU or handle this case as a connection failure.  In its turn,
   the target MUST either reject the SACK with a Reject PDU with a
   reason-code of Data-SACK-Reject or resend the data PDU.

   When an initiator receives an iSCSI status PDU with an out-of-order
   StatSN implying missing responses, it MUST either request the missing
   response PDUs through a status SACK or handle this case as a
   connection failure.  The target MUST reissue the missing responses.
   As a side effect of receiving the missing responses, the initiator
   may discover missing data PDUs. The initiator MUST NOT acknowledge
   (either explicitly through ExpStatRN or implicitly through a status
   SACK) the received responses until it has completed receiving all the
   data PDUs of a SCSI command."

Page 83:

"6.7.1.1 Recovery Within-connection

   At the initiator, the following cases lend themselves to within-
   connection recovery:

      (1)Lost iSCSI numbered Response recognized by either receiving
      it with a data digest error or receiving a Response PDU with a
      higher StatSN than expected. The initiator MAY request the
      missing responses through SACK, in which case the target MUST
      reissue them.
      (2)Requests not acknowledged for a long time. Requests are
      acknowledged explicitly through ExpCmdSN or implicitly by
      receiving data and/or status. The initiator MAY reissue non-
      acknowledged commands. The reissued, non-acknowledged commands
      MUST carry their original CmdSN and the X (retry) flag set to
      1.  Note that this is the only case in which the reissued
      command carries the same CmdSN as the "original".
      N.B. While the original connection for a command is still
      "active" (i.e., has not been logged-out or restarted), any
      command MUST be retried only on the original connection. After
      logging out the original connection, commands can be retried on
      a different connection, but must still carry the original
      CmdSN."

Doug



From owner-ips@ece.cmu.edu  Sun Apr  8 19:46:52 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA13041
	for <ips-archive@odin.ietf.org>; Sun, 8 Apr 2001 19:46:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f38KRI718863
	for ips-outgoing; Sun, 8 Apr 2001 16:27:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f38KQlr18845
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 16:26:48 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id WAA68762
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 22:26:40 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id WAA228528
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 22:24:40 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A28.007000C7 ; Sun, 8 Apr 2001 22:23:22 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A28.00700095.00@d12mta02.de.ibm.com>
Date: Sun, 8 Apr 2001 22:27:12 +0200
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic
	 recovery
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Doug,

I think I need a translation in plain English.

Julo

"Douglas Otis" <dotis@sanlight.net> on 08/04/2001 19:10:45

Please respond to "Douglas Otis" <dotis@sanlight.net>

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI:flow control, acknowledgement, and a deterministic
      recovery




Julian,

If you do not reject commands pending within the sequencer, then you will
need to log these out of sequence events to ensure the command taken out of
sequence will not cause a hole later.  You could allow an exception for
window size but if you are attempting to issue many of these urgent
commands, then a single exception does not provide much flexibility.

Should you reject all prior commands within the sequencer, there is no
issue
regarding the logging of these out of sequence commands.  Should there be a
sequence of urgent commands, by flagging the first such command forces the
window to be cleared to allow these urgent commands their immediate
deployment.

This rejection technique also has the advantage of deleting command holes
without timeouts should there be a problem with a connection.  This
technique supplemented with SNACK appears to provide a rather immediate
recovery again without reliance on the SCSI layer.

As there should already be code ready to handle rejected commands, using
this mechanism does not add code and allows the transport to use this very
simple mechanism.  This also does not depend on the target within the SCSI
layer to handle these out of sequence events so there is less likelihood of
needing the SCSI layer tailored to use iSCSI.

While placing this immediate function into a flag ensures that such a
command is acknowledged eventually if rejection is not used and immediately
if commands are rejected, this flag however provides the vital information
of command placement to prevent such a command from applying after
subsequent commands for without this flag there is no assurance when such a
command is received.

Doug



> The main reason for selecting 0 and not a flag for immediate delivery was
> to enable immediate delivery even when the command window is closed.
>
> However we can achieve the same effect by using an immediate flag and
> ausing the current CmdSN without advancing it.  With this we get a
> reference to where in the stream the command where supposed to act if the
> stream order is important. All commands that reached the target having
> CmdSN less than the immediate CmdSN where sent before our command and all
> those with a number equal or higher where sent after. Immediate commands
> are the only ones that can have a CmdSN higher (by 1) that the allowed
> window.
>
> Julo
>
>
>






From owner-ips@ece.cmu.edu  Sun Apr  8 22:24:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA14626
	for <ips-archive@odin.ietf.org>; Sun, 8 Apr 2001 22:24:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f390CPi00210
	for ips-outgoing; Sun, 8 Apr 2001 20:12:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h013.c007.snv.cp.net [209.228.33.220])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f390CBr00197
	for <ips@ece.cmu.edu>; Sun, 8 Apr 2001 20:12:11 -0400 (EDT)
Received: (cpmta 10662 invoked from network); 8 Apr 2001 17:12:00 -0700
Received: from dsl-64-130-130-105.telocity.com (HELO ljoy) (64.130.130.105)
  by smtp.telocity.com (209.228.33.220) with SMTP; 8 Apr 2001 17:12:00 -0700
X-Sent: 9 Apr 2001 00:12:00 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Sun, 8 Apr 2001 17:10:29 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEBHCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <C1256A28.00700095.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

I am attempting to provide a name for a function within iSCSI.  iSCSI
requires sequential ordering of commands with the exception of those
commands indicated with zero or null serialization.  This server function is
being described as a PDU sequencer.  If a CmdSN is assigned to a command
that bypasses the sequencer using a flag rather than a null sequence, then
the function of the sequencer will either need to create a list to ensure
these commands are not waited for at a later time or reject all pending
commands to move the valid CmdSN window up to this bypassing command.

The rejection scheme does modify some initiator behavior but provides
greater control during an error.  The major concept is invalidating a
sequence range.  The reject status should include both a copy of the
ExpCmdSN at the point in time of the reject process together with the new
ExpCmdSN and bypass mode as a reason to allow this list to be determined.
These sequences are thus retired and, as such, a new CmdSN will need to be
assigned to reissue these rejected commands.  Limiting this bypass mode to a
single unacknowledged command is not desirable if more than one command is
required.

The advantage for rejection would be:
  1) No need to list any sequencer bypasses.
  2) Immediate opening of the window for urgent commands.
  3) Quick repairs for a connection problem avoiding timeouts.
  4) Immediate acknowledgement for all commands sent to the target.
  5) No need to trap iSCSI errors within the SCSI layer.

The advantage for using a flag would be:
  1) No late inadvertent applications due to null serialization.
  2) Acknowledgement of this critical function.
  3) Flow control still desired for deploying critical commands.

Doug

> Doug,
>
> I think I need a translation in plain English.
>
> Julo
>
> "Douglas Otis" <dotis@sanlight.net> on 08/04/2001 19:10:45
>
> Please respond to "Douglas Otis" <dotis@sanlight.net>
>
> To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI:flow control, acknowledgement, and a deterministic
>       recovery
>

> Julian,
>
> If you do not reject commands pending within the sequencer, then you will
> need to log these out of sequence events to ensure the command
> taken out of
> sequence will not cause a hole later.  You could allow an exception for
> window size but if you are attempting to issue many of these urgent
> commands, then a single exception does not provide much flexibility.
>
> Should you reject all prior commands within the sequencer, there is no
> issue
> regarding the logging of these out of sequence commands.  Should
> there be a
> sequence of urgent commands, by flagging the first such command forces the
> window to be cleared to allow these urgent commands their immediate
> deployment.
>
> This rejection technique also has the advantage of deleting command holes
> without timeouts should there be a problem with a connection.  This
> technique supplemented with SNACK appears to provide a rather immediate
> recovery again without reliance on the SCSI layer.
>
> As there should already be code ready to handle rejected commands, using
> this mechanism does not add code and allows the transport to use this very
> simple mechanism.  This also does not depend on the target within the SCSI
> layer to handle these out of sequence events so there is less
> likelihood of
> needing the SCSI layer tailored to use iSCSI.
>
> While placing this immediate function into a flag ensures that such a
> command is acknowledged eventually if rejection is not used and
> immediately
> if commands are rejected, this flag however provides the vital information
> of command placement to prevent such a command from applying after
> subsequent commands for without this flag there is no assurance
> when such a
> command is received.
>
> Doug
>
>
>
> > The main reason for selecting 0 and not a flag for immediate
> delivery was
> > to enable immediate delivery even when the command window is closed.
> >
> > However we can achieve the same effect by using an immediate flag and
> > ausing the current CmdSN without advancing it.  With this we get a
> > reference to where in the stream the command where supposed to
> act if the
> > stream order is important. All commands that reached the target having
> > CmdSN less than the immediate CmdSN where sent before our
> command and all
> > those with a number equal or higher where sent after. Immediate commands
> > are the only ones that can have a CmdSN higher (by 1) that the allowed
> > window.
> >
> > Julo
> >
> >
> >
>
>
>
>
>



From owner-ips@ece.cmu.edu  Mon Apr  9 13:37:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA12750
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 13:37:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39FVkH26194
	for ips-outgoing; Mon, 9 Apr 2001 11:31:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39FUAr26030
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 11:30:10 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3CXHM>; Mon, 9 Apr 2001 11:31:33 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153EC@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	very
Date: Mon, 9 Apr 2001 11:30:02 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

In lieu of the requested translation ...

> If you do not reject commands pending within the sequencer, then you will
> need to log these out of sequence events to ensure the command taken out
of
> sequence will not cause a hole later. 

Doug is assuming that the sequencer that issues commands runs off of CmdSN.
Needless to say, that's not necessary, as the sequencer could run off
whatever
it likes (e.g., pointers in memory).  If the sequencer runs off something
internal
instead of CmdSN, immediate commands don't cause "holes" in its sequence
because the CmdSN-based "sequence" compresses them out as commands
are released to the sequencer from the code that handles CmdSN-related
state.

As near as I can tell, the entire discussion of rejecting queued commands
when an immediate command shows up is based
on this implementation-specific assumption about not only the existence of
a "sequencer" but also the use of CmdSN to implement it.  While those are
valid ways to design a system, neither are necessary.  CmdSN tracking is
only necessary in the portion of the system that generates the ExpCmdSN
and MaxCmdSN responses and holds commands that have to wait for missing
ones to show up.  An array-based implementation rather than a queue-based
sequencer can handle this without the command rejection side-effects of
Doug's envisioned sequencer, and there are doubtless more clever ways to do
it.

Unless someone other than Doug wants to speak up on the importance
of a CmdSN-based sequencer implementation, I think discussion of that
needs to be dropped, and changes beyond the use of an immediate
flag instead of a zero CmdSN set aside as only being needed by this
specific implementation and not of general applicability.  This includes
Doug's latest message about commands being "rejected for replay at a
new sequence".

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------
 



From owner-ips@ece.cmu.edu  Mon Apr  9 13:37:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA12763
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 13:37:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39GPqe29587
	for ips-outgoing; Mon, 9 Apr 2001 12:25:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39GPXr29571
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 12:25:33 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 8BE3C94009
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 12:25:32 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: SNACK and recovery 
In-Reply-To: Message from Black_David@emc.com 
   of "Thu, 05 Apr 2001 21:47:55 EDT." <0F31E5C394DAD311B60C00E029101A07080153CD@corpmx9.isus.emc.com> 
References: <0F31E5C394DAD311B60C00E029101A07080153CD@corpmx9.isus.emc.com> 
Date: Mon, 09 Apr 2001 12:24:09 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010409162532.8BE3C94009@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> - Does a 16-bit TCP checksum catch enough of
> the corruption events to make it acceptable to
> take drastic measures like aborting a backup
> when a 32 bit CRC fails on a response that
> made it through the 16 bit checksum?

Absolutely.

Events which create end-to-end integrity check errors are as handily
caught by TCP checksum as a CRC.  Link errors are caught by link
integrity checks, so that is not for the e2e check to protect.  The
remaining errors which are detectable by an e2e check have a signature
that most any check that's not blind stupid will detect.  For example,
back in the day, VMS's clustering software ran on Ethernet, and there
were many problems as a result of an early generation Ethernet
controller (my group...) corrupting data.  So, the VMS folks said, to
heck with performance, we're going to put a checksum on every cluster
packet.  Problem absolutely solved.  I don't know what the checksum
algorithm was, but it was not a CRC.  It was more like the TCP
checksum.

The TCP checksum escape evidence in the papers seems to be primarly in
paths which are not actually protected by it (host end points).

Looking at it from the other direction, backups have historically
always had to handle occasional problems, which has resulted in the
implementation of high-level recovery mechanisms.

Who can say with absolute certainly, and first-hand experience that
there WILL be a high frequency of checksum escapes which don't also
escape a CRC?  It seems a somewhat unlikely scenario, and my concern
is that we're making, complicated, incremental improvements for
handling a situation which will not occur.

It would be one thing if there were NO e2e check, or if the e2e check
also had to protect against link errors, or if the existing e2e check
were completely trivial, but that is just not the case here.

Steph


From owner-ips@ece.cmu.edu  Mon Apr  9 13:37:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA12774
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 13:37:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39G32p28237
	for ips-outgoing; Mon, 9 Apr 2001 12:03:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lub1028.lss.emc.com ([168.159.39.28])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39G2Hr28202
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 12:02:17 -0400 (EDT)
Received: from emc.com (IDENT:jhall@localhost.localdomain [127.0.0.1])
	by lub1028.lss.emc.com (8.9.3/8.9.3) with ESMTP id MAA28626
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 12:02:10 -0400
Message-Id: <200104091602.MAA28626@lub1028.lss.emc.com>
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Mon, 09 Apr 2001 12:02:10 -0400
From: "Jon Hall" <jhall@emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

"John Hufferd" writes:
>OK, if you go at it long enough you are punished with my two cents.

:-)

>We of course need some real numbers on what the probability of the CRC
>detected error, when TCP does not detect it.
>
>Given the fact that we do not have that information,  I could only just use
>some of the numbers that have been kicking around on this thread.
>
>A Billion is NOT a large number, especially when we are talking about 10
>Gigabit Links ( vendors sampling 10 Gigabit/sec HBAs, next year, some
>shipping them in general availability (GA) in that year, and the rest in
>2003.  And yes I also got information of a company that is currently
>developing  100 Gigabit Links.)  So when I looked at some of the numbers, I
>found that it meant that a link would see a failure about every twenty
>minutes, some went for 200 minutes, etc.

This is exactly why its necessary to understand the flow.  In a tape
context is it OK to assume that the data flowing from the target to
the initiator is responses to cmds, and that only a part of that is
iSCSI headers with StatSNs?  If that's right, then run the numbers
against the flow (don't use my numbers, they are riddled with guesses).

>I have one war story that might apply.  Years ago when we were first
>thinking about putting the small disk drives in our large storage
>controllers, we had folks calculate the Mean Time to Failure (MTF), of the
>various Desktop HD.   Some individual MTF numbers sounded large, for any
>given drive.  But then we computed the number of drives we would have in a
>large installation,  it ended up that we would have a drive failure at
>least every day.  (Thankfully, we had significantly better MTF numbers in
>the drives that were actually used.)   So, the point is that sometimes
>these large numbers come back to bite you in ways you had not considered at
>first, when you think about it in a large installation.
>
>OK, back to the thread.
>
>Now I see sites all the time with 10s to 100s of Tape Units, and these
>units.  In many cases this will mean that there will be a tape unit failure
>that causes the critical  backup job to fail, somewhere on the computing
>room floor, about every 2, 20, or 200 min.  This is a major impact on a
>computing center that must process hundreds of backup each day.

Exactly, I've worked in this context (though its been some years now).
It was true (at one time) that tape had a tractability limit, e.g.,
a tape backup of a terabyte was out of the question.  Has that changed?

>Therefore, those of you that think you are talking about  very rare events,
>should at least compute the 10 Gigabit/second Rates, and then the number of
>paths, etc. that might be in an enterprise installation, and then state how
>often a computing center will see such an event.  When many of these things
>are done at night with unattended operations, these can be a significant
>issue.  If it is probable that only one failure will occur per night, then
>you are certain that when a disaster does occur,  they will not have a
>valid backup over some amount of the data.
>
>OK, I am not saying who's right or wrong here, but just that some of the
>numbers I have heard, on this thread are not that impressive when looked at
>with a 10 Gigabit/sec links and many paths.  (Let alone the future 100
>Gigabit/sec links)  (Oh by the way, remember a 10 Gigabit link is really a
>20 Gigabit link when you factor in full duplex.)
>
>So it might be useful, for the Rare Event folks to do the calculations on
>their numbers and tell us what they mean in terms of Minutes between
>failures on 10 Gigabit links.  Then the rest of us can compute our own
>picture on how many links we will probably have in our installation.

But why does the fact that we may someday run at 10 gig change the
question?  Is there some reason to believe that at 10 gig the nature
of a tape flow has changed?  You could certainly have more flows, but
the number of packets per flow will not increase.  The speed of tape
access won't change.  You could do more tapes simultaneously, but
you still have the tractability of handling large numbers of tapes.

As an aside, is a "Rare Event" person, like a flat-earth person?
(I want to get my role right :-).

-Jon


From owner-ips@ece.cmu.edu  Mon Apr  9 13:38:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA12805
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 13:38:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39FCiB24942
	for ips-outgoing; Mon, 9 Apr 2001 11:12:44 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39FBur24904
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 11:11:56 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3CVX1>; Mon, 9 Apr 2001 11:13:16 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153EB@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	 very
Date: Mon, 9 Apr 2001 11:11:44 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> The main reason for selecting 0 and not a flag for immediate delivery was
> to enable immediate delivery even when the command window is closed.
> 
> However we can achieve the same effect by using an immediate flag and
> using the current CmdSN without advancing it.  With this we get a
> reference to where in the stream the command where supposed to act if the
> stream order is important.

I think this causes resource management complications on the target
because reusing the CmdSN for an immediate command requires a new
command buffer, unless we allow the target to arbitrarily reject/abort some
command to make room, which seems wrong (the immediate command
may not be a task management command, and even then arbitrary
rejects/aborts may not be desirable). 

How is this better than requiring the Initiator to not permit the command
window to close (i.e., the Initiator always keeps one slot in the window
open for one immediate command, or N slots for N immediate commands,
or no slots if it wants to live dangerously)?  I think existing (SCSI and
FC)
Initiators have to do this sort of resource management anyway, as I don't
believe immediate commands are exempt from TASK_SET_FULL.

There appears to be a "no free lunch" principle here in that some piece
of the iSCSI Initiator-Target system somewhere has to be cognizant
of the fact that resources for immediate commands are being 
reserved on the target, and Initiator control of the command window
(coupled with a requirement that MaxCmdSN never decrease) seems
to be the easiest way to get there. 

--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Mon Apr  9 13:39:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA12831
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 13:39:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39GPka29582
	for ips-outgoing; Mon, 9 Apr 2001 12:25:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39GPWr29570
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 12:25:32 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 64C5494006
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 12:25:32 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI: frame formats 
In-Reply-To: Message from "Barry Reinhold" <bbrtrebia@mediaone.net> 
   of "Mon, 02 Apr 2001 14:59:29 EDT." <BJEIKPAFDFPFNCPPBCGPAEMKCEAA.bbrtrebia@mediaone.net> 
References: <BJEIKPAFDFPFNCPPBCGPAEMKCEAA.bbrtrebia@mediaone.net> 
Date: Mon, 09 Apr 2001 12:24:09 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010409162532.64C5494006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> No one felt that a max data size of 16 megs in an iSCSI PDU was an
> issue

I did, I was just exhausted.  Furthermore, the size the PDU could
handle shrank (from 64M to 16M) right there before our ears in a split
second.

The `question' was also not asked in a form which I could effectively
answer or discuss.  There were two proposals presented, and they had
very different characteristics.  I figured I would refrain from
comment until I a) understood the proposals better b) got a sense of
which way we might be going.

I have a hard time swallowing a PDU that won't accomodate the data
from even a READ10 or WRITE10.

If people expect to tile each TCP segment with an iSCSI PDU, I think
that's a mistake.  The headers are too bulky for that.  If they think
they're going to gain something meaningful in hardware implementation
complexity (e.g. bounding the amount of reassembly buffering) by
assuming small multiples of segment size PDUs, I think that's a
mistake too.  I'd like to see a show of hands of people who have
actually implemented this approach in hardware.  Anybody?

If you (eventually) accept this, then you're back to the model that
there's a lower layer which is providing your data steering on a
per-segment basis, and your iSCSI PDU is the granularity at which the
iSCSI implementation delivers data to this layer.  It is clearly a
granularity at which software (transfer scheduling) occurs. 

At 10 Gbit/s, a 16 MB PDU is only 16 MS worth of data.

Steph


From owner-ips@ece.cmu.edu  Mon Apr  9 14:39:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA13905
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 14:39:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39G4jK28315
	for ips-outgoing; Mon, 9 Apr 2001 12:04:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39G43r28282
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 12:04:04 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id SAA272888
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 18:03:55 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id SAA74508
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 18:01:53 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A29.0057F03B ; Mon, 9 Apr 2001 18:00:31 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A29.0057F014.00@d12mta02.de.ibm.com>
Date: Mon, 9 Apr 2001 18:04:03 +0200
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
		 very
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



David,

Resource "stress" is the same as for immediate marked with 0 - as immediate
commands are not
controlled by a window in any case.

The "new scheme" might use a flag and the CmdSN field will carry the next
CmdSN (immediate commands will not cause the counters to advance - but 0
does not advance either).

The effect you get is that the target will have a "reference point" in the
ordered stream where it was supposed to act.  For some commands that could
be important for other s not.

Obviously the immediate commands get ordered with reference to
non-immediate commands but not between themselves if issued without
intervening ordered commands (a partial order).

Obviously I will have to reformulate the "rules of engagement" for command
retry but I think it worth doing.

And yes - there is no free lunch - I will have to put in some work -:)

Thanks for keeping the subject up long enough for me to get the feeling
that it might be a problem (and simple solution) out there.

Regards,
Julo

Black_David@emc.com on 09/04/2001 17:11:44

Please respond to Black_David@emc.com

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI:flow control, acknowledgement, and a deterministic reco
       very




> The main reason for selecting 0 and not a flag for immediate delivery was
> to enable immediate delivery even when the command window is closed.
>
> However we can achieve the same effect by using an immediate flag and
> using the current CmdSN without advancing it.  With this we get a
> reference to where in the stream the command where supposed to act if the
> stream order is important.

I think this causes resource management complications on the target
because reusing the CmdSN for an immediate command requires a new
command buffer, unless we allow the target to arbitrarily reject/abort some
command to make room, which seems wrong (the immediate command
may not be a task management command, and even then arbitrary
rejects/aborts may not be desirable).

How is this better than requiring the Initiator to not permit the command
window to close (i.e., the Initiator always keeps one slot in the window
open for one immediate command, or N slots for N immediate commands,
or no slots if it wants to live dangerously)?  I think existing (SCSI and
FC)
Initiators have to do this sort of resource management anyway, as I don't
believe immediate commands are exempt from TASK_SET_FULL.

There appears to be a "no free lunch" principle here in that some piece
of the iSCSI Initiator-Target system somewhere has to be cognizant
of the fact that resources for immediate commands are being
reserved on the target, and Initiator control of the command window
(coupled with a requirement that MaxCmdSN never decrease) seems
to be the easiest way to get there.

--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------






From owner-ips@ece.cmu.edu  Mon Apr  9 16:00:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA15469
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 16:00:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39HSpX03626
	for ips-outgoing; Mon, 9 Apr 2001 13:28:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f39HS6r03598
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 13:28:06 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by dirty; Mon Apr  9 13:27:48 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Mon Apr  9 13:27:48 EDT 2001
Received: (from sandeepj@localhost)
	by aura.research.bell-labs.com (8.9.1/8.9.1) id NAA27077
	for ips@ece.cmu.edu; Mon, 9 Apr 2001 13:27:46 -0400 (EDT)
Date: Mon, 9 Apr 2001 13:27:46 -0400 (EDT)
Message-Id: <200104091727.NAA27077@aura.research.bell-labs.com>
From: sandeepj@research.bell-labs.com (Sandeep Joshi)
To: ips@ece.cmu.edu
Subject: iSCSI: session login and ISID
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

There seems to be a problem in distinguishing session logins, using 
only the ISID field in the Login Command.   It is possible that 
different initiators could try to start a session using the same ISID 
value.   One of those attempts will get rejected, since the ISID is 
the sole key to find if a session already exists. (note: TSID was 
sent as zero for the leading connection of session)

The initiator WWUI does not seem to be available at this time.
a) Appendix D.10 states that InitiatorWWUI is optional and defaults
   to iSCSI.
b) Section 2.10.9 on Login Command states that "initiator MAY provide 
   some basic parameters".

On the other hand, Section 1.2.7 states that "the initiator MUST
present both its initiator WWUI and target WWUI to which it wishes
to connect during the login phase".

The WWUI is also needed if we are to support multiple I_T nexuses 
between the same initiator and target.  

So it seems like Section 1.2.7 has the right spec.   Appendix D and 
Section 2.10.9 must then be corrected.  The descriptions in Sec 4.1 
and Section 1.2.3 may also need to be changed to reflect the fact 
that initiator WWUI must be supplied at session login.

-Sandeep



From owner-ips@ece.cmu.edu  Mon Apr  9 16:00:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA15480
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 16:00:47 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39I8mV06238
	for ips-outgoing; Mon, 9 Apr 2001 14:08:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39I8fr06230
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 14:08:41 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f39JGj002416;
	Mon, 9 Apr 2001 12:16:46 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Mon, 9 Apr 2001 11:06:46 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIEBKCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <0F31E5C394DAD311B60C00E029101A07080153EC@corpmx9.isus.emc.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

Ver 5 Page 11:

   "The iSCSI target layer MUST deliver the commands to the SCSI target
   layer in the order specified by CmdSN."

I am not making an assumption about there being a CmdSN sequence function.
It is a MUST within the proposal!  Yes, you can bridge holes within CmdSN
sequence using pointers and that would be expected and I described this as a
list.  This list still must be sorted between sequential and immediate
especially when multiple connections are used to determine a next in
sequence.  You would not want the creation of these lists to be blocking
until the next in sequence is seen as you seem to imply.  You view this as a
pointer list, again I agree.  The use of rejection still greatly simplifies
this process.  You are wrong about the use of CmdSN being limited to just
the flow control window.

To not support rejects, you are then relying on the target to handle these
now out of sequence commands.  In addition, that there be spare overhead
allotted to handle these undetermined number of "immediate" commands.  You
will then also not see any immediate acknowledgement of these "immediate"
commands sent to the target unlike ALL other such commands.

Can you answer how many "immediate" commands are allowed?  When would you
expect these "immediate" commands to be acknowledged to allow more such
commands?

Doug

> In lieu of the requested translation ...
>
> > If you do not reject commands pending within the sequencer,
> then you will
> > need to log these out of sequence events to ensure the command taken out
> of
> > sequence will not cause a hole later.
>
> Doug is assuming that the sequencer that issues commands runs off
> of CmdSN.
> Needless to say, that's not necessary, as the sequencer could run off
> whatever
> it likes (e.g., pointers in memory).  If the sequencer runs off something
> internal
> instead of CmdSN, immediate commands don't cause "holes" in its sequence
> because the CmdSN-based "sequence" compresses them out as commands
> are released to the sequencer from the code that handles CmdSN-related
> state.
>
> As near as I can tell, the entire discussion of rejecting queued commands
> when an immediate command shows up is based
> on this implementation-specific assumption about not only the existence of
> a "sequencer" but also the use of CmdSN to implement it.  While those are
> valid ways to design a system, neither are necessary.  CmdSN tracking is
> only necessary in the portion of the system that generates the ExpCmdSN
> and MaxCmdSN responses and holds commands that have to wait for missing
> ones to show up.  An array-based implementation rather than a queue-based
> sequencer can handle this without the command rejection side-effects of
> Doug's envisioned sequencer, and there are doubtless more clever
> ways to do
> it.
>
> Unless someone other than Doug wants to speak up on the importance
> of a CmdSN-based sequencer implementation, I think discussion of that
> needs to be dropped, and changes beyond the use of an immediate
> flag instead of a zero CmdSN set aside as only being needed by this
> specific implementation and not of general applicability.  This includes
> Doug's latest message about commands being "rejected for replay at a
> new sequence".
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>
>



From owner-ips@ece.cmu.edu  Mon Apr  9 17:01:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA16988
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 17:01:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39IwvH09453
	for ips-outgoing; Mon, 9 Apr 2001 14:58:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39Iwhr09439
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 14:58:43 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 6696094009
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 14:58:43 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
In-Reply-To: Message from "Jon Hall" <jhall@emc.com> 
   of "Mon, 09 Apr 2001 12:02:10 EDT." <200104091602.MAA28626@lub1028.lss.emc.com> 
References: <200104091602.MAA28626@lub1028.lss.emc.com> 
Date: Mon, 09 Apr 2001 14:57:19 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010409185843.6696094009@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> Exactly, I've worked in this context (though its been some years now).
> It was true (at one time) that tape had a tractability limit, e.g.,
> a tape backup of a terabyte was out of the question.  Has that changed?

I think this is precisely the point.  Existing, off-the-shelf SCSI
solutions DO NOT presently solve this problem.  Both ||SCSI an FCP
burp the operation on a expectable, O(days) failure rate.  The rate of
adoption for the FCP-2 command recovery feature is overwhelming to the
point that the tape guys have been talking about end-running the
problem with explicitly addressed commands.

What we have running iSCSI on TCP is such a drastic improvement in
what you can expect from your SCSI service that we can eventually
expect a disruptive change.  Trying to engineer it to the point where
its 2^100 times more disruptive, when we don't really know where it's
taking us in the first place is meaningless.

[Warning: repetition ahead]

TCP + link layer error detection is engineered precisely to ensure
reliable data delivery.  It's clear from an engineering stand point
that it is likely (not guaranteed, what is?) to do this quite well.
In spite of much research, it seems like nobody here has come up with
a strong indication that TCP + link layer error detection does NOT do
its job well.  I do not think this is because nobody has ever looked
at the problem.

The lack of concrete information to support the case that TCP + link
layer error detection is inadequate has us chasing our tails.

Given the layer iSCSI occupies in the protocol layer cake, if we don't
try to solve which is presently assigned to a lower layer, it seems
quite comfortable to shim additional checks or recovery, or even a completely
different transport substrate underneath if we do discover TCP + link
layer error detection is not doing the trick, but it really seems like
folly to engineer based upon an assumption that nobody has done a good
job documenting.

Steph



From owner-ips@ece.cmu.edu  Mon Apr  9 17:01:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA17001
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 17:01:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39Ix6309465
	for ips-outgoing; Mon, 9 Apr 2001 14:59:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39Iw7r09411
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 14:58:08 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f39K6P002455;
	Mon, 9 Apr 2001 13:06:25 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Mon, 9 Apr 2001 11:56:25 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEBLCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <C1256A29.0057F014.00@d12mta02.de.ibm.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

How many such "immediate" commands do you expect to support?  Would you need
to see the status returned from the first before sending the next?  By
copying the CmdSN of the previous non-immediate command, you forgo
acknowledgement.  Although this does relieve the sequencer the problem of
patching holes, it comes at the expense of no longer having any flow control
or acknowledgement.  If this interface supported just a single target, then
a single such immediate command may be appropriate.  As this interface
supports many such targets, rejection of commands pending within the
sequencer seems far more appropriate in allowing the needed latitude in
error conditions.  Your scheme seems to expect a methodical technique of
sending a command, waiting for status, sending the next etc.  If the problem
was indeed urgent, then rejecting these bypassed commands would be far more
responsive, provide acknowledgement, flow control, not depend on the target
model handing such out of sequence commands, and not requiring any spare
resources to be maintained.  To me, this subject seems to merit discussion
if only to modify the zero sequence to a copy of the previous command CmdSN
with a flag added.

Doug

> David,
>
> Resource "stress" is the same as for immediate marked with 0 - as
> immediate
> commands are not
> controlled by a window in any case.
>
> The "new scheme" might use a flag and the CmdSN field will carry the next
> CmdSN (immediate commands will not cause the counters to advance - but 0
> does not advance either).
>
> The effect you get is that the target will have a "reference point" in the
> ordered stream where it was supposed to act.  For some commands that could
> be important for other s not.
>
> Obviously the immediate commands get ordered with reference to
> non-immediate commands but not between themselves if issued without
> intervening ordered commands (a partial order).
>
> Obviously I will have to reformulate the "rules of engagement" for command
> retry but I think it worth doing.
>
> And yes - there is no free lunch - I will have to put in some work -:)
>
> Thanks for keeping the subject up long enough for me to get the feeling
> that it might be a problem (and simple solution) out there.
>
> Regards,
> Julo
>
> Black_David@emc.com on 09/04/2001 17:11:44
>
> Please respond to Black_David@emc.com
>
> To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI:flow control, acknowledgement, and a
> deterministic reco
>        very
>
>
>
>
> > The main reason for selecting 0 and not a flag for immediate
> delivery was
> > to enable immediate delivery even when the command window is closed.
> >
> > However we can achieve the same effect by using an immediate flag and
> > using the current CmdSN without advancing it.  With this we get a
> > reference to where in the stream the command where supposed to
> act if the
> > stream order is important.
>
> I think this causes resource management complications on the target
> because reusing the CmdSN for an immediate command requires a new
> command buffer, unless we allow the target to arbitrarily
> reject/abort some
> command to make room, which seems wrong (the immediate command
> may not be a task management command, and even then arbitrary
> rejects/aborts may not be desirable).
>
> How is this better than requiring the Initiator to not permit the command
> window to close (i.e., the Initiator always keeps one slot in the window
> open for one immediate command, or N slots for N immediate commands,
> or no slots if it wants to live dangerously)?  I think existing (SCSI and
> FC)
> Initiators have to do this sort of resource management anyway, as I don't
> believe immediate commands are exempt from TASK_SET_FULL.
>
> There appears to be a "no free lunch" principle here in that some piece
> of the iSCSI Initiator-Target system somewhere has to be cognizant
> of the fact that resources for immediate commands are being
> reserved on the target, and Initiator control of the command window
> (coupled with a requirement that MaxCmdSN never decrease) seems
> to be the easiest way to get there.
>
> --David
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>
>
>
>



From owner-ips@ece.cmu.edu  Mon Apr  9 17:07:23 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA17107
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 17:07:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39J8nZ10117
	for ips-outgoing; Mon, 9 Apr 2001 15:08:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39J8Ur10102
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 15:08:30 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3DHGD>; Mon, 9 Apr 2001 15:09:53 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153F8@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	very
Date: Mon, 9 Apr 2001 15:08:18 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> Ver 5 Page 11:
> 
>    "The iSCSI target layer MUST deliver the commands to the SCSI target
>    layer in the order specified by CmdSN."
> 
> I am not making an assumption about there being a CmdSN sequence function.
> It is a MUST within the proposal!  Yes, you can bridge holes within CmdSN
> sequence using pointers and that would be expected and I described this as
a
> list.  This list still must be sorted between sequential and immediate
> especially when multiple connections are used to determine a next in
> sequence.

Aha ... here's a problem.  The text Doug quotes can't apply to existing
immediate
commands, because the current use of a zero CmdSN would require them to be
delivered before all other commands - what if the first immediate command
shows
up after 3000 non-immediate commands?  If we adopt the proposal to use
regular
CmdSNs for immediate commands, we'll need some text indicating that the
above MUST does NOT apply to immediate commands - i.e., iSCSI is expected
to deliver them to SCSI immediately after TCP delivers them to iSCSI no
matter
what sate the CmdSN window is in.  There is no need for rejection -
if worst comes to worst, a dummy command whose "deliver to SCSI"
operation is "do nothing" will keep Doug's sequencer happy even though
the actual command was delivered out of order.

> This list still must be sorted between sequential and immediate
> especially when multiple connections are used to determine a next in
> sequence.

That's one option.  Another is to have two lists so the immediate commands
don't get mixed with those for ordered delivery.  As I said, this is within
the realm of implementation choices.

> You would not want the creation of these lists to be blocking
> until the next in sequence is seen as you seem to imply.

I suspect Doug has misread what I wrote - most of the time, there's
no need to block for "next in sequence". OTOH, there is a "no free lunch"
principle in here that says Target resources are limited - if the Target is
temporarily out of resources, not much is going to happen until it gets
resources, and in that case, blocking is a reasonable first step.
Rejecting perfectly valid commands just to free resources doesn't seem
like the first thing that a Target should do (e.g., vs. waiting for SCSI to
finish something).

> You view this as a pointer list, again I agree.  The use of rejection
still greatly
> simplifies this process.

I still believe that "greatly simplifies" is implementation-specific until
someone
else speaks up in support of it.

> You are wrong about the use of CmdSN being limited to just
> the flow control window.

That's incorrect.  Once the command is ready for delivery to SCSI,
CmdSN is not needed to track it if there's some other way to do so.
"Deliver in CmdSN order" doesn't require that an implementation use
CmdSN to track that order, and the fact that immediate commands
put holes in this ordering may lead an implementation to use
something else.

> To not support rejects, you are then relying on the target to handle these
> now out of sequence commands.  

Immediate commands are "out of sequence" by definition - immediate
means deliver immediately without regard to CmdSN order.  Only the
target can do this, so I don't see any problem with relying on the Target
to do so.

>  In addition, that there be spare overhead allotted to handle these
> undetermined number of "immediate" commands.  

That's open to debate and depends on whether we require the CmdSN
to be within the window (as originally proposed) or allow one (more
than one?) beyond the window to be used (as Julian has proposed).

> You will then also not see any immediate acknowledgement of these
> "immediate" commands sent to the target unlike ALL other such commands.

In most of the cases, the TCP ACK will indicate that the bytes got there.
If there's an iSCSI CRC failure, then we're back to the CmdSN sequencing
to catch it and clean up, and that might not be "immediate" - but is that
going
to happen often enough to be worth any additional effort (rejects, immediate
ACKs, something else?) to optimize for performance?  I'm sceptical,
especially if a Cmd SACK/SNACK is proposed to correct this.

> Can you answer how many "immediate" commands are allowed?

Initiator determines this by appropriate management of usage of the CmdSN
window.
If it keeps N slots open in the window, N immediate commands can be sent
immediately.

> When would you expect these "immediate" commands to be acknowledged to
> allow more such commands?

When the resources from executing an immediate command are no longer
needed on the Target, MaxCmdSN would advance to allow an additional command
in, even if ExpCmdSN has not moved.  The Initiator has to spend some of its
time dealing with keeping ExpCmdSN moving (i.e., resend the command
at ExpCmdSN) because Targets are unlikely to support arbitrarily large
windows.

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Mon Apr  9 17:07:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA17118
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 17:07:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39J9mN10168
	for ips-outgoing; Mon, 9 Apr 2001 15:09:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39J9Ur10155
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 15:09:31 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3DH2D>; Mon, 9 Apr 2001 15:10:54 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080153F9@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: jojy_michael@agilent.com, ips@ece.cmu.edu
Subject: RE: Multiple targets behind a single IP address
Date: Mon, 9 Apr 2001 15:09:22 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

No, it's in the Login message, and thereafter is
implicit in the Session to which the connection belongs.

--David

> -----Original Message-----
> From:	jojy_michael@agilent.com [SMTP:jojy_michael@agilent.com]
> Sent:	Monday, April 09, 2001 2:51 PM
> To:	ips@ece.cmu.edu
> Subject:	Multiple targets behind a single IP address
> 
> Section 1.2.7 states that WWUIs are used in iSCSI to provide a target
> identifier for configurations that present multiple targets behind a
> single
> IP address and port.
> 
> To provide this support I would expect the target WWUI to be in the BHS,
> but, it does not seem to be there. Is the support missing from the spec?
> 
> - Jojy


From owner-ips@ece.cmu.edu  Mon Apr  9 18:28:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA18197
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 18:28:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39KMua15022
	for ips-outgoing; Mon, 9 Apr 2001 16:22:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39KMWr14993
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 16:22:32 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f39LUi002519;
	Mon, 9 Apr 2001 14:30:44 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Mon, 9 Apr 2001 13:20:45 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEBMCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <0F31E5C394DAD311B60C00E029101A07080153F8@corpmx9.isus.emc.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

> > To not support rejects, you are then relying on the target to
> > handle these now out of sequence commands.
>
> Immediate commands are "out of sequence" by definition - immediate
> means deliver immediately without regard to CmdSN order.  Only the
> target can do this, so I don't see any problem with relying on the Target
> to do so.

The commands that are NOW out of sequence are those commands bypassed by the
immediate command.  If this immediate command was Rewind, then a long list
of write commands become invalid as a new tape is now needed for those
operations.  You are expecting that the target to handle a long list of
invalid commands made possible by the iSCSI sequencer.  Commands invalidated
by immediate commands become out of sequence commands.

I'll reserve further comment until Julian has made some progress in
addressing these concerns.  I also suspect it is a mistake to allow commands
to remain trapped within the sequencer during emergency or abnormal events
signified by the use of these immediate commands.

Forgive my advocacy, but if one does not attempt to support a position, then
the subject is not explored.

Doug



From owner-ips@ece.cmu.edu  Mon Apr  9 19:38:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19336
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 19:38:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39IpnZ09014
	for ips-outgoing; Mon, 9 Apr 2001 14:51:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1t.cos.agilent.com (msgbas1tx.cos.agilent.com [192.6.9.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39IpWr08980
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 14:51:32 -0400 (EDT)
Received: from msgrel1.cos.agilent.com (msgrel1.cos.agilent.com [130.29.152.77])
	by msgbas1t.cos.agilent.com (Postfix) with ESMTP id A9D0D603
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 12:51:31 -0600 (MDT)
Received: from axcsbh1.cos.agilent.com (axcsbh1.cos.agilent.com [130.29.152.143])
	by msgrel1.cos.agilent.com (Postfix) with SMTP id 9B57B5E
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 12:51:31 -0600 (MDT)
Received: from 130.29.152.143 by axcsbh1.cos.agilent.com (InterScan E-Mail VirusWall NT); Mon, 09 Apr 2001 12:51:31 -0600 (Mountain Daylight Time)
Received: by axcsbh1.cos.agilent.com with Internet Mail Service (5.5.2653.19)
	id <2PL3Y6CX>; Mon, 9 Apr 2001 12:51:31 -0600
Message-ID: <FEEBE78C8360D411ACFD00D0B7477971D01EE0@xsj02.sjs.agilent.com>
From: jojy_michael@agilent.com
To: ips@ece.cmu.edu
Subject: Multiple targets behind a single IP address
Date: Mon, 9 Apr 2001 12:51:29 -0600 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="ISO-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Section 1.2.7 states that WWUIs are used in iSCSI to provide a target
identifier for configurations that present multiple targets behind a single
IP address and port.

To provide this support I would expect the target WWUI to be in the BHS,
but, it does not seem to be there. Is the support missing from the spec?

- Jojy



From owner-ips@ece.cmu.edu  Mon Apr  9 19:52:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19700
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 19:52:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39Lnv620359
	for ips-outgoing; Mon, 9 Apr 2001 17:49:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39LnAr20319
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 17:49:10 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id D6F6894006
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 17:49:09 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
In-Reply-To: Message from julian_satran@il.ibm.com 
   of "Thu, 05 Apr 2001 11:27:47 +0200." <C1256A25.00513DFA.00@d12mta02.de.ibm.com> 
References: <C1256A25.00513DFA.00@d12mta02.de.ibm.com> 
Date: Mon, 09 Apr 2001 17:47:46 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010409214909.D6F6894006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> But perhaps there is a place in the market for the kind of devices Somesh
> is suggesting that do all recovery at SCSI level (and that can't copy a
> terabyte of data without a session drop).

The market clearly demonstrates this, since that's the way all but a
vanishingly small portion of the devices behave today.

Steph


From owner-ips@ece.cmu.edu  Mon Apr  9 19:52:51 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19711
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 19:52:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39Lqqw20557
	for ips-outgoing; Mon, 9 Apr 2001 17:52:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39LqNr20543
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 17:52:23 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f39N0a002590;
	Mon, 9 Apr 2001 16:00:37 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Stephen Bailey" <steph@cs.uchicago.edu>, <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Mon, 9 Apr 2001 14:50:37 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEBNCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <20010409185843.6696094009@sandmail.sandburst.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Steph,

I am aware of telco equipment that does not have any error checking between
interfaces.  I would not place a limit on capacity as a means of justifying
full reliance on TCP checksums.

Doug


> > Exactly, I've worked in this context (though its been some years now).
> > It was true (at one time) that tape had a tractability limit, e.g.,
> > a tape backup of a terabyte was out of the question.  Has that changed?
>
> I think this is precisely the point.  Existing, off-the-shelf SCSI
> solutions DO NOT presently solve this problem.  Both ||SCSI an FCP
> burp the operation on a expectable, O(days) failure rate.  The rate of
> adoption for the FCP-2 command recovery feature is overwhelming to the
> point that the tape guys have been talking about end-running the
> problem with explicitly addressed commands.
>
> What we have running iSCSI on TCP is such a drastic improvement in
> what you can expect from your SCSI service that we can eventually
> expect a disruptive change.  Trying to engineer it to the point where
> its 2^100 times more disruptive, when we don't really know where it's
> taking us in the first place is meaningless.
>
> [Warning: repetition ahead]
>
> TCP + link layer error detection is engineered precisely to ensure
> reliable data delivery.  It's clear from an engineering stand point
> that it is likely (not guaranteed, what is?) to do this quite well.
> In spite of much research, it seems like nobody here has come up with
> a strong indication that TCP + link layer error detection does NOT do
> its job well.  I do not think this is because nobody has ever looked
> at the problem.
>
> The lack of concrete information to support the case that TCP + link
> layer error detection is inadequate has us chasing our tails.
>
> Given the layer iSCSI occupies in the protocol layer cake, if we don't
> try to solve which is presently assigned to a lower layer, it seems
> quite comfortable to shim additional checks or recovery, or even
> a completely
> different transport substrate underneath if we do discover TCP + link
> layer error detection is not doing the trick, but it really seems like
> folly to engineer based upon an assumption that nobody has done a good
> job documenting.
>
> Steph
>
>



From owner-ips@ece.cmu.edu  Mon Apr  9 19:52:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19728
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 19:52:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39MP0J22504
	for ips-outgoing; Mon, 9 Apr 2001 18:25:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39MNur22449
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 18:23:56 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 51DC694006
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 18:23:56 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
In-Reply-To: Message from Venkat Rangan <venkat@rhapsodynetworks.com> 
   of "Mon, 09 Apr 2001 14:53:23 PDT." <15851BD69CFCD41186B100B0D0AABE650C1B49@med.corp.rhapsodynetworks.com> 
References: <15851BD69CFCD41186B100B0D0AABE650C1B49@med.corp.rhapsodynetworks.com> 
Date: Mon, 09 Apr 2001 18:22:32 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010409222356.51DC694006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Venkat,

> Not to beat a dead horse, the reason link level CRCs may not be of
> much help is because of the following.

I understand.  My point is that when you start talking about `bit
error rates', you're usually in path which covered by link error
detection.

The packet permutations which the TCP checksum (and other e2e checks)
protect against are usually not meaningfully discussed in bit error
rate terms.  Not that CRC isn't a good (or even better) end-to-end
mechanism, it's just that that what we have is checksum, and for these
types of errors, it's still pretty darned likely to detect them.

When I mentioned link error detection, I was explicitly trying to
factor out detecting that portion of the error behavior with an end to
end check.

> So, the escape rate depends quite a bit on number of middle boxes and the
> exposure of data paths. How much do we rely on middle boxes to never
> introduce an error during the exposure?

No.  However, middlebox-proofing we do will be circumvented when the
middle box decides it wants to look in iSCSI as a L7 protocol.  I know
it sounds horrible, but there are zillions of companies doing this for
HTTP now.  The reason why middle boxes manipulate the data is because
that allows them to provide the desired behavior.  It seems like a
really natural product idea for some people (personally I don't get
it) to plumb HTTP and iSCSI (hey, why not CIFS, NFS, SIP and RTP while
we're at it :^) into one big, happy L7 router/switch type box.
Fundamentally, if the middle box is going to diddle in the payload,
there's not squat we can do.

An ultimate solution to this is to run secured end-to-end.  That will
keep those middleboxes from messing with the data :^) The ones that
want and need to will just break.  As long as the protocol is
in-the-clear (and without a security significant digest), there's a
lot less we can do.  The box vendors seem to operate on the `easier to
get forgiveness than permission' model.

I am somewhat ambivalent about CRC digests (I'd rather have end-to-end
security and kill all those birds with the same stone), but what I'm
really averse to is assuming that digest failures are frequent, and a
less than brute force (connection bounce) recovery mechanism is
required.

Steph


From owner-ips@ece.cmu.edu  Mon Apr  9 21:11:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA20624
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 21:11:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39Mxtm24650
	for ips-outgoing; Mon, 9 Apr 2001 18:59:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from philotas.hosting.pacbell.net (philotas.hosting.pacbell.net [216.100.99.24])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39LDgr18179
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 17:13:42 -0400 (EDT)
Received: from somesh (sdsl-216-36-75-164.dsl.sjc.megapath.net [216.36.75.164])
	by philotas.hosting.pacbell.net
	id RAA05703; Mon, 9 Apr 2001 17:13:38 -0400 (EDT)
	[ConcentricHost SMTP Relay 1.7]
Reply-To: <somesh_gupta@silverbacksystems.com>
From: "Somesh Gupta" <somesh_gupta@silverbacksystems.com>
To: "IPS" <ips@ece.cmu.edu>
Subject: Calling for Tape & Backup Application Experts
Date: Mon, 9 Apr 2001 14:08:01 -0700
Message-ID: <NMEALCLOIBCHBDHLCMIJOENDCCAA.somesh_gupta@silverbacksystems.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

For Tape and Backup App Experts,

If you have followed the running debates (subjects could be 
Snack & Recovery; or ERT), you have seen people making
(reasonably good in some cases) attempts at how backups
are done etc.

If you are out there on the list and can help with
creating a model of the backup app, that will help
resolve some of the issues.

Thanks,
Somesh


From owner-ips@ece.cmu.edu  Mon Apr  9 22:30:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA22395
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 22:29:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3A02tW28497
	for ips-outgoing; Mon, 9 Apr 2001 20:02:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3A01xr28455
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 20:01:59 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id BC6A76CD
	for <ips@ece.cmu.edu>; Mon,  9 Apr 2001 17:01:24 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id RAA12699
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 17:01:10 -0700 (PDT)
Message-ID: <3AD24E8E.A79E8198@cup.hp.com>
Date: Mon, 09 Apr 2001 17:06:38 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A28.00424363.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------E7D3685563283921AD9ABEE1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------E7D3685563283921AD9ABEE1
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian & All,

Do we agree on the following requirements for SNACK :

a) iSCSI MUST NOT mandate either data or status S[N]ACK for intra-task
error recovery. Initiators MUST be allowed to perform command
granularity error recovery.

b) iSCSI MUST provide a mechanism by which targets can continue with I/O
resource release upon completion of an I/O. Such a mechanism may be
based on an explicit StatSN acknowledgement, (if the target supports
StatSN SNACK), or allow immiediate resource clean-up upon I/O
completion. 

c) Such a mechanism MUST NOT block forward progress when holes occur in
StatSN sequence, due to format or digest errors encountered at the
initiator.

In order to meet the above requirements, "StatSN S[N]ACK" support can be
negotiated at login time and if StatSN SNACK is not supported by the
target, it MUST NOT use StatSN sequence numbering. (i.e. StatSN = 0). 

By not using StatSN numbering, the "holes in StatSN" problem does not
occur, thereby, meeting requirements (a) ,(b) & (c) for targets that do
not retain I/O state information.

For targets that do retain I/O state information, StatSn SNACK is turned
on along with StatSN numbering. 

- Santosh


julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> I have to think about it a bit more.  The main ack mechanism is ExpStatSN
> (as I indicated in a previous note).  SNACk is meant to simplify recovering
> holes without having to reissue commands or wait for timeout. Rejecing
> SNACK with an "unsupported" indication but resending the status on command
> retry can hardly be considered a good alternative while rejecting SNACK
> with "no status to recover" has to be bubbled up to SCSI and that can be
> bad for tapes.
> I think that if you keep status until ack-ed SNACK makes only life easier
> as it makes the recovery request explicit an specific - unlike the command
> retry that is vague and unspecific.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 06/04/2001 20:26:47
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   santoshr@cup.hp.com, David Black <Black_David@emc.com>
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> Julian,
> 
> I did not hear back on this and am re-sending in case you did not
> receive the same. Your comments would be appreciated.
> 
> (Can you clarify if you intend to make the current SNACK mechanism
> optional and if so, how it is expected to solve the "holes in StatSN"
> problem for targets that don't implement StatSN SNACK ?)
> 
> Regards,
> Santosh
> 
> -------------------------------------------------------------------------------------
> 
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> Date:     Thu, 05 Apr 2001 19:22:09 -0700
> From:     Santosh Rao <santoshr@cup.hp.com>
> Organization: Hewlett Packard, Cupertino.
> To:       julian_satran@il.ibm.com
> CC:       ips@ece.cmu.edu
> 
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > SNACK and SACK are the same thing (I just renamed them to avoid confusion
> > with TCP SACK).
> > The status is acked by ExpStatSN (and only indirectly by SNACK). SNACK
> > enables fast recovery of
> > a hole (whithout having to resort to a timeout).
> 
> Julian,
> 
> The bottom line is that the current SNACK mechanism as defined in the
> spec will NOT work if it is made optional, and at the same time, it is
> too expensive to mandate the SNACK mechanism.
> 
> The current SNACK mechanism is really a negative ACK requesting the
> target to re-send the status PDU. This mechanism has 2 dis-advantages :
> 
> a) requires targets to retain I/O state information until StatSN is
> acknowledged.
> b) Does not allow forward progress with the release of I/O resources in
> the event that a target could not retain that state information or for
> some other reason could not service the SNACK.
> 
> I am suggesting that the alternate model of SACK be used, wherein, a
> SACK is an individual ACK of a received status PDU. This SACK only kicks
> in on detection of a hole. The hole is implicitly plugged by the
> initiator on eventual completion of the command
> [on timeout followed by abort or retry].
> 
> The advantage of this alternate model is :
> a) Does not require state information to be stored at targets beyond I/O
> completion.
> b) Allows a more reliable mechanism of resource release.
> 
> The dis-advantage of this mechanism is :
> a) It results in I/O timeout when Status PDU was dropped due to a digest
> error.
> 
> Once again, the question boils down to the rate of TCP checksum escapes
> and the probability of such escapes affecting status PDUs. If this is
> low enough, such a timeout on a digest error of a status PDU should be
> acceptable.
> 
> >  We decided long ago
> > against individual acks as bulk acking through a window is cheaper and
> > safer (repetition).
> 
> I am not suggesting removal of bulk ack scheme. My suggestion is that
> SACK kick in on a hole and the initiator revert to bulk ACK scheme once
> it considers the hole to be plugged (thru the eventual completion of the
> I/O on the timeout path followed by abort or retry).
> 
> - Santosh
>  - santoshr.vcf
--------------E7D3685563283921AD9ABEE1
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------E7D3685563283921AD9ABEE1--



From owner-ips@ece.cmu.edu  Mon Apr  9 23:20:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23831
	for <ips-archive@odin.ietf.org>; Mon, 9 Apr 2001 23:20:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3A113q01804
	for ips-outgoing; Mon, 9 Apr 2001 21:01:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3A102r01709
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 21:00:03 -0400 (EDT)
Received: from xparelay2.corp.hp.com (xparelay2.corp.hp.com [15.58.137.112])
	by palrel1.hp.com (Postfix) with ESMTP
	id E332D1093; Mon,  9 Apr 2001 18:00:01 -0700 (PDT)
Received: from xatlbh2.atl.hp.com (xatlbh2.atl.hp.com [15.45.89.187])
	by xparelay2.corp.hp.com (Postfix) with ESMTP
	id 27B791F50A; Mon,  9 Apr 2001 20:58:30 -0400 (EDT)
Received: by xatlbh2.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <2M5JGTMY>; Mon, 9 Apr 2001 20:59:57 -0400
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08F95@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'jojy_michael@agilent.com'" <jojy_michael@agilent.com>, ips@ece.cmu.edu
Subject: RE: Multiple targets behind a single IP address
Date: Mon, 9 Apr 2001 20:59:54 -0400 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

The Target Node Address is used in the login phase of the iSCSI protocol.
In order to attempt login, a TCP connection must be first set up.  It is
this connection which is associated with the target when the initiator is
successfully logged in (any PDUs received on that connection are addressed
to the "target the initiator logged into").  There is no connection
association to a target until an initiator is successfully logged in, and
any PDUs received before successful login that are not related to the login
phase (such as a command PDU) should be discarded (not "rejected", since
that could constitute a denial-of-service attack opportunity).

If an initiator is talking to multiple targets behind a single IP address,
it will have at least 1 TCP connection to each target.

The short answer is "the TCP connection is the target identifier" :-)

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 

> -----Original Message-----
> From: jojy_michael@agilent.com [mailto:jojy_michael@agilent.com]
> Sent: Monday, April 09, 2001 11:51 AM
> To: ips@ece.cmu.edu
> Subject: Multiple targets behind a single IP address
> 
> 
> Section 1.2.7 states that WWUIs are used in iSCSI to provide a target
> identifier for configurations that present multiple targets 
> behind a single
> IP address and port.
> 
> To provide this support I would expect the target WWUI to be 
> in the BHS,
> but, it does not seem to be there. Is the support missing 
> from the spec?
> 
> - Jojy
> 


From owner-ips@ece.cmu.edu  Tue Apr 10 01:08:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA25266
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 01:08:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3A2Ixf05777
	for ips-outgoing; Mon, 9 Apr 2001 22:18:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3A2I7r05752
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 22:18:07 -0400 (EDT)
Received: from xparelay2.corp.hp.com (xparelay2.corp.hp.com [15.58.137.112])
	by palrel1.hp.com (Postfix) with ESMTP
	id A5B07F5B; Mon,  9 Apr 2001 19:18:06 -0700 (PDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1])
	by xparelay2.corp.hp.com (Postfix) with ESMTP
	id 1C1EB1F50A; Mon,  9 Apr 2001 22:16:30 -0400 (EDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2PW5K9D7>; Mon, 9 Apr 2001 19:18:00 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08F97@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'Black_David@emc.com'" <Black_David@emc.com>, ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Mon, 9 Apr 2001 19:17:59 -0700 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

How does this relate to URLs?  Does this mean we can't specify a URL format
for iSCSI resources?  Can you provide the logic behind the IESG pronouncing
that the IESG won't approve another global namespace?  It sounds like the
logic the TCP group used to reject the need for framing first time round("If
we allowed every yahoo that wanted changes to the TCP header, TCP wouldn't
be what it is today..")

Its not clear to me why the N&D group thinks iSCSI devices need globally
unique names.  It seems like a host name is unique enough, and behind that
it's up to the host to ensure uniqueness locally.

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 

> -----Original Message-----
> From: Black_David@emc.com [mailto:Black_David@emc.com]
> Sent: Thursday, March 22, 2001 9:48 PM
> To: ips@ece.cmu.edu
> Subject: iSCSI Naming: WWUIs, URNs, and namespaces
> 
> 
> <RANT> I don't like naming issues. </RANT> :-) :-)
> 
> After suitable consulting with some members of
> the IESG and IAB, I have some news to convey about
> the current approach to iSCSI naming.
> 
> The IESG will not approve another global namespace
> for iSCSI's use - this means that WWUIs as currently
> designed will need to be revised out of the
> naming and discovery draft, and that it will not be
> possible to proceed with the WWUI URN draft
> as an official IPS WG work item.  The best course of
> action would probably be to allow the WWUI URN draft
> to expire without further revision.
> 
> To a first approximation, WWUIs are/were a "grand
> unified theory" of naming, in that any namespace could
> be glued into the WWUI world (as several were).
> The WG is being directed to instead focus on the
> individual namespaces and make sure that the ones that
> are used are in fact necessary.  iSCSI can use text
> keys to identify which sort of name is being used
> (one key for each sort of format, for each instance
> in which a name is used), and it may be possible
> to encode the name format in the parse rules for the
> values of iSCSI keys to avoid proliferation of keys.
> 
> Taking a look at the namespaces in the current iSCSI
> naming and discovery draft, here's some initial
> guidance from this WG co-chair:
>   iscsi - canonical target -- This should be fine.
>   eui - WWNs -- The use of these for storage makes eminent
> 	sense.  I don't see a problem here.
>   dns - hostnames -- Use of these should be abandoned as
> 	not only are they not really URNs (as indicated
> 	in the draft), but their intended usage is straying
> 	into the tarpit known as "URN resolution".  Faster
> 	progress will made by staying out.  A DNS hostname
> 	can be put into an Alias or something new can be
> 	invented to carry it as a Location Hint, BUT the
> 	relevant URN WG RFCs and drafts on URN resolution
> 	should be reviewed before proceeding too far in this
> 	direction.
>   iscsi - Reverse DNS and oui - Org. Unique Identifier --
> 	The rationale for these is not clear to me.
> 	Assuming that WWNs are going to be available for
> 	use in naming iSCSI Initiators and Targets, what
> 	are the problems that these sorts of names solve
> 	that WWNs don't?  It appears that one of the problems
> 	may be who can get/create them.  Discussion of this
> 	on the list would be appropriate.
> In any case, the fewer name formats we have to deal with,
> the better.
> 
> I want to try to anticipate an objection to this, which
> would note that from a functional viewpoint the basic
> impact of this is to move some characters from one text 
> string to another (e.g., from a WWUI type designator
> to part of an iSCSI text key), and wonder if this is
> a distinction without a difference.  One of the reasons
> for the <RANT> that started this post is that a functional
> view is not sufficient for naming - how things are named,
> the intended usage of names and their scope matter a lot.
> This is particularly true when considering the structure
> of a namespace and how that structure may be extended.
> The upshot is that avoiding introduction of something
> claiming to be yet another global namespace is important
> (i.e., use existing namespaces with global scope in preference
> to inventing new ones).  The resulting need to define
> the name spaces/formats in the main iSCSI spec. is
> probably a "feature" as it forces us to pay more
> attention to the sorts of names we use and raises the
> bar for adding additional sorts of names in the future.
> 
> I will be working with
> the naming and discovery team in my "copious spare time"
> to make sure that we don't lose the valuable work and
> progress they've made to date as a consequence of this
> change.  Discussion on the list about what sort
> of names are important (e.g., the Reverse DNS and OUI
> namespaces) and why would be useful. 
> 
> Thanks,
> --David
> 
> 
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
> 


From owner-ips@ece.cmu.edu  Tue Apr 10 02:46:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA08913
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 02:46:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f39LrrK20605
	for ips-outgoing; Mon, 9 Apr 2001 17:53:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from med.corp.rhapsodynetworks.com (64-160-62-201.rhapsodynetworks.com [64.160.62.201] (may be forged))
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f39LrVr20586
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 17:53:31 -0400 (EDT)
Received: by med.corp.rhapsodynetworks.com with Internet Mail Service (5.5.2653.19)
	id <2R7P21P0>; Mon, 9 Apr 2001 14:53:24 -0700
Message-ID: <15851BD69CFCD41186B100B0D0AABE650C1B49@med.corp.rhapsodynetworks.com>
From: Venkat Rangan <venkat@rhapsodynetworks.com>
To: "'Stephen Bailey'" <steph@cs.uchicago.edu>, ips@ece.cmu.edu
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Mon, 9 Apr 2001 14:53:23 -0700 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Steph,

Not to beat a dead horse, the reason link level CRCs may not be of much help
is because of the following.

The paper "When the CRC and TCP Checksum Disagree" section 5.1 describes the
data transmission path and potential for error introduction at various
points in the path.

At a layer 3 device upon you have:

1. The existing link-level CRC verified and stripped.

2. The payload (IP packet) DMA'ed into some buffers, preserving the original
IP header checksums and TCP checksums.

3. Create a new link-level header.

4. Compute a new CRC.

5. Data sent to the next hop.

If an error is introduced (software or hardware) in steps 2 and 3, the new
CRC introduced in step 4 isn't of any help. The introduced error can be:

1. In the IP header (such as IP address bytes were munged).

2. In the TCP header (such as the port got corrupted).

3. In the TCP checksum itself.

4. In the payload.

Error categories 1 and 2 may cause the packet to be not delivered at all. It
is okay if we do not detect these because they are not delivered to the
iSCSI processing layer. Error 3 would cause the packet to be rejected. Error
4 should normally catch the error, but at an escape rate of 1 in 10e8
escapes detection. (Actually I'm not sure if given the error bias to the
headers, this rate is the rate within the payload of TCP segment). The iSCSI
header and data digest is present to detect that escape.

In the presence of middle boxes that do more than layer 2 forwarding, (say a
box that terminates a TCP connection and re-initiates a new connection) and
if the middle box retains the iSCSI header and data digests but only
computes a new checksum, the transmission path exposure is similar to 2 and
3 above. The header and data digests will enable detection of that.

If the middle box does more than just terminate TCP connections and changes
the iSCSI header and recomputes a new iSCSI header digest and leaves the
data digest alone, at least the data part is protected, but not the header.
If it changes both header and data, there is no protection. In order to get
true end-to-end protection, the application needs to apply a separate
digest, such as creating a 516-byte data block for every 512-byte sector of
data and storing that in the media.

So, the escape rate depends quite a bit on number of middle boxes and the
exposure of data paths. How much do we rely on middle boxes to never
introduce an error during the exposure? Since the referred papers suggest
correct end-to-end delivery of TCP segments with checksum errors in them,
the presence of exposed paths in the middle boxes has been a factor. Still,
rates quoted (1 in 200 million or 1 in 300 million) suggests that it is
necessary to have very strong CRC and detection mechanisms, but it may not
be necessary to optimize the recovery options, so we are able to recover
with the smallest amount of retransmission of data.

I haven't studied the two other references on the subject, but again I
suspect there is evidence to suggest that errors will creep in at
intermediate processing elements.

Venkat Rangan
Rhapsody Networks Inc.
http://www.rhapsodynetworks.com


-----Original Message-----
From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
Sent: Monday, April 09, 2001 11:57 AM
To: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 


> Exactly, I've worked in this context (though its been some years now).
> It was true (at one time) that tape had a tractability limit, e.g.,
> a tape backup of a terabyte was out of the question.  Has that changed?

I think this is precisely the point.  Existing, off-the-shelf SCSI
solutions DO NOT presently solve this problem.  Both ||SCSI an FCP
burp the operation on a expectable, O(days) failure rate.  The rate of
adoption for the FCP-2 command recovery feature is overwhelming to the
point that the tape guys have been talking about end-running the
problem with explicitly addressed commands.

What we have running iSCSI on TCP is such a drastic improvement in
what you can expect from your SCSI service that we can eventually
expect a disruptive change.  Trying to engineer it to the point where
its 2^100 times more disruptive, when we don't really know where it's
taking us in the first place is meaningless.

[Warning: repetition ahead]

TCP + link layer error detection is engineered precisely to ensure
reliable data delivery.  It's clear from an engineering stand point
that it is likely (not guaranteed, what is?) to do this quite well.
In spite of much research, it seems like nobody here has come up with
a strong indication that TCP + link layer error detection does NOT do
its job well.  I do not think this is because nobody has ever looked
at the problem.

The lack of concrete information to support the case that TCP + link
layer error detection is inadequate has us chasing our tails.

Given the layer iSCSI occupies in the protocol layer cake, if we don't
try to solve which is presently assigned to a lower layer, it seems
quite comfortable to shim additional checks or recovery, or even a
completely
different transport substrate underneath if we do discover TCP + link
layer error detection is not doing the trick, but it really seems like
folly to engineer based upon an assumption that nobody has done a good
job documenting.

Steph


From owner-ips@ece.cmu.edu  Tue Apr 10 03:06:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA09085
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 03:06:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3A1Zvt03604
	for ips-outgoing; Mon, 9 Apr 2001 21:35:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3A1Yxr03516
	for <ips@ece.cmu.edu>; Mon, 9 Apr 2001 21:34:59 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id D67A4A5E; Mon,  9 Apr 2001 18:34:57 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id SAA18079;
	Mon, 9 Apr 2001 18:34:53 -0700 (PDT)
Message-ID: <3AD26485.58EEBDCF@cup.hp.com>
Date: Mon, 09 Apr 2001 18:40:21 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Douglas Otis <dotis@sanlight.net>
Cc: Ips <ips@ece.cmu.edu>
Subject: Re: iSCSI:flow control, acknowledgement, and a deterministic recovery
References: <NEBBJGDMMLHHCIKHGBEJGEBMCGAA.dotis@sanlight.net>
Content-Type: multipart/mixed;
 boundary="------------DD775FDA37A334CED458F581"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------DD775FDA37A334CED458F581
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Doug [& All],

Some comments on this thread :

1) The immediate command feature can be exploited by initiators which
are not required to provide strict command ordering for their SCSI
sub-system. (The majority of today's scsi stacks). IOW, all commands can
be sent with CmdSN=0 to indicate no ordering required.

Any scheme to restrict the number of "immediate commands" that can be
sent prohibits such a feature and is un-desirable.

2) The (ExpCmdSN, MaxCmdSN) based acknowledgement and flow control is a
freebie that targets can use to implement additional flow control.
There's nothing in the spec that mandates that a target MUST use this
feature to throttle its window and apply flow control.

Hence, any dependence and assumptions on the existence of CmdSN based
flow control may be incorrect, since the target may be using QUEUE FULL
(TASK SET FULL) to apply flow control.

> I also suspect it is a mistake to allow commands
> to remain trapped within the sequencer during emergency or abnormal events
> signified by the use of these immediate commands.

3) If the initiator suspects that the CmdSN queue at the target is
stuck, it can always use a task mgmt command or even an iSCSI login with
the X (restart) bit to perform a cleanup of stuck commands at the
target. There's no need to build in implicit assumptions that a CmdSN
with the immediate flag result in a clean-up of previously pending
commands at the CmdSN queue of the target.

- Santosh


Douglas Otis wrote:
> 
> David,
> 
> > > To not support rejects, you are then relying on the target to
> > > handle these now out of sequence commands.
> >
> > Immediate commands are "out of sequence" by definition - immediate
> > means deliver immediately without regard to CmdSN order.  Only the
> > target can do this, so I don't see any problem with relying on the Target
> > to do so.
> 
> The commands that are NOW out of sequence are those commands bypassed by the
> immediate command.  If this immediate command was Rewind, then a long list
> of write commands become invalid as a new tape is now needed for those
> operations.  You are expecting that the target to handle a long list of
> invalid commands made possible by the iSCSI sequencer.  Commands invalidated
> by immediate commands become out of sequence commands.
> 
> I'll reserve further comment until Julian has made some progress in

> 
> Forgive my advocacy, but if one does not attempt to support a position, then
> the subject is not explored.
> 
> Doug
--------------DD775FDA37A334CED458F581
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------DD775FDA37A334CED458F581--



From owner-ips@ece.cmu.edu  Tue Apr 10 03:48:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA09397
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 03:48:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3A590u14898
	for ips-outgoing; Tue, 10 Apr 2001 01:09:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3A58cr14885
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 01:08:38 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id HAA12564
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 07:08:30 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id HAA94610
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 07:06:27 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2A.001BEF25 ; Tue, 10 Apr 2001 07:05:06 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2A.001BE84B.00@d12mta02.de.ibm.com>
Date: Tue, 10 Apr 2001 07:08:43 +0200
Subject: Re: iSCSI: frame formats
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Steph,

Read10 and Write10 can still be accomodated with several data PDUs as we
maintained the total read/write count (expected data) as a 32 bit field.
It is only the individual PDU that went down to 16M.

It is worth noticing also that the CRCs we are contemplating have a
guaranteed haming distance of 4 for blocklengths of less than 2^31 bits.

Julo

Stephen Bailey <steph@cs.uchicago.edu> on 09/04/2001 18:24:09

Please respond to Stephen Bailey <steph@cs.uchicago.edu>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI: frame formats




> No one felt that a max data size of 16 megs in an iSCSI PDU was an
> issue

I did, I was just exhausted.  Furthermore, the size the PDU could
handle shrank (from 64M to 16M) right there before our ears in a split
second.

The `question' was also not asked in a form which I could effectively
answer or discuss.  There were two proposals presented, and they had
very different characteristics.  I figured I would refrain from
comment until I a) understood the proposals better b) got a sense of
which way we might be going.

I have a hard time swallowing a PDU that won't accomodate the data
from even a READ10 or WRITE10.

If people expect to tile each TCP segment with an iSCSI PDU, I think
that's a mistake.  The headers are too bulky for that.  If they think
they're going to gain something meaningful in hardware implementation
complexity (e.g. bounding the amount of reassembly buffering) by
assuming small multiples of segment size PDUs, I think that's a
mistake too.  I'd like to see a show of hands of people who have
actually implemented this approach in hardware.  Anybody?

If you (eventually) accept this, then you're back to the model that
there's a lower layer which is providing your data steering on a
per-segment basis, and your iSCSI PDU is the granularity at which the
iSCSI implementation delivers data to this layer.  It is clearly a
granularity at which software (transfer scheduling) occurs.

At 10 Gbit/s, a 16 MB PDU is only 16 MS worth of data.

Steph





From owner-ips@ece.cmu.edu  Tue Apr 10 03:50:02 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA09408
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 03:49:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3A614u17544
	for ips-outgoing; Tue, 10 Apr 2001 02:01:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3A60Tr17517
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 02:00:29 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id IAA74176
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 08:00:22 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id HAA137904
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 07:58:19 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2A.0020ADF2 ; Tue, 10 Apr 2001 07:56:56 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2A.0020ADB6.00@d12mta02.de.ibm.com>
Date: Tue, 10 Apr 2001 09:00:49 +0200
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

Case a is what we have today.

Status numbering is not meant for ordering - it is only a helper for ack
(bulk ack).

All the resources required for status retransmission are the control block
and the status,
If you give them up you give up all forms of recovery (as command retry
will not help either).

The only recovery path remaining is a SCSI timeout and probably some form
of task management clear (as SCSI does not know what went wrong).

That  is what I had in mind when I said that we can make it optional.

However - a long time ago when we suggested making it optional for targets
most of the list wanted it mandatory.

Julo


Santosh Rao <santoshr@cup.hp.com> on 10/04/2001 02:06:38

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"




Julian & All,

Do we agree on the following requirements for SNACK :

a) iSCSI MUST NOT mandate either data or status S[N]ACK for intra-task
error recovery. Initiators MUST be allowed to perform command
granularity error recovery.

b) iSCSI MUST provide a mechanism by which targets can continue with I/O
resource release upon completion of an I/O. Such a mechanism may be
based on an explicit StatSN acknowledgement, (if the target supports
StatSN SNACK), or allow immiediate resource clean-up upon I/O
completion.

c) Such a mechanism MUST NOT block forward progress when holes occur in
StatSN sequence, due to format or digest errors encountered at the
initiator.

In order to meet the above requirements, "StatSN S[N]ACK" support can be
negotiated at login time and if StatSN SNACK is not supported by the
target, it MUST NOT use StatSN sequence numbering. (i.e. StatSN = 0).

By not using StatSN numbering, the "holes in StatSN" problem does not
occur, thereby, meeting requirements (a) ,(b) & (c) for targets that do
not retain I/O state information.

For targets that do retain I/O state information, StatSn SNACK is turned
on along with StatSN numbering.

- Santosh


julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> I have to think about it a bit more.  The main ack mechanism is ExpStatSN
> (as I indicated in a previous note).  SNACk is meant to simplify
recovering
> holes without having to reissue commands or wait for timeout. Rejecing
> SNACK with an "unsupported" indication but resending the status on
command
> retry can hardly be considered a good alternative while rejecting SNACK
> with "no status to recover" has to be bubbled up to SCSI and that can be
> bad for tapes.
> I think that if you keep status until ack-ed SNACK makes only life easier
> as it makes the recovery request explicit an specific - unlike the
command
> retry that is vague and unspecific.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 06/04/2001 20:26:47
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   santoshr@cup.hp.com, David Black <Black_David@emc.com>
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
> Julian,
>
> I did not hear back on this and am re-sending in case you did not
> receive the same. Your comments would be appreciated.
>
> (Can you clarify if you intend to make the current SNACK mechanism
> optional and if so, how it is expected to solve the "holes in StatSN"
> problem for targets that don't implement StatSN SNACK ?)
>
> Regards,
> Santosh
>
>
-------------------------------------------------------------------------------------

>
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> Date:     Thu, 05 Apr 2001 19:22:09 -0700
> From:     Santosh Rao <santoshr@cup.hp.com>
> Organization: Hewlett Packard, Cupertino.
> To:       julian_satran@il.ibm.com
> CC:       ips@ece.cmu.edu
>
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > SNACK and SACK are the same thing (I just renamed them to avoid
confusion
> > with TCP SACK).
> > The status is acked by ExpStatSN (and only indirectly by SNACK). SNACK
> > enables fast recovery of
> > a hole (whithout having to resort to a timeout).
>
> Julian,
>
> The bottom line is that the current SNACK mechanism as defined in the
> spec will NOT work if it is made optional, and at the same time, it is
> too expensive to mandate the SNACK mechanism.
>
> The current SNACK mechanism is really a negative ACK requesting the
> target to re-send the status PDU. This mechanism has 2 dis-advantages :
>
> a) requires targets to retain I/O state information until StatSN is
> acknowledged.
> b) Does not allow forward progress with the release of I/O resources in
> the event that a target could not retain that state information or for
> some other reason could not service the SNACK.
>
> I am suggesting that the alternate model of SACK be used, wherein, a
> SACK is an individual ACK of a received status PDU. This SACK only kicks
> in on detection of a hole. The hole is implicitly plugged by the
> initiator on eventual completion of the command
> [on timeout followed by abort or retry].
>
> The advantage of this alternate model is :
> a) Does not require state information to be stored at targets beyond I/O
> completion.
> b) Allows a more reliable mechanism of resource release.
>
> The dis-advantage of this mechanism is :
> a) It results in I/O timeout when Status PDU was dropped due to a digest
> error.
>
> Once again, the question boils down to the rate of TCP checksum escapes
> and the probability of such escapes affecting status PDUs. If this is
> low enough, such a timeout on a digest error of a status PDU should be
> acceptable.
>
> >  We decided long ago
> > against individual acks as bulk acking through a window is cheaper and
> > safer (repetition).
>
> I am not suggesting removal of bulk ack scheme. My suggestion is that
> SACK kick in on a hole and the initiator revert to bulk ACK scheme once
> it considers the hole to be plugged (thru the eventual completion of the
> I/O on the timeout path followed by abort or retry).
>
> - Santosh
>  - santoshr.vcf
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Tue Apr 10 04:40:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA09791
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 04:40:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3A6F3R18203
	for ips-outgoing; Tue, 10 Apr 2001 02:15:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3A6EYr18176
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 02:14:34 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3A7J0002960;
	Tue, 10 Apr 2001 00:19:05 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <santoshr@cup.hp.com>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Mon, 9 Apr 2001 23:09:01 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMECBCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
In-Reply-To: <3AD26485.58EEBDCF@cup.hp.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Santosh,

In a change recommending a flag and a valid CmdSN and not a null value, I
included a flag named "Casual" to provide this mode of operation you
describe which still allows flow control.  I would not wish to permit this
Casual mode, but as David feels such lists or threads are easy to manage,
this should not represent any greater effort for feedback.  But then again,
if you are not using flow control, you are also not using any
acknowledgement other than an eventual SCSI status.  If there is no flow
control, this removes resource control and reception acknowledgement.  If
this is not needed, then such flow control should be removed for simplicity
as there then must always be an abundance by design of adequate recourses.
For cases where command sequence is important, CmdSN together with
acknowledgement must remain or only one command at a time can be issued.  I
can envision three modes, a Sequential mode, a Casual mode, and an Exigent
mode where the command is given head of sequence status (with rejection for
feedback simplicity ;).

You seem to be placing many things on the table for reconsideration.  Flow
control, acknowledgement, sequential delivery, and perhaps even a means of
moving a command up the queue feeding SCSI targets.  One command at a time
for sequential assurance is not impossible.  Is flow control needed?  Do you
care if there is any reception acknowledgement if all feedback is in the
SCSI layer?  It would seem in this barebones case, iSCSI adds little to
ensure integrity.  The FC encapsulation places a time stamp within the PDU.
Perhaps a time stamp should be employed as an alternative to sequence
control?  At least after a timeout, you are assured a missing command is
forever gone.  Otherwise, the check was to see that CmdSN is not too small
to ensure it is not a stray.

Doug


> Doug [& All],
>
> Some comments on this thread :
>
> 1) The immediate command feature can be exploited by initiators which
> are not required to provide strict command ordering for their SCSI
> sub-system. (The majority of today's scsi stacks). IOW, all commands can
> be sent with CmdSN=0 to indicate no ordering required.
>
> Any scheme to restrict the number of "immediate commands" that can be
> sent prohibits such a feature and is un-desirable.
>
> 2) The (ExpCmdSN, MaxCmdSN) based acknowledgement and flow control is a
> freebie that targets can use to implement additional flow control.
> There's nothing in the spec that mandates that a target MUST use this
> feature to throttle its window and apply flow control.
>
> Hence, any dependence and assumptions on the existence of CmdSN based
> flow control may be incorrect, since the target may be using QUEUE FULL
> (TASK SET FULL) to apply flow control.
>
> > I also suspect it is a mistake to allow commands
> > to remain trapped within the sequencer during emergency or
> abnormal events
> > signified by the use of these immediate commands.
>
> 3) If the initiator suspects that the CmdSN queue at the target is
> stuck, it can always use a task mgmt command or even an iSCSI login with
> the X (restart) bit to perform a cleanup of stuck commands at the
> target. There's no need to build in implicit assumptions that a CmdSN
> with the immediate flag result in a clean-up of previously pending
> commands at the CmdSN queue of the target.
>
> - Santosh
>
>
> Douglas Otis wrote:
> >
> > David,
> >
> > > > To not support rejects, you are then relying on the target to
> > > > handle these now out of sequence commands.
> > >
> > > Immediate commands are "out of sequence" by definition - immediate
> > > means deliver immediately without regard to CmdSN order.  Only the
> > > target can do this, so I don't see any problem with relying
> on the Target
> > > to do so.
> >
> > The commands that are NOW out of sequence are those commands
> bypassed by the
> > immediate command.  If this immediate command was Rewind, then
> a long list
> > of write commands become invalid as a new tape is now needed for those
> > operations.  You are expecting that the target to handle a long list of
> > invalid commands made possible by the iSCSI sequencer.
> Commands invalidated
> > by immediate commands become out of sequence commands.
> >
> > I'll reserve further comment until Julian has made some progress in
>
> >
> > Forgive my advocacy, but if one does not attempt to support a
> position, then
> > the subject is not explored.
> >
> > Doug



From owner-ips@ece.cmu.edu  Tue Apr 10 10:03:38 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA15641
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 10:03:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ABU9r20095
	for ips-outgoing; Tue, 10 Apr 2001 07:30:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ABTIr20071
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 07:29:18 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id NAA156556
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:29:10 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id NAA68148
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:25:42 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2A.003EC7D2 ; Tue, 10 Apr 2001 13:25:43 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2A.003EC784.00@d12mta02.de.ibm.com>
Date: Tue, 10 Apr 2001 14:34:10 +0300
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Steph,

You overlook several well documented papers that were referenced on the
list and the perenial issue of midleboxes.

Julo

P.S. - and my personal experience that is I get a corrupted file about 2-4
times a year and I am far from being a heavy user.



Stephen Bailey <steph@cs.uchicago.edu> on 09/04/2001 21:57:19

Please respond to Stephen Bailey <steph@cs.uchicago.edu>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"




> Exactly, I've worked in this context (though its been some years now).
> It was true (at one time) that tape had a tractability limit, e.g.,
> a tape backup of a terabyte was out of the question.  Has that changed?

I think this is precisely the point.  Existing, off-the-shelf SCSI
solutions DO NOT presently solve this problem.  Both ||SCSI an FCP
burp the operation on a expectable, O(days) failure rate.  The rate of
adoption for the FCP-2 command recovery feature is overwhelming to the
point that the tape guys have been talking about end-running the
problem with explicitly addressed commands.

What we have running iSCSI on TCP is such a drastic improvement in
what you can expect from your SCSI service that we can eventually
expect a disruptive change.  Trying to engineer it to the point where
its 2^100 times more disruptive, when we don't really know where it's
taking us in the first place is meaningless.

[Warning: repetition ahead]

TCP + link layer error detection is engineered precisely to ensure
reliable data delivery.  It's clear from an engineering stand point
that it is likely (not guaranteed, what is?) to do this quite well.
In spite of much research, it seems like nobody here has come up with
a strong indication that TCP + link layer error detection does NOT do
its job well.  I do not think this is because nobody has ever looked
at the problem.

The lack of concrete information to support the case that TCP + link
layer error detection is inadequate has us chasing our tails.

Given the layer iSCSI occupies in the protocol layer cake, if we don't
try to solve which is presently assigned to a lower layer, it seems
quite comfortable to shim additional checks or recovery, or even a
completely
different transport substrate underneath if we do discover TCP + link
layer error detection is not doing the trick, but it really seems like
folly to engineer based upon an assumption that nobody has done a good
job documenting.

Steph






From owner-ips@ece.cmu.edu  Tue Apr 10 10:09:13 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA15640
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 10:03:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ABH9C19506
	for ips-outgoing; Tue, 10 Apr 2001 07:17:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ABGxr19499
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 07:17:00 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id NAA129390
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:16:51 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id NAA220234
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:14:48 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2A.003DA635 ; Tue, 10 Apr 2001 13:13:22 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2A.003DA5C7.00@d12mta02.de.ibm.com>
Date: Tue, 10 Apr 2001 14:21:49 +0300
Subject: Re: iSCSI: session login and ISID
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



WWUI can be presented during login phase (2.10.9 is correct and in-line
with 1.2.7) Two sesions can have the same ISID but will have different
TSID. The question of whether more than one session should be allowed
between a pair of wuis is under debate.

Julo

sandeepj@research.bell-labs.com (Sandeep Joshi) on 09/04/2001 20:27:46

Please respond to sandeepj@research.bell-labs.com (Sandeep Joshi)

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI: session login and ISID




There seems to be a problem in distinguishing session logins, using
only the ISID field in the Login Command.   It is possible that
different initiators could try to start a session using the same ISID
value.   One of those attempts will get rejected, since the ISID is
the sole key to find if a session already exists. (note: TSID was
sent as zero for the leading connection of session)

The initiator WWUI does not seem to be available at this time.
a) Appendix D.10 states that InitiatorWWUI is optional and defaults
   to iSCSI.
b) Section 2.10.9 on Login Command states that "initiator MAY provide
   some basic parameters".

On the other hand, Section 1.2.7 states that "the initiator MUST
present both its initiator WWUI and target WWUI to which it wishes
to connect during the login phase".

The WWUI is also needed if we are to support multiple I_T nexuses
between the same initiator and target.

So it seems like Section 1.2.7 has the right spec.   Appendix D and
Section 2.10.9 must then be corrected.  The descriptions in Sec 4.1
and Section 1.2.3 may also need to be changed to reflect the fact
that initiator WWUI must be supplied at session login.

-Sandeep






From owner-ips@ece.cmu.edu  Tue Apr 10 12:46:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA20490
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 12:46:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ABQ9a19928
	for ips-outgoing; Tue, 10 Apr 2001 07:26:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ABP9r19842
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 07:25:09 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id NAA23318
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:25:01 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id NAA120818
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:22:57 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2A.003E64B9 ; Tue, 10 Apr 2001 13:21:30 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2A.003E62A7.00@d12mta02.de.ibm.com>
Date: Tue, 10 Apr 2001 14:29:54 +0300
Subject: Re: Multiple targets behind a single IP address
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



It has to get once to the target up to the end of the login phase.  Why
should it be in the BHS? Do you have to carry it on every PDU.

Julo

jojy_michael@agilent.com on 09/04/2001 21:51:29

Please respond to jojy_michael@agilent.com

To:   ips@ece.cmu.edu
cc:
Subject:  Multiple targets behind a single IP address




Section 1.2.7 states that WWUIs are used in iSCSI to provide a target
identifier for configurations that present multiple targets behind a single
IP address and port.

To provide this support I would expect the target WWUI to be in the BHS,
but, it does not seem to be there. Is the support missing from the spec?

- Jojy






From owner-ips@ece.cmu.edu  Tue Apr 10 13:05:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA21185
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 13:05:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AEfH601763
	for ips-outgoing; Tue, 10 Apr 2001 10:41:17 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AEePr01691
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 10:40:25 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id KAA21384;
	Tue, 10 Apr 2001 10:32:56 -0400
Received: from f3n42e (d03nm042h.boulder.ibm.com [9.99.140.42])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.95) with ESMTP id IAA174718;
	Tue, 10 Apr 2001 08:40:15 -0600
Importance: Normal
Subject: RE: Multiple targets behind a single IP address
To: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>,
        ips@ece.cmu.edu
From: "Jim Hafner" <hafner@almaden.ibm.com>
Date: Tue, 10 Apr 2001 07:40:13 -0700
Message-ID: <OFF09D5D6C.31421BA7-ON88256A2A.00502660@LocalDomain>
X-MIMETrack: Serialize by Router on D03NM042/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/10/2001 07:40:14 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Marjorie,

I think I have to ask some questions concerning your quoted statement that
the "TCP connection is the target identifier".    First, the term "target
identifier" is a heavily loaded term in SCSI; is that the sense you meant
it it? Second, how do multiple connections per session, particularly with
connections spanning multiple TCP entities (IPaddresses and IPports) fit in
your picture?  Third, is it the connection itself or the ipaddress/ipport
you're referring to?

Thanks,
Jim Hafner


"KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>@ece.cmu.edu
on 04-09-2001 05:59:54 PM

Sent by:  owner-ips@ece.cmu.edu


To:   "'jojy_michael@agilent.com'" <jojy_michael@agilent.com>,
      ips@ece.cmu.edu
cc:
Subject:  RE: Multiple targets behind a single IP address



The Target Node Address is used in the login phase of the iSCSI protocol.
In order to attempt login, a TCP connection must be first set up.  It is
this connection which is associated with the target when the initiator is
successfully logged in (any PDUs received on that connection are addressed
to the "target the initiator logged into").  There is no connection
association to a target until an initiator is successfully logged in, and
any PDUs received before successful login that are not related to the login
phase (such as a command PDU) should be discarded (not "rejected", since
that could constitute a denial-of-service attack opportunity).

If an initiator is talking to multiple targets behind a single IP address,
it will have at least 1 TCP connection to each target.

The short answer is "the TCP connection is the target identifier" :-)

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com

> -----Original Message-----
> From: jojy_michael@agilent.com [mailto:jojy_michael@agilent.com]
> Sent: Monday, April 09, 2001 11:51 AM
> To: ips@ece.cmu.edu
> Subject: Multiple targets behind a single IP address
>
>
> Section 1.2.7 states that WWUIs are used in iSCSI to provide a target
> identifier for configurations that present multiple targets
> behind a single
> IP address and port.
>
> To provide this support I would expect the target WWUI to be
> in the BHS,
> but, it does not seem to be there. Is the support missing
> from the spec?
>
> - Jojy
>





From owner-ips@ece.cmu.edu  Tue Apr 10 13:07:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA21226
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 13:07:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AF4HT03539
	for ips-outgoing; Tue, 10 Apr 2001 11:04:17 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AF44r03521
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 11:04:04 -0400 (EDT)
Received: from hpbs5000.boi.hp.com (hpbs5000.boi.hp.com [15.56.8.201])
	by palrel3.hp.com (Postfix) with ESMTP id 3FB1D5F1
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 08:04:03 -0700 (PDT)
Received: from xpabh3.corp.hp.com (xpabh3.corp.hp.com [15.58.136.223]) by hpbs5000.boi.hp.com with ESMTP (8.8.6 (PHNE_17135)/8.8.6 SMKit7.02) id JAA13521 for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 09:04:02 -0600 (MDT)
Received: by xpabh3.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2M5MZ4VA>; Tue, 10 Apr 2001 08:03:49 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08F9A@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: draft-ietf-ips-iscsi-rqmts-02.txt
Date: Tue, 10 Apr 2001 08:03:48 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I've posted a new revision of the iSCSI Requirements Doc to the internet
drafts.  It is posted on Julian's website, the URL is
http://www.haifa.il.ibm.com/satran/ips/draft-ietf-ips-iscsi-rqmts-02.txt

Please review this document before the IPS interim meeting in Nashua
(5/1/01) - we will have an informal call for consensus to submit this
document as an informational RFC.

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 


From owner-ips@ece.cmu.edu  Tue Apr 10 13:08:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA21249
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 13:07:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AF4LY03547
	for ips-outgoing; Tue, 10 Apr 2001 11:04:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AF3nr03513
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 11:03:49 -0400 (EDT)
Received: from colosus2.cup.hp.com (colosus2.cup.hp.com [15.13.128.145])
	by palrel1.hp.com (Postfix) with ESMTP id AC0CCAC8
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 08:03:34 -0700 (PDT)
Received: from hp.com (IDENT:plabat@pl703521.cup.hp.com [15.13.133.216])
	by colosus2.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id IAA03121;
	Tue, 10 Apr 2001 08:03:32 -0700 (PDT)
Message-ID: <3AD322D6.FC79F5B1@hp.com>
Date: Tue, 10 Apr 2001 08:12:22 -0700
From: Pierre Labat <pierre_labat@hp.com>
Organization: Hewlett Packard ATM-SISL
X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.5-15 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI: session login and ISID
References: <C1256A2A.003DA5C7.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:

> WWUI can be presented during login phase (2.10.9 is correct and in-line
> with 1.2.7) Two sesions can have the same ISID but will have different
> TSID. The question of whether more than one session should be allowed
> between a pair of wuis is under debate.
>

I think that several sessions should be allowed between a pair of wwuis
at least to handle the simple case where the initiator and the target have

two adapters each and each session wants to use a pair of adapter.

Regards,

Pierre

>
> Julo
>
> sandeepj@research.bell-labs.com (Sandeep Joshi) on 09/04/2001 20:27:46
>
> Please respond to sandeepj@research.bell-labs.com (Sandeep Joshi)
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  iSCSI: session login and ISID
>
> There seems to be a problem in distinguishing session logins, using
> only the ISID field in the Login Command.   It is possible that
> different initiators could try to start a session using the same ISID
> value.   One of those attempts will get rejected, since the ISID is
> the sole key to find if a session already exists. (note: TSID was
> sent as zero for the leading connection of session)
>
> The initiator WWUI does not seem to be available at this time.
> a) Appendix D.10 states that InitiatorWWUI is optional and defaults
>    to iSCSI.
> b) Section 2.10.9 on Login Command states that "initiator MAY provide
>    some basic parameters".
>
> On the other hand, Section 1.2.7 states that "the initiator MUST
> present both its initiator WWUI and target WWUI to which it wishes
> to connect during the login phase".
>
> The WWUI is also needed if we are to support multiple I_T nexuses
> between the same initiator and target.
>
> So it seems like Section 1.2.7 has the right spec.   Appendix D and
> Section 2.10.9 must then be corrected.  The descriptions in Sec 4.1
> and Section 1.2.3 may also need to be changed to reflect the fact
> that initiator WWUI must be supplied at session login.
>
> -Sandeep



From owner-ips@ece.cmu.edu  Tue Apr 10 15:24:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA24438
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 15:24:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AGtJ711455
	for ips-outgoing; Tue, 10 Apr 2001 12:55:19 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e33.bld.us.ibm.com (e33.co.us.ibm.com [32.97.110.131])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AGsUr11409
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 12:54:30 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e33.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id LAA19964;
	Tue, 10 Apr 2001 11:48:42 -0500
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.95) with ESMTP id KAA69478;
	Tue, 10 Apr 2001 10:54:24 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: Re: iSCSI: session login and ISID
To: Pierre Labat <pierre_labat@hp.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFA5558F80.26F9149C-ON88256A2A.0059D600@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Tue, 10 Apr 2001 09:54:14 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/10/2001 10:54:24 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Pierre, et.al.
There should be no reason that a specific OS, can not have multiple session
to the same Target, as long as the ISID is different.  It is up to the
installation/vendor to determine how to arrive at ISID.  They can also
determine that they want each iSCSI HBA to have a different ISID, and
arrive at unique ISIDs for each.  Then, as we talked, months and months
ago, it is then up to a Wedge Driver to sort out who get what requests, and
attempts to do load balancing.

The wedge driver is needed to make sure that the application does not have
to care which Session it is connected too, since they have no way to
specify it anyhow.  That is, the wedge driver needs to insure that it sends
no I/O to a LU on a specific session if it has sent previous I/O to that
same LU, on a different session, and still has its I/O outstanding.
Otherwise, out of order things will happen.  But this is a fairly easy
thing to handle in the  Wedge Driver.  There are some issues with
Persistence Reserve, but those can be handled also, by sending LU's I/O to
the Reserved LUs only on the same Session, on which the Reserve was issued.

Remember, it is only the multiple connections per session model that does
not require the Wedge Driver in order handle multiple connections.


.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


Pierre Labat <pierre_labat@hp.com>@ece.cmu.edu on 04/10/2001 08:12:22 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI: session login and ISID



julian_satran@il.ibm.com wrote:

> WWUI can be presented during login phase (2.10.9 is correct and in-line
> with 1.2.7) Two sesions can have the same ISID but will have different
> TSID. The question of whether more than one session should be allowed
> between a pair of wuis is under debate.
>

I think that several sessions should be allowed between a pair of wwuis
at least to handle the simple case where the initiator and the target have

two adapters each and each session wants to use a pair of adapter.

Regards,

Pierre

>
> Julo
>
> sandeepj@research.bell-labs.com (Sandeep Joshi) on 09/04/2001 20:27:46
>
> Please respond to sandeepj@research.bell-labs.com (Sandeep Joshi)
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  iSCSI: session login and ISID
>
> There seems to be a problem in distinguishing session logins, using
> only the ISID field in the Login Command.   It is possible that
> different initiators could try to start a session using the same ISID
> value.   One of those attempts will get rejected, since the ISID is
> the sole key to find if a session already exists. (note: TSID was
> sent as zero for the leading connection of session)
>
> The initiator WWUI does not seem to be available at this time.
> a) Appendix D.10 states that InitiatorWWUI is optional and defaults
>    to iSCSI.
> b) Section 2.10.9 on Login Command states that "initiator MAY provide
>    some basic parameters".
>
> On the other hand, Section 1.2.7 states that "the initiator MUST
> present both its initiator WWUI and target WWUI to which it wishes
> to connect during the login phase".
>
> The WWUI is also needed if we are to support multiple I_T nexuses
> between the same initiator and target.
>
> So it seems like Section 1.2.7 has the right spec.   Appendix D and
> Section 2.10.9 must then be corrected.  The descriptions in Sec 4.1
> and Section 1.2.3 may also need to be changed to reflect the fact
> that initiator WWUI must be supplied at session login.
>
> -Sandeep






From owner-ips@ece.cmu.edu  Tue Apr 10 15:25:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA24450
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 15:25:16 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AHIKG12987
	for ips-outgoing; Tue, 10 Apr 2001 13:18:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AHIBr12977
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:18:11 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LCGDG>; Tue, 10 Apr 2001 13:08:56 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015404@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Tue, 10 Apr 2001 13:18:05 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Marj,

> How does this relate to URLs?  Does this mean we can't specify a URL
format
> for iSCSI resources?  Can you provide the logic behind the IESG
pronouncing
> that the IESG won't approve another global namespace?

Use of URLs is fine (existing namespace), as is use of URL format.  The word
"global" has a number of meanings - in this context it has to do with scope
and context, namely globally unique identifiers for anything on the globe
that
needs a unique identifier.  While it may not have been the intention to do
so, the WWUI approach leads to a new global naming framework.  The
IESG is saying "NO" to that sort of general framework in no uncertain terms
and asking the WG to focus on the storage naming problems we need to
solve and not go about defining a framework that is extensible to solve a
whole batch of other unrelated problems.

At the next level down, there are three sorts
of issues floating in here:
- Semantics: Unique identifiers, global scope,
	not tied to communication endpoints.
- Syntax: How those identifiers are represented
	in messages and related issues of control
	over use and extension of that representation
- Description: How the syntax is documented and
	the resulting suggestions/implications for
	its use.
To a first approximation, the above list is in
the order of importance to iSCSI (Semantics is
most important) and in REVERSE order of importance
to the IESG/IAB issues I've raised (i.e., the
biggest issues are with the Description, believe
it or not :-)).

Let me know if I need to explain more,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Tue Apr 10 16:30:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA25564
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 16:30:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AHfIr14568
	for ips-outgoing; Tue, 10 Apr 2001 13:41:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AHf1r14555
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:41:01 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT310H1>; Tue, 10 Apr 2001 13:42:25 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015405@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	very
Date: Tue, 10 Apr 2001 13:40:49 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> > Immediate commands are "out of sequence" by definition - immediate
> > means deliver immediately without regard to CmdSN order.  Only the
> > target can do this, so I don't see any problem with relying on the
Target
> > to do so.
> 
> The commands that are NOW out of sequence are those commands bypassed by
the
> immediate command.  If this immediate command was Rewind, then a long list
> of write commands become invalid as a new tape is now needed for those
> operations.  You are expecting that the target to handle a long list of
> invalid commands made possible by the iSCSI sequencer.  Commands
invalidated
> by immediate commands become out of sequence commands.

And I still don't see a problem here.  Tapes have to have logic
to deal with a write issued when no tape is loaded - this situation
is yet another way to invoke that logic.  The target was asked
to execute the Rewind immediately, did so, and now processes the
writes in its new state of not having a tape loaded (presumably,
they all result in check conditions).  Even if we do something
clever in the iSCSI "sequencer", the same behavior can result if
the write commands have been delivered to SCSI and queued in SCSI.
I don't see the point of adding a new way (iSCSI rejects vs. SCSI
check conditions) for what's essentially the same set of errors
(tape was rewound, therefore can't be written) to manifest themselves.

If the Initiator didn't want this
behavior, it should have done something else, such as not
using immediate delivery for the Rewind, or sending
some suitable Task or Task Set Aborts.  

> I'll reserve further comment until Julian has made some progress in
> addressing these concerns.  I also suspect it is a mistake to allow
commands
> to remain trapped within the sequencer during emergency or abnormal events
> signified by the use of these immediate commands.

Resetting the sequencer probably involves a Lun or Target
Reset, which may be necessary in any case to clean up
from these sort of emergency or abnormal events.  Keep
in mind that SCSI Task Management is inherently
non-deterministic, and so asking for completely predictable
behavior in these sort of circumstances is just plain
unreasonable.

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Tue Apr 10 16:30:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA25575
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 16:30:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AHwLH15852
	for ips-outgoing; Tue, 10 Apr 2001 13:58:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AHvTr15793
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 13:57:29 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LC25H>; Tue, 10 Apr 2001 13:48:14 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015406@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	very
Date: Tue, 10 Apr 2001 13:57:22 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Doug,

There are a couple of confusions in here:

TASK_SET_FULL *is* a flow control mechanism, albeit a
rather primitive one - the Initiator issues commands until
it gets a FULL response and hence knows the capacity
of the Target's queue.  The statement that the Target loses
resource control if it only uses TASK_SET_FULL is wrong,
although it is relying on code at or above the SCSI driver
on the Initiator to handle the flow control.

Santosh did not propose to remove CmdSN, although he
did point out an issue that needs attention - we need
to say something about the interaction of CmdSN with
TASK_SET_FULL since both mechanisms may be implemented.
In particular, it's possible to use CmdSN for delivery
ACKs and TASK_SET_FULL for flow control, although that's
a "peculiar" combination (and perhaps a SHOULD NOT?).

Beyond this, your entire second paragraph is well off
into the weeds.  Please be careful about taking the
discussion off onto wild tangents.

Thanks,
--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------

From:	Douglas Otis [dotis@sanlight.net]
Sent:	Tuesday, April 10, 2001 2:09 AM
To:	santoshr@cup.hp.com
Cc:	Ips
Subject:	RE: iSCSI:flow control, acknowledgement, and a deterministic
recovery

Santosh,

In a change recommending a flag and a valid CmdSN and not a null value, I
included a flag named "Casual" to provide this mode of operation you
describe which still allows flow control.  I would not wish to permit this
Casual mode, but as David feels such lists or threads are easy to manage,
this should not represent any greater effort for feedback.  But then again,
if you are not using flow control, you are also not using any
acknowledgement other than an eventual SCSI status.  If there is no flow
control, this removes resource control and reception acknowledgement.  If
this is not needed, then such flow control should be removed for simplicity
as there then must always be an abundance by design of adequate recourses.
For cases where command sequence is important, CmdSN together with
acknowledgement must remain or only one command at a time can be issued.  I
can envision three modes, a Sequential mode, a Casual mode, and an Exigent
mode where the command is given head of sequence status (with rejection for
feedback simplicity ;).

You seem to be placing many things on the table for reconsideration.  Flow
control, acknowledgement, sequential delivery, and perhaps even a means of
moving a command up the queue feeding SCSI targets.  One command at a time
for sequential assurance is not impossible.  Is flow control needed?  Do you
care if there is any reception acknowledgement if all feedback is in the
SCSI layer?  It would seem in this barebones case, iSCSI adds little to
ensure integrity.  The FC encapsulation places a time stamp within the PDU.
Perhaps a time stamp should be employed as an alternative to sequence
control?  At least after a timeout, you are assured a missing command is
forever gone.  Otherwise, the check was to see that CmdSN is not too small
to ensure it is not a stray.

Doug


> Doug [& All],
>
> Some comments on this thread :
>
> 1) The immediate command feature can be exploited by initiators which
> are not required to provide strict command ordering for their SCSI
> sub-system. (The majority of today's scsi stacks). IOW, all commands can
> be sent with CmdSN=0 to indicate no ordering required.
>
> Any scheme to restrict the number of "immediate commands" that can be
> sent prohibits such a feature and is un-desirable.
>
> 2) The (ExpCmdSN, MaxCmdSN) based acknowledgement and flow control is a
> freebie that targets can use to implement additional flow control.
> There's nothing in the spec that mandates that a target MUST use this
> feature to throttle its window and apply flow control.
>
> Hence, any dependence and assumptions on the existence of CmdSN based
> flow control may be incorrect, since the target may be using QUEUE FULL
> (TASK SET FULL) to apply flow control.
>
> > I also suspect it is a mistake to allow commands
> > to remain trapped within the sequencer during emergency or
> abnormal events
> > signified by the use of these immediate commands.
>
> 3) If the initiator suspects that the CmdSN queue at the target is
> stuck, it can always use a task mgmt command or even an iSCSI login with
> the X (restart) bit to perform a cleanup of stuck commands at the
> target. There's no need to build in implicit assumptions that a CmdSN
> with the immediate flag result in a clean-up of previously pending
> commands at the CmdSN queue of the target.
>
> - Santosh
>
>
> Douglas Otis wrote:
> >
> > David,
> >
> > > > To not support rejects, you are then relying on the target to
> > > > handle these now out of sequence commands.
> > >
> > > Immediate commands are "out of sequence" by definition - immediate
> > > means deliver immediately without regard to CmdSN order.  Only the
> > > target can do this, so I don't see any problem with relying
> on the Target
> > > to do so.
> >
> > The commands that are NOW out of sequence are those commands
> bypassed by the
> > immediate command.  If this immediate command was Rewind, then
> a long list
> > of write commands become invalid as a new tape is now needed for those
> > operations.  You are expecting that the target to handle a long list of
> > invalid commands made possible by the iSCSI sequencer.
> Commands invalidated
> > by immediate commands become out of sequence commands.
> >
> > I'll reserve further comment until Julian has made some progress in
>
> >
> > Forgive my advocacy, but if one does not attempt to support a
> position, then
> > the subject is not explored.
> >
> > Doug




From owner-ips@ece.cmu.edu  Tue Apr 10 17:56:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA26873
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 17:56:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AJVM522511
	for ips-outgoing; Tue, 10 Apr 2001 15:31:22 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AJV2r22496
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 15:31:02 -0400 (EDT)
Received: from xparelay2.corp.hp.com (xparelay2.corp.hp.com [15.58.137.112])
	by palrel3.hp.com (Postfix) with ESMTP
	id 331AC7D8; Tue, 10 Apr 2001 12:31:01 -0700 (PDT)
Received: from xatlbh2.atl.hp.com (xatlbh2.atl.hp.com [15.45.89.187])
	by xparelay2.corp.hp.com (Postfix) with ESMTP
	id 5FD4B1F508; Tue, 10 Apr 2001 15:29:28 -0400 (EDT)
Received: by xatlbh2.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <2M5J28V3>; Tue, 10 Apr 2001 15:30:41 -0400
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08F9D@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'Jim Hafner'" <hafner@almaden.ibm.com>, ips@ece.cmu.edu
Subject: RE: Multiple targets behind a single IP address
Date: Tue, 10 Apr 2001 15:30:32 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Oops, sorry for using that loaded term!  I mean "identifier" in the most
generic way and didn't mean to imply that this is a SCSI target identifier!
I think Jojy was asking what part of the iSCSI PDU indicates which target
this packet is addressed to, and I was trying to convey that the connection
the PDU is received on should provide the association with the target.  You
are correct in pointing out that the association is actually higher than
that, in that a connection is associated with a session, and it's the
session that identifies the target.

To your third point, the connection = ip address+port pair in TCP

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 

> -----Original Message-----
> From: Jim Hafner [mailto:hafner@almaden.ibm.com]
> Sent: Tuesday, April 10, 2001 7:40 AM
> To: KRUEGER,MARJORIE (HP-Roseville,ex1); ips@ece.cmu.edu
> Subject: RE: Multiple targets behind a single IP address
> 
> 
> 
> Marjorie,
> 
> I think I have to ask some questions concerning your quoted 
> statement that
> the "TCP connection is the target identifier".    First, the 
> term "target
> identifier" is a heavily loaded term in SCSI; is that the 
> sense you meant
> it it? Second, how do multiple connections per session, 
> particularly with
> connections spanning multiple TCP entities (IPaddresses and 
> IPports) fit in
> your picture?  Third, is it the connection itself or the 
> ipaddress/ipport
> you're referring to?
> 
> Thanks,
> Jim Hafner
> 
> 
> "KRUEGER,MARJORIE (HP-Roseville,ex1)" 
> <marjorie_krueger@hp.com>@ece.cmu.edu
> on 04-09-2001 05:59:54 PM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> 
> To:   "'jojy_michael@agilent.com'" <jojy_michael@agilent.com>,
>       ips@ece.cmu.edu
> cc:
> Subject:  RE: Multiple targets behind a single IP address
> 
> 
> 
> The Target Node Address is used in the login phase of the 
> iSCSI protocol.
> In order to attempt login, a TCP connection must be first set 
> up.  It is
> this connection which is associated with the target when the 
> initiator is
> successfully logged in (any PDUs received on that connection 
> are addressed
> to the "target the initiator logged into").  There is no connection
> association to a target until an initiator is successfully 
> logged in, and
> any PDUs received before successful login that are not 
> related to the login
> phase (such as a command PDU) should be discarded (not 
> "rejected", since
> that could constitute a denial-of-service attack opportunity).
> 
> If an initiator is talking to multiple targets behind a 
> single IP address,
> it will have at least 1 TCP connection to each target.
> 
> The short answer is "the TCP connection is the target identifier" :-)
> 
> Marjorie Krueger
> Networked Storage Architecture
> Networked Storage Solutions Org.
> Hewlett-Packard
> tel: +1 916 785 2656
> fax: +1 916 785 0391
> email: marjorie_krueger@hp.com
> 
> > -----Original Message-----
> > From: jojy_michael@agilent.com [mailto:jojy_michael@agilent.com]
> > Sent: Monday, April 09, 2001 11:51 AM
> > To: ips@ece.cmu.edu
> > Subject: Multiple targets behind a single IP address
> >
> >
> > Section 1.2.7 states that WWUIs are used in iSCSI to 
> provide a target
> > identifier for configurations that present multiple targets
> > behind a single
> > IP address and port.
> >
> > To provide this support I would expect the target WWUI to be
> > in the BHS,
> > but, it does not seem to be there. Is the support missing
> > from the spec?
> >
> > - Jojy
> >
> 
> 
> 


From owner-ips@ece.cmu.edu  Tue Apr 10 17:56:21 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA26885
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 17:56:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AKrMH28291
	for ips-outgoing; Tue, 10 Apr 2001 16:53:22 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AKqsr28271
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 16:52:54 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3AM0x004036;
	Tue, 10 Apr 2001 15:01:04 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Tue, 10 Apr 2001 13:51:00 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOECICGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0708015405@corpmx9.isus.emc.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

The example that I gave perhaps is not a valid case but I wanted to
illustrate which commands become invalidated by a command that bypasses the
sequencer.  Clearly, it is not the command that does the bypassing but those
commands bypassed.  My concern is regarding the affect this has with respect
to existing interfaces where there is not this potential for a large queue
of commands that can not be purged except through action at the target at a
much later point in time.  The interface offered by iSCSI is not perhaps the
same as that seen by the iSCSI to SCSI layer.

Just as you could use a non-immediate SCSI command (why exigent is a good
name.) you will also likely wish to insert additional commands, change the
reference on those issued, and the like.  With this being a potentially long
list of commands, you will not know when this list is exhausted until you
see ExpCmdSN has passed these commands.  The period of time for this process
is potentially long and will likely involve timeouts on other targets if a
tape device goes Busy.  The iSCSI sequencer could become generous as to the
concept of sequential delivery.  You imply that if staged in order
initially, then when these items are actually sent to the SCSI target layer
is not a concern.  That is not what the proposal says!  Now you have a three
minute non-immediate Rewind to wait for!

Ver 5 Pg 11.
   "The iSCSI target layer MUST deliver the commands to the SCSI target
   layer in the order specified by CmdSN."

If these commands are delivered to the target, then SCSI has the means to
handle these commands.  If stuck in the sequencer, there is NO means to
handle these commands until they poke their way through the iSCSI sequencer.
Expect this to impact the nature of SCSI handling as you suggested.

Rejecting commands within the sequencer should have zero affect on any SCSI
target.  None of these commands have ever been issued to the SCSI layer.  It
is simply a means to return these commands to the initiator to allow the
normal process of flushing.  Should there be commands to reissue, these
commands are reissued.  There is no need to reset anything.  Not requiring
commands bypassed to be rejected will cause someone to suggest a need for a
sequencer reset.  I think that an automatic rejection of bypassed commands
is the safest means to handle this situation.  Otherwise, you are going to
need to write a relaxed set of rules where sequencing happens at a point
that is never apparent or acknowledged.  The state of the system becomes
abstract.  At least with rejection, ExpCmdSN makes it extremely clear what
has been delivered to the SCSI layer.  With your scheme, you never know.

I am asking for predictable behavior in the event of the sequencer being
bypassed.  This allows for the management commands to be issued without risk
of being prevented by the command window while not being able to predict how
many commands it will take to handle these events.  Rejecting pending
commands within the sequencer at the event of a bypass should be harmless
and painless.

Doug


> > > Immediate commands are "out of sequence" by definition - immediate
> > > means deliver immediately without regard to CmdSN order.  Only the
> > > target can do this, so I don't see any problem with relying on the
> Target
> > > to do so.
> >
> > The commands that are NOW out of sequence are those commands bypassed by
> the
> > immediate command.  If this immediate command was Rewind, then
> a long list
> > of write commands become invalid as a new tape is now needed for those
> > operations.  You are expecting that the target to handle a long list of
> > invalid commands made possible by the iSCSI sequencer.  Commands
> invalidated
> > by immediate commands become out of sequence commands.
>
> And I still don't see a problem here.  Tapes have to have logic
> to deal with a write issued when no tape is loaded - this situation
> is yet another way to invoke that logic.  The target was asked
> to execute the Rewind immediately, did so, and now processes the
> writes in its new state of not having a tape loaded (presumably,
> they all result in check conditions).  Even if we do something
> clever in the iSCSI "sequencer", the same behavior can result if
> the write commands have been delivered to SCSI and queued in SCSI.
> I don't see the point of adding a new way (iSCSI rejects vs. SCSI
> check conditions) for what's essentially the same set of errors
> (tape was rewound, therefore can't be written) to manifest themselves.
>
> If the Initiator didn't want this
> behavior, it should have done something else, such as not
> using immediate delivery for the Rewind, or sending
> some suitable Task or Task Set Aborts.
>
> > I'll reserve further comment until Julian has made some progress in
> > addressing these concerns.  I also suspect it is a mistake to allow
> commands
> > to remain trapped within the sequencer during emergency or
> abnormal events
> > signified by the use of these immediate commands.
>
> Resetting the sequencer probably involves a Lun or Target
> Reset, which may be necessary in any case to clean up
> from these sort of emergency or abnormal events.  Keep
> in mind that SCSI Task Management is inherently
> non-deterministic, and so asking for completely predictable
> behavior in these sort of circumstances is just plain
> unreasonable.
>
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>



From owner-ips@ece.cmu.edu  Tue Apr 10 17:56:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA26897
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 17:56:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AJjOV23487
	for ips-outgoing; Tue, 10 Apr 2001 15:45:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AJisr23440
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 15:44:54 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3AKqS003982;
	Tue, 10 Apr 2001 13:52:29 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>,
        "Ips Reflector \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: draft-ietf-ips-iscsi-rqmts-02.txt
Date: Tue, 10 Apr 2001 12:42:30 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOECGCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <6BD67FFB937FD411A04F00D0B74FE87802A08F9A@xrose06.rose.hp.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie,

XML?

Pg 6.
    "(5)  Development of specifications for iSCSI device management as MIBs,
         XML schemas, etc."

XML is a method for data markup.  Hopefully a URN assigned by IANA will
become the XML Namespace's name.  Standard practice within W3C is to not use
PUBLIC identifiers and instead use 'well known' SYSTEM identifiers.  Is an
XML schema in lieu of an LDAP schema to allow use of XML namespace?

Pg 4.
   "The format for the iSCSI names MUST use existing naming authorities.

   An iSCSI name SHOULD be a human readable string in an international
   character set encoding.

   Standard internet lookup services SHOULD be used to resolve iSCSI names."

Pg 7.

   "The iSCSI protocol MUST provide a means of identifying an iSCSI storage
   device by a unique identifier that is independent of the path on which it
   is found.  This name will be used to correlate alternate paths to the
same
   device.  The format for the iSCSI names MUST use existing naming
   authorities, to avoid creating new central administrative tasks.  An
iSCSI
   name SHOULD be a human readable string in an international character set
   encoding."

Would a naming authority include XML?  Why was an XML schema chosen over an
LDAP schema as follow on documentation.  XML is relatively new compared to
LDAP and will require coordination with W3C standards bodies.

Example of XML data types:

<!NOTATION string
   SYSTEM "urn:schemas-microsoft-com:datatypes/string">
<!NOTATION number
   SYSTEM "urn:schemas-microsoft-com:datatypes/number">
<!NOTATION int
   SYSTEM "urn:schemas-microsoft-com:datatypes/int">

The associated URI may be some other notation. Optionally, such referents
may be Java classes, XSL stylesheets, et. al.

For a XML schema example see:
http://www.w3.org/2001/XMLSchema.xsd


Doug

> I've posted a new revision of the iSCSI Requirements Doc to the internet
> drafts.  It is posted on Julian's website, the URL is
> http://www.haifa.il.ibm.com/satran/ips/draft-ietf-ips-iscsi-rqmts-02.txt
>
> Please review this document before the IPS interim meeting in Nashua
> (5/1/01) - we will have an informal call for consensus to submit this
> document as an informational RFC.
>
> Marjorie Krueger
> Networked Storage Architecture
> Networked Storage Solutions Org.
> Hewlett-Packard
> tel: +1 916 785 2656
> fax: +1 916 785 0391
> email: marjorie_krueger@hp.com
>



From owner-ips@ece.cmu.edu  Tue Apr 10 18:02:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA27035
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 18:02:53 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AKHMB25805
	for ips-outgoing; Tue, 10 Apr 2001 16:17:22 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AKGhr25768
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 16:16:43 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTQ8YH>; Tue, 10 Apr 2001 13:16:32 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B1733F1@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	 very
Date: Tue, 10 Apr 2001 13:16:26 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> TASK_SET_FULL *is* a flow control mechanism, albeit a
> rather primitive one - the Initiator issues commands until
> it gets a FULL response and hence knows the capacity
> of the Target's queue.

Using task set full to probe in this way is a guess at best. It's certainly
not behavior specified by any standard. Getting a task full after "N"
commands does not guarantee that the LU will have room for "N" commands in
the future. For efficiency reasons, logical units are loath to make any such
resource guarantee to individual initiators.  Earlier postings have
discussed this at some length.

> Santosh did not propose to remove CmdSN, although he
> did point out an issue that needs attention - we need
> to say something about the interaction of CmdSN with
> TASK_SET_FULL since both mechanisms may be implemented.
> In particular, it's possible to use CmdSN for delivery
> ACKs and TASK_SET_FULL for flow control, although that's
> a "peculiar" combination (and perhaps a SHOULD NOT?).
> 

The only thing to be said is that there is no interaction.  For example, an
iSCSI gateway fronting a bunch of FC devices has no real way to tell how
many commands each device has room for.  So, a logical unit may return a
TASK_SET_FULL response even though the CmdSN window was wide open.
Naturally, the converse is also true. I.e., the CmdSN window may close even
though logical units have room for additional commands.

Charles




From owner-ips@ece.cmu.edu  Tue Apr 10 18:03:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA27050
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 18:02:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AJKNr21726
	for ips-outgoing; Tue, 10 Apr 2001 15:20:23 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AJJkr21683
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 15:19:46 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id BB4D66C; Tue, 10 Apr 2001 12:19:44 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id MAA14796;
	Tue, 10 Apr 2001 12:19:40 -0700 (PDT)
Message-ID: <3AD35E16.BD67560D@cup.hp.com>
Date: Tue, 10 Apr 2001 12:25:10 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
References: <C1256A2A.0020ADB6.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------220A4F6B6A8B72A46EF67CCF"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------220A4F6B6A8B72A46EF67CCF
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> Case a is what we have today.

Julian,

I may be missing something, but case (a) is NOT what we have today.
iSCSI rev 05 describes StatSN S[N]ACK support to be mandatory. 
(See http://ips.pdl.cs.cmu.edu/mail/msg04003.html for details).

> Status numbering is not meant for ordering - it is only a helper for ack
> (bulk ack).

Well understood.


> All the resources required for status retransmission are the control block
> and the status,
> If you give them up you give up all forms of recovery (as command retry
> will not help either).
> 
> The only recovery path remaining is a SCSI timeout and probably some form
> of task management clear (as SCSI does not know what went wrong).

which is the standard SCSI recovery followed historically by most SCSI
initiator and targets and given the TCP checksum escape rate, this is
not an issue for disk I/O. For tape I/O, this is still under debate and
timeout based recovery may not be optimal in some scenarios for tape.
(not yet conclusive though).


> That  is what I had in mind when I said that we can make it optional.

Does "it" refer to StatSN optional or "StatSN SNACK support" ? 

> 
> However - a long time ago when we suggested making it optional for targets
> most of the list wanted it mandatory.

Not sure what "it" is referring to.

Are we in agreement on requirements (b) & (c) ? Can "StatSN S[N]ACK"
support be negotiated at login time and StatSN numbering be only used if
Status SNACK is supported by the target ?

- Santosh



> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 10/04/2001 02:06:38
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> 
> Julian & All,
> 
> Do we agree on the following requirements for SNACK :
> 
> a) iSCSI MUST NOT mandate either data or status S[N]ACK for intra-task
> error recovery. Initiators MUST be allowed to perform command
> granularity error recovery.
> 
> b) iSCSI MUST provide a mechanism by which targets can continue with I/O
> resource release upon completion of an I/O. Such a mechanism may be
> based on an explicit StatSN acknowledgement, (if the target supports
> StatSN SNACK), or allow immiediate resource clean-up upon I/O
> completion.
> 
> c) Such a mechanism MUST NOT block forward progress when holes occur in
> StatSN sequence, due to format or digest errors encountered at the
> initiator.
> 
> In order to meet the above requirements, "StatSN S[N]ACK" support can be
> negotiated at login time and if StatSN SNACK is not supported by the
> target, it MUST NOT use StatSN sequence numbering. (i.e. StatSN = 0).
> 
> By not using StatSN numbering, the "holes in StatSN" problem does not
> occur, thereby, meeting requirements (a) ,(b) & (c) for targets that do
> not retain I/O state information.
> 
> For targets that do retain I/O state information, StatSn SNACK is turned
> on along with StatSN numbering.
> 
> - Santosh
--------------220A4F6B6A8B72A46EF67CCF
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------220A4F6B6A8B72A46EF67CCF--



From owner-ips@ece.cmu.edu  Tue Apr 10 20:25:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA28820
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 20:25:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ALvYe02758
	for ips-outgoing; Tue, 10 Apr 2001 17:57:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ALuwr02715
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 17:56:58 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3AN58004088;
	Tue, 10 Apr 2001 16:05:08 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Tue, 10 Apr 2001 14:55:08 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJCECKCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0708015406@corpmx9.isus.emc.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

A limitation in the number of tasks for each or all targets would be a
control method or as a type of Xon/Xoff as with SEP, but not an iSCSI one as
it is now.  Santosh said all commands could use a null CmdSN in his first
statement.  Perhaps iSCSI should explicitly exclude this use.  This does
imply there is no acknowledgements, no flow control, and no sequential
delivery within iSCSI.

I pointed out in the second paragraph there is still a need to ensure timely
delivery or NO delivery.  That is not a property of IP.  With multiple
connections, if you are not going to use a valid CmdSN, or in his case a
null CmdSN for all commands, then there would be a requirement to include a
timestamp to meet a timely delivery requirement in the same manner as used
in the FC encapsulation.  He had made it clear he was not concerned about
sequential delivery and perhaps I made the rather absurd but valid point as
a result.

You have now offered the possibility of different flow control schemes.  I
started this effort to offer a solution for the existing scheme.  The flora
gets thick in this area it would appear.

Doug


> Doug,
>
> There are a couple of confusions in here:
>
> TASK_SET_FULL *is* a flow control mechanism, albeit a
> rather primitive one - the Initiator issues commands until
> it gets a FULL response and hence knows the capacity
> of the Target's queue.  The statement that the Target loses
> resource control if it only uses TASK_SET_FULL is wrong,
> although it is relying on code at or above the SCSI driver
> on the Initiator to handle the flow control.
>
> Santosh did not propose to remove CmdSN, although he
> did point out an issue that needs attention - we need
> to say something about the interaction of CmdSN with
> TASK_SET_FULL since both mechanisms may be implemented.
> In particular, it's possible to use CmdSN for delivery
> ACKs and TASK_SET_FULL for flow control, although that's
> a "peculiar" combination (and perhaps a SHOULD NOT?).
>
> Beyond this, your entire second paragraph is well off
> into the weeds.  Please be careful about taking the
> discussion off onto wild tangents.
>
> Thanks,
> --David
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
> From:	Douglas Otis [dotis@sanlight.net]
> Sent:	Tuesday, April 10, 2001 2:09 AM
> To:	santoshr@cup.hp.com
> Cc:	Ips
> Subject:	RE: iSCSI:flow control, acknowledgement, and a deterministic
> recovery
>
> Santosh,
>
> In a change recommending a flag and a valid CmdSN and not a null value, I
> included a flag named "Casual" to provide this mode of operation you
> describe which still allows flow control.  I would not wish to permit this
> Casual mode, but as David feels such lists or threads are easy to manage,
> this should not represent any greater effort for feedback.  But
> then again,
> if you are not using flow control, you are also not using any
> acknowledgement other than an eventual SCSI status.  If there is no flow
> control, this removes resource control and reception acknowledgement.  If
> this is not needed, then such flow control should be removed for
> simplicity
> as there then must always be an abundance by design of adequate recourses.
> For cases where command sequence is important, CmdSN together with
> acknowledgement must remain or only one command at a time can be
> issued.  I
> can envision three modes, a Sequential mode, a Casual mode, and an Exigent
> mode where the command is given head of sequence status (with
> rejection for
> feedback simplicity ;).
>
> You seem to be placing many things on the table for reconsideration.  Flow
> control, acknowledgement, sequential delivery, and perhaps even a means of
> moving a command up the queue feeding SCSI targets.  One command at a time
> for sequential assurance is not impossible.  Is flow control
> needed?  Do you
> care if there is any reception acknowledgement if all feedback is in the
> SCSI layer?  It would seem in this barebones case, iSCSI adds little to
> ensure integrity.  The FC encapsulation places a time stamp
> within the PDU.
> Perhaps a time stamp should be employed as an alternative to sequence
> control?  At least after a timeout, you are assured a missing command is
> forever gone.  Otherwise, the check was to see that CmdSN is not too small
> to ensure it is not a stray.
>
> Doug
>
>
> > Doug [& All],
> >
> > Some comments on this thread :
> >
> > 1) The immediate command feature can be exploited by initiators which
> > are not required to provide strict command ordering for their SCSI
> > sub-system. (The majority of today's scsi stacks). IOW, all commands can
> > be sent with CmdSN=0 to indicate no ordering required.
> >
> > Any scheme to restrict the number of "immediate commands" that can be
> > sent prohibits such a feature and is un-desirable.
> >
> > 2) The (ExpCmdSN, MaxCmdSN) based acknowledgement and flow control is a
> > freebie that targets can use to implement additional flow control.
> > There's nothing in the spec that mandates that a target MUST use this
> > feature to throttle its window and apply flow control.
> >
> > Hence, any dependence and assumptions on the existence of CmdSN based
> > flow control may be incorrect, since the target may be using QUEUE FULL
> > (TASK SET FULL) to apply flow control.
> >
> > > I also suspect it is a mistake to allow commands
> > > to remain trapped within the sequencer during emergency or
> > abnormal events
> > > signified by the use of these immediate commands.
> >
> > 3) If the initiator suspects that the CmdSN queue at the target is
> > stuck, it can always use a task mgmt command or even an iSCSI login with
> > the X (restart) bit to perform a cleanup of stuck commands at the
> > target. There's no need to build in implicit assumptions that a CmdSN
> > with the immediate flag result in a clean-up of previously pending
> > commands at the CmdSN queue of the target.
> >
> > - Santosh
> >
> >
> > Douglas Otis wrote:
> > >
> > > David,
> > >
> > > > > To not support rejects, you are then relying on the target to
> > > > > handle these now out of sequence commands.
> > > >
> > > > Immediate commands are "out of sequence" by definition - immediate
> > > > means deliver immediately without regard to CmdSN order.  Only the
> > > > target can do this, so I don't see any problem with relying
> > on the Target
> > > > to do so.
> > >
> > > The commands that are NOW out of sequence are those commands
> > bypassed by the
> > > immediate command.  If this immediate command was Rewind, then
> > a long list
> > > of write commands become invalid as a new tape is now needed for those
> > > operations.  You are expecting that the target to handle a
> long list of
> > > invalid commands made possible by the iSCSI sequencer.
> > Commands invalidated
> > > by immediate commands become out of sequence commands.
> > >
> > > I'll reserve further comment until Julian has made some progress in
> >
> > >
> > > Forgive my advocacy, but if one does not attempt to support a
> > position, then
> > > the subject is not explored.
> > >
> > > Doug
>
>
>



From owner-ips@ece.cmu.edu  Tue Apr 10 20:28:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA28852
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 20:27:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3AMeQO05525
	for ips-outgoing; Tue, 10 Apr 2001 18:40:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3AMdkr05486
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 18:39:46 -0400 (EDT)
Received: from xatlrelay1.atl.hp.com (xatlrelay1.atl.hp.com [15.45.89.190])
	by atlrel1.hp.com (Postfix) with ESMTP id EC573332
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 18:39:45 -0400 (EDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1])
	by xatlrelay1.atl.hp.com (Postfix) with ESMTP id 5B62D1F508
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 18:37:59 -0400 (EDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2PW5NP4T>; Tue, 10 Apr 2001 15:39:44 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08F9F@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Tue, 10 Apr 2001 15:39:42 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> At the next level down, there are three sorts
> of issues floating in here:
> - Semantics: Unique identifiers, global scope,
> 	not tied to communication endpoints.
> - Syntax: How those identifiers are represented
> 	in messages and related issues of control
> 	over use and extension of that representation
> - Description: How the syntax is documented and
> 	the resulting suggestions/implications for
> 	its use.
> To a first approximation, the above list is in
> the order of importance to iSCSI (Semantics is
> most important) and in REVERSE order of importance
> to the IESG/IAB issues I've raised (i.e., the
> biggest issues are with the Description, believe
> it or not :-)).
> 

I think I'm agreeing with the IESG then.  It appears that the N&D team has
somehow decided that globally unique identifiers are necessary and of
primary importance.  That doesn't make sense to me.  I think we should be
focusing on defining a URL format for iSCSI resources.  The host name
provides the level of uniqueness necessary to allow iSCSI to ensure further
uniqueness within that host.  I'm waiting to hear justification from the N&D
team regarding their focus???

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 


From owner-ips@ece.cmu.edu  Tue Apr 10 21:41:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA00969
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 21:41:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ANQPw08362
	for ips-outgoing; Tue, 10 Apr 2001 19:26:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ANPvr08349
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 19:25:57 -0400 (EDT)
Received: from xatlrelay1.atl.hp.com (xatlrelay1.atl.hp.com [15.45.89.190])
	by atlrel1.hp.com (Postfix) with ESMTP id 7666730C
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 19:25:57 -0400 (EDT)
Received: from xpabh3.corp.hp.com (xpabh3.corp.hp.com [15.58.136.223])
	by xatlrelay1.atl.hp.com (Postfix) with ESMTP id CA9A01F50B
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 19:24:10 -0400 (EDT)
Received: by xpabh3.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2M5M69QS>; Tue, 10 Apr 2001 16:25:55 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FA2@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: RE: draft-ietf-ips-iscsi-rqmts-02.txt
Date: Tue, 10 Apr 2001 16:25:50 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> XML?
> 
> Pg 6.
>     "(5)  Development of specifications for iSCSI device 
> management as MIBs,
>          XML schemas, etc."
> 
> XML is a method for data markup.  Hopefully a URN assigned by IANA will
> become the XML Namespace's name.  Standard practice within W3C is to not
use
> PUBLIC identifiers and instead use 'well known' SYSTEM identifiers.  Is an
> XML schema in lieu of an LDAP schema to allow use of XML namespace?

Sorry, I have no idea what you are concerned about here.  Do you want some
LDAP words included in the document?  Suggest wording that's clear and
unambigous.  Do you have a statement of work for developing an LDAP
something or other?

XML is also used to describe and manage objects.  It's not discuss XML
further in this document.  The reader is assumed to be able to educate
themselves.
 
Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 


From owner-ips@ece.cmu.edu  Tue Apr 10 22:29:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA01376
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 22:29:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ANxS110272
	for ips-outgoing; Tue, 10 Apr 2001 19:59:28 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ANwor10247
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 19:58:50 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 37EF3802; Tue, 10 Apr 2001 16:58:49 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id QAA12111;
	Tue, 10 Apr 2001 16:58:44 -0700 (PDT)
Message-ID: <3AD39F7F.19B16D78@cup.hp.com>
Date: Tue, 10 Apr 2001 17:04:15 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Douglas Otis <dotis@sanlight.net>
Cc: Ips <ips@ece.cmu.edu>
Subject: Re: iSCSI:flow control, acknowledgement, and a deterministic recovery
References: <NEBBJGDMMLHHCIKHGBEJCECKCGAA.dotis@sanlight.net>
Content-Type: multipart/mixed;
 boundary="------------093743EC183D4D8FD5C490AF"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------093743EC183D4D8FD5C490AF
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Douglas Otis wrote:
> 
> Santosh said all commands could use a null CmdSN in his first
> statement.  Perhaps iSCSI should explicitly exclude this use.  This does
> imply there is no acknowledgements, no flow control, and no sequential
> delivery within iSCSI.

Doug,

What you state above is no different than a traditional SCSI transport
implementation. The acknowledgements, flow control and sequential
delivery properties are dervied from TCP. iSCSI behaves as an
encapsulation only. Most host O.S. stacks and data applications have no
expectations of strict ordering from the scsi transport. The QUEUE FULL
has served as a flow control mechanism in the past. 

IOW, simple implementations may choose to derive benefits from existing
mature TCP and SCSI algorithms rather than re-invent & re-implement all
of the transport capabilities within iSCSI.

There is no need to preclude implementations from sending all commands
with a 0 CmdSN. 

As for your second conern regarding I/O timeouts, there is no need for
any timestamp. An I/O timeout is dealt with by an Abort Task. The abort
task response guarantees that the abort reached the target and pushed
all intermediate stale frames. Failure to complete Abort Task leads to
higher level error recovery (ex : Logout, or some higher form of task
mgmt).

- Santosh
--------------093743EC183D4D8FD5C490AF
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------093743EC183D4D8FD5C490AF--



From owner-ips@ece.cmu.edu  Tue Apr 10 23:43:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA03133
	for <ips-archive@odin.ietf.org>; Tue, 10 Apr 2001 23:43:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3B1KYu15200
	for ips-outgoing; Tue, 10 Apr 2001 21:20:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3B1KCr15128
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 21:20:12 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3B2SG004236;
	Tue, 10 Apr 2001 19:28:24 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>,
        "Ips Reflector \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: draft-ietf-ips-iscsi-rqmts-02.txt
Date: Tue, 10 Apr 2001 18:18:17 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIECOCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <6BD67FFB937FD411A04F00D0B74FE87802A08FA2@xrose06.rose.hp.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie,

As there are several mentions of the use of LDAP already in other documents,
perhaps you would not mind including LDAP schema rather than XML.  Is this
document recommending XML as a management tool?

Doug


> Sorry, I have no idea what you are concerned about here.  Do you want some
> LDAP words included in the document?  Suggest wording that's clear and
> unambigous.  Do you have a statement of work for developing an LDAP
> something or other?
>
> XML is also used to describe and manage objects.  It's not discuss XML
> further in this document.  The reader is assumed to be able to educate
> themselves.
>
> Marjorie Krueger
> Networked Storage Architecture
> Networked Storage Solutions Org.
> Hewlett-Packard
> tel: +1 916 785 2656
> fax: +1 916 785 0391
> email: marjorie_krueger@hp.com
>



From owner-ips@ece.cmu.edu  Wed Apr 11 02:20:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA18003
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 02:20:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3B1GUJ14944
	for ips-outgoing; Tue, 10 Apr 2001 21:16:30 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3B1FQr14870
	for <ips@ece.cmu.edu>; Tue, 10 Apr 2001 21:15:26 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3B2Je004226;
	Tue, 10 Apr 2001 19:19:56 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <santoshr@cup.hp.com>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Tue, 10 Apr 2001 18:09:41 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOECNCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <3AD39F7F.19B16D78@cup.hp.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Santosh,

You deleted my point :(

With multiple connections, if you are not going to use a valid CmdSN, or in
your case a null CmdSN for all commands, then there would be a need to
include a timestamp to meet a timely delivery requirement in the same manner
as used in FC encapsulation.  IP can deliver over any time period.  A
command could arrive at any time with respect to other connections.  With
all of your feedback now from just the SCSI layer, the SCSI layer is likely
to have timed out and restarted and now stray commands finally make an
appearance (the technician re-inserted the cable).  What did that do?  Yes,
if this were on a single connection, then TCP could provide some assurances,
(ignoring digests errors) but you must not make that assumption nor can you
assume all disruptions are symmetric.

If I was making an iSCSI device and expected to have an ability to control
resources, then your all null CmdSN implementation has made my equipment
dysfunctional.  My suggestion did not prevent your approach if the Casual
flag were adopted and it also kept the documented scheme for flow control
functional.  It also assures stray commands over multiple connections would
be rejected!  If you think ordered delivery is not needed as most disk
vendors do, then you should object to the requirements document just posted.
Sequential SCSI layer delivery as a mandate provides the most information
concerning the state of the SAN.

Doug

> Douglas Otis wrote:
> >
> > Santosh said all commands could use a null CmdSN in his first
> > statement.  Perhaps iSCSI should explicitly exclude this use.  This does
> > imply there is no acknowledgements, no flow control, and no sequential
> > delivery within iSCSI.
>
> Doug,
>
> What you state above is no different than a traditional SCSI transport
> implementation. The acknowledgements, flow control and sequential
> delivery properties are dervied from TCP. iSCSI behaves as an
> encapsulation only. Most host O.S. stacks and data applications have no
> expectations of strict ordering from the scsi transport. The QUEUE FULL
> has served as a flow control mechanism in the past.
>
> IOW, simple implementations may choose to derive benefits from existing
> mature TCP and SCSI algorithms rather than re-invent & re-implement all
> of the transport capabilities within iSCSI.
>
> There is no need to preclude implementations from sending all commands
> with a 0 CmdSN.
>
> As for your second conern regarding I/O timeouts, there is no need for
> any timestamp. An I/O timeout is dealt with by an Abort Task. The abort
> task response guarantees that the abort reached the target and pushed
> all intermediate stale frames. Failure to complete Abort Task leads to
> higher level error recovery (ex : Logout, or some higher form of task
> mgmt).
>
> - Santosh



From owner-ips@ece.cmu.edu  Wed Apr 11 05:32:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA19229
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 05:32:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3B74eC05291
	for ips-outgoing; Wed, 11 Apr 2001 03:04:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3B747r05263
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 03:04:07 -0400 (EDT)
Received: from hpbs5001.boi.hp.com (hpbs5001.boi.hp.com [15.2.209.237])
	by atlrel1.hp.com (Postfix) with ESMTP
	id 4BB114C7; Wed, 11 Apr 2001 03:04:06 -0400 (EDT)
Received: from xpabh3.corp.hp.com (xpabh3.corp.hp.com [15.58.136.223]) by hpbs5001.boi.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id BAA19562; Wed, 11 Apr 2001 01:04:04 -0600 (MDT)
Received: by xpabh3.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2M5M7ZKA>; Wed, 11 Apr 2001 00:03:15 -0700
Message-ID: <499DC368E25AD411B3F100902740AD65BC5AA2@xrose03.rose.hp.com>
From: "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
To: ips@ece.cmu.edu
Cc: "'Vern Paxson'" <vern@ee.lbl.gov>, Craig Partridge <craig@aland.bbn.com>,
        "'Jonathan Stone'" <jonathan@DSG.Stanford.EDU>,
        "'chase@cs.duke.edu'" <chase@cs.duke.edu>, madler@alumni.caltech.edu,
        "'chsharp@cisco.com'" <chsharp@cisco.com>,
        "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
Subject: TCP checksum escapes and iSCSI error recovery design
Date: Wed, 11 Apr 2001 00:03:11 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

All,
In designing the iSCSI error recovery mechanisms there has been considerable
focus on general deficiency of the TCP checksum.  It seems that what iSCSI
error recovery should really be based upon is the overall profile of TCP
checksum escapes for the networks that will carry iSCSI traffic ("checksum
escapes" being defined as those cases where corrupted data is passed through
TCP and delivered to the upper layer protocol as good).  The design of iSCSI
error recovery needs to start with a clear and agreed upon set of
assumptions regarding the profile for TCP checksum escapes across space and
time. Thus, the question is not simply "How often is corrupted data passed
through TCP and delivered to iSCSI", but more along the lines of  "What is
the distribution of TCP checksum escapes across network paths and through
time?"

ISCSI error recovery should not be over-designed nor under-designed and will
be fundamentally different if the assumption is that checksum escapes or
bursts of such occur every 5 minutes or 5 hours or 5 days, and if the
assumption is that there are a few bad paths or that all paths are bad
sometimes. 

It would make sense for iSCSI to have minimal error recovery if the spatial
profile (discussed below) of TCP errors were bi-modal (i.e. there are good
paths and bad paths). IOW, if the checksum escapes happen deterministically
always or not at all, iSCSI can simply drop the connection on a digest
failure because the operation switched into error mode.  OTOH, we cannot
adopt this simplistic approach if TCP can be in the shades of gray between
the two modes.  Attached email discussion indicates that load levels and
memory usage models along the path play a key role in determining this
aspect. Employing an iSCSI-level digest/CRC appears to be the wise approach
regardless of the TCP operational model (bimodal or otherwise), to detect
escapes when they happen. IOW, the principle of "trust but verify" is apt to
be applied here. The question is what is the appropriate action for iSCSI to
take if the verification detects data corruption.

To start this discussion toward a set of assumptions regarding TCP checksum
escape profiles and an appropriate iSCSI error recovery design, this email
includes responses from several individuals working in this area (Vern
Paxson, Craig Partridge, Jonathan Stone), along with links to their original
papers.

Regards,
Jim Wendt
Networked Storage Architecture
Network Storage Solutions Organization
Hewlett-Packard Company / Roseville, CA
Tel: +1 916 785-5198
Fax: +1 916 785-0391
Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>


----------------------------------------------------------
Here is a copy of my original email to Vern Paxson, Craig Partridge,
Jonathan Stone, and Jeff Chase:

Hi All,
I'm e-mailing this to Vern, Craig, Jonathan, and Jeff in hopes of gathering
information around TCP checksum failure profiles and Internet data failure
characteristics in general.  I'm Jim Wendt and I've recently started working
on both the iSCSI error recovery and RDMA/TCP-Framing activities. I'm
emailing you directly because of your specific research and experience
regarding TCP checksum failures and behavior.

My immediate interest is in regards to TCP checksum failure rates
(undetected errors) and how iSCSI should handle these undetected errors.  As
you probably already know, the iSCSI protocol is a mapping of SCSI device
protocols onto TCP (or SCTP eventually).  I have read the three papers (When
the CRC and TCP Checksum Disagree, Performance of Checksums and CRCs over
Real Data - both versions, End-to-End Internet Packet Dynamics). 

The iSCSI WG has come to the conclusion that the TCP checksum provides an
insufficient level of data protection, and that a CRC-32 will be used for
data integrity verification on each iSCSI PDU (the specific CRC polynomial
to employ is being studied now). Thus, the assumption is that if corrupted
payload data does pass through TCP and is undetected by the TCP checksum,
then the corruption will be detected at the iSCSI PDU level via the CRC-32
check.

What we are considering now is what level and philosophy of error handling
and recovery is put into iSCSI. It seems to me that fundamentally different
error handling behaviors would be put into iSCSI based on a known rate or
profile of occurrence of bad PDUs (those PDUs that are passed up through TCP
as being OK, but for which the iSCSI level CRC-32 indicates a data error).
Thus, iSCSI error handling might be defined differently if the expected PDU
error rate is one every 5 minutes as opposed to one every 5 hours (or 5
days). Also, the error handling might be different if errors are bursty
(i.e. a sequence of bad PDUs) rather than evenly spread through time.

I would appreciate hearing your thoughts (and any supporting data references
I could employ) regarding the nature of TCP checksum failure profiles across
time and space (i.e. network paths).   More specifically:

* How does the Internet work in the face of TCP error escapes?  If there
truly is a relatively high level of application level data corruption
(perhaps 1 in 16M packets), then how does the Internet and associated
applications manage to function?

* What is the distribution of errors across network paths?  Are most network
paths very good (low TCP error escapes) with only a few paths that are
really bad, or are TCP error escapes more evenly distributed across all
network paths?  If the spatial profile is that there are good and bad paths,
then do the error escapes rates on these two classes of paths correspond to
the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
billion)?

* What is the distribution of errors through time?  Do TCP error escapes
occur individually and randomly through time, or are TCP error escapes more
bi-modal where most of the time there are no errors and occasionally there
is a clump or burst of TCP error escapes?  If the temporal profile for TCP
error escapes is that there are good periods and infrequent but severe bad
periods, then what is the duty cycle for these periods (how bad/good for how
long), and what are the error escape rates during these periods?

* What about a stronger TCP checksum?  I don't believe that anyone every
actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
recent thinking about actually improving TCP's end-to-end data integrity
checking.  I suppose that the existing middle box infrastructure won't allow
for this. However, I'm considering submitting a draft and starting to push
for a stronger TCP checksum or CRC, but I would like to get feedback from
all of you on the technical feasibility and possible acceptance of this
proposal before taking it to the public forums.

* What about a stronger SCTP checksum? SCTP currently uses the Adler-32
checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
Has there been any thinking in this direction? Again, I am considering
pursuing this with the SCTP folks but would appreciate any feedback you have
to offer first.

Thanks,
Jim

Jim Wendt
Networked Storage Architecture
Network Storage Solutions Organization
Hewlett-Packard Company / Roseville, CA
Tel: +1 916 785-5198
Fax: +1 916 785-0391
Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>


----------------------------------------------------------
Craig Partridge writes:

Hi Jim:

Here's my quick set of answers, which Vern, Jonathan and others can
refine.

>* How does the Internet work in the face of TCP error escapes?  If there
>truly is a relatively high level of application level data corruption
>(perhaps 1 in 16M packets), then how does the Internet and associated
>applications manage to function?

The answer is that applications appear to function because people don't
notice the errors or resort to backups, or whatever.  It isn't clear
exactly how we're managing to survive.  (Back in the old days when NFS
had these problems, people used to assume a disk failure had occurred when
in fact the network had trashed their data).

>* What is the distribution of errors across network paths?  Are most
network
>paths very good (low TCP error escapes) with only a few paths that are
>really bad, or are TCP error escapes more evenly distributed across all
>network paths?  If the spatial profile is that there are good and bad
paths,
>then do the error escapes rates on these two classes of paths correspond to
>the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
>billion)?

The answer is that there are good and bad paths.  On good paths you'll
probably see less escapes than 1 in 10 billion -- you can probably treat
them as essentially error free.  If you start seeing a modest number of
errors, then either (a) there's a broken router in your path or (b)
there's a broken end system.  If we had a better way of identifying the
source of errors, I'd say that if you ever see an error from a host,
declare it broken.

>* What is the distribution of errors through time?  Do TCP error escapes
>occur individually and randomly through time, or are TCP error escapes more
>bi-modal where most of the time there are no errors and occasionally there
>is a clump or burst of TCP error escapes?  If the temporal profile for TCP
>error escapes is that there are good periods and infrequent but severe bad
>periods, then what is the duty cycle for these periods (how bad/good for
how
>long), and what are the error escape rates during these periods?

They're not random but I don't think we know enough about the timing to say.
My hunch is that a lot are based on end system load (that high loads
on network cards tends to encourage certain classes of bus errors).

>* What about a stronger TCP checksum?  I don't believe that anyone every
>actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
>recent thinking about actually improving TCP's end-to-end data integrity
>checking.  I suppose that the existing middle box infrastructure won't
allow
>for this. However, I'm considering submitting a draft and starting to push
>for a stronger TCP checksum or CRC, but I would like to get feedback from
>all of you on the technical feasibility and possible acceptance of this
>proposal before taking it to the public forums.

That's a tricky question and not one I've thought enough about to have
a strong view -- except to say that it isn't all the checksum's fault --
Jonathan's done work showing that putting the checksum at the end
dramatically
improves its efficacy in certain situations.

>* What about a stronger SCTP checksum? SCTP currently uses the Adler-32
>checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
>Has there been any thinking in this direction? Again, I am considering
>pursuing this with the SCTP folks but would appreciate any feedback you
have
>to offer first.

There's considerable reason to believe that Adler-32 is a lousy checksum --
it uses more bits (32 vs. 16) and will probably detect fewer errors than
either Fletcher-16 or the TCP checksum.  If one is going to invest energy
in fixing a checksum, fixing SCTP's checksum is probably the first priority.

Craig

----------------------------------------------------------
Vern Paxson writes:

Just a couple of follow-on points:

	- I suspect the Internet survives because the bulk of the
	  traffic isn't all that critical (Web pages, particularly
	  large items like images and perhaps video clips), so when
	  they're corrupted, nothing breaks in a major way.

	- One point to consider is that if you use IPSEC, then you
	  get very strong protection from its integrity guarantees

- Vern

----------------------------------------------------------
Jonathan Stone writes:

In message <200104092135.f39LZVS21092@daffy.ee.lbl.gov>Vern Paxson writes

Jim,

Craig's answer is an excellent summary.  Fixing the SCTP checksum is a
priority; I have a chapter in my thesis addresing some of the issues
there, and I believe Craig and I plan to write a paper.
Stronger TCP checksums are an interesting idea, but the lag on
deploying new TCP features seems to be  5 years or more.


If I re-did the study, i would try and log headers of ``good'' packets
(those where the checksum matches) as well as the entire packets with
checksum errors.  With only data on the `bad' packets (which is partly
to meet privacy concerns, which were a big hurdle for us),
it's hard to give good answers to your questions about the time or path
dependency of bad checksums.

One thing we can say is that there appear to be certain `bad' hosts
which emit very high rates of packets with incorrect checksums.  One
host we noticed in the Stanford trace sent 2 packets every second for
around three hours, totalling about 20% of the total packets in that
trace.  We can't say whether that's a host problem or a path problem;
but either way that error rate would worry me, if I were using I-SCSI
on that host (or path).  That said, we did find what looks to be a
router with a bad memory bit, which is clearly path-dependent
(though hitting that particular memory word may be time- and load-dependent
as well).

Further, some of the bad checksums appear to be due to software
bugs in specific OS releases.  As the Internet evolves (old OS revs
replaced by newer ones), the rate of those specific errors will
evolve over time.


Last, the DORM trace in our SIGCOMM 2000 paper is the only trace where
our monitoring point was directly adjacent to end-hosts, with no
intervening IP routers.  On that Ethernet segment, i saw about 1 in
80,000 packets with a valid frame CRC, but a bad IP checksum --
often a deletion of a 16-bit word from the IP address.  I tried
to `repair' some  IP headers (e.g., by inserting what seemed
to be a missing 16 bits of an IP source address).
After the repair, the IP checksum seemed to be correct.

That points to a problem ocurring after the sending IP layer computeds
its IP header checksum, but before the frame-level CRC is computed.:
That suggests an error either in the NIC device driver, in its DMA
enigne, or somewhere on the NIC itself (dropped from a FIFO?).

That led Craig and I to suggest that where possible, checksums be
computed early and verified late.  That's a caution about a potential
downside of outboard checksumming: it cover less of the path to an
application buffer than is covered by software checksums.  Software
checksums can catch errors which occur between the driver and the NIC
(bus errors, DMA, FIFO overruns, what-have-you); but outboard hardware
checksums do not.  (IEN-45 mentions this issue, but the two-pass
scheme suggested there doubles bus utilization for a given I/O rate.)

Putting middle boxes in the path just exacerbates this.

Whether or not to use outboard checksumming is entirely up to
NIC designers. We merely raise the issue that it checks less of
the datapath than is covered by software checksums.


>Just a couple of follow-on points:
>
>	- I suspect the Internet survives because the bulk of the
>	  traffic isn't all that critical (Web pages, particularly
>	  large items like images and perhaps video clips), so when
>	  they're corrupted, nothing breaks in a major way.
>
>	- One point to consider is that if you use IPSEC, then you
>	  get very strong protection from its integrity guarantees

For a software IPSEC implementation.

A hardware implementation with outboard crypto hardware could
potentially fall foul of the same kinds of local-to-source-host
errors, (DMA errors or whatever the ultimate cause is) which our data
indicates some NICs suffer from.  If the data has already been
curdled by the time the encrypting accelerator sees it,
I don't see how IPsec actually enhances integrity.

----------------------------------------------------------
Vern Paxson writes:

> >	- One point to consider is that if you use IPSEC, then you
> >	  get very strong protection from its integrity guarantees
> 
> For a software IPSEC implementation.

Good point!

		Vern

----------------------------------------------------------
Craig Partridge writes:

I'd like to clarify my statement to Jim about Adler-32 to be a bit more
clear
about what the issues are.   I've also added Mark Adler to the cc list as
I'd promised Mark to get him results when we had them, and while the results
aren't quite cooked, if the Jim is going to circulate a discussion, Mark
should see it first.

If you're designing a checksum, there are certain features you'd like it
to have.  Here's a starting point of a formal definition.  Given the set
V of all possible bit vectors and a checksum function C(), what we'd like
is:

	prob(C(v1) == C(v2)) is 1/2**(sizeof C())

that is given any two v1, v2 being different elements of V, the chance that
their checksum will collide is the best possible, namely 1 over 2 raised to
the power of the bitwidth of the result of C().

Three sub points:

    1. This is not quite the same as what the cryptographic checksum folks
    want.  They actually want it to be very hard [for some computational
    approximation of hard], given C(v1) to find a v2 such that C(v1) ==
C(v2).
    For network checksums, we don't care as we're protecting from errors,
    not attacks.

    2. If we do not pick v1 and v2 at random, but according to some
distribution
    rule of likely packet sizes, packet contents, etc, we'd still like the
    equation to be true.  We don't want to be vulnerable to certain
    traffic patterns, etc.

    3. You can compare the effectiveness of checksums by how close they
    come to this ideal -- that is, how effectively do they use their
    range of values?

OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
improve the performance of the Fletcher checksum by summing modulo a prime
number, rather than 255 or 256.

However, it sums bytes (8-bit quantities) into 16-bit fields.  As a result,
the high bits of the 16-bit fields take some time to fill (they only get
filled by propogating carries from lower bits) and until the packet
is quite big (thousands of bytes) you don't get enough mixing in the high
bits.  Two problems: (a) we're not fully using the 16-bit width, so for
smaller packets the chance of collision is much greater than 1/2**sizeof(C)
simply because some bits in the checksum are always (or with very
high probability) set to 0; and (b) it looks like (and Jonathan's still
working on this) that the law of large numbers will cause the values to
cluster still further [I think of this behavior as the result of just
looking at all the bits instead of just the low order bits mod a prime,
we're
vulnerable to the fact that the sums are not evenly distributed, by
the law of large numbers]

We're still working on whether the core idea behind Adler-32 (namely working
modulo a prime) is as powerful as it seems, but it is clear that to have
a hope of making it comparable to the TCP checksum or Fletcher, you have to
sum 16-bit quantities into 16-bit fields.

Craig

----------------------------------------------------------
Jonathan writes:

In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes

>I'd like to clarify my statement to Jim about Adler-32 to be a bit more
clear
>about what the issues are.   I've also added Mark Adler to the cc list as
>I'd promised Mark to get him results when we had them, and while the
results
>aren't quite cooked, if the Jim is going to circulate a discussion, Mark
>should see it first.
>
>If you're designing a checksum, there are certain features you'd like it
>to have.  Here's a starting point of a formal definition.  Given the set
>V of all possible bit vectors and a checksum function C(), what we'd like
is:
>
>	prob(C(v1) == C(v2)) is 1/2**(sizeof C())
>
>that is given any two v1, v2 being different elements of V, the chance that
>their checksum will collide is the best possible, namely 1 over 2 raised to
>the power of the bitwidth of the result of C().

Jim, Craig: 

Just to be picky, as I'm working right now on definitions of some of
these issues for my thesis:

One can list other desirable properties, like wanting each bit of the
checksum field to have informational entropy 1/2.  Craig's aggregate
definition falls out from that, with a few extra assumptions.

Also, less formally, desiring each bit of the input data to contribute
equally to flipping the the final state of each output bit.
that is where Adler32 runs into trouble when given short inputs.


>Three sub points:
>
>    1. This is not quite the same as what the cryptographic checksum folks
>    want.  They actually want it to be very hard [for some computational
>    approximation of hard], given C(v1) to find a v2 such that C(v1) ==
C(v2).
>    For network checksums, we don't care as we're protecting from errors,
>    not attacks.

There are two versions of formal cryptographic invertibility; the
other criteria is that it be computationally intractable to find *any*
v1 and v2 such that C(v1) = C(v2).  Crypto folks would generally like
both.


>    2. If we do not pick v1 and v2 at random, but according to some
distributi
>on
>    rule of likely packet sizes, packet contents, etc, we'd still like the
>    equation to be true.  We don't want to be vulnerable to certain
>    traffic patterns, etc.
>
>    3. You can compare the effectiveness of checksums by how close they
>    come to this ideal -- that is, how effectively do they use their
>    range of values?
>
>OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
>improve the performance of the Fletcher checksum by summing modulo a prime
>number, rather than 255 or 256.
>
>However, it sums bytes (8-bit quantities) into 16-bit fields.  As a result,
>the high bits of the 16-bit fields take some time to fill (they only get
>filled by propogating carries from lower bits) and until the packet
>is quite big (thousands of bytes) you don't get enough mixing in the high
>bits.  Two problems: (a) we're not fully using the 16-bit width, so for
>smaller packets the chance of collision is much greater than 1/2**sizeof(C)
>simply because some bits in the checksum are always (or with very
>high probability) set to 0; and (b) it looks like (and Jonathan's still
>working on this) that the law of large numbers will cause the values to
>cluster still further [I think of this behavior as the result of just
>looking at all the bits instead of just the low order bits mod a prime,
we're
>vulnerable to the fact that the sums are not evenly distributed, by
>the law of large numbers]

To be fair, and to give the whole picture, there are a couple of
points here that should be expanded.

The first is that for large input data lengths, we can show that the
distribution of both 16-bit halves of the Adler-32 sum should acually
be well distributed.  That holds true for addition mod M, of any
repeated independent observations of a random variable.  (A proof
appears in the appendix of the 1998 ToN paper you cited already;
although  that version of the proof may not formally state
all the necessary assumptions about indepedent observations.)

However, for networking in general and SCTP in particular, there are
fairly modest hard upper boundes on the maximum input length.
SCTP forbids fragmentation.  For the increasingly-pervasive Ethernet
frame length of 1500 bytes that means SCTP checksums have no more
than 1480 input bytes.

The data we have -- and I am still working on it -- say that's too
short to get good coverage of either the two sum in SCTP-- the purely
additive commutative sum, or the higher-order running sum of the
commutative sum (which is position-dependent)

Craig's description gives a good intuition for the first, commutative
sum. For the position-dependent sum (the running sum of the first sum),
another good intution for the computational modelling I've done is to
imagine a hash-table where we hash, independently, all possible values
at all possible offsets in a 1480-byte SCTP packet.  The intuition
isn't so much "law of large numbers" as that SCTP draws its per-byte
values from one particular corner of a two-dimensional space (the
hash-table vs. all possible bytes); so it ends up with an uneven
coverage of the space of all hash values.


>We're still working on whether the core idea behind Adler-32 (namely
working
>modulo a prime) is as powerful as it seems, but it is clear that to have
>a hope of making it comparable to the TCP checksum or Fletcher, you have to
>sum 16-bit quantities into 16-bit fields.

Just so the reasoning is clear, another alternative is to sum 8-bit
inputs into 8-bit accumulators, modulo 251.  Given 32 bits of
available checksum field, the 16-bit sums are preferable.

Again, this is for short inputs.  For large inputs of several tens of
Kbytes, the Adler32 sum should do much better (at 64Kbytes it should
give very uniform coverage).  At those input sizes, the comparison comes
down to Adler32 having 15 pairs of values which are congruent, mod
65521, whereas a Fletcher sum wiht 16-bit inputs would have only one;
versus better `stirring' from a prime modulus.

I dont know what the distribution of sizes of zlib-compressed files
is.  If they are generally large, then our work may not be applicable
to Adler-32 its original designed purpose.


I also don't know the history of why SCTP chose the Adler-32 sum
rather than, say, CRC-32. The gossip I hear from IETF-going friend in
the Bay Area is that there was concern about the performance of
CRC-32; a direct bit-by-bit shift-and-add was seen as too slow.  I
hear there was also a supposition that an efficient table-lookup
version would require very large tables (128 kbits?) and that
tables that arge were prohibitive for PDAs and small handheld devices.

I am *not* suggesting that is an accurate report; it probably isn't.
But if there's any grain of truth in it, both four-bit and eight-bit
table-lookup algorithms for CRC32 exist.  Table lookup size need not
be an issue.  Perhaps we should draw the SCTP authors -- C. Sharp? --
into this discussion as well.

----------------------------------------------------------
Jonathan writes:

A small terminological correction here:

In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes
>

[...]

>
>so for
>smaller packets the chance of collision is much greater than 1/2**sizeof(C)
>simply because some bits in the checksum are always (or with very
>high probability) set to 0; and (b) it looks like (and Jonathan's still
>working on this) that the law of large numbers will cause the values to
>cluster still further [I think of this behavior as the result of just
>looking at all the bits instead of just the low order bits mod a prime,
we're
>vulnerable to the fact that the sums are not evenly distributed, by
>the law of large numbers]

I think its acutally a central limit theorem, not the law of large
numbers.  For addition modulo M, the "central limit theorem" says that
summation of many independent identically-distributed random variables
will tend to a uniform distribution, not a normal distribution
(as does the weighted sum)

----------------------------------------------------------
Jonathan writes:

Jim,

I forwarded my previos message to Chip Sharp.  (is the rfc2960
address, chsharp@cisco.com, still valid?)

He may wish to comment on the SCTP issues as well.


----------------------------------------------------------
----------------------------------------------------------
Links to papers:

End-to-End Internet Packet Dynamics, Vern Paxson
<http://citeseer.nj.nec.com/cache/papers2/cs/11598/ftp:zSzzSzftp.ee.lbl.govz
SzpaperszSzvp-pkt-dyn-ton99.pdf/paxson97endtoend.pdf>

When the CRC and TCP Checksum Disagree, Jonathan Stone, Craig Partridge
<http://www.acm.org/sigcomm/sigcomm2000/conf/paper/sigcomm2000-9-1.pdf>

Performance of Checksums and CRCs over Real Data, Jonathan Stone, Michael
Greenwald, Craig Partridge, Jim Hughes
<http://citeseer.nj.nec.com/cache/papers2/cs/1909/ftp:zSzzSzftp.dsg.stanford
.eduzSzpubzSzpaperszSzsplice-paper.pdf/performance-of-checksums-and.pdf>

----------------------------------------------------------




From owner-ips@ece.cmu.edu  Wed Apr 11 12:56:21 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA28719
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 12:56:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BEbsb13263
	for ips-outgoing; Wed, 11 Apr 2001 10:37:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BEbMr13229
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:37:22 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 37FF69400D
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:37:22 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: Calling for Tape & Backup Application Experts 
In-Reply-To: Message from "Somesh Gupta" <somesh_gupta@silverbacksystems.com> 
   of "Mon, 09 Apr 2001 14:08:01 PDT." <NMEALCLOIBCHBDHLCMIJOENDCCAA.somesh_gupta@silverbacksystems.com> 
References: <NMEALCLOIBCHBDHLCMIJOENDCCAA.somesh_gupta@silverbacksystems.com> 
Date: Wed, 11 Apr 2001 10:35:56 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010411143722.37FF69400D@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Somesh,

> If you are out there on the list and can help with creating a model
> of the backup app, that will help resolve some of the issues.

I expect I'm considered to be a biased observer, but the only backup
app that I've ever used that worked decently is amanda.

It's relatively configurable, but the typical configuration (the only
one I've seen used, at several, independent sites) is:

  1) backup (e.g. dump) streams from client to holding disk on backup server
  2) server streams data from holding disk to tape (prevents underruns) 
  3) if tape error occurs, notify operator and retain backup data on
     the holding disk
     if backup disk is full, stop backing up clients to holding disk
  
It also does all sorts of complicated stuff to chose among backup
levels, but that's not really germane here, except that it will allow
you to back up a massive storage foot print with a relatively small
number of tape drives and relatively small holding disk.

The upshot of this is that amanda uses programs like tar, dump and
cpio as primitives in which errors are expected to occur occasionally,
and recovery is required.  It is a mistake to rely solely upon simple
backup programs that have no operator intervention loop in them.

In my experience (I was an operator at one such site), a very common
reason why the backups failed was because high-density cartridge tape
drives require cleaning after a relatively small number of hours, and
(at least) Exabyte drives would not permit a tape to be used when a
cleaning was due.  It did give some advanced warning, like flashing
lights, but you know how easy it is to miss a flashing LED in a
machine room (they flash when the tape is operating too!), and with
the small interval between cleanings (when the tape drive is running
near 100% duty cycle), it seemed like it was ALWAYS in need of
cleaning.

I haven't used a Windows backup apps, but I can't help thinking the
Amanda approach is not novel for a serious backup app (as opposed to
NT backup or other `consumer grade' apps).

While we're gathering requirements, backup is only one of the two
major applications of tape.  The other is streaming data logging, also
known as the `telemetry problem'.  In this case, data is running to
the tape almost continuously, and an operation failure may not be a
matter of trying again tomorrow.  Then again, it might.  There's
usually a large direct access device (e.g. RAID box) used as a buffer
to the tape, but I'm not sure if it's sized to allow for operator
recovery or what.  I don't see how any other solution would be
possible.

Steph


From owner-ips@ece.cmu.edu  Wed Apr 11 12:56:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA28749
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 12:56:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BEbmI13250
	for ips-outgoing; Wed, 11 Apr 2001 10:37:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BEbMr13228
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:37:22 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 152BC9400C
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:37:22 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI: frame formats 
In-Reply-To: Message from julian_satran@il.ibm.com 
   of "Tue, 10 Apr 2001 07:08:43 +0200." <C1256A2A.001BE84B.00@d12mta02.de.ibm.com> 
References: <C1256A2A.001BE84B.00@d12mta02.de.ibm.com> 
Date: Wed, 11 Apr 2001 10:35:56 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010411143722.152BC9400C@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julian,

> Read10 and Write10 can still be accomodated with several data PDUs as we
> maintained the total read/write count (expected data) as a 32 bit field.
> It is only the individual PDU that went down to 16M.

I realize that.  I said that Read10 could not be accommodated within a
SINGLE PDU.

I know the endpoints CAN tile the data into sequences of PDUs of
arbitrarily small size.  The question is whether they WANT to do that.
My endpoints don't.  It boils down to whether you want to tile iSCSI
PDUs in hardware or not.  If you use something like an RDMA layer to
do your data handling, it's better not to add an additional ULP tiling
requirement.  You want to have the protocol support arbitrarily large
(within some expansive limit like 2^32) PDUs.

> It is worth noticing also that the CRCs we are contemplating have a
> guaranteed haming distance of 4 for blocklengths of less than 2^31 bits.

Understood.  Yet another reason to insert integrity below iSCSI.
Disabling CRC is presently supported, so I'm not worried about the
CRCs.

Steph


From owner-ips@ece.cmu.edu  Wed Apr 11 13:03:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA28942
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 13:03:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BEbpC13257
	for ips-outgoing; Wed, 11 Apr 2001 10:37:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BEbMr13227
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:37:22 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id DDDE29400A
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:37:21 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI: session login and ISID 
In-Reply-To: Message from sandeepj@research.bell-labs.com (Sandeep Joshi) 
   of "Mon, 09 Apr 2001 13:27:46 EDT." <200104091727.NAA27077@aura.research.bell-labs.com> 
References: <200104091727.NAA27077@aura.research.bell-labs.com> 
Date: Wed, 11 Apr 2001 10:35:56 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010411143721.DDDE29400A@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sandeep,

> One of those attempts will get rejected, since the ISID is the sole
> key to find if a session already exists.

As Julian mentioned, TSID is actually the key a target uses to
associate a login to an existing session.  ISID is an opaque (to the
target) convenience to the initiator.

> (note: TSID was sent as zero for the leading connection of session)

The allocated TSID from a leading login is returned in the Login
Response(es).

> The initiator WWUI does not seem to be available at this time.
> a) Appendix D.10 states that InitiatorWWUI is optional and defaults
>    to iSCSI.
> b) Section 2.10.9 on Login Command states that "initiator MAY provide 
>    some basic parameters".
> 
> On the other hand, Section 1.2.7 states that "the initiator MUST
> present both its initiator WWUI and target WWUI to which it wishes
> to connect during the login phase".

Hm.  There does seem to be a contradiction.  I prefer the 1.2.7
stipulation for esoteric reasons which will either be revealed
eventually (in which case, they must have been intrinsically correct)
or squashed like a bug (in which case they are irrelevant).

> The WWUI is also needed if we are to support multiple I_T nexuses 
> between the same initiator and target.  

I don't see this.  I agree that we want to allow an arbitrary number
of sessions between a pair of targets.  Julian says it's an open
issue.  I don't see why.  Perhaps he's referring to the fact that we
do want to discourage the use of multiple connections (or sessions)
between an I and a T simply for the purposes of winning a bigger share
of a congested network link.

As far as I can tell, the (possibly between the lines) specified
mechanism supports multiple connections and sessions between an I and
a T.  If an initiator wants a new session, it sets TSID == 0, and it
gets a new session.  No reason why it couldn't have multiple
connections between the same endpoints within a session too.

Steph


From owner-ips@ece.cmu.edu  Wed Apr 11 13:58:52 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA29929
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 13:58:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BGFrG20270
	for ips-outgoing; Wed, 11 Apr 2001 12:15:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BGFEr20172
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 12:15:14 -0400 (EDT)
Received: from hpbs5001.boi.hp.com (hpbs5001.boi.hp.com [15.2.209.237])
	by palrel3.hp.com (Postfix) with ESMTP id DC16AC25
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 09:15:12 -0700 (PDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1]) by hpbs5001.boi.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id KAA18564 for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:15:11 -0600 (MDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2PW5PAF4>; Wed, 11 Apr 2001 09:15:08 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FA7@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: RE: draft-ietf-ips-iscsi-rqmts-02.txt
Date: Wed, 11 Apr 2001 09:15:06 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> As there are several mentions of the use of LDAP already in 
> other documents,
> perhaps you would not mind including LDAP schema rather than 
> XML.

I will include LDAP, not substitute it.
 
>  Is this
> document recommending XML as a management tool?

Not necessarily *recommending*, rather acknowledging that there are
companies that wish to use it as such.

Marj


From owner-ips@ece.cmu.edu  Wed Apr 11 14:00:24 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA00016
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 14:00:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BG9no19800
	for ips-outgoing; Wed, 11 Apr 2001 12:09:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BG9Qr19745
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 12:09:27 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id SAA338612
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 18:09:15 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta02_cs0 [9.165.222.253])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.95) with SMTP id SAA81902
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 18:05:43 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2B.00586B97 ; Wed, 11 Apr 2001 18:05:47 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2B.00583B1C.00@d12mta02.de.ibm.com>
Date: Wed, 11 Apr 2001 19:10:24 +0300
Subject: Re: iSCSI initStatSN
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Ayman,

1.2.2.2 reads now:

   Status numbering starts after Login. During login, there is always only
   one outstanding command per connection and status numbering is not
   strictly needed but may be used as a sanity check.

   Reagrds,
   Julo


"Ayman Ghanem" <aghanem@cisco.com> on 11/04/2001 18:38:55

Please respond to "Ayman Ghanem" <aghanem@cisco.com>

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI  initStatSN




I apologize if this came up before but I couldn't find it in the archive.

Section 1.2.2.2
"During login, there is always only one outstanding command per connection
and
status numbering is not needed"

Section 2.11.3 (InitStatSN)
"If the login phase involves two login responses then each of them will
hold
for
the subsequent responses."

I find these not consistent. According to 2.11.3, partial and final login
responses
will each have initStatSN, but section 1.2.2.2 implies that initStatSN
comes
after
final login response, and text CMDs in login negotiation need no status
numbering.

-Ayman






From owner-ips@ece.cmu.edu  Wed Apr 11 14:00:59 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA00036
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 14:00:57 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BFgop17945
	for ips-outgoing; Wed, 11 Apr 2001 11:42:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3BFgEr17906
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 11:42:14 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by crufty; Wed Apr 11 11:38:49 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Wed Apr 11 11:40:59 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id LAA26211;
	Wed, 11 Apr 2001 11:40:59 -0400 (EDT)
Message-ID: <3AD47B0B.61CA7CD1@research.bell-labs.com>
Date: Wed, 11 Apr 2001 11:40:59 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Stephen Bailey <steph@cs.uchicago.edu>
CC: ips@ece.cmu.edu
Subject: Re: iSCSI: session login and ISID
References: <200104091727.NAA27077@aura.research.bell-labs.com> <20010411143721.DDDE29400A@sandmail.sandburst.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


Stephen,

Yes, I did see the elegance of the generating sessionID=ISID+TSID 
(rudimentary diffie-hellman)

The questions have been resolved...I was looking for a way to
track initiator sessions at the target but misread appendix D to 
think initiatorWWUI is optional (its optional only for the target 
which seems fine).

thanks,
-Sandeep

Stephen Bailey wrote:
> 
> Sandeep,
> 
> > One of those attempts will get rejected, since the ISID is the sole
> > key to find if a session already exists.
> 
> As Julian mentioned, TSID is actually the key a target uses to
> associate a login to an existing session.  ISID is an opaque (to the
> target) convenience to the initiator.
> 
> > (note: TSID was sent as zero for the leading connection of session)
> 
> The allocated TSID from a leading login is returned in the Login
> Response(es).
> 
> > The initiator WWUI does not seem to be available at this time.
> > a) Appendix D.10 states that InitiatorWWUI is optional and defaults
> >    to iSCSI.
> > b) Section 2.10.9 on Login Command states that "initiator MAY provide
> >    some basic parameters".
> >
> > On the other hand, Section 1.2.7 states that "the initiator MUST
> > present both its initiator WWUI and target WWUI to which it wishes
> > to connect during the login phase".
> 
> Hm.  There does seem to be a contradiction.  I prefer the 1.2.7
> stipulation for esoteric reasons which will either be revealed
> eventually (in which case, they must have been intrinsically correct)
> or squashed like a bug (in which case they are irrelevant).
> 
> > The WWUI is also needed if we are to support multiple I_T nexuses
> > between the same initiator and target.
> 
> I don't see this.  I agree that we want to allow an arbitrary number
> of sessions between a pair of targets.  Julian says it's an open
> issue.  I don't see why.  Perhaps he's referring to the fact that we
> do want to discourage the use of multiple connections (or sessions)
> between an I and a T simply for the purposes of winning a bigger share
> of a congested network link.
> 
> As far as I can tell, the (possibly between the lines) specified
> mechanism supports multiple connections and sessions between an I and
> a T.  If an initiator wants a new session, it sets TSID == 0, and it
> gets a new session.  No reason why it couldn't have multiple
> connections between the same endpoints within a session too.
> 
> Steph


From owner-ips@ece.cmu.edu  Wed Apr 11 14:51:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA00969
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 14:51:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BDEmv07499
	for ips-outgoing; Wed, 11 Apr 2001 09:14:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e23.nc.us.ibm.com (e23.nc.us.ibm.com [32.97.136.229])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BDDtr07455
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 09:13:55 -0400 (EDT)
Received: from southrelay02.raleigh.ibm.com (southrelay02.raleigh.ibm.com [9.37.3.209])
	by e23.nc.us.ibm.com (8.9.3/8.9.3) with ESMTP id JAA13810;
	Wed, 11 Apr 2001 09:06:11 -0500
Received: from d04nms25.raleigh.ibm.com (d04nms25.raleigh.ibm.com [9.67.228.6])
	by southrelay02.raleigh.ibm.com (8.11.1/NCO v4.95) with ESMTP id f3BDDn929012;
	Wed, 11 Apr 2001 09:13:49 -0400
Importance: Normal
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces 
To: marjorie_krueger@hp.com, ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF09D0FD68.5D1C2638-ON85256A2B.0044CFA2@raleigh.ibm.com>
From: "Thomas McSweeney" <rf42tpme@us.ibm.com>
Date: Wed, 11 Apr 2001 09:13:49 -0400
X-MIMETrack: Serialize by Router on D04NMS25/04/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/11/2001 09:13:49 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Marjorie,
                                                                      
 >The host name provides the level of uniqueness necessary            
 >to allow iSCSI to ensure further uniqueness within that host.       
                                                                      



Not all iSCSI devices have a host name;  some use DHCP for IP address
assignment, particularly initiators.  An initiator needs a name that is
more unique than just within its host, so the target can identify it
(imagine all the initiators named "WindowsLaptop").  It is even possible
for a target to use DHCP, if all the initiators find its IP address using
some means other than DNS (e.g., iSNS and/or SLP).  Also, this would enable
storage administrators to configure a "hot backup" of a target on a second
host, so they can temporarily move the data to a backup machine to perform
some maintenance on the primary machine, without reconfiguring the
potentially thousands of geographically widespread initiators.  You could
preconfigure the initiators with a primary hostname and a backup hostname
for the target, or use iSNS/SLP to learn its IP address.

Tom McSweeney
iSCSI Development, Storage Systems Group, IBM
Email: rf42tpme@us.ibm.com
Phone: (USA) 919-254-5634  (tie line: 444-5634)
Fax:   (USA) 919-254-0391  (tie line: 444-0391)



From owner-ips@ece.cmu.edu  Wed Apr 11 16:43:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA03063
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 16:43:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BGItu20487
	for ips-outgoing; Wed, 11 Apr 2001 12:18:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BB1Sr29425
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 07:01:28 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id HAA20057;
	Wed, 11 Apr 2001 07:01:25 -0400 (EDT)
Message-Id: <200104111101.HAA20057@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-iscsi-reqmts-02.txt
Date: Wed, 11 Apr 2001 07:01:25 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI Requirements and Design Considerations
	Author(s)	: M. Krueger et al.
	Filename	: draft-ietf-ips-iscsi-reqmts-02.txt
	Pages		: 22
	Date		: 10-Apr-01
	
The IP Storage Working group is chartered with developing comprehensive 
technology to transport block storage data over IP protocols.  This effort includes a protocol to transport the Small Computer Systems Interface (SCSI) protocol over the internet (iSCSI).

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-02.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-reqmts-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-02.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010410135017.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-02.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-reqmts-02.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010410135017.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Wed Apr 11 18:46:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA04505
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 18:46:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BKRw708174
	for ips-outgoing; Wed, 11 Apr 2001 16:27:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BKRZr08127
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 16:27:36 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id QAA83022
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 16:20:11 -0400
Received: from f3n42e (d03nm042h.boulder.ibm.com [9.99.140.42])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.95) with ESMTP id OAA110196
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 14:27:29 -0600
Importance: Normal
Subject: RE: iSCSI: session login and ISID
To: ips@ece.cmu.edu, <someshg@yahoo.com>
From: "Jim Hafner" <hafner@almaden.ibm.com>
Date: Wed, 11 Apr 2001 13:27:28 -0700
Message-ID: <OFB57CA69E.E5E4F8FC-ON88256A2B.00698F55@LocalDomain>
X-MIMETrack: Serialize by Router on D03NM042/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/11/2001 01:27:28 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Folks,

As the one that started the debate (somewhat privately) on whether one or
more sessions can be allowed between the same two named guys, I think I'd
better cut in here.

I'll warn you that I'm going to be vague and skip a lot of the details.

The N&DT is debating the (critical in my opinion) mapping of the SCSI
Architecture constructs of SCSI Device, SCSI Port, SCSI Nexus and the
constructs of iSCSI (things with "unique names", sessions, etc.).   This
has consequences to reservations, scsi access controls, unit attention
conditions,etc.(all those things that are essentially nexus state).  To
complicate matters, SAM-2 is evolving as we speak.

One has to keep in mind that SCSI is NOT completely stateless with respect
to individual IOs (e.g., they are not self-authenticating).  E.g., a given
IO is allowed to proceed ONLY if the nexus on which it is sent is not
blocked by some reservation state on another nexus.

Now we come to the crux of the issue. In my opinion, there is a fundamental
assumption (not explicitly stated) in the SCSI architecture that there
never exist more than one nexus between the same two (named or addressed)
SCSI Ports.  Parallel and FCP get this for free because of protocol layer
constructs/limitations.   In my opinion, iSCSI needs to make a similar
restriction to meet this requirement.  (Why SAM-x needs it at all is rooted
in the nexus state; I could go into that, but I won't unless pressed,
preferably offline.)

The net of this is that whatever mapping of SAM-2 terms to iSCSI terms we
pick, there will need to be imposed some limitations on sessions (not clear
what yet) to meet this requirement.

The N&DT will soon be offering one approach to the mapping and the
resulting requirements.  Hopefully, it will meet everyone's requirements.

Jim Hafner


"Somesh Gupta" <someshg@yahoo.com>@ece.cmu.edu on 04-11-2001 11:57:04 AM

Please respond to <someshg@yahoo.com>

Sent by:  owner-ips@ece.cmu.edu


To:   Julian Satran/Haifa/IBM@IBMIL, <ips@ece.cmu.edu>
cc:
Subject:  RE: iSCSI: session login and ISID





> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Tuesday, April 10, 2001 4:22 AM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI: session login and ISID
>
>
>
>
> WWUI can be presented during login phase (2.10.9 is correct and in-line
> with 1.2.7) Two sesions can have the same ISID but will have different
> TSID. The question of whether more than one session should be allowed
> between a pair of wuis is under debate.

Why are we debating this? Multiple sessions (one
connection per session) is a simpler, more robust and higher
performance model than multiple connections per session.

Somesh
>
> Julo
>
> sandeepj@research.bell-labs.com (Sandeep Joshi) on 09/04/2001 20:27:46
>
> Please respond to sandeepj@research.bell-labs.com (Sandeep Joshi)
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  iSCSI: session login and ISID
>
>
>
>
> There seems to be a problem in distinguishing session logins, using
> only the ISID field in the Login Command.   It is possible that
> different initiators could try to start a session using the same ISID
> value.   One of those attempts will get rejected, since the ISID is
> the sole key to find if a session already exists. (note: TSID was
> sent as zero for the leading connection of session)
>
> The initiator WWUI does not seem to be available at this time.
> a) Appendix D.10 states that InitiatorWWUI is optional and defaults
>    to iSCSI.
> b) Section 2.10.9 on Login Command states that "initiator MAY provide
>    some basic parameters".
>
> On the other hand, Section 1.2.7 states that "the initiator MUST
> present both its initiator WWUI and target WWUI to which it wishes
> to connect during the login phase".
>
> The WWUI is also needed if we are to support multiple I_T nexuses
> between the same initiator and target.
>
> So it seems like Section 1.2.7 has the right spec.   Appendix D and
> Section 2.10.9 must then be corrected.  The descriptions in Sec 4.1
> and Section 1.2.3 may also need to be changed to reflect the fact
> that initiator WWUI must be supplied at session login.
>
> -Sandeep
>
>
>

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com






From owner-ips@ece.cmu.edu  Wed Apr 11 18:47:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA04518
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 18:47:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BKEx007258
	for ips-outgoing; Wed, 11 Apr 2001 16:14:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BKDvr07212
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 16:13:57 -0400 (EDT)
Received: from southrelay02.raleigh.ibm.com (southrelay02.raleigh.ibm.com [9.37.3.209])
	by e21.nc.us.ibm.com (8.9.3/8.9.3) with ESMTP id QAA138552;
	Wed, 11 Apr 2001 16:08:27 -0500
Received: from d04nms25.raleigh.ibm.com (d04nms25.raleigh.ibm.com [9.67.228.6])
	by southrelay02.raleigh.ibm.com (8.11.1/NCO v4.95) with ESMTP id f3BKDq945794;
	Wed, 11 Apr 2001 16:13:53 -0400
Importance: Normal
Subject: Re: iSCSI Naming: WWUIs, URNs, and namespaces
To: Brian Pawlowski <beepy@netapp.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF56397CDB.B0DAA5E0-ON85256A2B.0067BCDC@raleigh.ibm.com>
From: "Thomas McSweeney" <rf42tpme@us.ibm.com>
Date: Wed, 11 Apr 2001 16:13:50 -0400
X-MIMETrack: Serialize by Router on D04NMS25/04/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/11/2001 04:13:52 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Brian,

>Are "initiators" being connected to actively by "targets"?
>Via some call back or something?

Not that I know of.  The target has to map each initiator that is logging
in to a particular SCSI resource (or set of resources), which may be
different for each initiator.  For example, suppose you have an iSCSI
target node with disk space carved up evenly between two users.  When you
log in to that target, you want to be given access to your disk space
allocation, not mine.  The initiator identifies itself to the target via
the InitiatorWWUI text key of the Login.  The current spec declares that
InitiatorWWUI is globally unique, which makes this mapping easy.  If
InitiatorWWUI is unique only within the initiator node, multiple initiator
nodes can choose the same name (e.g., "WindowsLaptop").  We'll have to
qualify it with something else to make it unique, at least to a target that
uses that name to look up the mapping.

Tom McSweeney
iSCSI Development, Storage Systems Group, IBM
Email: rf42tpme@us.ibm.com
Phone: (USA) 919-254-5634  (tie line: 444-5634)
Fax:   (USA) 919-254-0391  (tie line: 444-0391)



From owner-ips@ece.cmu.edu  Wed Apr 11 18:49:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA04560
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 18:49:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BFf4X17820
	for ips-outgoing; Wed, 11 Apr 2001 11:41:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BFeCr17720
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 11:40:12 -0400 (EDT)
Received: from ahganemw2k (dhcp-161-44-68-139.cisco.com [161.44.68.139]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with SMTP id LAA22224 for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 11:40:06 -0400 (EDT)
From: "Ayman Ghanem" <aghanem@cisco.com>
To: <ips@ece.cmu.edu>
Subject: iSCSI  initStatSN
Date: Wed, 11 Apr 2001 10:38:55 -0500
Message-ID: <LOEPJENHBHAHEABBNDAJAEPACAAA.aghanem@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

I apologize if this came up before but I couldn't find it in the archive.

Section 1.2.2.2
"During login, there is always only one outstanding command per connection
and
status numbering is not needed"

Section 2.11.3 (InitStatSN)
"If the login phase involves two login responses then each of them will hold
for
the subsequent responses."

I find these not consistent. According to 2.11.3, partial and final login
responses
will each have initStatSN, but section 1.2.2.2 implies that initStatSN comes
after
final login response, and text CMDs in login negotiation need no status
numbering.

-Ayman



From owner-ips@ece.cmu.edu  Wed Apr 11 19:46:46 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA05156
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 19:46:45 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BJ38A02120
	for ips-outgoing; Wed, 11 Apr 2001 15:03:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from smtp014.mail.yahoo.com (smtp014.mail.yahoo.com [216.136.173.58])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3BJ2nr02103
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 15:02:50 -0400 (EDT)
Received: from sdsl-216-36-75-164.dsl.sjc.megapath.net (HELO somesh) (216.36.75.164)
  by smtp.mail.vip.sc5.yahoo.com with SMTP; 11 Apr 2001 19:02:44 -0000
X-Apparently-From: <someshg@yahoo.com>
Reply-To: <someshg@yahoo.com>
From: "Somesh Gupta" <someshg@yahoo.com>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: session login and ISID
Date: Wed, 11 Apr 2001 11:57:04 -0700
Message-ID: <NMEALCLOIBCHBDHLCMIJIEOHCCAA.someshg@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
In-Reply-To: <C1256A2A.003DA5C7.00@d12mta02.de.ibm.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit



> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Tuesday, April 10, 2001 4:22 AM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI: session login and ISID
> 
> 
> 
> 
> WWUI can be presented during login phase (2.10.9 is correct and in-line
> with 1.2.7) Two sesions can have the same ISID but will have different
> TSID. The question of whether more than one session should be allowed
> between a pair of wuis is under debate.

Why are we debating this? Multiple sessions (one
connection per session) is a simpler, more robust and higher
performance model than multiple connections per session.

Somesh
> 
> Julo
> 
> sandeepj@research.bell-labs.com (Sandeep Joshi) on 09/04/2001 20:27:46
> 
> Please respond to sandeepj@research.bell-labs.com (Sandeep Joshi)
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  iSCSI: session login and ISID
> 
> 
> 
> 
> There seems to be a problem in distinguishing session logins, using
> only the ISID field in the Login Command.   It is possible that
> different initiators could try to start a session using the same ISID
> value.   One of those attempts will get rejected, since the ISID is
> the sole key to find if a session already exists. (note: TSID was
> sent as zero for the leading connection of session)
> 
> The initiator WWUI does not seem to be available at this time.
> a) Appendix D.10 states that InitiatorWWUI is optional and defaults
>    to iSCSI.
> b) Section 2.10.9 on Login Command states that "initiator MAY provide
>    some basic parameters".
> 
> On the other hand, Section 1.2.7 states that "the initiator MUST
> present both its initiator WWUI and target WWUI to which it wishes
> to connect during the login phase".
> 
> The WWUI is also needed if we are to support multiple I_T nexuses
> between the same initiator and target.
> 
> So it seems like Section 1.2.7 has the right spec.   Appendix D and
> Section 2.10.9 must then be corrected.  The descriptions in Sec 4.1
> and Section 1.2.3 may also need to be changed to reflect the fact
> that initiator WWUI must be supplied at session login.
> 
> -Sandeep
> 
> 
> 

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



From owner-ips@ece.cmu.edu  Wed Apr 11 19:48:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA05173
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 19:48:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BIPs329342
	for ips-outgoing; Wed, 11 Apr 2001 14:25:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mx01-a.netapp.com (mx01-a.netapp.com [198.95.226.53])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BIMFr29084
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 14:22:15 -0400 (EDT)
Received: from frejya.corp.netapp.com (frejya.corp.netapp.com [10.10.20.91])
	by mx01-a.netapp.com (8.11.1/8.11.1/NTAP-1.2) with ESMTP id f3BI1qs02335;
	Wed, 11 Apr 2001 11:01:53 -0700 (PDT)
Received: from tooting-fe.eng.netapp.com (localhost [127.0.0.1])
	by frejya.corp.netapp.com (8.11.1/8.11.1/NTAP-1.2) with ESMTP id f3BI1qU03384;
	Wed, 11 Apr 2001 11:01:52 -0700 (PDT)
Received: (from beepy@localhost)
	by tooting-fe.eng.netapp.com (8.8.8+Sun/8.8.8) id LAA10744;
	Wed, 11 Apr 2001 11:01:47 -0700 (PDT)
From: Brian Pawlowski <beepy@netapp.com>
Message-Id: <200104111801.LAA10744@tooting-fe.eng.netapp.com>
Subject: Re: iSCSI Naming: WWUIs, URNs, and namespaces
In-Reply-To: <OF09D0FD68.5D1C2638-ON85256A2B.0044CFA2@raleigh.ibm.com> from Thomas McSweeney at "Apr 11, 1 09:13:49 am"
To: rf42tpme@us.ibm.com (Thomas McSweeney)
Date: Wed, 11 Apr 2001 11:01:46 -0700 (PDT)
Cc: marjorie_krueger@hp.com, ips@ece.cmu.edu
X-Mailer: ELM [version 2.4ME++ PL40 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Wait.

Are "initiators" being connected to actively by "targets"?
Via some call back or something?

I might be missing something here - I was doing a mental analogy
to NFS clients and servers - we had to come up with a server
name space, but the clients were initiating the mappings and
only had to be able to find file systems on servers (forming
an identifier).

> Marjorie,
>                                                                       
>  >The host name provides the level of uniqueness necessary            
>  >to allow iSCSI to ensure further uniqueness within that host.       
>                                                                       
> 
> 
> 
> Not all iSCSI devices have a host name;  some use DHCP for IP address
> assignment, particularly initiators.  An initiator needs a name that is
> more unique than just within its host, so the target can identify it
> (imagine all the initiators named "WindowsLaptop").  It is even possible
> for a target to use DHCP, if all the initiators find its IP address using
> some means other than DNS (e.g., iSNS and/or SLP).  Also, this would enable
> storage administrators to configure a "hot backup" of a target on a second
> host, so they can temporarily move the data to a backup machine to perform
> some maintenance on the primary machine, without reconfiguring the
> potentially thousands of geographically widespread initiators.  You could
> preconfigure the initiators with a primary hostname and a backup hostname
> for the target, or use iSNS/SLP to learn its IP address.
> 
> Tom McSweeney
> iSCSI Development, Storage Systems Group, IBM
> Email: rf42tpme@us.ibm.com
> Phone: (USA) 919-254-5634  (tie line: 444-5634)
> Fax:   (USA) 919-254-0391  (tie line: 444-0391)
> 



From owner-ips@ece.cmu.edu  Wed Apr 11 20:02:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA05275
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 20:02:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BGip022375
	for ips-outgoing; Wed, 11 Apr 2001 12:44:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BGi1r22333
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 12:44:01 -0400 (EDT)
Received: from hpbs5000.boi.hp.com (hpbs5000.boi.hp.com [15.56.8.201])
	by palrel3.hp.com (Postfix) with ESMTP id 114CACCF
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 09:43:59 -0700 (PDT)
Received: from xatlbh1.atl.hp.com (xatlbh1.atl.hp.com [15.45.89.186]) by hpbs5000.boi.hp.com with ESMTP (8.8.6 (PHNE_17135)/8.8.6 SMKit7.02) id KAA04611 for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 10:43:57 -0600 (MDT)
Received: by xatlbh1.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <2TX6BG35>; Wed, 11 Apr 2001 12:43:53 -0400
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FA8@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces 
Date: Wed, 11 Apr 2001 12:43:51 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

>  >The host name provides the level of uniqueness necessary            
>  >to allow iSCSI to ensure further uniqueness within that host.       
> 
> Not all iSCSI devices have a host name;  some use DHCP for IP address
> assignment, particularly initiators.  An initiator needs a name that is
> more unique than just within its host, so the target can identify it
> (imagine all the initiators named "WindowsLaptop"). 

My workstation uses DHCP to obtain an address, but it still has a hostname
that is unique.  That's the great thing about hostnames, they aren't
dependant on IP address - even when an initiator uses DHCP, it's hostname
stays the same.

If all the initiators are named "WindowsLaptop", that's an administrative
problem, not iSCSI's problem to solve.  The adminstrators have the means to
solve this problem - they do today for many other applications that use
hostnames to identify workstations.

> You could preconfigure the initiators with a primary hostname and a 
> backup hostname for the target, or use iSNS/SLP to learn its IP address.
>

I'm not sure what you are advocating here, Thomas.  IP hosts typically
already have a hostname configured, and their full host name is a product of
their domain.  For instance, my domain qualified hostname is
user55.rose.hp.com   It's unique enough for storage object naming purposes.
I don't see any reason for iSCSI to define a new (or multiple) hostnames for
initiators to use.

I wouldn't think this is an "either or" situation.  Defining hostnames
(initiator name) doesn't necessarily have anything to do with learning IP
addresses.  I'm concerned that the N&D team is inventing things that don't
need to be invented without understanding current IP practices.

Marj


From owner-ips@ece.cmu.edu  Wed Apr 11 20:05:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA05301
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 20:05:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BLo2P13786
	for ips-outgoing; Wed, 11 Apr 2001 17:50:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BLnTr13756
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 17:49:29 -0400 (EDT)
Received: from hpindlm.cup.hp.com (hpindlm.cup.hp.com [15.13.95.89])
	by palrel1.hp.com (Postfix) with ESMTP
	id C258F18EA; Wed, 11 Apr 2001 14:44:30 -0700 (PDT)
Received: from mk731913.cup.hp.com (mk731912.cup.hp.com [15.8.80.111])
	by hpindlm.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA06588;
	Wed, 11 Apr 2001 14:48:33 -0700 (PDT)
Message-Id: <5.0.2.1.2.20010411115842.02917b98@esalpha2.cup.hp.com>
X-Sender: krause@esalpha2.cup.hp.com
X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
Date: Wed, 11 Apr 2001 11:59:33 -0700
To: Robert Snively <rsnively@Brocade.COM>
From: Michael Krause <krause@cup.hp.com>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Cc: ips@ece.cmu.edu
In-Reply-To: <FFD40DB4943CD411876500508BAD02797D46A7@sj5-ex2.brocade.com
 >
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

At 01:17 PM 4/6/2001 -0700, Robert Snively wrote:

>The net I draw from this is that careful design is key to
>success and that CRC or positionally
>dependent checksum on the iSCSI data packets is probably
>a good idea.  However, retry of iSCSI data packets may not
>be necessary.

Agree with CRC being always present and that retry is not needed.

Mike



From owner-ips@ece.cmu.edu  Wed Apr 11 20:07:13 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA05327
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 20:07:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BLj2c13419
	for ips-outgoing; Wed, 11 Apr 2001 17:45:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BLiIr13382
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 17:44:18 -0400 (EDT)
Received: from hpindlm.cup.hp.com (hpindlm.cup.hp.com [15.13.95.89])
	by palrel3.hp.com (Postfix) with ESMTP
	id 5AB4ABBC; Wed, 11 Apr 2001 14:44:17 -0700 (PDT)
Received: from mk731913.cup.hp.com (mk731912.cup.hp.com [15.8.80.111])
	by hpindlm.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA06566;
	Wed, 11 Apr 2001 14:48:20 -0700 (PDT)
Message-Id: <5.0.2.1.2.20010411113332.028b5838@esalpha2.cup.hp.com>
X-Sender: krause@esalpha2.cup.hp.com
X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
Date: Wed, 11 Apr 2001 11:43:05 -0700
To: Stephen Bailey <steph@cs.uchicago.edu>
From: Michael Krause <krause@cup.hp.com>
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Cc: ips@ece.cmu.edu
In-Reply-To: <20010409222356.51DC694006@sandmail.sandburst.com>
References: <Message from Venkat Rangan <venkat@rhapsodynetworks.com>
 <15851BD69CFCD41186B100B0D0AABE650C1B49@med.corp.rhapsodynetworks.com>
 <15851BD69CFCD41186B100B0D0AABE650C1B49@med.corp.rhapsodynetworks.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

At 06:22 PM 4/9/2001 -0400, Stephen Bailey wrote:

>No.  However, middlebox-proofing we do will be circumvented when the
>middle box decides it wants to look in iSCSI as a L7 protocol.  I know
>it sounds horrible, but there are zillions of companies doing this for
>HTTP now.  The reason why middle boxes manipulate the data is because
>that allows them to provide the desired behavior.  It seems like a
>really natural product idea for some people (personally I don't get
>it) to plumb HTTP and iSCSI (hey, why not CIFS, NFS, SIP and RTP while
>we're at it :^) into one big, happy L7 router/switch type box.
>Fundamentally, if the middle box is going to diddle in the payload,
>there's not squat we can do.

This is why the concept of a variant CRC for those fields within the header 
subject to change and an invariant CRC for the data and header fields that 
are not subject to change is a preferred solution.  The middle boxes cannot 
screw up everything.  However, this is primarily practical only when PDU do 
not span multiple packets which many people seem to want for some reason.

>I am somewhat ambivalent about CRC digests (I'd rather have end-to-end
>security and kill all those birds with the same stone), but what I'm
>really averse to is assuming that digest failures are frequent, and a
>less than brute force (connection bounce) recovery mechanism is
>required.

Given security solutions can change over time it is unclear whether this 
would be practically. Doing something as simple as a secured hash for 
modification detection is much more complex and performance inhibiting than 
calculating a CRC.

Also the frequency of many aspects - the number, type, and bandwidth of the 
fabrics involved in the end-to-end solution come immediately to 
mind.  Given these can vary in quality and rate of change (bandwidth is 
already at 40 Gbps on some fabrics and rising to 100 Gbps and it is 
important for the same solution to work from DSL speeds to 100 Gbps and 
beyond), it is unclear whether any study done to date can quantify the 
actual frequency that will be present. As such, it is better to error on 
the side of strong data integrity at the start (provide investment 
protection to the customer) and then adapt as required.  Not doing anything 
now leaves customers buying hardware that will be throw-away / of limited use.

Mike



From owner-ips@ece.cmu.edu  Wed Apr 11 21:04:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05795
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 21:04:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BNd4Y21325
	for ips-outgoing; Wed, 11 Apr 2001 19:39:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BNcir21273
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:38:48 -0400 (EDT)
Received: from hpbs5000.boi.hp.com (hpbs5000.boi.hp.com [15.56.8.201])
	by atlrel1.hp.com (Postfix) with ESMTP id 6A281564
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:38:19 -0400 (EDT)
Received: from xatlbh2.atl.hp.com (xatlbh2.atl.hp.com [15.45.89.187]) by hpbs5000.boi.hp.com with ESMTP (8.8.6 (PHNE_17135)/8.8.6 SMKit7.02) id RAA05516 for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 17:38:18 -0600 (MDT)
Received: by xatlbh2.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <2M5JNTC0>; Wed, 11 Apr 2001 19:38:16 -0400
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FAC@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Wed, 11 Apr 2001 19:38:14 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


> >Are "initiators" being connected to actively by "targets"?
> >Via some call back or something?
> 
> Not that I know of.  The target has to map each initiator that is logging
> in to a particular SCSI resource (or set of resources), which may be
> different for each initiator.  For example, suppose you have an iSCSI
> target node with disk space carved up evenly between two users.  When you
> log in to that target, you want to be given access to your disk space
> allocation, not mine.  The initiator identifies itself to the target via
> the InitiatorWWUI text key of the Login.  The current spec  declares that
> InitiatorWWUI is globally unique, which makes this mapping easy.  If
> InitiatorWWUI is unique only within the initiator node, multiple initiator
> nodes can choose the same name (e.g., "WindowsLaptop").  We'll have to
> qualify it with something else to make it unique, at least to a target
that
> uses that name to look up the mapping.
> 

As previously discussed in this thread, your terminology is obsolete - the
IESG axed the use of "WWUI" and you should be referring to "initiator name"
and "target name" or genericly "storage node name".  You have not made a
case for a need for these names to be globally unique.

In any case, fully qualified host names *are* globally  unique, are already
a part of the IP protocols, and can be used as part of a URL syntax to
identify storage node objects.  You keep referring to "WindowsLaptop" as the
hostname, but that's only part of the hostname.  Are you familiar with IP
hostname constructs?


From owner-ips@ece.cmu.edu  Wed Apr 11 21:05:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05807
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 21:05:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BNi5Q21641
	for ips-outgoing; Wed, 11 Apr 2001 19:44:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BNhsr21634
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:43:54 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3C0pt005686;
	Wed, 11 Apr 2001 17:51:55 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Thomas McSweeney" <rf42tpme@us.ibm.com>,
        "Brian Pawlowski" <beepy@netapp.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Wed, 11 Apr 2001 16:41:56 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIEDOCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <OF56397CDB.B0DAA5E0-ON85256A2B.0067BCDC@raleigh.ibm.com>
Importance: Normal
X-MIMEOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Thomas,

Part of the justification for creating a new name server was to promulgate
changes to various consumers.  If you look at the food chain involved in the
iSCSI to SCSI configuration, the iSCSI server is a principle consumer
whereas the iSCSI clients need this transitory information to pass to iSCSI
server.  Would you see an advantage in having a type of housekeeping signal
to be incorporated within the iSCSI protocol to signal a need to update
configuration information?  Management software that changes the information
could advise the iSCSI servers that in turn through the use of the protocol
signals, advise their clients.  It would seem this system naturally scales
better than a scheme that expects all clients to retain a persistent but
otherwise idle connection to a name server.  I can think of a few more cases
where this same management software may wish to advise of such a change in
advance as well as to advise after the fact.

Doug

> Brian,
>
> >Are "initiators" being connected to actively by "targets"?
> >Via some call back or something?
>
> Not that I know of.  The target has to map each initiator that is logging
> in to a particular SCSI resource (or set of resources), which may be
> different for each initiator.  For example, suppose you have an iSCSI
> target node with disk space carved up evenly between two users.  When you
> log in to that target, you want to be given access to your disk space
> allocation, not mine.  The initiator identifies itself to the target via
> the InitiatorWWUI text key of the Login.  The current spec declares that
> InitiatorWWUI is globally unique, which makes this mapping easy.  If
> InitiatorWWUI is unique only within the initiator node, multiple initiator
> nodes can choose the same name (e.g., "WindowsLaptop").  We'll have to
> qualify it with something else to make it unique, at least to a
> target that
> uses that name to look up the mapping.
>
> Tom McSweeney
> iSCSI Development, Storage Systems Group, IBM
> Email: rf42tpme@us.ibm.com
> Phone: (USA) 919-254-5634  (tie line: 444-5634)
> Fax:   (USA) 919-254-0391  (tie line: 444-0391)
>
>



From owner-ips@ece.cmu.edu  Wed Apr 11 21:05:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05818
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 21:05:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BNK5G19951
	for ips-outgoing; Wed, 11 Apr 2001 19:20:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BNJlr19943
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:19:47 -0400 (EDT)
Received: from xatlrelay1.atl.hp.com (xatlrelay1.atl.hp.com [15.45.89.190])
	by atlrel1.hp.com (Postfix) with ESMTP id C46F9999
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:19:46 -0400 (EDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1])
	by xatlrelay1.atl.hp.com (Postfix) with ESMTP id 0D2571F509
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:18:00 -0400 (EDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2PW5QR3B>; Wed, 11 Apr 2001 16:19:45 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FAA@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Wed, 11 Apr 2001 16:19:43 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 

> -----Original Message-----
> From: Douglas Otis [mailto:dotis@sanlight.net]
> Sent: Wednesday, April 11, 2001 1:44 PM
> To: Brian Pawlowski; Thomas McSweeney
> Cc: marjorie_krueger@hp.com; ips@ece.cmu.edu
> Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
> 
> 
> Brian and Marjorie,
> 
> There is a need to obtain iSCSI configuration information in 
> a generalized
> way.  This information should allow the client to find the 
> server using
> existing IP namespace information.  Forging connections 
> within the server
> must use normal SCSI naming appended in some manner to the IP 
> namespace.
> The SCSI naming should not be dereferenced to require an 
> additional name
> server.  DHCP provides a standard means of indicating the 
> location of LDAP.
> SLP also provides a standard means of locating LDAP.  There 
> is a standard
> alias convention of ldap.example.com of finding the LDAP 
> server and LDAP has
> be working with may security schemes for years.  Once an LDAP 
> schema is
> defined by the IPS group, then any Directory User Agent can obtain any
> auxiliary information without the need for yet another name server.
> 
> Marjorie had mentioned that some are considering XML to support this
> function.  You will need some way to find this service.  You will find
> little information regarding security and w3c.org conventions 
> for naming are
> system not public based as I had mentioned.  Security issues 
> regarding XML
> will be addressed.  Microsoft has announced their secure 
> version of NT (XP)
> will use XML to manage updates through third parties.  I bet 
> this will be as
> popular as Clippit.  With the desire to invent a new name server, XML
> namespace seems a likely bet.  It will be interesting to see 
> if all this can
> be made secure.  It is a goal of Microsoft obviously and will 
> keep w3c.org
> busy for many years.  It does seem rather bold to base 
> management on such
> new code for something so critical.
> 
> Doug
> 
> > Wait.
> >
> > Are "initiators" being connected to actively by "targets"?
> > Via some call back or something?
> >
> > I might be missing something here - I was doing a mental analogy
> > to NFS clients and servers - we had to come up with a server
> > name space, but the clients were initiating the mappings and
> > only had to be able to find file systems on servers (forming
> > an identifier).
> 
> 
> 


From owner-ips@ece.cmu.edu  Wed Apr 11 21:05:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05830
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 21:05:57 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BNc3t21249
	for ips-outgoing; Wed, 11 Apr 2001 19:38:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BNbGr21186
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:37:16 -0400 (EDT)
Received: from xparelay2.corp.hp.com (xparelay2.corp.hp.com [15.58.137.112])
	by palrel1.hp.com (Postfix) with ESMTP id 1C84718A2
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 16:25:59 -0700 (PDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1])
	by xparelay2.corp.hp.com (Postfix) with ESMTP id 3C4301F514
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:24:26 -0400 (EDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2PW5QR8R>; Wed, 11 Apr 2001 16:25:42 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FAB@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Wed, 11 Apr 2001 16:25:39 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> There is a need to obtain iSCSI configuration information in 
> a generalized
> way.  This information should allow the client to find the 
> server using
> existing IP namespace information.  
..snip..

> 
> Marjorie had mentioned that some are considering XML to support this
> function.  You will need some way to find this service.  

You misread that paragraph in the iSCSI Requirements document.  There has
been no assertion that storage devices will use XML to discover each other!
To quote the requirements document:

"... Development of specifications for iSCSI device management as MIBs, XML
schemas, etc"

Device management is much different (and entirely separate) from service
discovery and authentication.  Naming is an a construct for identity, which
is necessary for authentication.

Marj


From owner-ips@ece.cmu.edu  Wed Apr 11 21:06:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05842
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 21:06:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BNA5e19338
	for ips-outgoing; Wed, 11 Apr 2001 19:10:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-1.cisco.com (sj-msg-core-1.cisco.com [171.71.163.11])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BN91r19296
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:09:02 -0400 (EDT)
Received: from sponge.cisco.com (sponge.cisco.com [171.71.61.25])
	by sj-msg-core-1.cisco.com (8.9.3/8.9.1) with ESMTP id QAA10192
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 16:08:57 -0700 (PDT)
Received: from dap02w2k (rtp-dial-1-244.cisco.com [10.83.97.244])
	by sponge.cisco.com (Mirapoint)
	with SMTP id AAW00792;
	Wed, 11 Apr 2001 18:08:54 -0500 (CDT)
From: "Dave Peterson" <dap@cisco.com>
To: "Ips@Ece. Cmu. Edu" <ips@ece.cmu.edu>
Subject: Tape drives and iSCSI
Date: Wed, 11 Apr 2001 18:06:31 -0500
Message-ID: <EDEKKDKNBFCABNBAAOBBKEKMCDAA.dap@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


Don't believe a tape backup application model will add much value to the
discussion.
Here's my view:

Tape Requirements
-----------------
1. Data integrity
2. Data integrity
3. Data integrity
4. Perform successful uninterrupted backup (within backup window is a big
plus)

Tape Specifics
--------------
Large record sizes are recommended for performance. The largest (typical)
record size is 256KB.
Using a default DataPDULength = 8192 would require 32 data PDU's.
A typical higher-end head-to-tape transfer rates = 10 MB/sec resulting in a
backup rate of ~36 GB/hour.

Ramblings& Tidbits
------------------
FCP-2 error detection and recovery is implemented by at least one high-end
tape drive vendor (and they have a significant advantage over other vendors,
i.e., their backup will not fail due to a FC link-level error).

SCSI level timeout and retry is highly dependent on the backup application.
One must first determine the state/position of the tape drive before
proceeding. Tools such as READ POSITION and LOCATE are in place for the
backup app to attempt recovery from an error. Problem has been few backup
vendors have yet to implement them (but things are getting better). Thus
FC-TAPE was born and the error detection and recovery mechanism was rolled
into FCP-2 as a standard. FCP-2 provides tools to determine the
state/position of the tape drive below the SCSI level allowing a "best
effort" attempt to complete the exchange/command and not return an error to
the application. At this point in time this is what is important, i.e., DO
NOT RETURN AN ERROR TO THE APPLICATION IF AT ALL POSSIBLE.

Refer to Matt Wakeley's I/O (command) recovery write-up for a description of
iSCSI error recovery that should be used as a starting point. Maybe this has
already been done, I'm not involved in the error recovery group. What we
need is the ability to detect an error and recovery below the SCSI level
(i.e., the iSCSI transport level). Any further granularity is not needed
especially due to the low error rates that will be encountered.

Regarding tape devices and maintaining state, FC-TAPE enabled drives have
the following requirements:
For non-tagged command queuing operations, the target shall retain the
Exchange information until
a) the next FCP_CMND IU has been received for that LUN from the same
initiator;
b) an FCP_CONF IU is received for the Exchange; or
c) after RR_TOV times out.
For tagged command queuing operations, the target shall retain Exchange
information until
a) an FCP_CONF IU is received for the Exchange; or
b) after RR_TOV times out.

There is a work in progress for a new tape model in the T10 SSC-2 working
group. This new model will allow for a simpler error detection and recovery
precedure and a robust command queuing implementation.

Finally, I strongly agree with the sentiment of getting the first version
"out the door".
The issues surrounding CRC's and error recovery need to be put to reset
asap.

David Peterson
Lead Architect - Standards Development
Cisco Systems - SRBU
6450 Wedgwood Road
Maple Grove, MN 55311
Office: 763-398-1007
Cell: 612-802-3299
Email: dap@cisco.com



From owner-ips@ece.cmu.edu  Wed Apr 11 21:06:53 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05853
	for <ips-archive@odin.ietf.org>; Wed, 11 Apr 2001 21:06:52 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BNA3h19335
	for ips-outgoing; Wed, 11 Apr 2001 19:10:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-1.cisco.com (sj-msg-core-1.cisco.com [171.71.163.11])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BN97r19301
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 19:09:08 -0400 (EDT)
Received: from sponge.cisco.com (sponge.cisco.com [171.71.61.25])
	by sj-msg-core-1.cisco.com (8.9.3/8.9.1) with ESMTP id QAA10244;
	Wed, 11 Apr 2001 16:09:01 -0700 (PDT)
Received: from dap02w2k (rtp-dial-1-244.cisco.com [10.83.97.244])
	by sponge.cisco.com (Mirapoint)
	with SMTP id AAW00793;
	Wed, 11 Apr 2001 18:08:58 -0500 (CDT)
From: "Dave Peterson" <dap@cisco.com>
To: "Stephen Bailey" <steph@cs.uchicago.edu>, <ips@ece.cmu.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport" 
Date: Wed, 11 Apr 2001 18:06:34 -0500
Message-ID: <EDEKKDKNBFCABNBAAOBBMEKMCDAA.dap@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <20010409185843.6696094009@sandmail.sandburst.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit



> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> Stephen Bailey
> Sent: Monday, April 09, 2001 1:57 PM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>
>
> > Exactly, I've worked in this context (though its been some years now).
> > It was true (at one time) that tape had a tractability limit, e.g.,
> > a tape backup of a terabyte was out of the question.  Has that changed?
>
> I think this is precisely the point.  Existing, off-the-shelf SCSI
> solutions DO NOT presently solve this problem.  Both ||SCSI an FCP
> burp the operation on a expectable, O(days) failure rate.  The rate of
> adoption for the FCP-2 command recovery feature is overwhelming to the
> point that the tape guys have been talking about end-running the
> problem with explicitly addressed commands.
>

Not true. The reason for the explicit command set is a result of the
difficulties encountered when attempting to perform FC sequence-level
recovery in an out-of-order environment. The command set will also allow a
more robust tape command queuing implementation.

Dave



From owner-ips@ece.cmu.edu  Thu Apr 12 01:44:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA13916
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 01:44:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3BKlKM09542
	for ips-outgoing; Wed, 11 Apr 2001 16:47:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3BKk1r09455
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 16:46:02 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3BLs0005535;
	Wed, 11 Apr 2001 14:54:00 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Brian Pawlowski" <beepy@netapp.com>,
        "Thomas McSweeney" <rf42tpme@us.ibm.com>
Cc: <marjorie_krueger@hp.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Wed, 11 Apr 2001 13:44:02 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJAEDMCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <200104111801.LAA10744@tooting-fe.eng.netapp.com>
X-MIMEOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Brian and Marjorie,

There is a need to obtain iSCSI configuration information in a generalized
way.  This information should allow the client to find the server using
existing IP namespace information.  Forging connections within the server
must use normal SCSI naming appended in some manner to the IP namespace.
The SCSI naming should not be dereferenced to require an additional name
server.  DHCP provides a standard means of indicating the location of LDAP.
SLP also provides a standard means of locating LDAP.  There is a standard
alias convention of ldap.example.com of finding the LDAP server and LDAP has
be working with may security schemes for years.  Once an LDAP schema is
defined by the IPS group, then any Directory User Agent can obtain any
auxiliary information without the need for yet another name server.

Marjorie had mentioned that some are considering XML to support this
function.  You will need some way to find this service.  You will find
little information regarding security and w3c.org conventions for naming are
system not public based as I had mentioned.  Security issues regarding XML
will be addressed.  Microsoft has announced their secure version of NT (XP)
will use XML to manage updates through third parties.  I bet this will be as
popular as Clippit.  With the desire to invent a new name server, XML
namespace seems a likely bet.  It will be interesting to see if all this can
be made secure.  It is a goal of Microsoft obviously and will keep w3c.org
busy for many years.  It does seem rather bold to base management on such
new code for something so critical.

Doug

> Wait.
>
> Are "initiators" being connected to actively by "targets"?
> Via some call back or something?
>
> I might be missing something here - I was doing a mental analogy
> to NFS clients and servers - we had to come up with a server
> name space, but the clients were initiating the mappings and
> only had to be able to find file systems on servers (forming
> an identifier).





From owner-ips@ece.cmu.edu  Thu Apr 12 02:27:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA23834
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 02:27:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3C1a6C28487
	for ips-outgoing; Wed, 11 Apr 2001 21:36:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3C1Zdr28463
	for <ips@ece.cmu.edu>; Wed, 11 Apr 2001 21:35:39 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3C2hS005770;
	Wed, 11 Apr 2001 19:43:32 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>,
        <ips@ece.cmu.edu>
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Wed, 11 Apr 2001 18:33:29 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJCEEACGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <6BD67FFB937FD411A04F00D0B74FE87802A08FAB@xrose06.rose.hp.com>
Importance: Normal
X-MIMEOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie,

Although there is nothing within requirements documentation regarding how an
XML database scheme is used within an iSCSI architecture, there must be some
relationship either directly or indirectly or it does not belong.  The
aspect of using this database is unclear, so your views in this area are
welcome in relating the function you understand this database fulfilling.  I
simply said there must be some way to find this service with a substratum of
XML in whatever form it takes.  This service should be comprehensive and
provide SCSI related items to allow a client, directly or indirectly, the
means for successful communication using iSCSI.  I did not indicate this
would be used as a discovery tool, but rather there would need to be such a
means in place to allow its use.  Centralized data removes the abstraction
server concept that has been found objectionable.  I was attempting to view
such data in that light.

Doug

> > There is a need to obtain iSCSI configuration information in
> > a generalized
> > way.  This information should allow the client to find the
> > server using
> > existing IP namespace information.
> ..snip..
>
> >
> > Marjorie had mentioned that some are considering XML to support this
> > function.  You will need some way to find this service.
>
> You misread that paragraph in the iSCSI Requirements document.  There has
> been no assertion that storage devices will use XML to discover
> each other!
> To quote the requirements document:
>
> "... Development of specifications for iSCSI device management as
> MIBs, XML
> schemas, etc"
>
> Device management is much different (and entirely separate) from service
> discovery and authentication.  Naming is an a construct for
> identity, which
> is necessary for authentication.
>
> Marj
>



From owner-ips@ece.cmu.edu  Thu Apr 12 05:59:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA25468
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 05:59:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3C8PGk20821
	for ips-outgoing; Thu, 12 Apr 2001 04:25:16 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3C8OWr20796
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 04:24:32 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT32A6F>; Thu, 12 Apr 2001 04:25:56 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801541A@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: marjorie_krueger@hp.com, ips@ece.cmu.edu
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
Date: Thu, 12 Apr 2001 04:24:25 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Marj,

> > At the next level down, there are three sorts
> > of issues floating in here:
> > - Semantics: Unique identifiers, global scope,
> > 	not tied to communication endpoints.
> > - Syntax: How those identifiers are represented
> > 	in messages and related issues of control
> > 	over use and extension of that representation
> > - Description: How the syntax is documented and
> > 	the resulting suggestions/implications for
> > 	its use.
> > To a first approximation, the above list is in
> > the order of importance to iSCSI (Semantics is
> > most important) and in REVERSE order of importance
> > to the IESG/IAB issues I've raised (i.e., the
> > biggest issues are with the Description, believe
> > it or not :-)).
> 
> I think I'm agreeing with the IESG then.  It appears that the N&D team has
> somehow decided that globally unique identifiers are necessary and of
> primary importance.  That doesn't make sense to me.

Actually that's not the IESG/IAB concern.  If iSCSI needs globally unique
identifiers, that's ok, but we MUST not invent a new potentially
all-encompassing global namespace for them.  Reuse of existing
globally unique identifiers, like WWNs, is fine.

> I think we should be
> focusing on defining a URL format for iSCSI resources.  The host name
> provides the level of uniqueness necessary to allow iSCSI to ensure
further
> uniqueness within that host.  I'm waiting to hear justification from the
N&D
> team regarding their focus???

Off the top of my head, this gets caught between the rock of parallel
sessions
on parallel network fabrics and the hard place of authentication.  Setting
up
parallel sessions requires naming the session endpoints explicitly to avoid
having the sessions cross-routed in a way that a single network device
failure
could take all of them out.  Authentication would like a single identity for
each side of the set of sessions to simplify identity management.  An
implicit
assumption in the parallel sessions scenario is that the ability to have
multiple TCP connections in a session is not sufficient for parallel multi-
path I/O - this is based on past discussions about concerns like CmdSN
sequencing for one session that is spread across multiple iSCSI HBA cards.

DNS works well for naming endpoints because one DNS name can resolve
to one IP address. Trying to kill both birds with one stone results in
needing
to resolve one DNS name to a set of IP addresses, which DNS isn't very
good at -- round-robin risks cross-routed paths, and if we insist on using
multi-homing, the network administration community will have
our heads.  This gets even worse if the parallel paths between Initiator
and Target aren't cross-connected (e.g., the interfaces on each side
are split across two parallel networks that aren't connected, or don't
have a routable connection between them due to VLAN configuration)
as this further complicates use of DNS by requiring something
encode this lack of connectivity to avoid any attempt to set up connections
where communication is impossible.  Hence it appears that something
else is needed to name targets and probably also initiators, at least
for authentication purposes.  The NDT folks will need to comment on
how they arrived at the "something else" they are recommending.

My 0.02,
--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Thu Apr 12 05:59:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA25479
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 05:59:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3C7iFG18631
	for ips-outgoing; Thu, 12 Apr 2001 03:44:15 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3C7hSr18607
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 03:43:28 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LFGD3>; Thu, 12 Apr 2001 03:34:10 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015415@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Subject: LDAP and XML
Date: Thu, 12 Apr 2001 03:43:20 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Doug,

With my WG co-chair hat on ... this thread needs to
stop, because Marjorie's offer to include LDAP in
addition to XML as an example is reasonable, and is
all that needs to be done for the iSCSI requirements
draft.  Among the examples of the use of XML for
management software is the XML dialect of CIM used
by both DMTF and SNIA, and CIM can clearly be extended
to include iSCSI.  Hence the suggestion that XML be
removed as an inappropriate example is hereby
rejected.  The inclusion of XML as an example
does not obligate the WG to develop an XML schema
any more than inclusion of LDAP would obligate
the WG to develop an LDAP schema.  FWIW, use of
XML does not require a "database" - e.g., some
of the data retrieved by SNMP from the MIB could
also be encoded in XML for retrieval via HTTP.

Comments from others who feel this topic is important
are welcome, but Doug's well-known opinion on LDAP is
noted and should not be repeated on the list in
discussion of the iSCSI requirements draft prior to
the interim meeting.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------

> -----Original Message-----
> From:	Douglas Otis [SMTP:dotis@sanlight.net]
> Sent:	Wednesday, April 11, 2001 9:33 PM
> To:	KRUEGER,MARJORIE (HP-Roseville,ex1); ips@ece.cmu.edu
> Subject:	RE: iSCSI Naming: WWUIs, URNs, and namespaces
> 
> Marjorie,
> 
> Although there is nothing within requirements documentation regarding how
> an
> XML database scheme is used within an iSCSI architecture, there must be
> some
> relationship either directly or indirectly or it does not belong.  The
> aspect of using this database is unclear, so your views in this area are
> welcome in relating the function you understand this database fulfilling.
> I
> simply said there must be some way to find this service with a substratum
> of
> XML in whatever form it takes.  This service should be comprehensive and
> provide SCSI related items to allow a client, directly or indirectly, the
> means for successful communication using iSCSI.  I did not indicate this
> would be used as a discovery tool, but rather there would need to be such
> a
> means in place to allow its use.  Centralized data removes the abstraction
> server concept that has been found objectionable.  I was attempting to
> view
> such data in that light.
> 
> Doug
> 
> > > There is a need to obtain iSCSI configuration information in
> > > a generalized
> > > way.  This information should allow the client to find the
> > > server using
> > > existing IP namespace information.
> > ..snip..
> >
> > >
> > > Marjorie had mentioned that some are considering XML to support this
> > > function.  You will need some way to find this service.
> >
> > You misread that paragraph in the iSCSI Requirements document.  There
> has
> > been no assertion that storage devices will use XML to discover
> > each other!
> > To quote the requirements document:
> >
> > "... Development of specifications for iSCSI device management as
> > MIBs, XML
> > schemas, etc"
> >
> > Device management is much different (and entirely separate) from service
> > discovery and authentication.  Naming is an a construct for
> > identity, which
> > is necessary for authentication.
> >
> > Marj
> >


From owner-ips@ece.cmu.edu  Thu Apr 12 05:59:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA25490
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 05:59:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3C7sFn19153
	for ips-outgoing; Thu, 12 Apr 2001 03:54:15 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com (mxbh4.isus.emc.com [168.159.208.52])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3C7rZr19133
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 03:53:35 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <2XNBRD71>; Thu, 12 Apr 2001 03:53:29 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015416@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: iSCSI Requirements Draft - Informal WG Last Call
Date: Thu, 12 Apr 2001 03:53:28 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
as an Informational RFC. There is no formal requirement for
a WG Last Call, but if you have any further substantive comments
on the document please raise them on this list within the next
two weeks, i.e. by April 27th at the latest.

If you have typographical/editorial comments please send them
direct to the document's author, Marjorie Krueger
<marjorie_krueger@hp.com>.

Thanks,
--David and Elizabeth, IPS WG co-chairs



From owner-ips@ece.cmu.edu  Thu Apr 12 06:59:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA25817
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 06:59:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3C8VGr21184
	for ips-outgoing; Thu, 12 Apr 2001 04:31:16 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3C8USr21106
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 04:30:28 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LFHVM>; Thu, 12 Apr 2001 04:21:10 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801541B@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic reco
	very
Date: Thu, 12 Apr 2001 04:30:20 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Summarizing the state of this dicussion:

(1) The idea for use of regular CmdSN numbering and
an "immediate" flag for immediate commands instead of
a zero CmdSN appears to have received a positive reception,
although the details of how CmdSN is used for immediate
commands are still open (use number in sequence vs.
duplicate what would be the next number).  Since
immediate commands are currently exempt from the
requirement to deliver from iSCSI to SCSI in order,
the default way to do this would be to exempt them
from the "MUST deliver in order" text that Doug
repeatedly quotes.

(2) It looks like some words are in order to advise
implementers to watch out for TASK_SET_FULL interactions
with CmdSN-based windowing possibly including restrictions
on issuing TASK_SET_FULL when the CmdSN window is still
open.  While the TASK_SET_FULL usage I described isn't
always present or used, there are Initiators and Targets
that behave in that fashion and hence we need some words
to advise those who plan to carry that mechanism forward
in order to avoid surprises when SCSI code (or code above
it) receives a TASK_SET_FULL.

Now the open issue:

(3) My understanding of the outstanding issue is the
rejection mechanism whereby arrival of an immediate
command with a CmdSN higher than the CmdSN of the last
command delivered to SCSI would then cause rejection
(return to sender) of all undelivered commands with
lower CmdSNs.  I think Doug is asking that support for
this mechanism be available but not be required.

The overall goal for this is apparently:

> My concern is regarding the affect [on] existing
> interfaces where there is not this potential for
> a large queue of commands that can not be purged
> except through action at the target at a much
> later point in time.

> I am asking for predictable behavior in the event
> of the sequencer being bypassed.

First of all, we need realistic motivating scenarios for
this.  The Rewind scenario isn't realistic, as Doug has
agreed that it's an application bug.  Beyond this, the
statement that existing interfaces cannot build up a large
queue of unpurgeable commands is false - if the application
issues a bunch of writes just after the Rewind, the same
situation results, and again the application is buggy.

The original motivating scenarios involved task management
commands, but I thought the following comment from Bob
Snively closed out those scenarios:

  At present, SCSI applications do not have a clear
  guarantee of the order between task management functions
  and the processing or delivery of any particular task.
  In fact, the concept of an ORDERED attribute applied
  to a task management function is unknown.  As a result,
  SCSI drivers have to be aware of the implications.

I'm assuming that Bob's comment applies to tape as well
as disk (tape experts, please correct this if it's wrong).

Absent a plausible description of what's broken, it's
hard to justify designing a fix.  There are already
situations in which SCSI behavior is unpredictable,
so a general appeal to the goodness of predictability
does not suffice - it needs to be accompanied by one
or more realistic scenarios in which such a fix would
make a visible difference.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Thu Apr 12 08:42:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA28287
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 08:42:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3C84Fq19716
	for ips-outgoing; Thu, 12 Apr 2001 04:04:15 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [168.159.208.81])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3C83vr19705
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 04:03:57 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <2H17N3WF>; Thu, 12 Apr 2001 04:03:48 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015417@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: Draft deadline for Interim meeting
Date: Thu, 12 Apr 2001 04:03:47 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Internet-Drafts intended for discussion at the
April 30th/May 1st interim meeting need to be
submitted to the Internet-Drafts servers by a
week from today, Thursday, April 19th.  That'll
give everyone at least a week to read them.
Earlier submission is better and will earn the
gratitude of your WG co-chairs.  Sorry for the
short notice, and we're bending the usual 2 week
timeframe to try to accommodate work in progress.    

Thanks,
--David and Elizabeth



From owner-ips@ece.cmu.edu  Thu Apr 12 13:23:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA05406
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 13:23:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3CFDUV24510
	for ips-outgoing; Thu, 12 Apr 2001 11:13:30 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e1.ny.us.ibm.com ([32.97.182.101])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3CFCrr24477
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 11:12:53 -0400 (EDT)
Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.117.200.22])
	by e1.ny.us.ibm.com (8.9.3/8.9.3) with ESMTP id LAA118550;
	Thu, 12 Apr 2001 11:11:19 -0400
Received: from d04nms25.raleigh.ibm.com (d04nms25.raleigh.ibm.com [9.67.228.6])
	by northrelay02.pok.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id LAA125866;
	Thu, 12 Apr 2001 11:08:03 -0400
Importance: Normal
Subject: RE: iSCSI Naming: WWUIs, URNs, and namespaces
To: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF371A7397.9AB6BD7E-ON85256A2C.00473C6C@raleigh.ibm.com>
From: "Thomas McSweeney" <rf42tpme@us.ibm.com>
Date: Thu, 12 Apr 2001 11:12:34 -0400
X-MIMETrack: Serialize by Router on D04NMS25/04/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/12/2001 11:12:43 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sorry for using obsolete terminology, but I feel uncomfortable using
anything but the latest published spellings for the text keys on the Login
Command.  If there is an updated copy of the spec available (more recent
than the -05 version), I'll go find it and use the corrected terms from it
in the future.

I know what DNS names look like, but I do I did a little experimenting with
my machine's host name.  I discovered that it does have one despite the
fact that I use DHCP - whoever installed Windows2000 must have typed in the
machine's serial number as the name (and set it to automatically append the
DNS suffix).  I tried to delete the name, but Windows2000 wouldn't let me.
I also learned that our DNS does not resolve that hostname to an IP address
-  I can't ping my hostname from anywhere, except from itself.  Finally, I
changed it to match the hostname of another machine in my office which does
not use DHCP, and rebooted.  It came up OK and connected to the net just
fine.  I'm not advocating this, in fact I quickly changed my hostname again
to make it unique (I agree that all fully qualified hostnames should be
unique).

This solidifies in my mind the difference between the iSCSI resource's name
and its location.  I don't object to using the hostname as (part of) the
iSCSI name (perhaps qualified with a locally-unique identifier of a
specific iSCSI resource), as long as the spec does not require an initiator
to find its target's location by passing the hostname portion of the iSCSI
name to a DNS.  It would be OK for an implementation to do such a location
lookup, but it restricts functionality (e.g., if the target is temporarily
moved to another host without being renamed, the initiator won't be able to
find it).  I think this is consistent with what David Black said in his
append.

Tom McSweeney
iSCSI Development, Storage Systems Group, IBM
Email: rf42tpme@us.ibm.com
Phone: (USA) 919-254-5634  (tie line: 444-5634)
Fax:   (USA) 919-254-0391  (tie line: 444-0391)



From owner-ips@ece.cmu.edu  Thu Apr 12 14:37:34 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA07215
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 14:37:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3CHQbg04603
	for ips-outgoing; Thu, 12 Apr 2001 13:26:37 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3CHPvr04566
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 13:25:58 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRBQT>; Thu, 12 Apr 2001 10:25:39 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B1733FD@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Ips (E-mail)" <ips@ece.cmu.edu>
Subject: RE: DRAFT Minneapolis Minutes -- ACA Discussion
Date: Thu, 12 Apr 2001 10:25:39 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> - SCSI ACA discussion
> 
> ACA (Auto Contingent Allegiance) is an optional SCSI 
> mechanism that stops
> execution
> of a sequence of dependent SCSI commands when one of them fails.  The
> situation
> surrounding it is complex - T10 specifies ACA in SAM2, and 
> hence iSCSI has
> to
> specify it and endeavor to make sure that ACA gets 
> implemented sufficiently 
> (two independent interoperable implementations) to avoid 
> dropping ACA in the
> transition from Proposed Standard to Draft Standard.  On the 
> list David
> Black
> noted that this would make ACA implementation at least a 
> "SHOULD" rather
> than a "MAY".
> 
A point of clarification:

Without going into the gory details, the ACA mechanism has been suggested as
a way of avoiding the loss of strict ordering that occurs when a logical
unit completes a command with an error status.

Keeping in mind the underlying iSCSI issue, I assume the question here is
not support for the ACA function as a SCSI option but whether or not iSCSI
will MANDATE the implementation of ACA as a condition for iSCSI compliance.
In that context, "dropping ACA" would amount to not requiring a logical unit
to implement the feature.

A subsidiary issue, not discussed, was whether to request a T10 extension to
the ACA semantics to cover command processing errors other than those
represented by the CHECK CONDITION status (e.g., QUEUE_FULL). 

Charles
> -----Original Message-----
> From: Black_David@emc.com [mailto:Black_David@emc.com]
> Sent: Friday, April 06, 2001 7:27 PM
> To: ips@ece.cmu.edu
> Subject: DRAFT Minneapolis Minutes
> 
> 
> Many thanks to Mark Carlson and Elizabeth Rodriguez for
> taking the minutes.  Please send corrections to the list,
> as well as any objections to rough consensus decisions
> reached in the meeting.  The deadline for changes to
> get incorporated in the official minutes is the end of
> the day on Wednesday, April 11th.  Thanks, --David
> 
> IPS WG Meeting Minutes - DRAFT
> IETF #50
> Minneapolis MN
> 
> 
> Interim Meeting - There will be an interim meeting for the IPS working
> group. 
> It will be co-located with T10 in Nashua, NH on April 30 & May 1.  
> FC related topics will be covered on Monday, April 30.  
> SCSI related documents will be covered on Tuesday, May 1.
> In addition to the IPS meetings, on Wednesday, May 2, during the
> T10 CAP meeting, a discussion of a SCSI MIB will be held. 
> Details of this meeting, including location and hotel room 
> information,
> have been sent to the IPS mailing list.
> 
> TCP framing discussion will be in TSVWG WG meeting the 
> evening of Monday,
> March 19.
> IPS working group attendees are encouraged to attend, be 
> familiar with the
> draft,
> and ask questions. The draft was cross-posted to both IPS and 
> TSVWG mailing
> lists.
> To subscribe to the TSVWG mailing list or to view the archive, see
> www.ietf.org/mailman/listinfo/tsvwg
> 
> RDMA - RDMA is a separate effort from the framing discussion.
> A separate mailing list has been formed to address RDMA
> To join, send an email to rdma-subscribe@yahoogroups.com .
> ***NOTE***: This address has been corrected from the one 
> discussed in the
> meeting.
> 
> T10 is considering a new SCSI model.  T10 participants agreed 
> that it may
> have
> advantages but will be very different from current SCSI 
> model.  Therefore it
> is currently deferred to SAM-3.
> This model may be of interested to iSCSI, as it will have 
> advantages for
> iSCSI.
> 
> -- iSCSI Requirements --
> 
> iSCSI requirements - document is almost complete.
> draft-ietf-ips-iscsi-rqmts-01.txt.
> 1 more draft will be sent out, and then a last call will occur in
> approximately 1 month
> with the hope of submitted the result as an informational RFC 
> around the
> time of the
> interim meeting (end of April).  Everyone should review the 
> draft on the
> list and
> especially check that the MUSTs and SHOULDs are correct and nothing is
> missing.
> 
> -- iSCSI specification - draft-ietf-ips-iscsi-05.txt --
> 
> - CRCs for iSCSI.  Two presentations made with respect to CRC 
> vs checksum
> usage in iSCSI.
> Recommendation was to use a 32 bit CRC; three mentioned as 
> candidates -
> CRC-32C; CRC-32Q and CCITT-CRC32.  Reported that there really 
> is no need for
> a 64 bit CRC.
> Consensus call on use of CRC instead of checksum. Loud CRC 
> hum; no hum for
> checksum,
> 	so use of CRC rather than checksums is the rough 
> consensus of this
> meeting.
> Consensus call will be taken on which CRC to use at interim 
> meeting, so that
> WG has
> more time to investigate and understand the options presented.
> 
> Request made for a single mandatory to implement CRC 
> algorithm as opposed
> to making the CRC algorithm selection negotiable.
> 
> - iSCSI Digests.  Current draft has multiple digests.  
> 3 proposals presented for digest and related header formats.
> Request made for use of TLV format in whatever solution is 
> finalized on.  
> Consensus call made to have 1 header digest instead of multiple.  
> Barry Reinhold and Julian Satran were asked to work and come 
> back with a
> single
> proposed format at the Wed  meeting.  The rough consensus of 
> the meeting was
> that
> the data length should always be in the same place in the header.
> 
> Consensus call - descriptors and diagrams should always be 
> kept together in
> the document.
> 	This is the rough consensus of the meeting.
> 
> Error Recovery will be addressed at the interim meeting
> 
> 
> - Security
> Public Key and TLS were removed in version 05 of the iSCSI draft.
> Public Key will be reinstated in the 06 draft.
> MUST provide authentication and data integrity.
> MAY provide data privacy (encryption).
> 
> Need to have at least 1 mandatory to implement security protocol.
> IPSec seems to be the selection in current draft.
> Consensus call - Make IPSec mandatory to implement?  
> Arguments against, no
> consensus.
> Consensus call - mandatory to implement SRP?  Hum against.  
> Will be taken to
> interim meeting.
> Both Public Key and Radius authentication mechanisms will be 
> added to the
> next version
> of the draft.
> 
> - SCSI ACA discussion
> 
> ACA (Auto Contingent Allegiance) is an optional SCSI 
> mechanism that stops
> execution
> of a sequence of dependent SCSI commands when one of them fails.  The
> situation
> surrounding it is complex - T10 specifies ACA in SAM2, and 
> hence iSCSI has
> to
> specify it and endeavor to make sure that ACA gets 
> implemented sufficiently 
> (two independent interoperable implementations) to avoid 
> dropping ACA in the
> transition from Proposed Standard to Draft Standard.  On the 
> list David
> Black
> noted that this would make ACA implementation at least a 
> "SHOULD" rather
> than a "MAY".
> 
> The current iSCSI draft has language requiring use of ACA rather than
> implementation;
> that's overspecified (it's ok for cooperating iSCSI nodes to 
> decide not to
> use ACA),
> and will be changed.
> 
> In practice, ACA is not a complete solution (e.g., if a Fibre 
> Channel switch
> drops
> a frame due to a CRC error, ACA will not kick in).  T10 has 
> been working on
> other
> mechanisms that address problems that ACA addresses in a more 
> comprehensive
> fashion,
> but has not moved to drop ACA from SAM2, hence iSCSI has to 
> deal with it.
> 
> 
> -- iSCSI Boot presentation - draft-ietf-ips-iscsi-boot-02.txt
> 
> This draft is relatively stable - the bulk of this draft will 
> become an
> informational RFC rather than standards track.  David Black 
> will figure
> out whether some piece of it needs to be standards track.
> 
> -- iSCSI MIB work presented. - draft-bakke-iscsimib-02.txt
> 
> Difficulty in making this an iSCSI MIB only,
> in that there is no SCSI MIB currently, but people want to 
> see both iSCSI
> and SCSI information addressed.
> SCSI MIB will be a topic on the agenda at the next T10 CAP 
> meeting, May 2.  
> iSCSI MIB accepted as a WG item; next submission will be 
> official WG item.
> 
> -- iSCSI naming and Discovery presented  -
> draft-ietf-ips-iscsi-name-disc-00.txt
> 
> Two discovery methods presented - iSNS and SLP based.
> 
> iSNS is an new name server, specific to IPS.  Question raised 
> as to why not
> use SLP.  
> Separate SLP presentation presented following; both iSNS and 
> SLP can be
> used.  
> SLP works for basic discovery whereas iSNS provides additional
> information/capabilities, 
> including management functionality.
> 
> -- URN Namespace document presented - 
> draft-bakke-iscsi-wwui-urn-00.txt
> 
> While the meeting consensus was to accept this as an official 
> WG work item,
> this was overridden by direction from the IESG subsequent to 
> the meeting.
> 
> -- SLP document presented - draft-bakke-iscsi-slp-00.txt 
> 
> Meeting consensus was to accept this as an official WG item.
> 
> -- Back to iSCSI headers (first thing Wednesday)
> 
> Follow up from Monday:  Two header formats presented to WG for
> consideration.   One has data length in fixed location, but 
> limits size
> of data.  Second more flexible, but data length field not in 
> fixed location.
> More details on each format to be written up and posted to list; 
> decision on which header format to follow to be decided on list; 
> Julian requests decision within 10 days.  The list subsequently
> chose the "Format 2" prepared/proposed by Barry Reinhold.
> 
> -- iSNS draft presented - draft-ietf-ips-isns-01.txt
> 
> Primarily a status report and description of how iSNS applies 
> to all three
> protocols in the WG (iFCP, iSCSI, FCIP).  It's required by iFCP, but
> optional
> for the others.
> 
> Rationale for why SLP isn't enough and iSNS is needed will be 
> sent to list.
> 
> A concern was raised about relying on iSNS for certificate 
> distribution vs.
> certificate exchange between the two ends of an IP Storage 
> connection.  Not
> resolved in the meeting and will need further consideration as part of
> protocol
> security work.
> 
> -- Fibre Channel related topics: --
> 
> Fibre Channel Common Encapsulation team being formed.  
> Small team, with a target size of 6 people.  
> Interested people to see David or Elizabeth (co-chairs) by 
> Friday, March 23.
> 
> Three encapsulation presentations made: 
>   Two had encapsulation format recommendations -
> draft-weber-fcip-encaps-00.txt and
> 	a presentation from Vi Chau (FCIP draft co-author)
>   Third consisted of requirements needed for iFCP -
> draft-monia-ips-ifcpenc-00.txt
> 
> Discussions of expectations from common encapsulation format - 
> should provide some means of data integrity & synchronization by 
> guarding against accidental interpretation of encapsulated 
> data as an FC
> frame if framing synchronization to the data stream is lost and being
> recovered;
> this is not a requirement to provide a means of preventing intentional
> hijacking;
> simply a means of validation that what is seen is actually current and
> valid.
> The WG co-chair made it clear that the encapsulation design 
> team must take
> this
> issue seriously.
> 
> FC Frames have CRC around them, so no CRC needed there, 
> but what needs to be wrapped around the common encapsulation 
> piece (header
> and/or
> header + SOF/EOF codes) to insure that data is correct?  
> Statement of direction from WG, based on show of hands, is 
> that CRC should
> be used.
> This is the rough consensus of the meeting.
> 
> One requirement of the draft was initially to make the common header
> extensible - 
> not needed by FCIP or iFCP.  Consensus call - remove this 
> requirement.  This
> is the
> rough consensus of this meeting.
> 
> -- FCIP draft update discussed - draft-ietf-ips-fcovertcpip-01.txt
> 
> No draft submitted since Dec IETF meeting.  
> Changes made and discussed by authors, but major pending changes did
> not make updated the draft for this meeting reasonable.
> Next draft will include changes recommended at both IETF-49 
> and the interim
> meeting, 
> as well as a better description of the FCIP port models, 
> encapsulation, etc.
> 
> Next draft due April, prior to next interim meeting.
> 
> -- FCIP Model presentation - Mike O'Donnell
> 
> FCIP port model usage.  In current FCIP draft, unclear as to 
> whether this is
> following
> a B-Port,  E-Port or some other kind of port model.  
> Presentation makes
> clear that
> both E-Ports and B-Ports  can be used by the FCIP device.  In 
> the absence of
> FCIP,
> an inter-switch link in Fibre Channel connects two E-ports.  
> If FCIP is
> implemented
> in the Fibre Channel switch, the result is a logical E-port 
> communicating
> over FCIP.
> If FCIP is implemented in an external bridge, a real E-port 
> on the switch
> communicates
> with a B-port on the bridge and the result is still a logical 
> E-port on the
> FCIP
> side of the switch.  These implementation structures will interoperate
> (i.e.,
> a logical E-port in a switch can communicate via FCIP with a 
> logical E-port
> implemented in a bridge that uses a B-port to talk to an 
> E-port on a switch
> - the
> FCIP protocol is the same).
> 
> FC-BB2 meeting announced - during T11 week in Toronto, Canada 
> on April 9.
> Meeting is open to all.  FC-BB2 handles the aspects of FCIP usage that
> require
> Fibre Channel standards.
> 
> 
> -- iFCP status - draft-ietf-ips-ifcp-00.txt
> 
> 3 additional versions of the draft anticipated between now 
> and August 2001.
> 
> Target is that this August draft will be complete, w/o any TBDs.
> 
> Noting that the draft contained a MIB lead to a more general 
> discussion of
> MIBs.
> In general, MIBs should be advanced as separate documents, 
> and authors need
> to
> look at the FC MIBs developed in the ipfc WG to avoid duplication.
> The iSCSI MIB may provide a model for some of the iFCP MIB.
> 
> Final note - somehow, we managed to get 3 anagrammed 
> acronyms.  FCIP and
> iFCP are
> protocols in this WG, and IPFC is the IP over Fibre Channel 
> protocol done by
> the
> ipfc WG.
> 


From owner-ips@ece.cmu.edu  Thu Apr 12 15:33:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA08362
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 15:33:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3CH2Zk02859
	for ips-outgoing; Thu, 12 Apr 2001 13:02:35 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hoemlsrv.firewall.lucent.com (hoemail1.lucent.com [192.11.226.161])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3CH2Mr02851
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 13:02:22 -0400 (EDT)
Received: from hoemlsrv.firewall.lucent.com (localhost [127.0.0.1])
	by hoemlsrv.firewall.lucent.com (Pro-8.9.3/8.9.3) with ESMTP id NAA22413
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 13:02:10 -0400 (EDT)
Received: from nc8220exchange.ral.lucent.com (h135-92-100-21.lucent.com [135.92.100.21])
	by hoemlsrv.firewall.lucent.com (Pro-8.9.3/8.9.3) with ESMTP id NAA22393
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 13:02:10 -0400 (EDT)
content-class: urn:content-classes:message
Subject: RE: Draft deadline for Interim meeting
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Date: Thu, 12 Apr 2001 13:02:08 -0400
Disposition-Notification-To: "Elizabeth Rodriguez" <Elizabeth.Rodriguez@nc8220exchange.ral.lucent.com>
X-MimeOLE: Produced By Microsoft Exchange V6.0.4417.0
Message-ID: <D55EFF49CC829E468BE958686EDE9FDE04C9F9@nc8220exchange.ral.lucent.com>
Thread-Topic: Draft deadline for Interim meeting
Thread-Index: AcDDL8VUQ+wPUdQRRw2QgcdxnyvLygASmeiQ
From: "Elizabeth Rodriguez" <egrodriguez@lucent.com>
To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by ece.cmu.edu id f3CH2Us02853
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 8bit

Clarification --
Drafts must be sent in by 5pm ET on April 19.

Elizabeth


-----Original Message-----
From: Black_David@emc.com [mailto:Black_David@emc.com]
Sent: Thursday, April 12, 2001 3:04 AM
To: ips@ece.cmu.edu
Subject: Draft deadline for Interim meeting


Internet-Drafts intended for discussion at the
April 30th/May 1st interim meeting need to be
submitted to the Internet-Drafts servers by a
week from today, Thursday, April 19th.  That'll
give everyone at least a week to read them.
Earlier submission is better and will earn the
gratitude of your WG co-chairs.  Sorry for the
short notice, and we're bending the usual 2 week
timeframe to try to accommodate work in progress.    

Thanks,
--David and Elizabeth



From owner-ips@ece.cmu.edu  Thu Apr 12 18:44:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA12729
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 18:44:33 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3CKEga17122
	for ips-outgoing; Thu, 12 Apr 2001 16:14:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3CKE1r17084
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 16:14:01 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3CLLt007013;
	Thu, 12 Apr 2001 14:21:55 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: RE: LDAP and XML (RE: iSCSI Naming: WWUIs, URNs, and namespaces)
Date: Thu, 12 Apr 2001 13:11:57 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJCEEICGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0708015415@corpmx9.isus.emc.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

In my defense, whatever the database and whatever the function, it must be
related to the iSCSI architecture for it to belong within the IPS efforts.
This was a premise and not a request.  The comment was addressing a larger
concern of use of this information and not the underlying mechanism.  The
point you seem to have missed although you will likely not agree is that
having information abstracted through multiple servers is likely the area
causing greater concern with respect to namespace.  That was the issue
within this thread I was addressing.  I am not sure what you're reacting to
except perhaps a misunderstanding of a general premise that I suspect you
agree with however.

On a prior occasion I indicated a concern I had regarding inclusion of a
schema within the requirements document as a related function was not
discernable and was seeking insight.  Even DMTF CIM standard security
requires the now unmentionable.  Am I wrong about the over-shadowing concern
of a naming server?  The comment made by Marjorie concerning IESG/IAB as it
relates to this was:

- Description: How the syntax is documented and the resulting
suggestions/implications for its use.

- Syntax: How those identifiers are represented in messages and related
issues of control over use and extension of that representation.

- Semantics: Unique identifiers, global scope, not tied to communication
endpoints.

Could you provide your understanding of how this relates to the current
architecture.  My take on these comments is to remove any new abstraction
server akin to DNS in a communication protocol.  The furtherance of that
would then imply a comprehensive means of obtaining information without this
additional abstraction layer.  In addition to providing an out-of-band
signal, what other function is disrupted with this type of change?

Doug


> Doug,
>
> With my WG co-chair hat on ... this thread needs to
> stop, because Marjorie's offer to include LDAP in
> addition to XML as an example is reasonable, and is
> all that needs to be done for the iSCSI requirements
> draft.  Among the examples of the use of XML for
> management software is the XML dialect of CIM used
> by both DMTF and SNIA, and CIM can clearly be extended
> to include iSCSI.  Hence the suggestion that XML be
> removed as an inappropriate example is hereby
> rejected.  The inclusion of XML as an example
> does not obligate the WG to develop an XML schema
> any more than inclusion of LDAP would obligate
> the WG to develop an LDAP schema.  FWIW, use of
> XML does not require a "database" - e.g., some
> of the data retrieved by SNMP from the MIB could
> also be encoded in XML for retrieval via HTTP.
>
> Comments from others who feel this topic is important
> are welcome, but Doug's well-known opinion on LDAP is
> noted and should not be repeated on the list in
> discussion of the iSCSI requirements draft prior to
> the interim meeting.
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
> > -----Original Message-----
> > From:	Douglas Otis [SMTP:dotis@sanlight.net]
> > Sent:	Wednesday, April 11, 2001 9:33 PM
> > To:	KRUEGER,MARJORIE (HP-Roseville,ex1); ips@ece.cmu.edu
> > Subject:	RE: iSCSI Naming: WWUIs, URNs, and namespaces
> >
> > Marjorie,
> >
> > Although there is nothing within requirements documentation
> regarding how
> > an
> > XML database scheme is used within an iSCSI architecture, there must be
> > some
> > relationship either directly or indirectly or it does not belong.  The
> > aspect of using this database is unclear, so your views in this area are
> > welcome in relating the function you understand this database
> fulfilling.
> > I
> > simply said there must be some way to find this service with a
> substratum
> > of
> > XML in whatever form it takes.  This service should be comprehensive and
> > provide SCSI related items to allow a client, directly or
> indirectly, the
> > means for successful communication using iSCSI.  I did not indicate this
> > would be used as a discovery tool, but rather there would need
> to be such
> > a
> > means in place to allow its use.  Centralized data removes the
> abstraction
> > server concept that has been found objectionable.  I was attempting to
> > view
> > such data in that light.
> >
> > Doug
> >
> > > > There is a need to obtain iSCSI configuration information in
> > > > a generalized
> > > > way.  This information should allow the client to find the
> > > > server using
> > > > existing IP namespace information.
> > > ..snip..
> > >
> > > >
> > > > Marjorie had mentioned that some are considering XML to support this
> > > > function.  You will need some way to find this service.
> > >
> > > You misread that paragraph in the iSCSI Requirements document.  There
> > has
> > > been no assertion that storage devices will use XML to discover
> > > each other!
> > > To quote the requirements document:
> > >
> > > "... Development of specifications for iSCSI device management as
> > > MIBs, XML
> > > schemas, etc"
> > >
> > > Device management is much different (and entirely separate)
> from service
> > > discovery and authentication.  Naming is an a construct for
> > > identity, which
> > > is necessary for authentication.
> > >
> > > Marj
> > >
>



From owner-ips@ece.cmu.edu  Thu Apr 12 19:38:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA13532
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 19:37:58 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3CGGW229170
	for ips-outgoing; Thu, 12 Apr 2001 12:16:32 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com (mxbh4.isus.emc.com [168.159.208.52])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3CGGOr29161
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 12:16:24 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <2XNBRM5Q>; Thu, 12 Apr 2001 12:16:18 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015426@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: FINAL  Minneapolis Minutes
Date: Thu, 12 Apr 2001 12:16:16 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Since there were no changes or corrections reported,
the DRAFT Minneapolis minutes posted to the list last
week have become the final minutes, and the rough
consensus agreements reached in the meeting are
now confirmed as the rough consensus of the WG.
A copy of those minutes (which have been submitted
for the official proceedings) is appended below.

Thanks,
--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------


> IPS WG Meeting Minutes - FINAL
> IETF #50
> Minneapolis MN
> 
> 
> Interim Meeting - There will be an interim meeting for the IPS working
> group. 
> It will be co-located with T10 in Nashua, NH on April 30 & May 1.  
> FC related topics will be covered on Monday, April 30.  
> SCSI related documents will be covered on Tuesday, May 1.
> In addition to the IPS meetings, on Wednesday, May 2, during the
> T10 CAP meeting, a discussion of a SCSI MIB will be held. 
> Details of this meeting, including location and hotel room information,
> have been sent to the IPS mailing list.
> 
> TCP framing discussion will be in TSVWG WG meeting the evening of Monday,
> March 19.
> IPS working group attendees are encouraged to attend, be familiar with the
> draft,
> and ask questions. The draft was cross-posted to both IPS and TSVWG
> mailing
> lists.
> To subscribe to the TSVWG mailing list or to view the archive, see
> www.ietf.org/mailman/listinfo/tsvwg
> 
> RDMA - RDMA is a separate effort from the framing discussion.
> A separate mailing list has been formed to address RDMA
> To join, send an email to rdma-subscribe@yahoogroups.com .
> ***NOTE***: This address has been corrected from the one discussed in the
> meeting.
> 
> T10 is considering a new SCSI model.  T10 participants agreed that it may
> have
> advantages but will be very different from current SCSI model.  Therefore
> it
> is currently deferred to SAM-3.
> This model may be of interested to iSCSI, as it will have advantages for
> iSCSI.
> 
> -- iSCSI Requirements --
> 
> iSCSI requirements - document is almost complete.
> draft-ietf-ips-iscsi-rqmts-01.txt.
> 1 more draft will be sent out, and then a last call will occur in
> approximately 1 month
> with the hope of submitted the result as an informational RFC around the
> time of the
> interim meeting (end of April).  Everyone should review the draft on the
> list and
> especially check that the MUSTs and SHOULDs are correct and nothing is
> missing.
> 
> -- iSCSI specification - draft-ietf-ips-iscsi-05.txt --
> 
> - CRCs for iSCSI.  Two presentations made with respect to CRC vs checksum
> usage in iSCSI.
> Recommendation was to use a 32 bit CRC; three mentioned as candidates -
> CRC-32C; CRC-32Q and CCITT-CRC32.  Reported that there really is no need
> for
> a 64 bit CRC.
> Consensus call on use of CRC instead of checksum. Loud CRC hum; no hum for
> checksum,
> 	so use of CRC rather than checksums is the rough consensus of this
> meeting.
> Consensus call will be taken on which CRC to use at interim meeting, so
> that
> WG has
> more time to investigate and understand the options presented.
> 
> Request made for a single mandatory to implement CRC algorithm as opposed
> to making the CRC algorithm selection negotiable.
> 
> - iSCSI Digests.  Current draft has multiple digests.  
> 3 proposals presented for digest and related header formats.
> Request made for use of TLV format in whatever solution is finalized on.  
> Consensus call made to have 1 header digest instead of multiple.  
> Barry Reinhold and Julian Satran were asked to work and come back with a
> single
> proposed format at the Wed  meeting.  The rough consensus of the meeting
> was
> that
> the data length should always be in the same place in the header.
> 
> Consensus call - descriptors and diagrams should always be kept together
> in
> the document.
> 	This is the rough consensus of the meeting.
> 
> Error Recovery will be addressed at the interim meeting
> 
> 
> - Security
> Public Key and TLS were removed in version 05 of the iSCSI draft.
> Public Key will be reinstated in the 06 draft.
> MUST provide authentication and data integrity.
> MAY provide data privacy (encryption).
> 
> Need to have at least 1 mandatory to implement security protocol.
> IPSec seems to be the selection in current draft.
> Consensus call - Make IPSec mandatory to implement?  Arguments against, no
> consensus.
> Consensus call - mandatory to implement SRP?  Hum against.  Will be taken
> to
> interim meeting.
> Both Public Key and Radius authentication mechanisms will be added to the
> next version
> of the draft.
> 
> - SCSI ACA discussion
> 
> ACA (Auto Contingent Allegiance) is an optional SCSI mechanism that stops
> execution
> of a sequence of dependent SCSI commands when one of them fails.  The
> situation
> surrounding it is complex - T10 specifies ACA in SAM2, and hence iSCSI has
> to
> specify it and endeavor to make sure that ACA gets implemented
> sufficiently 
> (two independent interoperable implementations) to avoid dropping ACA in
> the
> transition from Proposed Standard to Draft Standard.  On the list David
> Black
> noted that this would make ACA implementation at least a "SHOULD" rather
> than a "MAY".
> 
> The current iSCSI draft has language requiring use of ACA rather than
> implementation;
> that's overspecified (it's ok for cooperating iSCSI nodes to decide not to
> use ACA),
> and will be changed.
> 
> In practice, ACA is not a complete solution (e.g., if a Fibre Channel
> switch
> drops
> a frame due to a CRC error, ACA will not kick in).  T10 has been working
> on
> other
> mechanisms that address problems that ACA addresses in a more
> comprehensive
> fashion,
> but has not moved to drop ACA from SAM2, hence iSCSI has to deal with it.
> 
> 
> -- iSCSI Boot presentation - draft-ietf-ips-iscsi-boot-02.txt
> 
> This draft is relatively stable - the bulk of this draft will become an
> informational RFC rather than standards track.  David Black will figure
> out whether some piece of it needs to be standards track.
> 
> -- iSCSI MIB work presented. - draft-bakke-iscsimib-02.txt
> 
> Difficulty in making this an iSCSI MIB only,
> in that there is no SCSI MIB currently, but people want to see both iSCSI
> and SCSI information addressed.
> SCSI MIB will be a topic on the agenda at the next T10 CAP meeting, May 2.
> 
> iSCSI MIB accepted as a WG item; next submission will be official WG item.
> 
> -- iSCSI naming and Discovery presented  -
> draft-ietf-ips-iscsi-name-disc-00.txt
> 
> Two discovery methods presented - iSNS and SLP based.
> 
> iSNS is an new name server, specific to IPS.  Question raised as to why
> not
> use SLP.  
> Separate SLP presentation presented following; both iSNS and SLP can be
> used.  
> SLP works for basic discovery whereas iSNS provides additional
> information/capabilities, 
> including management functionality.
> 
> -- URN Namespace document presented - draft-bakke-iscsi-wwui-urn-00.txt
> 
> While the meeting consensus was to accept this as an official WG work
> item,
> this was overridden by direction from the IESG subsequent to the meeting.
> 
> -- SLP document presented - draft-bakke-iscsi-slp-00.txt 
> 
> Meeting consensus was to accept this as an official WG item.
> 
> -- Back to iSCSI headers (first thing Wednesday)
> 
> Follow up from Monday:  Two header formats presented to WG for
> consideration.   One has data length in fixed location, but limits size
> of data.  Second more flexible, but data length field not in fixed
> location.
> More details on each format to be written up and posted to list; 
> decision on which header format to follow to be decided on list; 
> Julian requests decision within 10 days.  The list subsequently
> chose the "Format 2" prepared/proposed by Barry Reinhold.
> 
> -- iSNS draft presented - draft-ietf-ips-isns-01.txt
> 
> Primarily a status report and description of how iSNS applies to all three
> protocols in the WG (iFCP, iSCSI, FCIP).  It's required by iFCP, but
> optional
> for the others.
> 
> Rationale for why SLP isn't enough and iSNS is needed will be sent to
> list.
> 
> A concern was raised about relying on iSNS for certificate distribution
> vs.
> certificate exchange between the two ends of an IP Storage connection.
> Not
> resolved in the meeting and will need further consideration as part of
> protocol
> security work.
> 
> -- Fibre Channel related topics: --
> 
> Fibre Channel Common Encapsulation team being formed.  
> Small team, with a target size of 6 people.  
> Interested people to see David or Elizabeth (co-chairs) by Friday, March
> 23.
> 
> Three encapsulation presentations made: 
>   Two had encapsulation format recommendations -
> draft-weber-fcip-encaps-00.txt and
> 	a presentation from Vi Chau (FCIP draft co-author)
>   Third consisted of requirements needed for iFCP -
> draft-monia-ips-ifcpenc-00.txt
> 
> Discussions of expectations from common encapsulation format - 
> should provide some means of data integrity & synchronization by 
> guarding against accidental interpretation of encapsulated data as an FC
> frame if framing synchronization to the data stream is lost and being
> recovered;
> this is not a requirement to provide a means of preventing intentional
> hijacking;
> simply a means of validation that what is seen is actually current and
> valid.
> The WG co-chair made it clear that the encapsulation design team must take
> this
> issue seriously.
> 
> FC Frames have CRC around them, so no CRC needed there, 
> but what needs to be wrapped around the common encapsulation piece (header
> and/or
> header + SOF/EOF codes) to insure that data is correct?  
> Statement of direction from WG, based on show of hands, is that CRC should
> be used.
> This is the rough consensus of the meeting.
> 
> One requirement of the draft was initially to make the common header
> extensible - 
> not needed by FCIP or iFCP.  Consensus call - remove this requirement.
> This
> is the
> rough consensus of this meeting.
> 
> -- FCIP draft update discussed - draft-ietf-ips-fcovertcpip-01.txt
> 
> No draft submitted since Dec IETF meeting.  
> Changes made and discussed by authors, but major pending changes did
> not make updated the draft for this meeting reasonable.
> Next draft will include changes recommended at both IETF-49 and the
> interim
> meeting, 
> as well as a better description of the FCIP port models, encapsulation,
> etc.
> 
> Next draft due April, prior to next interim meeting.
> 
> -- FCIP Model presentation - Mike O'Donnell
> 
> FCIP port model usage.  In current FCIP draft, unclear as to whether this
> is
> following
> a B-Port,  E-Port or some other kind of port model.  Presentation makes
> clear that
> both E-Ports and B-Ports  can be used by the FCIP device.  In the absence
> of
> FCIP,
> an inter-switch link in Fibre Channel connects two E-ports.  If FCIP is
> implemented
> in the Fibre Channel switch, the result is a logical E-port communicating
> over FCIP.
> If FCIP is implemented in an external bridge, a real E-port on the switch
> communicates
> with a B-port on the bridge and the result is still a logical E-port on
> the
> FCIP
> side of the switch.  These implementation structures will interoperate
> (i.e.,
> a logical E-port in a switch can communicate via FCIP with a logical
> E-port
> implemented in a bridge that uses a B-port to talk to an E-port on a
> switch
> - the
> FCIP protocol is the same).
> 
> FC-BB2 meeting announced - during T11 week in Toronto, Canada on April 9.
> Meeting is open to all.  FC-BB2 handles the aspects of FCIP usage that
> require
> Fibre Channel standards.
> 
> 
> -- iFCP status - draft-ietf-ips-ifcp-00.txt
> 
> 3 additional versions of the draft anticipated between now and August
> 2001.
> 
> Target is that this August draft will be complete, w/o any TBDs.
> 
> Noting that the draft contained a MIB lead to a more general discussion of
> MIBs.
> In general, MIBs should be advanced as separate documents, and authors
> need
> to
> look at the FC MIBs developed in the ipfc WG to avoid duplication.
> The iSCSI MIB may provide a model for some of the iFCP MIB.
> 
> Final note - somehow, we managed to get 3 anagrammed acronyms.  FCIP and
> iFCP are
> protocols in this WG, and IPFC is the IP over Fibre Channel protocol done
> by
> the
> ipfc WG.


From owner-ips@ece.cmu.edu  Thu Apr 12 20:48:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA14367
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 20:48:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3CM0ms24767
	for ips-outgoing; Thu, 12 Apr 2001 18:00:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3CLxur24707
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 17:59:57 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id EA17E1C7B
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 14:59:54 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA03900
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 14:59:50 -0700 (PDT)
Message-ID: <3AD626A7.452D6579@cup.hp.com>
Date: Thu, 12 Apr 2001 15:05:27 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI : More problems with Status SNACK !
Content-Type: multipart/mixed;
 boundary="------------DB282946F24C71A602A8FDFE"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------DB282946F24C71A602A8FDFE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian & All,

Here are 2 more problems with the current model of Status SNACK. This
current model, IMO, is un-usable due to the reasons stated below, in
addition to reasons stated in earlier mails on this issue. iSCSI MUST
use a different Status SNACK model. See proposal below in that regard.

1)Section 2.16. S[N]ACK Request
-------------------------------
The rev 05 spec states that SACK also implicitly acknowledges data or
status PDUs.

Now, consider the following scenario :

ExpStatSN = 0
StatSNs 1 - 10 are sent from target to initiator.

-------> StatSN 1
-------> StatSN 2
-------> StatSN 4
-------> StatSN 5
-------> StatSn 6
-------> StatSN 8
-------> StatSN 9
-------> StatSN 10

If StatSn 3 & 7 are missing, the initiator would then issue status SACK
for 3 & 7. As per the above rule, SACK for 3 implicitly acknowledges 1 &
2. (ExpStatSN advances to 3). However, SACK for 7 will implicitly
acknowledge 3 - 7, whereas 3 is still a hole !

The current SNACK model MUST NOT be used as an implicit acknowledgement
since it can cause spurious free up of status resources at the target,
prior to initiator having gotten the status.

2) SNACK mechanism cannot be relied upon for resource cleanup for the
following reasons :

a) SNACK support MUST be mandatory at the target and target can NEVER
fail a Status SNACK.
b) Initiators MUST always use a Status SNACK and this is not possible on
a UP timeout. IOW, there exist I/O timeout and other circumstances when
the initiator gives up and does not attempt SACK (suppose SACK itself
got a digest error at the target and timed out at the initiator !).

Since the current SNACK model is heavily dependent on the above
assumptions [which canot be met], failure of SNACK blocks further
forward progress with resource cleanup at the target since all further
I/O completions beyond the hole StatSN cannot be acknowledged.

In the worst case, any I/O timeout would imply session level error
recovery since the target will no longer be able to relaim resources.

Proposal :
==========
1) Negotiate Status SACK support at login time.
2) Do not use StatSN when Status SACK is not supported.
3) Modify the current SNACK PDU to eliminate "Additional run Length"
(which is of no practical use currently) and replace with an explicit
positive ack run described by ack_begrun and ack_run_length.

Comments ?

- Santosh

-----------------------------------------------------------------------------

Santosh Rao wrote:
> 
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > Case a is what we have today.
> 
> Julian,
> 
> I may be missing something, but case (a) is NOT what we have today.
> iSCSI rev 05 describes StatSN S[N]ACK support to be mandatory.
> (See http://ips.pdl.cs.cmu.edu/mail/msg04003.html for details).
> 
> > Status numbering is not meant for ordering - it is only a helper for ack
> > (bulk ack).
> 
> Well understood.
> 
> > All the resources required for status retransmission are the control block
> > and the status,
> > If you give them up you give up all forms of recovery (as command retry
> > will not help either).
> >
> > The only recovery path remaining is a SCSI timeout and probably some form
> > of task management clear (as SCSI does not know what went wrong).
> 
> which is the standard SCSI recovery followed historically by most SCSI
> initiator and targets and given the TCP checksum escape rate, this is
> not an issue for disk I/O. For tape I/O, this is still under debate and
> timeout based recovery may not be optimal in some scenarios for tape.
> (not yet conclusive though).
> 
> > That  is what I had in mind when I said that we can make it optional.
> 
> Does "it" refer to StatSN optional or "StatSN SNACK support" ?
> 
> >
> > However - a long time ago when we suggested making it optional for targets
> > most of the list wanted it mandatory.
> 
> Not sure what "it" is referring to.
> 
> Are we in agreement on requirements (b) & (c) ? Can "StatSN S[N]ACK"
> support be negotiated at login time and StatSN numbering be only used if
> Status SNACK is supported by the target ?
> 
> - Santosh
> 
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 10/04/2001 02:06:38
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> > Julian & All,
> >
> > Do we agree on the following requirements for SNACK :
> >
> > a) iSCSI MUST NOT mandate either data or status S[N]ACK for intra-task
> > error recovery. Initiators MUST be allowed to perform command
> > granularity error recovery.
> >
> > b) iSCSI MUST provide a mechanism by which targets can continue with I/O
> > resource release upon completion of an I/O. Such a mechanism may be
> > based on an explicit StatSN acknowledgement, (if the target supports
> > StatSN SNACK), or allow immiediate resource clean-up upon I/O
> > completion.
> >
> > c) Such a mechanism MUST NOT block forward progress when holes occur in
> > StatSN sequence, due to format or digest errors encountered at the
> > initiator.
> >
> > In order to meet the above requirements, "StatSN S[N]ACK" support can be
> > negotiated at login time and if StatSN SNACK is not supported by the
> > target, it MUST NOT use StatSN sequence numbering. (i.e. StatSN = 0).
> >
> > By not using StatSN numbering, the "holes in StatSN" problem does not
> > occur, thereby, meeting requirements (a) ,(b) & (c) for targets that do
> > not retain I/O state information.
> >
> > For targets that do retain I/O state information, StatSn SNACK is turned
> > on along with StatSN numbering.
> >
> > - Santosh
--------------DB282946F24C71A602A8FDFE
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------DB282946F24C71A602A8FDFE--



From owner-ips@ece.cmu.edu  Thu Apr 12 23:04:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA17338
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 23:04:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3D1Ds506349
	for ips-outgoing; Thu, 12 Apr 2001 21:13:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3D1Ddr06335
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 21:13:39 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 8F1A9139A; Thu, 12 Apr 2001 18:13:38 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id SAA19206;
	Thu, 12 Apr 2001 18:13:32 -0700 (PDT)
Message-ID: <3AD6540D.D28D391A@cup.hp.com>
Date: Thu, 12 Apr 2001 18:19:09 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Douglas Otis <dotis@sanlight.net>
Cc: Ips <ips@ece.cmu.edu>
Subject: Re: iSCSI:flow control, acknowledgement, and a deterministic recovery
References: <NEBBJGDMMLHHCIKHGBEJOECNCGAA.dotis@sanlight.net>
Content-Type: multipart/mixed;
 boundary="------------92DD83708423953CB924E558"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------92DD83708423953CB924E558
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Douglas Otis wrote:
> 
> With multiple connections, if you are not going to use a valid CmdSN, or in
> your case a null CmdSN for all commands, then there would be a need to
> include a timestamp to meet a timely delivery requirement in the same manner
> as used in FC encapsulation.  IP can deliver over any time period.  A
> command could arrive at any time with respect to other connections.  With
> all of your feedback now from just the SCSI layer, the SCSI layer is likely
> to have timed out and restarted and now stray commands finally make an
> appearance (the technician re-inserted the cable).  What did that do?  Yes,
> if this were on a single connection, then TCP could provide some assurances,
> (ignoring digests errors) but you must not make that assumption nor can you
> assume all disruptions are symmetric.

Doug,

The below snippet from my last mail answered your above concern. The
Abort Task is sent on the same connection as the command. (connection
allegiance applied to the abort task as well). The Abort task pushes the
stale data PDUs. There is no need for a timestamp on iSCSI PDUs.

> > As for your second concern regarding I/O timeouts, there is no need for
> > any timestamp. An I/O timeout is dealt with by an Abort Task. The abort
> > task response guarantees that the abort reached the target and pushed
> > all intermediate stale frames. Failure to complete Abort Task leads to
> > higher level error recovery (ex : Logout, or some higher form of task
> > mgmt).

- Santosh
--------------92DD83708423953CB924E558
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------92DD83708423953CB924E558--



From owner-ips@ece.cmu.edu  Thu Apr 12 23:41:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA18052
	for <ips-archive@odin.ietf.org>; Thu, 12 Apr 2001 23:41:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3D2Rqe10748
	for ips-outgoing; Thu, 12 Apr 2001 22:27:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3D2Rgr10707
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 22:27:43 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3JXL3>; Thu, 12 Apr 2001 22:29:07 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801542A@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Subject: RE: LDAP and XML (RE: iSCSI Naming: WWUIs, URNs, and namespaces)
Date: Thu, 12 Apr 2001 22:27:34 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Doug,

> In my defense, whatever the database and whatever the function, it must be
> related to the iSCSI architecture for it to belong within the IPS efforts.
> This was a premise and not a request.  

XML is not a database and does not require a database - a web server
can serve XML from scripts and files.  XML is an encoding format like
HTML, SGML, or even XDR, although both HTML and XDR have
significantly less ability to convey semantic content.  The XML
format can be used to deliver the same sort of management
information as is found in a MIB like the iSCSI MIB.  I would
think the resulting relationship to the iSCSI architecture is
clear and obvious.

> Am I wrong about the over-shadowing concern of a naming server?

In this case yes, because it's not related to the place where XML appears in
the iSCSI requirements document.  The discussion of iSCSI naming is not
relevant in this context because the requirements document does not envision
or require that XML be used in iSCSI naming or name resolution.  Please
make sure you understand the paragraph beginning with "XML is not a
database" before responding to this paragraph.

As to the potentially broader issue of whether LDAP ought to be used/usable
as part of iSCSI naming and discovery, the short answer is
"Send Draft" (i.e., submit an Internet-Draft with a concrete proposal), as
past list discussion has generated no interest in LDAP beyond yourself,
and I don't believe that further list discussion is likely to change that.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 13 01:34:09 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA21752
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 01:34:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3D1Soi07246
	for ips-outgoing; Thu, 12 Apr 2001 21:28:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3D1Sbr07233
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 21:28:37 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id D7A201499; Thu, 12 Apr 2001 18:28:23 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id SAA20026;
	Thu, 12 Apr 2001 18:28:18 -0700 (PDT)
Message-ID: <3AD65779.AA75141D@cup.hp.com>
Date: Thu, 12 Apr 2001 18:33:45 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Cc: Black_David@emc.com
Subject: Re: iSCSI Requirements Draft - Informal WG Last Call
References: <0F31E5C394DAD311B60C00E029101A0708015416@corpmx9.isus.emc.com>
Content-Type: multipart/mixed;
 boundary="------------CBA23C4E0851EEA44E854AC9"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------CBA23C4E0851EEA44E854AC9
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

David & All,

I object to the following requirement :

" MUST support ordered delivery of SCSI commands from the initiator to
the 
  target, to support SCSI Task Queuing. "

Ordered delivery is not a requirement for disk based applications and
non tagged queueing tape applications, which form the majority of
today's data traffic. 

To impose strict ordering (even in the presence of errors ?) as a MUST
is penalizing the majority of today's data traffic that does not expect
ordering from the SCSI subsystem.

I am particularly concerned about the effect of the above requirement in
the presence of errors. Does iSCSI expect strict ordering to be
maintained even when individual I/O errors like ULP timeout occur ? 

On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
choose not to retry the command, but instead, error it back to the ULP.
In such a case, it can plug the hole in CmdSN with a NOP-OUT.

The above requirement is not feasible to be met under such circumstances
and others similar to this. Mandating strict ordering on ULP timeouts
implies a session level error recovery on any individual I/O being
failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
as error recovery and should not be imposed.

The above requirement must be changed to :
" SHOULD support ordered delivery of SCSI commands from the initiator to
the 
  target, to support SCSI Task Queuing. "

- Santosh



Black_David@emc.com wrote:
> 
> It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> as an Informational RFC. There is no formal requirement for
> a WG Last Call, but if you have any further substantive comments
> on the document please raise them on this list within the next
> two weeks, i.e. by April 27th at the latest.
> 
> If you have typographical/editorial comments please send them
> direct to the document's author, Marjorie Krueger
> <marjorie_krueger@hp.com>.
> 
> Thanks,
> --David and Elizabeth, IPS WG co-chairs
--------------CBA23C4E0851EEA44E854AC9
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------CBA23C4E0851EEA44E854AC9--



From owner-ips@ece.cmu.edu  Fri Apr 13 03:51:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA03191
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 03:51:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3D0nok05060
	for ips-outgoing; Thu, 12 Apr 2001 20:49:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3D0mur05015
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 20:48:57 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id 030BB4D4
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 17:48:56 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id RAA17763
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 17:48:51 -0700 (PDT)
Message-ID: <3AD64E44.277CD323@cup.hp.com>
Date: Thu, 12 Apr 2001 17:54:28 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI : digest error handling violates EMDP/InDataOrder
Content-Type: multipart/mixed;
 boundary="------------9ACA765F01D479D96BADAB8D"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------9ACA765F01D479D96BADAB8D
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Where :
=======

Section 6.2 (pg 80). Digest Errors
-----------------------------------
"If the error is a Data-Digest-Error in a Data-PDU, the target MUST
either request retransmission with a R2T or answer with a Reject iSCSI
PDU and abort the task."

Problem :
---------
On a Data digest error detected by a target, it MUST NOT request
re-transmission of the data PDU thru an R2T if the session login key
InDataOrder is set to yes. The current rev 05 draft violates
InDataOrder/EMDP settings by allowing a re-transmission of R2T by
target.

Scenario :
==========
initiator			target
---------			------
EMDP=0
InDataOrder=YES
(exp_off=0)
	   offset=0,len=64k <------ R2T
	
--------> data PDUs
(exp_off = 64K)
                              data digest error results in
			      an 8K PDU being dropped at offset 24K.

	  offset=24K,len=8K  <------ R2T for missing PDU.

exp_off != offset


- Santosh
--------------9ACA765F01D479D96BADAB8D
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------9ACA765F01D479D96BADAB8D--



From owner-ips@ece.cmu.edu  Fri Apr 13 04:21:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA03393
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 04:21:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3D1up808880
	for ips-outgoing; Thu, 12 Apr 2001 21:56:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [168.159.208.81])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3D1ubr08835
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 21:56:37 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <2H173G1K>; Thu, 12 Apr 2001 21:56:28 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015428@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: cmonia@nishansystems.com, ips@ece.cmu.edu
Subject: RE: DRAFT Minneapolis Minutes -- ACA Discussion
Date: Thu, 12 Apr 2001 21:56:26 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> - SCSI ACA discussion
> 
> > ACA (Auto Contingent Allegiance) is an optional SCSI mechanism that
stops execution
> > of a sequence of dependent SCSI commands when one of them fails.  The
situation
> > surrounding it is complex - T10 specifies ACA in SAM2, and hence iSCSI
has to
> > specify it and endeavor to make sure that ACA gets implemented
sufficiently 
> > (two independent interoperable implementations) to avoid dropping ACA in
the
> > transition from Proposed Standard to Draft Standard.  On the list David
Black
> > noted that this would make ACA implementation at least a "SHOULD" rather
> > than a "MAY".
>
> Keeping in mind the underlying iSCSI issue, I assume the question here is
> not support for the ACA function as a SCSI option but whether or not iSCSI
> will MANDATE the implementation of ACA as a condition for iSCSI
compliance.

That is the current question - "SHOULD" and "MAY" refer to their
definitions in RFC 2119, and "MANDATE" would correspond to "MUST".
I apologize if this shorthand was confusing

> In that context, "dropping ACA" would amount to not requiring a logical
unit
> to implement the feature.

That's not correct because "dropping ACA" refers to a future transition of
iSCSI
from Proposed Standard to Draft Standard status.  As part of that
transition,
any feature without two independently developed interoperable
implementations
is removed from the RFC as part of revising it for Draft Standard.  The
feature
isn't just made optional - the text describing such a feature is deleted.
The WG
has a fair amount of control over when the transition occurs, but the
transition
should be a goal, as the resulting cleansing of unimplemented features from
the specification is usually a good thing.  In ACA's case, its removal would
not be a good thing since it would cause iSCSI to no longer be a complete
SCSI transport mapping, and hence it behooves us to word the specification
now to encourage ACA to be implemented.

As I said, this situation is complex - I hope this explanation is clearer.

--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 13 06:05:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA04198
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 06:05:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3D7c1w28211
	for ips-outgoing; Fri, 13 Apr 2001 03:38:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3D7bZr28159
	for <ips@ece.cmu.edu>; Fri, 13 Apr 2001 03:37:36 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3D8jf007511;
	Fri, 13 Apr 2001 01:45:41 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: RE: LDAP and XML (RE: iSCSI Naming: WWUIs, URNs, and namespaces)
Date: Fri, 13 Apr 2001 00:35:40 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMEENCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A070801542A@corpmx9.isus.emc.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

The phrase that started this line of questioning was:
Req Doc Pg 6.
 "(5)  Development of specifications for iSCSI device management as MIBs,
XML schemas, etc."

XML Schema can be thought of as the subset of the XML-Data submission.  It
provides rules, structures, data-types often used in setting up a database.
I am aware of the ability of this application but this knowledge does not
provide intended use.  My focus is on this structured data and its function
as related to iSCSI.  I will be happy to support an effort to establish
these structures.  Although I see an advantage in reducing the number of
interacting protocols and servers, especially newly developed servers, my
interest remains in seeing the function of conveying iSCSI configuration
fulfilled.  If given no direction, you already know my inclination.  If
there is interest in other mechanisms, the only way to determine that is by
asking.  I do not wish to appear confrontational.

At the same time, I was also trying to understand the implication of the
namespace concerns from the IESG/IAB statement.  Could you provide your
understanding of how this relates to the current
architecture particularly with respect to obtaining configuration?

Doug


> Doug,
>
> > In my defense, whatever the database and whatever the function,
> it must be
> > related to the iSCSI architecture for it to belong within the
> IPS efforts.
> > This was a premise and not a request.
>
> XML is not a database and does not require a database - a web server
> can serve XML from scripts and files.  XML is an encoding format like
> HTML, SGML, or even XDR, although both HTML and XDR have
> significantly less ability to convey semantic content.  The XML
> format can be used to deliver the same sort of management
> information as is found in a MIB like the iSCSI MIB.  I would
> think the resulting relationship to the iSCSI architecture is
> clear and obvious.
>
> > Am I wrong about the over-shadowing concern of a naming server?
>
> In this case yes, because it's not related to the place where XML
> appears in
> the iSCSI requirements document.  The discussion of iSCSI naming is not
> relevant in this context because the requirements document does
> not envision
> or require that XML be used in iSCSI naming or name resolution.  Please
> make sure you understand the paragraph beginning with "XML is not a
> database" before responding to this paragraph.
>
> As to the potentially broader issue of whether LDAP ought to be
> used/usable
> as part of iSCSI naming and discovery, the short answer is
> "Send Draft" (i.e., submit an Internet-Draft with a concrete proposal), as
> past list discussion has generated no interest in LDAP beyond yourself,
> and I don't believe that further list discussion is likely to change that.
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>



From owner-ips@ece.cmu.edu  Fri Apr 13 06:33:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA04445
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 06:33:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3D29pZ09603
	for ips-outgoing; Thu, 12 Apr 2001 22:09:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3D29Jr09581
	for <ips@ece.cmu.edu>; Thu, 12 Apr 2001 22:09:19 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LG8K4>; Thu, 12 Apr 2001 22:00:01 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015429@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: santoshr@cup.hp.com, ips@ece.cmu.edu
Cc: Black_David@emc.com
Subject: RE: iSCSI Requirements Draft - Informal WG Last Call
Date: Thu, 12 Apr 2001 22:09:12 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Santosh,

Objection noted - could you attempt to write
alternative text that would respect the original
motivation to avoid reordering of commands in
most situations without causing the problems
that concern you in error cases?  Fibre Channel
may be a useful analogy, as FC delivers in order
as long as there isn't a frame drop or CRC
failure (both of which are very rare events).

Thanks,
--David

> -----Original Message-----
> From:	Santosh Rao [SMTP:santoshr@cup.hp.com]
> Sent:	Thursday, April 12, 2001 9:34 PM
> To:	ips@ece.cmu.edu
> Cc:	Black_David@emc.com
> Subject:	Re: iSCSI Requirements Draft - Informal WG Last Call
> 
> David & All,
> 
> I object to the following requirement :
> 
> " MUST support ordered delivery of SCSI commands from the initiator to
> the 
>   target, to support SCSI Task Queuing. "
> 
> Ordered delivery is not a requirement for disk based applications and
> non tagged queueing tape applications, which form the majority of
> today's data traffic. 
> 
> To impose strict ordering (even in the presence of errors ?) as a MUST
> is penalizing the majority of today's data traffic that does not expect
> ordering from the SCSI subsystem.
> 
> I am particularly concerned about the effect of the above requirement in
> the presence of errors. Does iSCSI expect strict ordering to be
> maintained even when individual I/O errors like ULP timeout occur ? 
> 
> On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
> choose not to retry the command, but instead, error it back to the ULP.
> In such a case, it can plug the hole in CmdSN with a NOP-OUT.
> 
> The above requirement is not feasible to be met under such circumstances
> and others similar to this. Mandating strict ordering on ULP timeouts
> implies a session level error recovery on any individual I/O being
> failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
> as error recovery and should not be imposed.
> 
> The above requirement must be changed to :
> " SHOULD support ordered delivery of SCSI commands from the initiator to
> the 
>   target, to support SCSI Task Queuing. "
> 
> - Santosh
> 
> 
> 
> Black_David@emc.com wrote:
> > 
> > It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> > as an Informational RFC. There is no formal requirement for
> > a WG Last Call, but if you have any further substantive comments
> > on the document please raise them on this list within the next
> > two weeks, i.e. by April 27th at the latest.
> > 
> > If you have typographical/editorial comments please send them
> > direct to the document's author, Marjorie Krueger
> > <marjorie_krueger@hp.com>.
> > 
> > Thanks,
> > --David and Elizabeth, IPS WG co-chairs << File: Card for Santosh Rao >>
> 


From owner-ips@ece.cmu.edu  Fri Apr 13 16:47:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA18178
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 16:47:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3DIQLj20934
	for ips-outgoing; Fri, 13 Apr 2001 14:26:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3DIPXr20845
	for <ips@ece.cmu.edu>; Fri, 13 Apr 2001 14:25:33 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3DJUq008393;
	Fri, 13 Apr 2001 12:31:10 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <santoshr@cup.hp.com>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Fri, 13 Apr 2001 11:20:55 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEEPCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
In-Reply-To: <3AD6540D.D28D391A@cup.hp.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Santosh,

I see a few problems with this approach.  Tasks as defined in iSCSI do not
maintain connection allegiance.  The driver binds all SCSI commands to their
connection for the most resent association.  Although there are several
places within the iSCSI proposal that make reference to a task having a
connection allegiance, this is in error.  Commands and not tasks carry such
allegiance.  Your recovery scheme will not allow a satisfactory recovery
with a sequential device.  In this case, repeating the command is not a
solution.  As a result, one connection falter and it will become a difficult
situation.  In addition, you have no clue from iSCSI your delivery status.
You do not know if you are waiting for the target or if you are waiting for
the connection.  Some sequential devices have rather long time-outs with
these complications of deducing status created by the multiple connections.

The application will not know about these connection allegiance problems.
The iSCSI layer does not define interaction to provide additional
application status to allow these applications to respond in a manner that
may aid this situation nor should such additional information be required.
With your scheme the SCSI driver must examine the content of these commands
to make a guess as to the connection allegiance assignments.  Now the driver
is expected to understand what the intended action is of this SCSI
management command.  What signal is used to indicate a need for the iSCSI
immediate treatment?  The only obvious seems to be the task attribute
argument.  With the way iSCSI has defined iSCSI immediate, I would expect
those commands to be treated in a LIFO rather than the normal FIFO fashion.

Doug


> Douglas Otis wrote:
> >
> > With multiple connections, if you are not going to use a valid
> > CmdSN, or in your case a null CmdSN for all commands, then there
> > would be a need to include a timestamp to meet a timely delivery
> > requirement in the same manner as used in FC encapsulation.  IP
> > can deliver over any time period.  A command could arrive at any
> > time with respect to other connections.  With all of your feedback
> > now from just the SCSI layer, the SCSI layer is likely to have timed
> > out and restarted and now stray commands finally make an appearance
> > (the technician re-inserted the cable).  What did that do?  Yes,
> > if this were on a single connection, then TCP could provide some
> > assurances, (ignoring digests errors) but you must not make that
> > assumption nor can you assume all disruptions are symmetric.
>
> Doug,
>
> The below snippet from my last mail answered your above concern. The
> Abort Task is sent on the same connection as the command. (connection
> allegiance applied to the abort task as well). The Abort task pushes the
> stale data PDUs. There is no need for a timestamp on iSCSI PDUs.
>
> > > As for your second concern regarding I/O timeouts, there is
> no need for
> > > any timestamp. An I/O timeout is dealt with by an Abort Task.
> The abort
> > > task response guarantees that the abort reached the target and pushed
> > > all intermediate stale frames. Failure to complete Abort Task leads to
> > > higher level error recovery (ex : Logout, or some higher form of task
> > > mgmt).
>
> - Santosh



From owner-ips@ece.cmu.edu  Fri Apr 13 19:43:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA20670
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 19:43:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3DL8Wm02235
	for ips-outgoing; Fri, 13 Apr 2001 17:08:32 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from auemail2.firewall.lucent.com (auemail2.lucent.com [192.11.223.163])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3DL7or02190
	for <ips@ece.cmu.edu>; Fri, 13 Apr 2001 17:07:50 -0400 (EDT)
Received: from auemail2.firewall.lucent.com (localhost [127.0.0.1])
	by auemail2.firewall.lucent.com (Pro-8.9.3/8.9.3) with ESMTP id RAA29027
	for <ips@ece.cmu.edu>; Fri, 13 Apr 2001 17:07:49 -0400 (EDT)
Received: from nc8220exchange.ral.lucent.com (h135-92-100-21.lucent.com [135.92.100.21])
	by auemail2.firewall.lucent.com (Pro-8.9.3/8.9.3) with ESMTP id RAA29023
	for <ips@ece.cmu.edu>; Fri, 13 Apr 2001 17:07:48 -0400 (EDT)
content-class: urn:content-classes:message
Subject: IPS interim meeting agenda
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Date: Fri, 13 Apr 2001 17:07:48 -0400
Disposition-Notification-To: "Elizabeth Rodriguez" <Elizabeth.Rodriguez@nc8220exchange.ral.lucent.com>
Message-ID: <D55EFF49CC829E468BE958686EDE9FDE04C9FD@nc8220exchange.ral.lucent.com>
X-MimeOLE: Produced By Microsoft Exchange V6.0.4417.0
Thread-Topic: IPS interim agenda
Thread-Index: AcDETzBpR0hkG8UuRLCRCHa7q2sNMAADMW4A
From: "Elizabeth Rodriguez" <egrodriguez@lucent.com>
To: "IPS Mailing List (E-mail)" <ips@ece.cmu.edu>
Cc: <t10@t10.org>, <fc@network.com>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by ece.cmu.edu id f3DL8Os02223
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 8bit


	As announced previously, the IETF IPS WG  will be holding an
interim meeting, in Nashua, NH on April 30 and May 1.
	This meeting will be co-located with the May T10 meeting.
	Any interested parties are welcome to attend.

	An initial draft of the agenda for that meeting is as follows:
>   
> 
> IP Storage WG (ips) Interim meeting
> 
> MONDAY, April 30 at 0800-1800 (8a - 6p) 
> TUESDAY, May 1 at 0800-1800 (8a - 6p) 
> ================================
> 
> CHAIRS:
>         David Black <black_david@emc.com>
>         Elizabeth Rodriguez <egrodriguez@lucent.com>
> 
> AGENDA: (subject to change)
> 
> April 30, 0800-1800
> ===================
> 
> - Agenda Bashing and Administrivia (15 min)
> 
> - Fibre Channel Common Encapsulation Draft (2 hrs)
>         This draft will be submitted next week.
> 
> - Naming and Discovery for FCIP and iFCP (30 minutes)
>  
>         iSNS and SLP proposals, as they apply to iFCP and FCIP
> 
> - FC MIB(s) for FCIP and iFCP (1 hr)
> 
> - Lunch (75 min)
> 
> - FCIP (3 hrs)
>         Common Encapsulation Header/Format usage (30 min)
>         Model (30 min)
>         TCP/IP interaction (30 min)
>         Data Integrity (30 min)
>         Multipath connectivity (30 min)
>         Security (15 min)
>         QoS (15 min)
> 
> - iFCP (1hr  40 min)
>       iFCP Implementation of the common encapsulation format (20 min)
> 	iFCP Addressing modes (address-transparent vs.
> address-translation) (20 min)
> 	Augmented ELS handling in address-translation mode (20 min)
> 	Fabric service profiles (20 min)
	      Introduction to the new draft-ietf-ips-ifcp-01.txt. What's
new (20 min)
	
	      Other (20 min)
>  
> 
> May 1, 0800-1800
> ===================
> 
> - iSCSI naming and discovery (3 hrs)
> 
> - iSCSI security  (2 hrs)
> 
> - Lunch (1 hr )
> 
	- iSCSI Error Recovery (2 hrs)

> - Other (2 hrs )
> 
>  
> 
> 


From owner-ips@ece.cmu.edu  Fri Apr 13 23:26:52 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA24257
	for <ips-archive@odin.ietf.org>; Fri, 13 Apr 2001 23:26:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3DMiTp09071
	for ips-outgoing; Fri, 13 Apr 2001 18:44:29 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway1.readyhosting.com ([63.119.175.29])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3DMeCr08795
	for <ips@ece.cmu.edu>; Fri, 13 Apr 2001 18:40:12 -0400 (EDT)
Received: from mailserver16 [63.119.175.16] by gateway1.readyhosting.com with ESMTP
  (SMTPD32-6.06) id AECE289E00F4; Fri, 13 Apr 2001 17:33:50 -0500
Received: from eddylaptop [216.135.246.226] by mailserver16 with ESMTP
  (SMTPD32-6.06) id AEA92330120; Fri, 13 Apr 2001 17:33:13 -0500
Reply-To: <Eddy@Quicksall.com>
From: "Eddy Quicksall" <Eddy@Quicksall.com>
To: "Ips@Ece. Cmu. Edu \(E-mail\)" <ips@ece.cmu.edu>
Subject: aborting an out of sequence cmdSN
Date: Fri, 13 Apr 2001 18:39:54 -0400
Message-ID: <000101c0c46a$a9dd6450$ad01a8c0@eddylaptop>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


If a command is received out of sequence, it is not passed to the SCSI
Target Layer. So, it must be queued waiting for the prior command. Now, if
the prior command never comes the operating system may attempt to abort the
lost command by sending an abort to the driver.

The abort can't be turned into an ABORT TASK TMF because the target iSCSI
has not acknowledged that it has received the original command yet. And,
even if the ABORT TASK was sent, it can't get to the SCSI Target Layer
because its cmdSN will be higher than the most recent command.

The iSCSI initiator drive could just jerk it out of its queue (knowing that
it hasn't been acknowledged yet) but then the iSCSI target may still execute
it when the missing PDU's arrive.

Does iSCSI have a way to abort the commands that are in the target but not
yet in the SCSI Target Layer?

Eddy





From owner-ips@ece.cmu.edu  Sat Apr 14 19:44:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA15311
	for <ips-archive@odin.ietf.org>; Sat, 14 Apr 2001 19:44:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3EGp4v19258
	for ips-outgoing; Sat, 14 Apr 2001 12:51:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3EGo4r19171
	for <ips@ece.cmu.edu>; Sat, 14 Apr 2001 12:50:04 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id SAA157068
	for <ips@ece.cmu.edu>; Sat, 14 Apr 2001 18:49:54 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id SAA182390
	for <ips@ece.cmu.edu>; Sat, 14 Apr 2001 18:49:53 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2E.005C7349 ; Sat, 14 Apr 2001 18:49:48 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2E.005C7285.00@d12mta02.de.ibm.com>
Date: Sat, 14 Apr 2001 19:54:55 +0300
Subject: draft version 05-91 available at my site
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Dear colleagues,

To save some time I am going to put several "preview" versions of the draft
on my site
(http://www.haifa.il.ibm.com/satran/ips).

I've just placed 05-91

Recovery is still "work in progress".

Main changes are in:

Formats
Immediate Delivery (flag instead of CmdSN=0)
Security (additions)

many editorials and clarifications.

Also my "todo" folder on iSCSI (including things that I have agreed to
change is empty) and the only item i have under "toConsider" is the
StatusSN and associated SNACK (I could not make up my mind and the few
arguments I've heard against the current scheme  are weak).
If you think I've forgot something please send a note.

I will let you know when and how I post changes but I will spend little
time to answer non-urgent mail
during the next several days.


Regards,
Julo




From owner-ips@ece.cmu.edu  Sun Apr 15 12:54:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA10643
	for <ips-archive@odin.ietf.org>; Sun, 15 Apr 2001 12:54:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3FErnt10737
	for ips-outgoing; Sun, 15 Apr 2001 10:53:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3FEr4r10713
	for <ips@ece.cmu.edu>; Sun, 15 Apr 2001 10:53:04 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id QAA04556
	for <ips@ece.cmu.edu>; Sun, 15 Apr 2001 16:52:46 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA157838
	for <ips@ece.cmu.edu>; Sun, 15 Apr 2001 16:52:45 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2F.0051BA52 ; Sun, 15 Apr 2001 16:52:41 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2F.0051B59F.00@d12mta02.de.ibm.com>
Date: Sun, 15 Apr 2001 17:48:12 +0300
Subject: Re: Tape drives and iSCSI
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Dave,

Thanks a lot for your insights.  We will do our best to have recovery
flexible enough to offer at least the same level as FCP-2.  The added
benefits of connection failover and dynamic bandwidth allocation
should keep iSCSI competitive in bot LAN and WAN.

Regards,
Julo

"Dave Peterson" <dap@cisco.com> on 12/04/2001 02:06:31

Please respond to "Dave Peterson" <dap@cisco.com>

To:   "Ips@Ece. Cmu. Edu" <ips@ece.cmu.edu>
cc:
Subject:  Tape drives and iSCSI





Don't believe a tape backup application model will add much value to the
discussion.
Here's my view:

Tape Requirements
-----------------
1. Data integrity
2. Data integrity
3. Data integrity
4. Perform successful uninterrupted backup (within backup window is a big
plus)

Tape Specifics
--------------
Large record sizes are recommended for performance. The largest (typical)
record size is 256KB.
Using a default DataPDULength = 8192 would require 32 data PDU's.
A typical higher-end head-to-tape transfer rates = 10 MB/sec resulting in a
backup rate of ~36 GB/hour.

Ramblings& Tidbits
------------------
FCP-2 error detection and recovery is implemented by at least one high-end
tape drive vendor (and they have a significant advantage over other
vendors,
i.e., their backup will not fail due to a FC link-level error).

SCSI level timeout and retry is highly dependent on the backup application.
One must first determine the state/position of the tape drive before
proceeding. Tools such as READ POSITION and LOCATE are in place for the
backup app to attempt recovery from an error. Problem has been few backup
vendors have yet to implement them (but things are getting better). Thus
FC-TAPE was born and the error detection and recovery mechanism was rolled
into FCP-2 as a standard. FCP-2 provides tools to determine the
state/position of the tape drive below the SCSI level allowing a "best
effort" attempt to complete the exchange/command and not return an error to
the application. At this point in time this is what is important, i.e., DO
NOT RETURN AN ERROR TO THE APPLICATION IF AT ALL POSSIBLE.

Refer to Matt Wakeley's I/O (command) recovery write-up for a description
of
iSCSI error recovery that should be used as a starting point. Maybe this
has
already been done, I'm not involved in the error recovery group. What we
need is the ability to detect an error and recovery below the SCSI level
(i.e., the iSCSI transport level). Any further granularity is not needed
especially due to the low error rates that will be encountered.

Regarding tape devices and maintaining state, FC-TAPE enabled drives have
the following requirements:
For non-tagged command queuing operations, the target shall retain the
Exchange information until
a) the next FCP_CMND IU has been received for that LUN from the same
initiator;
b) an FCP_CONF IU is received for the Exchange; or
c) after RR_TOV times out.
For tagged command queuing operations, the target shall retain Exchange
information until
a) an FCP_CONF IU is received for the Exchange; or
b) after RR_TOV times out.

There is a work in progress for a new tape model in the T10 SSC-2 working
group. This new model will allow for a simpler error detection and recovery
precedure and a robust command queuing implementation.

Finally, I strongly agree with the sentiment of getting the first version
"out the door".
The issues surrounding CRC's and error recovery need to be put to reset
asap.

David Peterson
Lead Architect - Standards Development
Cisco Systems - SRBU
6450 Wedgwood Road
Maple Grove, MN 55311
Office: 763-398-1007
Cell: 612-802-3299
Email: dap@cisco.com






From owner-ips@ece.cmu.edu  Sun Apr 15 12:54:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA10645
	for <ips-archive@odin.ietf.org>; Sun, 15 Apr 2001 12:54:10 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3FEsml10767
	for ips-outgoing; Sun, 15 Apr 2001 10:54:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3FErmr10736
	for <ips@ece.cmu.edu>; Sun, 15 Apr 2001 10:53:48 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id QAA231164
	for <ips@ece.cmu.edu>; Sun, 15 Apr 2001 16:53:40 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA173492
	for <ips@ece.cmu.edu>; Sun, 15 Apr 2001 16:53:40 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A2F.0051CEFE ; Sun, 15 Apr 2001 16:53:34 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A2F.0051C849.00@d12mta02.de.ibm.com>
Date: Sun, 15 Apr 2001 17:58:15 +0300
Subject: CRCs
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Dear colleagues,

We will probably not be able to finish the CRC/checksum document in time
for Nashua but we hope it will be out very soon after that.   However I
would like to inform you that while in Orlando and Minneapolis we where
still talking about different CRCs we (Dafna Sheinwald, Pat Thaler, Matt
Wakeley, Vince Cavanna and myself) have agreed on a CRC and the forthcoming
ID will give all the reasons and why we recomend it.

Regards,
Julo




From owner-ips@ece.cmu.edu  Mon Apr 16 14:36:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA10946
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 14:35:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GGKoN19408
	for ips-outgoing; Mon, 16 Apr 2001 12:20:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GGJvr19252
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 12:19:58 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id SAA14058
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 18:19:46 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id SAA145908
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 18:19:45 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A30.0059AFB1 ; Mon, 16 Apr 2001 18:19:36 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A30.0059AD02.00@d12mta02.de.ibm.com>
Date: Mon, 16 Apr 2001 18:31:35 +0300
Subject: Re: iSCSI Requirements Draft - Informal WG Last Call
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk





Ordered delivery of commands to ANY TYPE of devices will increase in
importance as network speeds increase and the need to hide latency
increases.

Today databases don't use queuing and rely and trickle the commands to
devices 1 by 1 to ensure atomicity and order.
As latency will become the determining factor in performance this is bound
to change.

SCSI has done an excellent job in defining the queueing mechanism. We have
to make it work with good performance in our environment.



Julo

Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 04:33:45

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu
cc:   Black_David@emc.com
Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call




David & All,

I object to the following requirement :

" MUST support ordered delivery of SCSI commands from the initiator to
the
  target, to support SCSI Task Queuing. "

Ordered delivery is not a requirement for disk based applications and
non tagged queueing tape applications, which form the majority of
today's data traffic.

To impose strict ordering (even in the presence of errors ?) as a MUST
is penalizing the majority of today's data traffic that does not expect
ordering from the SCSI subsystem.

I am particularly concerned about the effect of the above requirement in
the presence of errors. Does iSCSI expect strict ordering to be
maintained even when individual I/O errors like ULP timeout occur ?

On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
choose not to retry the command, but instead, error it back to the ULP.
In such a case, it can plug the hole in CmdSN with a NOP-OUT.

The above requirement is not feasible to be met under such circumstances
and others similar to this. Mandating strict ordering on ULP timeouts
implies a session level error recovery on any individual I/O being
failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
as error recovery and should not be imposed.

The above requirement must be changed to :
" SHOULD support ordered delivery of SCSI commands from the initiator to
the
  target, to support SCSI Task Queuing. "

- Santosh



Black_David@emc.com wrote:
>
> It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> as an Informational RFC. There is no formal requirement for
> a WG Last Call, but if you have any further substantive comments
> on the document please raise them on this list within the next
> two weeks, i.e. by April 27th at the latest.
>
> If you have typographical/editorial comments please send them
> direct to the document's author, Marjorie Krueger
> <marjorie_krueger@hp.com>.
>
> Thanks,
> --David and Elizabeth, IPS WG co-chairs
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Mon Apr 16 14:36:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA10948
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 14:35:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GGKrd19412
	for ips-outgoing; Mon, 16 Apr 2001 12:20:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GGJur19249
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 12:19:56 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id SAA211546
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 18:19:44 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id SAA140740
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 18:19:43 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A30.0059AEF2 ; Mon, 16 Apr 2001 18:19:34 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Santosh Rao <santoshr@cup.hp.com>
cc: IPS Reflector <ips@ece.cmu.edu>
Message-ID: <C1256A30.0059AD0E.00@d12mta05.de.ibm.com>
Date: Mon, 16 Apr 2001 18:01:01 +0300
Subject: Re: iSCSI : More problems with Status SNACK !
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

Comments in text.

Thanks,
Julo

Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 01:05:27

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : More problems with Status SNACK !




Julian & All,

Here are 2 more problems with the current model of Status SNACK. This
current model, IMO, is un-usable due to the reasons stated below, in
addition to reasons stated in earlier mails on this issue. iSCSI MUST
use a different Status SNACK model. See proposal below in that regard.

1)Section 2.16. S[N]ACK Request
-------------------------------
The rev 05 spec states that SACK also implicitly acknowledges data or
status PDUs.

Now, consider the following scenario :

ExpStatSN = 0
StatSNs 1 - 10 are sent from target to initiator.

-------> StatSN 1
-------> StatSN 2
-------> StatSN 4
-------> StatSN 5
-------> StatSn 6
-------> StatSN 8
-------> StatSN 9
-------> StatSN 10

If StatSn 3 & 7 are missing, the initiator would then issue status SACK
for 3 & 7. As per the above rule, SACK for 3 implicitly acknowledges 1 &
2. (ExpStatSN advances to 3). However, SACK for 7 will implicitly
acknowledge 3 - 7, whereas 3 is still a hole !

The current SNACK model MUST NOT be used as an implicit acknowledgement
since it can cause spurious free up of status resources at the target,
prior to initiator having gotten the status.

+++ Bad wording. Sorry - It reads now:

   SNACK request is used to request retransmission of status or data PDUs
   from the target.  The SNACK request indicates to the target the missed
   status or data runs, where a run is composed of an initial missed StatSN
   or DataSN and the number of additional missed Status or Data PDUs (0
   means only the initial). If a SNACK includes more than one run those
   have to be in increasing order and non-overlapping; the SNACK implicitly
   acknowledges data or status PDUs indicated by the intervals between
   runs.

   +++++


2) SNACK mechanism cannot be relied upon for resource cleanup for the
following reasons :

a) SNACK support MUST be mandatory at the target and target can NEVER
fail a Status SNACK.
b) Initiators MUST always use a Status SNACK and this is not possible on
a UP timeout. IOW, there exist I/O timeout and other circumstances when
the initiator gives up and does not attempt SACK (suppose SACK itself
got a digest error at the target and timed out at the initiator !).

Since the current SNACK model is heavily dependent on the above
assumptions [which canot be met], failure of SNACK blocks further
forward progress with resource cleanup at the target since all further
I/O completions beyond the hole StatSN cannot be acknowledged.

In the worst case, any I/O timeout would imply session level error
recovery since the target will no longer be able to relaim resources.

+++ any UL timeout must include an abort for the task to clean up the
target++

Proposal :
==========
1) Negotiate Status SACK support at login time.
2) Do not use StatSN when Status SACK is not supported.
3) Modify the current SNACK PDU to eliminate "Additional run Length"
(which is of no practical use currently) and replace with an explicit
positive ack run described by ack_begrun and ack_run_length.

Comments ?
+++ I am basically against options - If I can avoid them.
I don't see how an optional SNACK and STatSN would simplify a
target/initiator
while still allowing command recovery without popping errors into SCSI+++
++++
- Santosh

-----------------------------------------------------------------------------


Santosh Rao wrote:
>
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > Case a is what we have today.
>
> Julian,
>
> I may be missing something, but case (a) is NOT what we have today.
> iSCSI rev 05 describes StatSN S[N]ACK support to be mandatory.
> (See http://ips.pdl.cs.cmu.edu/mail/msg04003.html for details).
>
> > Status numbering is not meant for ordering - it is only a helper for
ack
> > (bulk ack).
>
> Well understood.
>
> > All the resources required for status retransmission are the control
block
> > and the status,
> > If you give them up you give up all forms of recovery (as command retry
> > will not help either).
> >
> > The only recovery path remaining is a SCSI timeout and probably some
form
> > of task management clear (as SCSI does not know what went wrong).
>
> which is the standard SCSI recovery followed historically by most SCSI
> initiator and targets and given the TCP checksum escape rate, this is
> not an issue for disk I/O. For tape I/O, this is still under debate and
> timeout based recovery may not be optimal in some scenarios for tape.
> (not yet conclusive though).
>
> > That  is what I had in mind when I said that we can make it optional.
>
> Does "it" refer to StatSN optional or "StatSN SNACK support" ?
>
> >
> > However - a long time ago when we suggested making it optional for
targets
> > most of the list wanted it mandatory.
>
> Not sure what "it" is referring to.
>
> Are we in agreement on requirements (b) & (c) ? Can "StatSN S[N]ACK"
> support be negotiated at login time and StatSN numbering be only used if
> Status SNACK is supported by the target ?
>
> - Santosh
>
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 10/04/2001 02:06:38
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
> >
> > Julian & All,
> >
> > Do we agree on the following requirements for SNACK :
> >
> > a) iSCSI MUST NOT mandate either data or status S[N]ACK for intra-task
> > error recovery. Initiators MUST be allowed to perform command
> > granularity error recovery.
> >
> > b) iSCSI MUST provide a mechanism by which targets can continue with
I/O
> > resource release upon completion of an I/O. Such a mechanism may be
> > based on an explicit StatSN acknowledgement, (if the target supports
> > StatSN SNACK), or allow immiediate resource clean-up upon I/O
> > completion.
> >
> > c) Such a mechanism MUST NOT block forward progress when holes occur in
> > StatSN sequence, due to format or digest errors encountered at the
> > initiator.
> >
> > In order to meet the above requirements, "StatSN S[N]ACK" support can
be
> > negotiated at login time and if StatSN SNACK is not supported by the
> > target, it MUST NOT use StatSN sequence numbering. (i.e. StatSN = 0).
> >
> > By not using StatSN numbering, the "holes in StatSN" problem does not
> > occur, thereby, meeting requirements (a) ,(b) & (c) for targets that do
> > not retain I/O state information.
> >
> > For targets that do retain I/O state information, StatSn SNACK is
turned
> > on along with StatSN numbering.
> >
> > - Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Mon Apr 16 14:36:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA10975
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 14:36:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GGKjk19365
	for ips-outgoing; Mon, 16 Apr 2001 12:20:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GGK1r19256
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 12:20:01 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id SAA347184
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 18:19:44 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id SAA140738
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 18:19:43 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A30.0059AF48 ; Mon, 16 Apr 2001 18:19:35 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Santosh Rao <santoshr@cup.hp.com>
cc: IPS Reflector <ips@ece.cmu.edu>
Message-ID: <C1256A30.0059AD0F.00@d12mta05.de.ibm.com>
Date: Mon, 16 Apr 2001 18:23:05 +0300
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

The bit and the interpretation are protocol specific.

FCP uses it like iSCSI - i.e. the order has to maintained within a sequence
(a R2T derived output or the entire input).
In that sense we are not violating the EMDP.

And BTW the recovery procedure in FCP is similar although a bit more
complicated than ours and involves also
a link level sequence.

Julo



Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 03:54:28

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : digest error handling violates EMDP/InDataOrder




Where :
=======

Section 6.2 (pg 80). Digest Errors
-----------------------------------
"If the error is a Data-Digest-Error in a Data-PDU, the target MUST
either request retransmission with a R2T or answer with a Reject iSCSI
PDU and abort the task."

Problem :
---------
On a Data digest error detected by a target, it MUST NOT request
re-transmission of the data PDU thru an R2T if the session login key
InDataOrder is set to yes. The current rev 05 draft violates
InDataOrder/EMDP settings by allowing a re-transmission of R2T by
target.

Scenario :
==========
initiator           target
---------           ------
EMDP=0
InDataOrder=YES
(exp_off=0)
        offset=0,len=64k <------ R2T

--------> data PDUs
(exp_off = 64K)
                              data digest error results in
                     an 8K PDU being dropped at offset 24K.

       offset=24K,len=8K  <------ R2T for missing PDU.

exp_off != offset


- Santosh




From owner-ips@ece.cmu.edu  Mon Apr 16 16:14:39 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA13120
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 16:14:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GI7Yn26845
	for ips-outgoing; Mon, 16 Apr 2001 14:07:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GI7Mr26827
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 14:07:22 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 577AF5E2; Mon, 16 Apr 2001 11:07:21 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA00153;
	Mon, 16 Apr 2001 11:07:16 -0700 (PDT)
Message-ID: <3ADB3630.DF674238@cup.hp.com>
Date: Mon, 16 Apr 2001 11:13:04 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Douglas Otis <dotis@sanlight.net>
Cc: Ips <ips@ece.cmu.edu>
Subject: Re: iSCSI:flow control, acknowledgement, and a deterministic recovery
References: <NEBBJGDMMLHHCIKHGBEJEEEPCGAA.dotis@sanlight.net>
Content-Type: multipart/mixed;
 boundary="------------6493EA040AED328C5D48DD66"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------6493EA040AED328C5D48DD66
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Doug,

You seem to be referring to linked commands as a case wherein the
approach of Abort Task will not flush stale PDUs.

Linked Commands cannot work the way SCSI implementations are defined
today, since linked commands require the initiator task tag (I_T_L_x
nexus identifier in SAM-2 Execute Command terminology) to be generated
by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
OX_ID) is typically generated in the SCSI LLP (or in some cases in the
adapter firmware). IOW, there is no common reference handle like the
task tag sent down from the ULP that allows for association of multiple
commands to a task in several/most implementations today.

When this is fixed up to get linked commands to work [& there exist
examples of its usage], there is no reason connection allegiance could
not be applied to all the commands within the task.

I fail to see why you think Abort Task will not work with sequential
devices (?).

- Santosh



Douglas Otis wrote:
> 
> Santosh,
> 
> I see a few problems with this approach.  Tasks as defined in iSCSI do not
> maintain connection allegiance.  The driver binds all SCSI commands to their
> connection for the most resent association.  Although there are several
> places within the iSCSI proposal that make reference to a task having a
> connection allegiance, this is in error.  Commands and not tasks carry such
> allegiance.  Your recovery scheme will not allow a satisfactory recovery
> with a sequential device.  In this case, repeating the command is not a
> solution.  As a result, one connection falter and it will become a difficult
> situation.  In addition, you have no clue from iSCSI your delivery status.
> You do not know if you are waiting for the target or if you are waiting for
> the connection.  Some sequential devices have rather long time-outs with
> these complications of deducing status created by the multiple connections.
> 
> The application will not know about these connection allegiance problems.
> The iSCSI layer does not define interaction to provide additional
> application status to allow these applications to respond in a manner that
> may aid this situation nor should such additional information be required.
> With your scheme the SCSI driver must examine the content of these commands
> to make a guess as to the connection allegiance assignments.  Now the driver
> is expected to understand what the intended action is of this SCSI
> management command.  What signal is used to indicate a need for the iSCSI
> immediate treatment?  The only obvious seems to be the task attribute
> argument.  With the way iSCSI has defined iSCSI immediate, I would expect
> those commands to be treated in a LIFO rather than the normal FIFO fashion.
> 
> Doug
> 
> > Douglas Otis wrote:
> > >
> > > With multiple connections, if you are not going to use a valid
> > > CmdSN, or in your case a null CmdSN for all commands, then there
> > > would be a need to include a timestamp to meet a timely delivery
> > > requirement in the same manner as used in FC encapsulation.  IP
> > > can deliver over any time period.  A command could arrive at any
> > > time with respect to other connections.  With all of your feedback
> > > now from just the SCSI layer, the SCSI layer is likely to have timed
> > > out and restarted and now stray commands finally make an appearance
> > > (the technician re-inserted the cable).  What did that do?  Yes,
> > > if this were on a single connection, then TCP could provide some
> > > assurances, (ignoring digests errors) but you must not make that
> > > assumption nor can you assume all disruptions are symmetric.
> >
> > Doug,
> >
> > The below snippet from my last mail answered your above concern. The
> > Abort Task is sent on the same connection as the command. (connection
> > allegiance applied to the abort task as well). The Abort task pushes the
> > stale data PDUs. There is no need for a timestamp on iSCSI PDUs.
> >
> > > > As for your second concern regarding I/O timeouts, there is
> > no need for
> > > > any timestamp. An I/O timeout is dealt with by an Abort Task.
> > The abort
> > > > task response guarantees that the abort reached the target and pushed
> > > > all intermediate stale frames. Failure to complete Abort Task leads to
> > > > higher level error recovery (ex : Logout, or some higher form of task
> > > > mgmt).
> >
> > - Santosh
--------------6493EA040AED328C5D48DD66
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------6493EA040AED328C5D48DD66--



From owner-ips@ece.cmu.edu  Mon Apr 16 18:15:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA14653
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 18:15:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GJsXg04384
	for ips-outgoing; Mon, 16 Apr 2001 15:54:33 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GJs9r04351
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 15:54:09 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id PAA26366; Mon, 16 Apr 2001 15:54:03 -0400 (EDT)
Message-ID: <3ADB4D88.D8B90C48@cisco.com>
Date: Mon, 16 Apr 2001 14:52:40 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: IPS <ips@ece.cmu.edu>
CC: gberthet@xmlnetworks.com
Subject: [Fwd: iSCSI MIB at mibCentral]
Content-Type: multipart/mixed;
 boundary="------------4A5D5F24FA53F3BD790003E0"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------4A5D5F24FA53F3BD790003E0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Dear iSCSI MIB enthusiasts-

The iSCSI MIB has been posted on mibCentral, in a browseable
HTML format.  If you are interested, please look at

    http://www.mibCentral.com

Click on "IETF WGs", then MIBs, then iSCSI-MIB.

Note:  This is the version posted prior to the Minneapolis
IETF meeting.  We hope to have another draft available shortly.

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054
--------------4A5D5F24FA53F3BD790003E0
Content-Type: message/rfc822
Content-Disposition: inline

Return-Path: gberthet@xmlnetworks.com
Received: from sj-msg-core-1.cisco.com (sj-msg-core-1.cisco.com [171.71.163.11]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id LAA13474 for <mbakke@dogwood.cisco.com>; Wed, 11 Apr 2001 11:53:19 -0400 (EDT)
Received: from sj-msg-av-3.cisco.com (sj-msg-av-3.cisco.com [171.69.2.19])
	by sj-msg-core-1.cisco.com (8.9.3/8.9.1) with ESMTP id IAA18491
	for <mbakke@sj-av.cisco.com>; Wed, 11 Apr 2001 08:53:18 -0700 (PDT)
Received: from proxy1.cisco.com (localhost [127.0.0.1])
	by sj-msg-av-3.cisco.com (8.10.1/8.10.1) with ESMTP id f3BFrIl22928
	for <mbakke@sj-av.cisco.com>; Wed, 11 Apr 2001 08:53:18 -0700 (PDT)
Received: from femail4.sdc1.sfba.home.com (femail4.sdc1.sfba.home.com [24.0.95.84])
	by proxy1.cisco.com (8.11.2/8.11.2) with ESMTP id f3BFrCV08522
	for <mbakke@cisco.com>; Wed, 11 Apr 2001 08:53:12 -0700 (PDT)
Received: from gerard ([24.7.82.153]) by femail4.sdc1.sfba.home.com
          (InterMail vM.4.01.03.20 201-229-121-120-20010223) with ESMTP
          id <20010411155022.RYNF29484.femail4.sdc1.sfba.home.com@gerard>;
          Wed, 11 Apr 2001 08:50:22 -0700
Message-Id: <4.2.0.58.20010411083450.009d4f00@192.168.0.12>
X-Sender: gjbxni#xmlnetworks.com@192.168.0.12 (Unverified)
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58 
Date: Wed, 11 Apr 2001 08:40:22 -0700
To: Mark Bakke <mbakke@cisco.com>
From: Gerard Berthet <gberthet@xmlnetworks.com>
Subject: iSCSI MIB at mibCentral
In-Reply-To: <3AB12619.43B47567@cisco.com>
References: <NMEALCLOIBCHBDHLCMIJOEDGCCAA.someshg@yahoo.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-Mozilla-Status2: 00000000

Mark,

I am pleased to inform you that mibCentral posted the iSCSI MIB
at http://www.mibCentral.com along with other IETF WG MIBs under
development.

mibCentral is a new and unique SNMP MIB Search Engine, which
allows searching and viewing SNMP MIBs through a full HTML
interface.

Could you please let your WG know about this site?
I would also welcome any feedback you might have about the site.
Regards,

Gerard Berthet
President
XML Networks, Inc.


--------------4A5D5F24FA53F3BD790003E0--



From owner-ips@ece.cmu.edu  Mon Apr 16 19:43:31 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA15304
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 19:43:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GLCc709916
	for ips-outgoing; Mon, 16 Apr 2001 17:12:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GLBvr09878
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 17:11:57 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3GMJx015014;
	Mon, 16 Apr 2001 15:19:59 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI Requirements Draft - Informal WG Last Call
Date: Mon, 16 Apr 2001 14:10:07 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJCEFLCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <C1256A30.0059AD02.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

I agree that without serialization of commands and then mandating a
sequential delivery scheme, there is no means of ensuring sequential
delivery or allowing atomicity of command groups.  This sequential delivery
also allows rejection of stray commands that may have become trapped within
the network.  A problem of duplicating serialization for iSCSI "immediate"
is that now the receive window can be closed by a "non-immediate" command
carrying the same value.  You will need to adopt connection allegiance for
repeated sequences to avoid a closed window on "immediate" commands.  (It
would also be an improvement if this iSCSI "immediate" term did not overload
the meaning used in SCSI.)

Also, there would appear to be no succession limit imposed on "immediate"
commands.  Santosh will take this freedom to imply all commands can be
treated in this fashion.  If so, terminology that will impose First In First
Out treatment on multiple "immediate" commands that carrying the same
serialization will be needed and connection allegiance seems to provide this
feature.  For succession, perhaps unique serialization re-applies on
subsequent "immediate" commands to allow multiple "immediate" commands to be
serialized across multiple connections and to re-enable iSCSI flow control.
The problem this creates results in apparent skips for the sequencer
tracking only "non-immediate" commands.  (Rejecting commands bypassed is one
method to handle this problem.)  There would not need to be a
"non-immediate" command to cause this increment in the case of no
"non-immediate" commands ever presented.  This implies however two
sequencers are operating.  One for "immediate" and another for
"non-immediate."  In essence, not using "immediate" would be the same as
only using "immediate."  The "Casual" ordering sought by Santosh does not
seem possible within these constraints without an additional flag to
explicitly allow such treatment.  There would then need to be warnings about
SCSI Task Full flow control used in lieu of the iSCSI scheme.  How do you
envision the use of "immediate"?  The proposal does not provide much
clarification about potential limits on successive use.

The ability to flag these commands by the driver is left in doubt by this
proposal and termed out of scope.  Would it be prudent to adopt a scheme
where either Head of Queue or ACA tasks are marked for this treatment?
Depending on the implementation size of the buffers feeding the SCSI layer
on the target side, those commands bypassed can be problematic.  If the
definition of SCSI layer is to imply yet another queue where all commands
are held in sequence regardless of this iSCSI "immediate" flag, then
solutions are limited but then the function of the "immediate" flag is then
severely reduced as well.  If the "immediate" flag is still used within the
SCSI layer, then the definition of SCSI layer is a bit odd and should be
clarified.

You may wish to elaborate what you expect to be done in the case where tasks
to be acted upon are bypassed.  In my opinion, rejecting these commands back
to the initiator is the corrective action.  This rejection scheme also
allows "immediate" commands to carry their own serialization to benefit from
iSCSI flow control remaining valid, connection allegiance not a concern for
the initiator, the command window immediately opening, and "immediate"
commands then receive acknowledgement.  The only provision for rejection
would be that prior "immediate" commands would not be rejected back to the
initiator.

Ver 5-91, Pg 11,
   "If immediate delivery is used with task management commands, these
   commands may reach the SCSI target task management before the tasks
   they are supposed to act upon.  However, their CmdSN is a good
   reference for what commands the immediate command was supposed to act
   upon."

   "The target MUST NOT transmit a MaxCmdSN that is more than 2**31 - 1
   above the last ExpCmdSN.  For non-immediate commands, the CmdSN field
   can take any value from ExpCmdSN to MaxCmdSN. For immediate commands,
   the CmdSN field can take any value from ExpCmdSN to MaxCmdSN+1. The
   target MUST silently ignore any command outside this range or
   duplicates within the range that have not been flagged with the retry
   bit (the X bit in the opcode)."


Doug

> Ordered delivery of commands to ANY TYPE of devices will increase in
> importance as network speeds increase and the need to hide latency
> increases.
>
> Today databases don't use queuing and rely and trickle the commands to
> devices 1 by 1 to ensure atomicity and order.
> As latency will become the determining factor in performance this is bound
> to change.
>
> SCSI has done an excellent job in defining the queueing mechanism. We have
> to make it work with good performance in our environment.
>
>
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 04:33:45
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   ips@ece.cmu.edu
> cc:   Black_David@emc.com
> Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call
>
>
>
>
> David & All,
>
> I object to the following requirement :
>
> " MUST support ordered delivery of SCSI commands from the initiator to
> the
>   target, to support SCSI Task Queuing. "
>
> Ordered delivery is not a requirement for disk based applications and
> non tagged queueing tape applications, which form the majority of
> today's data traffic.
>
> To impose strict ordering (even in the presence of errors ?) as a MUST
> is penalizing the majority of today's data traffic that does not expect
> ordering from the SCSI subsystem.
>
> I am particularly concerned about the effect of the above requirement in
> the presence of errors. Does iSCSI expect strict ordering to be
> maintained even when individual I/O errors like ULP timeout occur ?
>
> On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
> choose not to retry the command, but instead, error it back to the ULP.
> In such a case, it can plug the hole in CmdSN with a NOP-OUT.
>
> The above requirement is not feasible to be met under such circumstances
> and others similar to this. Mandating strict ordering on ULP timeouts
> implies a session level error recovery on any individual I/O being
> failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
> as error recovery and should not be imposed.
>
> The above requirement must be changed to :
> " SHOULD support ordered delivery of SCSI commands from the initiator to
> the
>   target, to support SCSI Task Queuing. "
>
> - Santosh
>
>
>
> Black_David@emc.com wrote:
> >
> > It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> > as an Informational RFC. There is no formal requirement for
> > a WG Last Call, but if you have any further substantive comments
> > on the document please raise them on this list within the next
> > two weeks, i.e. by April 27th at the latest.
> >
> > If you have typographical/editorial comments please send them
> > direct to the document's author, Marjorie Krueger
> > <marjorie_krueger@hp.com>.
> >
> > Thanks,
> > --David and Elizabeth, IPS WG co-chairs
>  - santoshr.vcf
>
>
>
>



From owner-ips@ece.cmu.edu  Mon Apr 16 19:49:52 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA15349
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 19:49:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GLlbx12479
	for ips-outgoing; Mon, 16 Apr 2001 17:47:37 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GLknr12421
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 17:46:49 -0400 (EDT)
Received: from ljoy ([10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3GMpe015040;
	Mon, 16 Apr 2001 15:52:24 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Santosh Rao" <santoshr@cup.hp.com>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: RE: iSCSI:flow control, acknowledgement, and a deterministic recovery
Date: Mon, 16 Apr 2001 14:41:49 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEFMCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <3ADB3630.DF674238@cup.hp.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Santosh,

The iSCSI proposal ver 5-91 explicitly defines tasks and also includes the
option to allow linked commands to be sent across different connections.
Obviously, for sequential devices, particular attention must be paid to
command serialization as these commands tend to use relative addressing or
are dependent upon the successful completion of prior commands.  This
requirement is not helped with Auto-Sense and impels the need for a target
model change in SCSI.  An error injected into the SCSI layer as a result of
a network communication error will significantly reduce the utility of most
backup applications.  Such reliance on the SCSI layer to recover from such
uncertainty imposed as a result of the inability of the network transport to
do minimal handshakes and retries is the wrong approach.  Regardless, you
are burdening the driver with the duty of tracking a transient connection
allegiance status.  The latest version has improved language with the
exception of Pg. 16 in two places.

Ver 5-91
Pg. 15
"Connection allegiance is strictly per-command and not per-task."

Pg. 16
"tasks that have allegiance to the connection"
"all outstanding tasks that have allegiance to the connection to conclude
and send their status."

Doug

> Doug,
>
> You seem to be referring to linked commands as a case wherein the
> approach of Abort Task will not flush stale PDUs.
>
> Linked Commands cannot work the way SCSI implementations are defined
> today, since linked commands require the initiator task tag (I_T_L_x
> nexus identifier in SAM-2 Execute Command terminology) to be generated
> by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
> OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> adapter firmware). IOW, there is no common reference handle like the
> task tag sent down from the ULP that allows for association of multiple
> commands to a task in several/most implementations today.
>
> When this is fixed up to get linked commands to work [& there exist
> examples of its usage], there is no reason connection allegiance could
> not be applied to all the commands within the task.
>
> I fail to see why you think Abort Task will not work with sequential
> devices (?).
>
> - Santosh
>
> Douglas Otis wrote:
> >
> > Santosh,
> >
> > I see a few problems with this approach.  Tasks as defined in
> > iSCSI do not maintain connection allegiance.  The driver binds all
> > SCSI commands to their connection for the most resent association.
> > Although there are several places within the iSCSI proposal that
> > make reference to a task having a connection allegiance, this is
> > in error.  Commands and not tasks carry such allegiance.  Your
> > recovery scheme will not allow a satisfactory recovery with a
> > sequential device.  In this case, repeating the command is not a
> > solution.  As a result, one connection falter and it will become a
> > difficult situation.  In addition, you have no clue from iSCSI your
> > delivery status.  You do not know if you are waiting for the target
> > or if you are waiting for the connection.  Some sequential devices
> > have rather long time-outs with these complications of deducing
> > status created by the multiple connections.
> >
> > The application will not know about these connection allegiance
> > problems. The iSCSI layer does not define interaction to provide
> > additional application status to allow these applications to respond
> > in a manner that may aid this situation nor should such additional
> > information be required.  With your scheme the SCSI driver must
> > examine the content of these commands to make a guess as to the
> > connection allegiance assignments.  Now the driver is expected to
> > understand what the intended action is of this SCSI management
> > command.  What signal is used to indicate a need for the iSCSI
> > immediate treatment?  The only obvious seems to be the task attribute
> > argument.  With the way iSCSI has defined iSCSI immediate, I
> > would expect those commands to be treated in a LIFO rather than the
> > normal FIFO fashion.
> >
> > Doug



From owner-ips@ece.cmu.edu  Mon Apr 16 22:26:23 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA17274
	for <ips-archive@odin.ietf.org>; Mon, 16 Apr 2001 22:26:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3GNjcj20319
	for ips-outgoing; Mon, 16 Apr 2001 19:45:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3GNjMr20302
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 19:45:22 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by palrel3.hp.com (Postfix) with ESMTP id 69E64557
	for <ips@ece.cmu.edu>; Mon, 16 Apr 2001 16:45:19 -0700 (PDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id QAA03387 for ips@ece.cmu.edu; Mon, 16 Apr 2001 16:46:21 -0700 (PDT)
Message-Id: <200104162346.QAA03387@core.rose.hp.com>
Subject: Re: TCP checksum escapes and iSCSI error recovery design
To: ips@ece.cmu.edu
Date: Mon, 16 Apr 2001 16:46:21 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Let me add some comments to this excellent analysis from Jim - 
in particular what this means to the ERT work in progress.

Grossly, there seem to be three different camps of thinking (with
possible finer variations) -

A. trust TCP implicitly: checksum is adequate, no iSCSI digests.
B. trust TCP explicitly: verify TCP, and signal a service delivery
   subsystem failure to SCSI on a CRC (tear up the session) since 
   it happens so very rarely.
C. verify and recover: verify TCP, and do a full recovery as needed
   since TCP checksum escapes happen in a random, non-deterministic
   way (may frequently at times).

The attached discussion seems to indicate that the truth is somewhere
between camps B and C.  In the absence of other conclusive evidence
placing us fully in camp B, the recourse is to crisply and unambiguously
specify the error recovery mechanisms while ensuring interoperability 
with and minimal penalty for, camp B.  Speaking for the Error Recovery
Team, this for now continues to be the operating assumption of ERT as
we specify algorithms, trim options, and work on the wording.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com


>All,
>In designing the iSCSI error recovery mechanisms there has been considerable
>focus on general deficiency of the TCP checksum.  It seems that what iSCSI
>error recovery should really be based upon is the overall profile of TCP
>checksum escapes for the networks that will carry iSCSI traffic ("checksum
>escapes" being defined as those cases where corrupted data is passed through
>TCP and delivered to the upper layer protocol as good).  The design of iSCSI
>error recovery needs to start with a clear and agreed upon set of
>assumptions regarding the profile for TCP checksum escapes across space and
>time. Thus, the question is not simply "How often is corrupted data passed
>through TCP and delivered to iSCSI", but more along the lines of  "What is
>the distribution of TCP checksum escapes across network paths and through
>time?"
>
>ISCSI error recovery should not be over-designed nor under-designed and will
>be fundamentally different if the assumption is that checksum escapes or
>bursts of such occur every 5 minutes or 5 hours or 5 days, and if the
>assumption is that there are a few bad paths or that all paths are bad
>sometimes. 
>
>It would make sense for iSCSI to have minimal error recovery if the spatial
>profile (discussed below) of TCP errors were bi-modal (i.e. there are good
>paths and bad paths). IOW, if the checksum escapes happen deterministically
>always or not at all, iSCSI can simply drop the connection on a digest
>failure because the operation switched into error mode.  OTOH, we cannot
>adopt this simplistic approach if TCP can be in the shades of gray between
>the two modes.  Attached email discussion indicates that load levels and
>memory usage models along the path play a key role in determining this
>aspect. Employing an iSCSI-level digest/CRC appears to be the wise approach
>regardless of the TCP operational model (bimodal or otherwise), to detect
>escapes when they happen. IOW, the principle of "trust but verify" is apt to
>be applied here. The question is what is the appropriate action for iSCSI to
>take if the verification detects data corruption.
>
>To start this discussion toward a set of assumptions regarding TCP checksum
>escape profiles and an appropriate iSCSI error recovery design, this email
>includes responses from several individuals working in this area (Vern
>Paxson, Craig Partridge, Jonathan Stone), along with links to their original
>papers.
>
>Regards,
>Jim Wendt
>Networked Storage Architecture
>Network Storage Solutions Organization
>Hewlett-Packard Company / Roseville, CA
>Tel: +1 916 785-5198
>Fax: +1 916 785-0391
>Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>
>
>
>----------------------------------------------------------
>Here is a copy of my original email to Vern Paxson, Craig Partridge,
>Jonathan Stone, and Jeff Chase:
>
>Hi All,
>I'm e-mailing this to Vern, Craig, Jonathan, and Jeff in hopes of gathering
>information around TCP checksum failure profiles and Internet data failure
>characteristics in general.  I'm Jim Wendt and I've recently started working
>on both the iSCSI error recovery and RDMA/TCP-Framing activities. I'm
>emailing you directly because of your specific research and experience
>regarding TCP checksum failures and behavior.
>
>My immediate interest is in regards to TCP checksum failure rates
>(undetected errors) and how iSCSI should handle these undetected errors.  As
>you probably already know, the iSCSI protocol is a mapping of SCSI device
>protocols onto TCP (or SCTP eventually).  I have read the three papers (When
>the CRC and TCP Checksum Disagree, Performance of Checksums and CRCs over
>Real Data - both versions, End-to-End Internet Packet Dynamics). 
>
>The iSCSI WG has come to the conclusion that the TCP checksum provides an
>insufficient level of data protection, and that a CRC-32 will be used for
>data integrity verification on each iSCSI PDU (the specific CRC polynomial
>to employ is being studied now). Thus, the assumption is that if corrupted
>payload data does pass through TCP and is undetected by the TCP checksum,
>then the corruption will be detected at the iSCSI PDU level via the CRC-32
>check.
>
>What we are considering now is what level and philosophy of error handling
>and recovery is put into iSCSI. It seems to me that fundamentally different
>error handling behaviors would be put into iSCSI based on a known rate or
>profile of occurrence of bad PDUs (those PDUs that are passed up through TCP
>as being OK, but for which the iSCSI level CRC-32 indicates a data error).
>Thus, iSCSI error handling might be defined differently if the expected PDU
>error rate is one every 5 minutes as opposed to one every 5 hours (or 5
>days). Also, the error handling might be different if errors are bursty
>(i.e. a sequence of bad PDUs) rather than evenly spread through time.
>
>I would appreciate hearing your thoughts (and any supporting data references
>I could employ) regarding the nature of TCP checksum failure profiles across
>time and space (i.e. network paths).   More specifically:
>
>* How does the Internet work in the face of TCP error escapes?  If there
>truly is a relatively high level of application level data corruption
>(perhaps 1 in 16M packets), then how does the Internet and associated
>applications manage to function?
>
>* What is the distribution of errors across network paths?  Are most network
>paths very good (low TCP error escapes) with only a few paths that are
>really bad, or are TCP error escapes more evenly distributed across all
>network paths?  If the spatial profile is that there are good and bad paths,
>then do the error escapes rates on these two classes of paths correspond to
>the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
>billion)?
>
>* What is the distribution of errors through time?  Do TCP error escapes
>occur individually and randomly through time, or are TCP error escapes more
>bi-modal where most of the time there are no errors and occasionally there
>is a clump or burst of TCP error escapes?  If the temporal profile for TCP
>error escapes is that there are good periods and infrequent but severe bad
>periods, then what is the duty cycle for these periods (how bad/good for how
>long), and what are the error escape rates during these periods?
>
>* What about a stronger TCP checksum?  I don't believe that anyone every
>actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
>recent thinking about actually improving TCP's end-to-end data integrity
>checking.  I suppose that the existing middle box infrastructure won't allow
>for this. However, I'm considering submitting a draft and starting to push
>for a stronger TCP checksum or CRC, but I would like to get feedback from
>all of you on the technical feasibility and possible acceptance of this
>proposal before taking it to the public forums.
>
>* What about a stronger SCTP checksum? SCTP currently uses the Adler-32
>checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
>Has there been any thinking in this direction? Again, I am considering
>pursuing this with the SCTP folks but would appreciate any feedback you have
>to offer first.
>
>Thanks,
>Jim
>
>Jim Wendt
>Networked Storage Architecture
>Network Storage Solutions Organization
>Hewlett-Packard Company / Roseville, CA
>Tel: +1 916 785-5198
>Fax: +1 916 785-0391
>Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>
>
>
>----------------------------------------------------------
>Craig Partridge writes:
>
>Hi Jim:
>
>Here's my quick set of answers, which Vern, Jonathan and others can
>refine.
>
>>* How does the Internet work in the face of TCP error escapes?  If there
>>truly is a relatively high level of application level data corruption
>>(perhaps 1 in 16M packets), then how does the Internet and associated
>>applications manage to function?
>
>The answer is that applications appear to function because people don't
>notice the errors or resort to backups, or whatever.  It isn't clear
>exactly how we're managing to survive.  (Back in the old days when NFS
>had these problems, people used to assume a disk failure had occurred when
>in fact the network had trashed their data).
>
>>* What is the distribution of errors across network paths?  Are most
>network
>>paths very good (low TCP error escapes) with only a few paths that are
>>really bad, or are TCP error escapes more evenly distributed across all
>>network paths?  If the spatial profile is that there are good and bad
>paths,
>>then do the error escapes rates on these two classes of paths correspond to
>>the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
>>billion)?
>
>The answer is that there are good and bad paths.  On good paths you'll
>probably see less escapes than 1 in 10 billion -- you can probably treat
>them as essentially error free.  If you start seeing a modest number of
>errors, then either (a) there's a broken router in your path or (b)
>there's a broken end system.  If we had a better way of identifying the
>source of errors, I'd say that if you ever see an error from a host,
>declare it broken.
>
>>* What is the distribution of errors through time?  Do TCP error escapes
>>occur individually and randomly through time, or are TCP error escapes more
>>bi-modal where most of the time there are no errors and occasionally there
>>is a clump or burst of TCP error escapes?  If the temporal profile for TCP
>>error escapes is that there are good periods and infrequent but severe bad
>>periods, then what is the duty cycle for these periods (how bad/good for
>how
>>long), and what are the error escape rates during these periods?
>
>They're not random but I don't think we know enough about the timing to say.
>My hunch is that a lot are based on end system load (that high loads
>on network cards tends to encourage certain classes of bus errors).
>
>>* What about a stronger TCP checksum?  I don't believe that anyone every
>>actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
>>recent thinking about actually improving TCP's end-to-end data integrity
>>checking.  I suppose that the existing middle box infrastructure won't
>allow
>>for this. However, I'm considering submitting a draft and starting to push
>>for a stronger TCP checksum or CRC, but I would like to get feedback from
>>all of you on the technical feasibility and possible acceptance of this
>>proposal before taking it to the public forums.
>
>That's a tricky question and not one I've thought enough about to have
>a strong view -- except to say that it isn't all the checksum's fault --
>Jonathan's done work showing that putting the checksum at the end
>dramatically
>improves its efficacy in certain situations.
>
>>* What about a stronger SCTP checksum? SCTP currently uses the Adler-32
>>checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
>>Has there been any thinking in this direction? Again, I am considering
>>pursuing this with the SCTP folks but would appreciate any feedback you
>have
>>to offer first.
>
>There's considerable reason to believe that Adler-32 is a lousy checksum --
>it uses more bits (32 vs. 16) and will probably detect fewer errors than
>either Fletcher-16 or the TCP checksum.  If one is going to invest energy
>in fixing a checksum, fixing SCTP's checksum is probably the first priority.
>
>Craig
>
>----------------------------------------------------------
>Vern Paxson writes:
>
>Just a couple of follow-on points:
>
>	- I suspect the Internet survives because the bulk of the
>	  traffic isn't all that critical (Web pages, particularly
>	  large items like images and perhaps video clips), so when
>	  they're corrupted, nothing breaks in a major way.
>
>	- One point to consider is that if you use IPSEC, then you
>	  get very strong protection from its integrity guarantees
>
>- Vern
>
>----------------------------------------------------------
>Jonathan Stone writes:
>
>In message <200104092135.f39LZVS21092@daffy.ee.lbl.gov>Vern Paxson writes
>
>Jim,
>
>Craig's answer is an excellent summary.  Fixing the SCTP checksum is a
>priority; I have a chapter in my thesis addresing some of the issues
>there, and I believe Craig and I plan to write a paper.
>Stronger TCP checksums are an interesting idea, but the lag on
>deploying new TCP features seems to be  5 years or more.
>
>
>If I re-did the study, i would try and log headers of ``good'' packets
>(those where the checksum matches) as well as the entire packets with
>checksum errors.  With only data on the `bad' packets (which is partly
>to meet privacy concerns, which were a big hurdle for us),
>it's hard to give good answers to your questions about the time or path
>dependency of bad checksums.
>
>One thing we can say is that there appear to be certain `bad' hosts
>which emit very high rates of packets with incorrect checksums.  One
>host we noticed in the Stanford trace sent 2 packets every second for
>around three hours, totalling about 20% of the total packets in that
>trace.  We can't say whether that's a host problem or a path problem;
>but either way that error rate would worry me, if I were using I-SCSI
>on that host (or path).  That said, we did find what looks to be a
>router with a bad memory bit, which is clearly path-dependent
>(though hitting that particular memory word may be time- and load-dependent
>as well).
>
>Further, some of the bad checksums appear to be due to software
>bugs in specific OS releases.  As the Internet evolves (old OS revs
>replaced by newer ones), the rate of those specific errors will
>evolve over time.
>
>
>Last, the DORM trace in our SIGCOMM 2000 paper is the only trace where
>our monitoring point was directly adjacent to end-hosts, with no
>intervening IP routers.  On that Ethernet segment, i saw about 1 in
>80,000 packets with a valid frame CRC, but a bad IP checksum --
>often a deletion of a 16-bit word from the IP address.  I tried
>to `repair' some  IP headers (e.g., by inserting what seemed
>to be a missing 16 bits of an IP source address).
>After the repair, the IP checksum seemed to be correct.
>
>That points to a problem ocurring after the sending IP layer computeds
>its IP header checksum, but before the frame-level CRC is computed.:
>That suggests an error either in the NIC device driver, in its DMA
>enigne, or somewhere on the NIC itself (dropped from a FIFO?).
>
>That led Craig and I to suggest that where possible, checksums be
>computed early and verified late.  That's a caution about a potential
>downside of outboard checksumming: it cover less of the path to an
>application buffer than is covered by software checksums.  Software
>checksums can catch errors which occur between the driver and the NIC
>(bus errors, DMA, FIFO overruns, what-have-you); but outboard hardware
>checksums do not.  (IEN-45 mentions this issue, but the two-pass
>scheme suggested there doubles bus utilization for a given I/O rate.)
>
>Putting middle boxes in the path just exacerbates this.
>
>Whether or not to use outboard checksumming is entirely up to
>NIC designers. We merely raise the issue that it checks less of
>the datapath than is covered by software checksums.
>
>
>>Just a couple of follow-on points:
>>
>>	- I suspect the Internet survives because the bulk of the
>>	  traffic isn't all that critical (Web pages, particularly
>>	  large items like images and perhaps video clips), so when
>>	  they're corrupted, nothing breaks in a major way.
>>
>>	- One point to consider is that if you use IPSEC, then you
>>	  get very strong protection from its integrity guarantees
>
>For a software IPSEC implementation.
>
>A hardware implementation with outboard crypto hardware could
>potentially fall foul of the same kinds of local-to-source-host
>errors, (DMA errors or whatever the ultimate cause is) which our data
>indicates some NICs suffer from.  If the data has already been
>curdled by the time the encrypting accelerator sees it,
>I don't see how IPsec actually enhances integrity.
>
>----------------------------------------------------------
>Vern Paxson writes:
>
>> >	- One point to consider is that if you use IPSEC, then you
>> >	  get very strong protection from its integrity guarantees
>> 
>> For a software IPSEC implementation.
>
>Good point!
>
>		Vern
>
>----------------------------------------------------------
>Craig Partridge writes:
>
>I'd like to clarify my statement to Jim about Adler-32 to be a bit more
>clear
>about what the issues are.   I've also added Mark Adler to the cc list as
>I'd promised Mark to get him results when we had them, and while the results
>aren't quite cooked, if the Jim is going to circulate a discussion, Mark
>should see it first.
>
>If you're designing a checksum, there are certain features you'd like it
>to have.  Here's a starting point of a formal definition.  Given the set
>V of all possible bit vectors and a checksum function C(), what we'd like
>is:
>
>	prob(C(v1) == C(v2)) is 1/2**(sizeof C())
>
>that is given any two v1, v2 being different elements of V, the chance that
>their checksum will collide is the best possible, namely 1 over 2 raised to
>the power of the bitwidth of the result of C().
>
>Three sub points:
>
>    1. This is not quite the same as what the cryptographic checksum folks
>    want.  They actually want it to be very hard [for some computational
>    approximation of hard], given C(v1) to find a v2 such that C(v1) ==
>C(v2).
>    For network checksums, we don't care as we're protecting from errors,
>    not attacks.
>
>    2. If we do not pick v1 and v2 at random, but according to some
>distribution
>    rule of likely packet sizes, packet contents, etc, we'd still like the
>    equation to be true.  We don't want to be vulnerable to certain
>    traffic patterns, etc.
>
>    3. You can compare the effectiveness of checksums by how close they
>    come to this ideal -- that is, how effectively do they use their
>    range of values?
>
>OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
>improve the performance of the Fletcher checksum by summing modulo a prime
>number, rather than 255 or 256.
>
>However, it sums bytes (8-bit quantities) into 16-bit fields.  As a result,
>the high bits of the 16-bit fields take some time to fill (they only get
>filled by propogating carries from lower bits) and until the packet
>is quite big (thousands of bytes) you don't get enough mixing in the high
>bits.  Two problems: (a) we're not fully using the 16-bit width, so for
>smaller packets the chance of collision is much greater than 1/2**sizeof(C)
>simply because some bits in the checksum are always (or with very
>high probability) set to 0; and (b) it looks like (and Jonathan's still
>working on this) that the law of large numbers will cause the values to
>cluster still further [I think of this behavior as the result of just
>looking at all the bits instead of just the low order bits mod a prime,
>we're
>vulnerable to the fact that the sums are not evenly distributed, by
>the law of large numbers]
>
>We're still working on whether the core idea behind Adler-32 (namely working
>modulo a prime) is as powerful as it seems, but it is clear that to have
>a hope of making it comparable to the TCP checksum or Fletcher, you have to
>sum 16-bit quantities into 16-bit fields.
>
>Craig
>
>----------------------------------------------------------
>Jonathan writes:
>
>In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes
>
>>I'd like to clarify my statement to Jim about Adler-32 to be a bit more
>clear
>>about what the issues are.   I've also added Mark Adler to the cc list as
>>I'd promised Mark to get him results when we had them, and while the
>results
>>aren't quite cooked, if the Jim is going to circulate a discussion, Mark
>>should see it first.
>>
>>If you're designing a checksum, there are certain features you'd like it
>>to have.  Here's a starting point of a formal definition.  Given the set
>>V of all possible bit vectors and a checksum function C(), what we'd like
>is:
>>
>>	prob(C(v1) == C(v2)) is 1/2**(sizeof C())
>>
>>that is given any two v1, v2 being different elements of V, the chance that
>>their checksum will collide is the best possible, namely 1 over 2 raised to
>>the power of the bitwidth of the result of C().
>
>Jim, Craig: 
>
>Just to be picky, as I'm working right now on definitions of some of
>these issues for my thesis:
>
>One can list other desirable properties, like wanting each bit of the
>checksum field to have informational entropy 1/2.  Craig's aggregate
>definition falls out from that, with a few extra assumptions.
>
>Also, less formally, desiring each bit of the input data to contribute
>equally to flipping the the final state of each output bit.
>that is where Adler32 runs into trouble when given short inputs.
>
>
>>Three sub points:
>>
>>    1. This is not quite the same as what the cryptographic checksum folks
>>    want.  They actually want it to be very hard [for some computational
>>    approximation of hard], given C(v1) to find a v2 such that C(v1) ==
>C(v2).
>>    For network checksums, we don't care as we're protecting from errors,
>>    not attacks.
>
>There are two versions of formal cryptographic invertibility; the
>other criteria is that it be computationally intractable to find *any*
>v1 and v2 such that C(v1) = C(v2).  Crypto folks would generally like
>both.
>
>
>>    2. If we do not pick v1 and v2 at random, but according to some
>distributi
>>on
>>    rule of likely packet sizes, packet contents, etc, we'd still like the
>>    equation to be true.  We don't want to be vulnerable to certain
>>    traffic patterns, etc.
>>
>>    3. You can compare the effectiveness of checksums by how close they
>>    come to this ideal -- that is, how effectively do they use their
>>    range of values?
>>
>>OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
>>improve the performance of the Fletcher checksum by summing modulo a prime
>>number, rather than 255 or 256.
>>
>>However, it sums bytes (8-bit quantities) into 16-bit fields.  As a result,
>>the high bits of the 16-bit fields take some time to fill (they only get
>>filled by propogating carries from lower bits) and until the packet
>>is quite big (thousands of bytes) you don't get enough mixing in the high
>>bits.  Two problems: (a) we're not fully using the 16-bit width, so for
>>smaller packets the chance of collision is much greater than 1/2**sizeof(C)
>>simply because some bits in the checksum are always (or with very
>>high probability) set to 0; and (b) it looks like (and Jonathan's still
>>working on this) that the law of large numbers will cause the values to
>>cluster still further [I think of this behavior as the result of just
>>looking at all the bits instead of just the low order bits mod a prime,
>we're
>>vulnerable to the fact that the sums are not evenly distributed, by
>>the law of large numbers]
>
>To be fair, and to give the whole picture, there are a couple of
>points here that should be expanded.
>
>The first is that for large input data lengths, we can show that the
>distribution of both 16-bit halves of the Adler-32 sum should acually
>be well distributed.  That holds true for addition mod M, of any
>repeated independent observations of a random variable.  (A proof
>appears in the appendix of the 1998 ToN paper you cited already;
>although  that version of the proof may not formally state
>all the necessary assumptions about indepedent observations.)
>
>However, for networking in general and SCTP in particular, there are
>fairly modest hard upper boundes on the maximum input length.
>SCTP forbids fragmentation.  For the increasingly-pervasive Ethernet
>frame length of 1500 bytes that means SCTP checksums have no more
>than 1480 input bytes.
>
>The data we have -- and I am still working on it -- say that's too
>short to get good coverage of either the two sum in SCTP-- the purely
>additive commutative sum, or the higher-order running sum of the
>commutative sum (which is position-dependent)
>
>Craig's description gives a good intuition for the first, commutative
>sum. For the position-dependent sum (the running sum of the first sum),
>another good intution for the computational modelling I've done is to
>imagine a hash-table where we hash, independently, all possible values
>at all possible offsets in a 1480-byte SCTP packet.  The intuition
>isn't so much "law of large numbers" as that SCTP draws its per-byte
>values from one particular corner of a two-dimensional space (the
>hash-table vs. all possible bytes); so it ends up with an uneven
>coverage of the space of all hash values.
>
>
>>We're still working on whether the core idea behind Adler-32 (namely
>working
>>modulo a prime) is as powerful as it seems, but it is clear that to have
>>a hope of making it comparable to the TCP checksum or Fletcher, you have to
>>sum 16-bit quantities into 16-bit fields.
>
>Just so the reasoning is clear, another alternative is to sum 8-bit
>inputs into 8-bit accumulators, modulo 251.  Given 32 bits of
>available checksum field, the 16-bit sums are preferable.
>
>Again, this is for short inputs.  For large inputs of several tens of
>Kbytes, the Adler32 sum should do much better (at 64Kbytes it should
>give very uniform coverage).  At those input sizes, the comparison comes
>down to Adler32 having 15 pairs of values which are congruent, mod
>65521, whereas a Fletcher sum wiht 16-bit inputs would have only one;
>versus better `stirring' from a prime modulus.
>
>I dont know what the distribution of sizes of zlib-compressed files
>is.  If they are generally large, then our work may not be applicable
>to Adler-32 its original designed purpose.
>
>
>I also don't know the history of why SCTP chose the Adler-32 sum
>rather than, say, CRC-32. The gossip I hear from IETF-going friend in
>the Bay Area is that there was concern about the performance of
>CRC-32; a direct bit-by-bit shift-and-add was seen as too slow.  I
>hear there was also a supposition that an efficient table-lookup
>version would require very large tables (128 kbits?) and that
>tables that arge were prohibitive for PDAs and small handheld devices.
>
>I am *not* suggesting that is an accurate report; it probably isn't.
>But if there's any grain of truth in it, both four-bit and eight-bit
>table-lookup algorithms for CRC32 exist.  Table lookup size need not
>be an issue.  Perhaps we should draw the SCTP authors -- C. Sharp? --
>into this discussion as well.
>
>----------------------------------------------------------
>Jonathan writes:
>
>A small terminological correction here:
>
>In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes
>>
>
>[...]
>
>>
>>so for
>>smaller packets the chance of collision is much greater than 1/2**sizeof(C)
>>simply because some bits in the checksum are always (or with very
>>high probability) set to 0; and (b) it looks like (and Jonathan's still
>>working on this) that the law of large numbers will cause the values to
>>cluster still further [I think of this behavior as the result of just
>>looking at all the bits instead of just the low order bits mod a prime,
>we're
>>vulnerable to the fact that the sums are not evenly distributed, by
>>the law of large numbers]
>
>I think its acutally a central limit theorem, not the law of large
>numbers.  For addition modulo M, the "central limit theorem" says that
>summation of many independent identically-distributed random variables
>will tend to a uniform distribution, not a normal distribution
>(as does the weighted sum)
>
>----------------------------------------------------------
>Jonathan writes:
>
>Jim,
>
>I forwarded my previos message to Chip Sharp.  (is the rfc2960
>address, chsharp@cisco.com, still valid?)
>
>He may wish to comment on the SCTP issues as well.
>
>
>----------------------------------------------------------
>----------------------------------------------------------
>Links to papers:
>
>End-to-End Internet Packet Dynamics, Vern Paxson
><http://citeseer.nj.nec.com/cache/papers2/cs/11598/ftp:zSzzSzftp.ee.lbl.govz
>SzpaperszSzvp-pkt-dyn-ton99.pdf/paxson97endtoend.pdf>
>
>When the CRC and TCP Checksum Disagree, Jonathan Stone, Craig Partridge
><http://www.acm.org/sigcomm/sigcomm2000/conf/paper/sigcomm2000-9-1.pdf>
>
>Performance of Checksums and CRCs over Real Data, Jonathan Stone, Michael
>Greenwald, Craig Partridge, Jim Hughes
><http://citeseer.nj.nec.com/cache/papers2/cs/1909/ftp:zSzzSzftp.dsg.stanford
>.eduzSzpubzSzpaperszSzsplice-paper.pdf/performance-of-checksums-and.pdf>
>
>----------------------------------------------------------
>
>
>




From owner-ips@ece.cmu.edu  Tue Apr 17 05:24:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA04639
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 05:24:03 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3H73pd15254
	for ips-outgoing; Tue, 17 Apr 2001 03:03:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3H73Sr15239
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 03:03:28 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3NM47>; Tue, 17 Apr 2001 03:04:50 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015447@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: Eddy@quicksall.com, ips@ece.cmu.edu
Subject: RE: aborting an out of sequence cmdSN
Date: Tue, 17 Apr 2001 03:03:08 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Eddy,

> Does iSCSI have a way to abort the commands that are in the target but not
> yet in the SCSI Target Layer?

No, it doesn't.  Part of the problem is that just considering
iSCSI info, the Initiator can't tell whether such commands
are in the Target vs. still in-flight.  TCP ACK info may help,
but if TCP hasn't ACKed, the Initiator still doesn't know.

This is the issue that Doug's been posting about, advocating
that iSCSI always reject all commands prior to the task abort
on the assumption that the initiator can reissue them - given
the lack of support this has received, I'm going to take this
opportunity to state that I believe WG rough consensus exists
for NOT pursuing this "reject all prior commands" approach.
This is the last chance for anyone else to express support for
this reject-based approach.

If we were to address the original issue, I think some sort
of new mechanism is needed at the iSCSI level, as the
attempts to use a side effect of SCSI task aborts don't
seem to be getting much of anywhere (e.g., a previous
attempt at this involved having iSCSI automatically send
task management commands multiple times).  A SWAG at what
such a mechanism might look like is some sort of iSCSI
"Cancel" that names the CmdSN(s) to be cancelled and sent
in the same PDU as the SCSI Task Abort or Task Set Abort
that is supposed to abort them.  Canceling would apply
both to commands waiting for CmdSN in iSCSI as well as
those not yet received by the target (i.e., the target
has to have some notion of "pending cancel" for commands
it hasn't received yet.  The cancelled commands
probably have to return some sort of iSCSI service error
rather than SCSI sense data.  Something like this looks
like it can be made to work as desired (issue TASK 
ABORT with the corresponding Cancel for immediate delivery,
and if SCSI declares the TASK ABORT to have succeeded,
then that command will not subsequently execute), but it'll
add complexity.  Is that worth the benefit of being able
to reach out and definitively strangle an errant command?

--David

> -----Original Message-----
> From:	Eddy Quicksall [SMTP:Eddy@quicksall.com]
> Sent:	Friday, April 13, 2001 6:40 PM
> To:	Ips@Ece. Cmu. Edu (E-mail)
> Subject:	aborting an out of sequence cmdSN
> 
> 
> If a command is received out of sequence, it is not passed to the SCSI
> Target Layer. So, it must be queued waiting for the prior command. Now, if
> the prior command never comes the operating system may attempt to abort
> the
> lost command by sending an abort to the driver.
> 
> The abort can't be turned into an ABORT TASK TMF because the target iSCSI
> has not acknowledged that it has received the original command yet. And,
> even if the ABORT TASK was sent, it can't get to the SCSI Target Layer
> because its cmdSN will be higher than the most recent command.
> 
> The iSCSI initiator drive could just jerk it out of its queue (knowing
> that
> it hasn't been acknowledged yet) but then the iSCSI target may still
> execute
> it when the missing PDU's arrive.
> 
> Does iSCSI have a way to abort the commands that are in the target but not
> yet in the SCSI Target Layer?
> 
> Eddy
> 


From owner-ips@ece.cmu.edu  Tue Apr 17 07:01:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id HAA05245
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 07:01:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3H8elO20934
	for ips-outgoing; Tue, 17 Apr 2001 04:40:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3H8dwr20854
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 04:39:58 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id KAA99644
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 10:39:49 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id KAA188392
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 10:39:48 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A31.002F9521 ; Tue, 17 Apr 2001 10:39:43 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A31.002F930E.00@d12mta02.de.ibm.com>
Date: Tue, 17 Apr 2001 10:44:47 +0200
Subject: iSCSI & Linked Commands
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

Sorry to interrupt this captivating thread. Why do you think linked
commands won't work?

Julo

Santosh Rao <santoshr@cup.hp.com> on 16/04/2001 20:13:04

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Douglas Otis <dotis@sanlight.net>
cc:   Ips <ips@ece.cmu.edu>
Subject:  Re: iSCSI:flow control, acknowledgement, and a deterministic
      recovery




Doug,

You seem to be referring to linked commands as a case wherein the
approach of Abort Task will not flush stale PDUs.

Linked Commands cannot work the way SCSI implementations are defined
today, since linked commands require the initiator task tag (I_T_L_x
nexus identifier in SAM-2 Execute Command terminology) to be generated
by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
OX_ID) is typically generated in the SCSI LLP (or in some cases in the
adapter firmware). IOW, there is no common reference handle like the
task tag sent down from the ULP that allows for association of multiple
commands to a task in several/most implementations today.

When this is fixed up to get linked commands to work [& there exist
examples of its usage], there is no reason connection allegiance could
not be applied to all the commands within the task.

I fail to see why you think Abort Task will not work with sequential
devices (?).

- Santosh



Douglas Otis wrote:
>
> Santosh,
>
> I see a few problems with this approach.  Tasks as defined in iSCSI do
not
> maintain connection allegiance.  The driver binds all SCSI commands to
their
> connection for the most resent association.  Although there are several
> places within the iSCSI proposal that make reference to a task having a
> connection allegiance, this is in error.  Commands and not tasks carry
such
> allegiance.  Your recovery scheme will not allow a satisfactory recovery
> with a sequential device.  In this case, repeating the command is not a
> solution.  As a result, one connection falter and it will become a
difficult
> situation.  In addition, you have no clue from iSCSI your delivery
status.
> You do not know if you are waiting for the target or if you are waiting
for
> the connection.  Some sequential devices have rather long time-outs with
> these complications of deducing status created by the multiple
connections.
>
> The application will not know about these connection allegiance problems.
> The iSCSI layer does not define interaction to provide additional
> application status to allow these applications to respond in a manner
that
> may aid this situation nor should such additional information be
required.
> With your scheme the SCSI driver must examine the content of these
commands
> to make a guess as to the connection allegiance assignments.  Now the
driver
> is expected to understand what the intended action is of this SCSI
> management command.  What signal is used to indicate a need for the iSCSI
> immediate treatment?  The only obvious seems to be the task attribute
> argument.  With the way iSCSI has defined iSCSI immediate, I would expect
> those commands to be treated in a LIFO rather than the normal FIFO
fashion.
>
> Doug
>
> > Douglas Otis wrote:
> > >
> > > With multiple connections, if you are not going to use a valid
> > > CmdSN, or in your case a null CmdSN for all commands, then there
> > > would be a need to include a timestamp to meet a timely delivery
> > > requirement in the same manner as used in FC encapsulation.  IP
> > > can deliver over any time period.  A command could arrive at any
> > > time with respect to other connections.  With all of your feedback
> > > now from just the SCSI layer, the SCSI layer is likely to have timed
> > > out and restarted and now stray commands finally make an appearance
> > > (the technician re-inserted the cable).  What did that do?  Yes,
> > > if this were on a single connection, then TCP could provide some
> > > assurances, (ignoring digests errors) but you must not make that
> > > assumption nor can you assume all disruptions are symmetric.
> >
> > Doug,
> >
> > The below snippet from my last mail answered your above concern. The
> > Abort Task is sent on the same connection as the command. (connection
> > allegiance applied to the abort task as well). The Abort task pushes
the
> > stale data PDUs. There is no need for a timestamp on iSCSI PDUs.
> >
> > > > As for your second concern regarding I/O timeouts, there is
> > no need for
> > > > any timestamp. An I/O timeout is dealt with by an Abort Task.
> > The abort
> > > > task response guarantees that the abort reached the target and
pushed
> > > > all intermediate stale frames. Failure to complete Abort Task leads
to
> > > > higher level error recovery (ex : Logout, or some higher form of
task
> > > > mgmt).
> >
> > - Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Tue Apr 17 09:24:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA07174
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 09:24:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HB9pj08251
	for ips-outgoing; Tue, 17 Apr 2001 07:09:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HB91r08223
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 07:09:01 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id NAA347646;
	Tue, 17 Apr 2001 13:08:50 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id NAA74918;
	Tue, 17 Apr 2001 13:08:50 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A31.003D3B0E ; Tue, 17 Apr 2001 13:08:48 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
cc: ralphoweber@compuserve.com
Message-ID: <C1256A31.003D3A23.00@d12mta02.de.ibm.com>
Date: Tue, 17 Apr 2001 12:30:51 +0200
Subject: iSCSI linked commands
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk




Doug,

I think you would want to go back to SAM.  Linked command are broken by any
"irregularity" in execution.
The basic assumption is that the initiator is in charge of shipping linked
commands - one-by-one.
I assume that for high latency links they won't be very popular.

At a very early stage (about 2 years ago) we contemplated the idea of
"prefetching" linked commands and have the target
effect the serialization. We would have had to come up with a way of
conveying the initiator which command broke the chain (if it broke) or
caused a unit attention (if it caused) and it was not at all clear that
this was "in the spirit of SAM" .
There where also more esoteric issues with later command getting modified
by execution of prior commands etc. -:).

The 360 channels had the same class of issues.

I assume that T10 folks went over these issues many times.

Julo

"Douglas Otis" <dotis@sanlight.net> on 16/04/2001 23:41:49

Please respond to "Douglas Otis" <dotis@sanlight.net>

To:   "Santosh Rao" <santoshr@cup.hp.com>
cc:   "Ips" <ips@ece.cmu.edu>
Subject:  RE: iSCSI:flow control, acknowledgement, and a deterministic
      recovery




Santosh,

The iSCSI proposal ver 5-91 explicitly defines tasks and also includes the
option to allow linked commands to be sent across different connections.
Obviously, for sequential devices, particular attention must be paid to
command serialization as these commands tend to use relative addressing or
are dependent upon the successful completion of prior commands.  This
requirement is not helped with Auto-Sense and impels the need for a target
model change in SCSI.  An error injected into the SCSI layer as a result of
a network communication error will significantly reduce the utility of most
backup applications.  Such reliance on the SCSI layer to recover from such
uncertainty imposed as a result of the inability of the network transport
to
do minimal handshakes and retries is the wrong approach.  Regardless, you
are burdening the driver with the duty of tracking a transient connection
allegiance status.  The latest version has improved language with the
exception of Pg. 16 in two places.

Ver 5-91
Pg. 15
"Connection allegiance is strictly per-command and not per-task."

Pg. 16
"tasks that have allegiance to the connection"
"all outstanding tasks that have allegiance to the connection to conclude
and send their status."

Doug

> Doug,
>
> You seem to be referring to linked commands as a case wherein the
> approach of Abort Task will not flush stale PDUs.
>
> Linked Commands cannot work the way SCSI implementations are defined
> today, since linked commands require the initiator task tag (I_T_L_x
> nexus identifier in SAM-2 Execute Command terminology) to be generated
> by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
> OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> adapter firmware). IOW, there is no common reference handle like the
> task tag sent down from the ULP that allows for association of multiple
> commands to a task in several/most implementations today.
>
> When this is fixed up to get linked commands to work [& there exist
> examples of its usage], there is no reason connection allegiance could
> not be applied to all the commands within the task.
>
> I fail to see why you think Abort Task will not work with sequential
> devices (?).
>
> - Santosh
>
> Douglas Otis wrote:
> >
> > Santosh,
> >
> > I see a few problems with this approach.  Tasks as defined in
> > iSCSI do not maintain connection allegiance.  The driver binds all
> > SCSI commands to their connection for the most resent association.
> > Although there are several places within the iSCSI proposal that
> > make reference to a task having a connection allegiance, this is
> > in error.  Commands and not tasks carry such allegiance.  Your
> > recovery scheme will not allow a satisfactory recovery with a
> > sequential device.  In this case, repeating the command is not a
> > solution.  As a result, one connection falter and it will become a
> > difficult situation.  In addition, you have no clue from iSCSI your
> > delivery status.  You do not know if you are waiting for the target
> > or if you are waiting for the connection.  Some sequential devices
> > have rather long time-outs with these complications of deducing
> > status created by the multiple connections.
> >
> > The application will not know about these connection allegiance
> > problems. The iSCSI layer does not define interaction to provide
> > additional application status to allow these applications to respond
> > in a manner that may aid this situation nor should such additional
> > information be required.  With your scheme the SCSI driver must
> > examine the content of these commands to make a guess as to the
> > connection allegiance assignments.  Now the driver is expected to
> > understand what the intended action is of this SCSI management
> > command.  What signal is used to indicate a need for the iSCSI
> > immediate treatment?  The only obvious seems to be the task attribute
> > argument.  With the way iSCSI has defined iSCSI immediate, I
> > would expect those commands to be treated in a LIFO rather than the
> > normal FIFO fashion.
> >
> > Doug








From owner-ips@ece.cmu.edu  Tue Apr 17 09:24:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA07190
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 09:24:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HB9sK08256
	for ips-outgoing; Tue, 17 Apr 2001 07:09:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HB95r08225
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 07:09:06 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id NAA348520
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 13:08:58 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id NAA112552
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 13:08:58 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A31.003D3E73 ; Tue, 17 Apr 2001 13:08:56 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A31.003D3D6C.00@d12mta02.de.ibm.com>
Date: Tue, 17 Apr 2001 12:54:43 +0200
Subject: Re: TCP checksum escapes and iSCSI error recovery design
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Mallikarjun,

I completely agree.  That is why I insisted on having the "must recover"
removed from the requirements doc.
However we should make it possible for those that are willing to make the
extra effort (and pay for it) to recover a transcontinental backup session
without blowing it up.

The one additional point we should consider is the impact iSCSI
gateways/proxies have on recover as they are one of the reasons why we have
segment checksums that go beyond the "pure transport" end-to-end error
detection and forced us into iSCSI end-to-end mode.

Regards,
Julo

"Mallikarjun C." <cbm@rose.hp.com> on 17/04/2001 01:46:21

Please respond to cbm@rose.hp.com

To:   ips@ece.cmu.edu
cc:
Subject:  Re: TCP checksum escapes and iSCSI error recovery design




Let me add some comments to this excellent analysis from Jim -
in particular what this means to the ERT work in progress.

Grossly, there seem to be three different camps of thinking (with
possible finer variations) -

A. trust TCP implicitly: checksum is adequate, no iSCSI digests.
B. trust TCP explicitly: verify TCP, and signal a service delivery
   subsystem failure to SCSI on a CRC (tear up the session) since
   it happens so very rarely.
C. verify and recover: verify TCP, and do a full recovery as needed
   since TCP checksum escapes happen in a random, non-deterministic
   way (may frequently at times).

The attached discussion seems to indicate that the truth is somewhere
between camps B and C.  In the absence of other conclusive evidence
placing us fully in camp B, the recourse is to crisply and unambiguously
specify the error recovery mechanisms while ensuring interoperability
with and minimal penalty for, camp B.  Speaking for the Error Recovery
Team, this for now continues to be the operating assumption of ERT as
we specify algorithms, trim options, and work on the wording.
--
Mallikarjun


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668   Hewlett-Packard, Roseville.
cbm@rose.hp.com


>All,
>In designing the iSCSI error recovery mechanisms there has been
considerable
>focus on general deficiency of the TCP checksum.  It seems that what iSCSI
>error recovery should really be based upon is the overall profile of TCP
>checksum escapes for the networks that will carry iSCSI traffic ("checksum
>escapes" being defined as those cases where corrupted data is passed
through
>TCP and delivered to the upper layer protocol as good).  The design of
iSCSI
>error recovery needs to start with a clear and agreed upon set of
>assumptions regarding the profile for TCP checksum escapes across space
and
>time. Thus, the question is not simply "How often is corrupted data passed
>through TCP and delivered to iSCSI", but more along the lines of  "What is
>the distribution of TCP checksum escapes across network paths and through
>time?"
>
>ISCSI error recovery should not be over-designed nor under-designed and
will
>be fundamentally different if the assumption is that checksum escapes or
>bursts of such occur every 5 minutes or 5 hours or 5 days, and if the
>assumption is that there are a few bad paths or that all paths are bad
>sometimes.
>
>It would make sense for iSCSI to have minimal error recovery if the
spatial
>profile (discussed below) of TCP errors were bi-modal (i.e. there are good
>paths and bad paths). IOW, if the checksum escapes happen
deterministically
>always or not at all, iSCSI can simply drop the connection on a digest
>failure because the operation switched into error mode.  OTOH, we cannot
>adopt this simplistic approach if TCP can be in the shades of gray between
>the two modes.  Attached email discussion indicates that load levels and
>memory usage models along the path play a key role in determining this
>aspect. Employing an iSCSI-level digest/CRC appears to be the wise
approach
>regardless of the TCP operational model (bimodal or otherwise), to detect
>escapes when they happen. IOW, the principle of "trust but verify" is apt
to
>be applied here. The question is what is the appropriate action for iSCSI
to
>take if the verification detects data corruption.
>
>To start this discussion toward a set of assumptions regarding TCP
checksum
>escape profiles and an appropriate iSCSI error recovery design, this email
>includes responses from several individuals working in this area (Vern
>Paxson, Craig Partridge, Jonathan Stone), along with links to their
original
>papers.
>
>Regards,
>Jim Wendt
>Networked Storage Architecture
>Network Storage Solutions Organization
>Hewlett-Packard Company / Roseville, CA
>Tel: +1 916 785-5198
>Fax: +1 916 785-0391
>Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>
>
>
>----------------------------------------------------------
>Here is a copy of my original email to Vern Paxson, Craig Partridge,
>Jonathan Stone, and Jeff Chase:
>
>Hi All,
>I'm e-mailing this to Vern, Craig, Jonathan, and Jeff in hopes of
gathering
>information around TCP checksum failure profiles and Internet data failure
>characteristics in general.  I'm Jim Wendt and I've recently started
working
>on both the iSCSI error recovery and RDMA/TCP-Framing activities. I'm
>emailing you directly because of your specific research and experience
>regarding TCP checksum failures and behavior.
>
>My immediate interest is in regards to TCP checksum failure rates
>(undetected errors) and how iSCSI should handle these undetected errors.
As
>you probably already know, the iSCSI protocol is a mapping of SCSI device
>protocols onto TCP (or SCTP eventually).  I have read the three papers
(When
>the CRC and TCP Checksum Disagree, Performance of Checksums and CRCs over
>Real Data - both versions, End-to-End Internet Packet Dynamics).
>
>The iSCSI WG has come to the conclusion that the TCP checksum provides an
>insufficient level of data protection, and that a CRC-32 will be used for
>data integrity verification on each iSCSI PDU (the specific CRC polynomial
>to employ is being studied now). Thus, the assumption is that if corrupted
>payload data does pass through TCP and is undetected by the TCP checksum,
>then the corruption will be detected at the iSCSI PDU level via the CRC-32
>check.
>
>What we are considering now is what level and philosophy of error handling
>and recovery is put into iSCSI. It seems to me that fundamentally
different
>error handling behaviors would be put into iSCSI based on a known rate or
>profile of occurrence of bad PDUs (those PDUs that are passed up through
TCP
>as being OK, but for which the iSCSI level CRC-32 indicates a data error).
>Thus, iSCSI error handling might be defined differently if the expected
PDU
>error rate is one every 5 minutes as opposed to one every 5 hours (or 5
>days). Also, the error handling might be different if errors are bursty
>(i.e. a sequence of bad PDUs) rather than evenly spread through time.
>
>I would appreciate hearing your thoughts (and any supporting data
references
>I could employ) regarding the nature of TCP checksum failure profiles
across
>time and space (i.e. network paths).   More specifically:
>
>* How does the Internet work in the face of TCP error escapes?  If there
>truly is a relatively high level of application level data corruption
>(perhaps 1 in 16M packets), then how does the Internet and associated
>applications manage to function?
>
>* What is the distribution of errors across network paths?  Are most
network
>paths very good (low TCP error escapes) with only a few paths that are
>really bad, or are TCP error escapes more evenly distributed across all
>network paths?  If the spatial profile is that there are good and bad
paths,
>then do the error escapes rates on these two classes of paths correspond
to
>the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
>billion)?
>
>* What is the distribution of errors through time?  Do TCP error escapes
>occur individually and randomly through time, or are TCP error escapes
more
>bi-modal where most of the time there are no errors and occasionally there
>is a clump or burst of TCP error escapes?  If the temporal profile for TCP
>error escapes is that there are good periods and infrequent but severe bad
>periods, then what is the duty cycle for these periods (how bad/good for
how
>long), and what are the error escape rates during these periods?
>
>* What about a stronger TCP checksum?  I don't believe that anyone every
>actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
>recent thinking about actually improving TCP's end-to-end data integrity
>checking.  I suppose that the existing middle box infrastructure won't
allow
>for this. However, I'm considering submitting a draft and starting to push
>for a stronger TCP checksum or CRC, but I would like to get feedback from
>all of you on the technical feasibility and possible acceptance of this
>proposal before taking it to the public forums.
>
>* What about a stronger SCTP checksum? SCTP currently uses the Adler-32
>checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
>Has there been any thinking in this direction? Again, I am considering
>pursuing this with the SCTP folks but would appreciate any feedback you
have
>to offer first.
>
>Thanks,
>Jim
>
>Jim Wendt
>Networked Storage Architecture
>Network Storage Solutions Organization
>Hewlett-Packard Company / Roseville, CA
>Tel: +1 916 785-5198
>Fax: +1 916 785-0391
>Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>
>
>
>----------------------------------------------------------
>Craig Partridge writes:
>
>Hi Jim:
>
>Here's my quick set of answers, which Vern, Jonathan and others can
>refine.
>
>>* How does the Internet work in the face of TCP error escapes?  If there
>>truly is a relatively high level of application level data corruption
>>(perhaps 1 in 16M packets), then how does the Internet and associated
>>applications manage to function?
>
>The answer is that applications appear to function because people don't
>notice the errors or resort to backups, or whatever.  It isn't clear
>exactly how we're managing to survive.  (Back in the old days when NFS
>had these problems, people used to assume a disk failure had occurred when
>in fact the network had trashed their data).
>
>>* What is the distribution of errors across network paths?  Are most
>network
>>paths very good (low TCP error escapes) with only a few paths that are
>>really bad, or are TCP error escapes more evenly distributed across all
>>network paths?  If the spatial profile is that there are good and bad
>paths,
>>then do the error escapes rates on these two classes of paths correspond
to
>>the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
>>billion)?
>
>The answer is that there are good and bad paths.  On good paths you'll
>probably see less escapes than 1 in 10 billion -- you can probably treat
>them as essentially error free.  If you start seeing a modest number of
>errors, then either (a) there's a broken router in your path or (b)
>there's a broken end system.  If we had a better way of identifying the
>source of errors, I'd say that if you ever see an error from a host,
>declare it broken.
>
>>* What is the distribution of errors through time?  Do TCP error escapes
>>occur individually and randomly through time, or are TCP error escapes
more
>>bi-modal where most of the time there are no errors and occasionally
there
>>is a clump or burst of TCP error escapes?  If the temporal profile for
TCP
>>error escapes is that there are good periods and infrequent but severe
bad
>>periods, then what is the duty cycle for these periods (how bad/good for
>how
>>long), and what are the error escape rates during these periods?
>
>They're not random but I don't think we know enough about the timing to
say.
>My hunch is that a lot are based on end system load (that high loads
>on network cards tends to encourage certain classes of bus errors).
>
>>* What about a stronger TCP checksum?  I don't believe that anyone every
>>actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
>>recent thinking about actually improving TCP's end-to-end data integrity
>>checking.  I suppose that the existing middle box infrastructure won't
>allow
>>for this. However, I'm considering submitting a draft and starting to
push
>>for a stronger TCP checksum or CRC, but I would like to get feedback from
>>all of you on the technical feasibility and possible acceptance of this
>>proposal before taking it to the public forums.
>
>That's a tricky question and not one I've thought enough about to have
>a strong view -- except to say that it isn't all the checksum's fault --
>Jonathan's done work showing that putting the checksum at the end
>dramatically
>improves its efficacy in certain situations.
>
>>* What about a stronger SCTP checksum? SCTP currently uses the Adler-32
>>checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
>>Has there been any thinking in this direction? Again, I am considering
>>pursuing this with the SCTP folks but would appreciate any feedback you
>have
>>to offer first.
>
>There's considerable reason to believe that Adler-32 is a lousy checksum
--
>it uses more bits (32 vs. 16) and will probably detect fewer errors than
>either Fletcher-16 or the TCP checksum.  If one is going to invest energy
>in fixing a checksum, fixing SCTP's checksum is probably the first
priority.
>
>Craig
>
>----------------------------------------------------------
>Vern Paxson writes:
>
>Just a couple of follow-on points:
>
>    - I suspect the Internet survives because the bulk of the
>      traffic isn't all that critical (Web pages, particularly
>      large items like images and perhaps video clips), so when
>      they're corrupted, nothing breaks in a major way.
>
>    - One point to consider is that if you use IPSEC, then you
>      get very strong protection from its integrity guarantees
>
>- Vern
>
>----------------------------------------------------------
>Jonathan Stone writes:
>
>In message <200104092135.f39LZVS21092@daffy.ee.lbl.gov>Vern Paxson writes
>
>Jim,
>
>Craig's answer is an excellent summary.  Fixing the SCTP checksum is a
>priority; I have a chapter in my thesis addresing some of the issues
>there, and I believe Craig and I plan to write a paper.
>Stronger TCP checksums are an interesting idea, but the lag on
>deploying new TCP features seems to be  5 years or more.
>
>
>If I re-did the study, i would try and log headers of ``good'' packets
>(those where the checksum matches) as well as the entire packets with
>checksum errors.  With only data on the `bad' packets (which is partly
>to meet privacy concerns, which were a big hurdle for us),
>it's hard to give good answers to your questions about the time or path
>dependency of bad checksums.
>
>One thing we can say is that there appear to be certain `bad' hosts
>which emit very high rates of packets with incorrect checksums.  One
>host we noticed in the Stanford trace sent 2 packets every second for
>around three hours, totalling about 20% of the total packets in that
>trace.  We can't say whether that's a host problem or a path problem;
>but either way that error rate would worry me, if I were using I-SCSI
>on that host (or path).  That said, we did find what looks to be a
>router with a bad memory bit, which is clearly path-dependent
>(though hitting that particular memory word may be time- and
load-dependent
>as well).
>
>Further, some of the bad checksums appear to be due to software
>bugs in specific OS releases.  As the Internet evolves (old OS revs
>replaced by newer ones), the rate of those specific errors will
>evolve over time.
>
>
>Last, the DORM trace in our SIGCOMM 2000 paper is the only trace where
>our monitoring point was directly adjacent to end-hosts, with no
>intervening IP routers.  On that Ethernet segment, i saw about 1 in
>80,000 packets with a valid frame CRC, but a bad IP checksum --
>often a deletion of a 16-bit word from the IP address.  I tried
>to `repair' some  IP headers (e.g., by inserting what seemed
>to be a missing 16 bits of an IP source address).
>After the repair, the IP checksum seemed to be correct.
>
>That points to a problem ocurring after the sending IP layer computeds
>its IP header checksum, but before the frame-level CRC is computed.:
>That suggests an error either in the NIC device driver, in its DMA
>enigne, or somewhere on the NIC itself (dropped from a FIFO?).
>
>That led Craig and I to suggest that where possible, checksums be
>computed early and verified late.  That's a caution about a potential
>downside of outboard checksumming: it cover less of the path to an
>application buffer than is covered by software checksums.  Software
>checksums can catch errors which occur between the driver and the NIC
>(bus errors, DMA, FIFO overruns, what-have-you); but outboard hardware
>checksums do not.  (IEN-45 mentions this issue, but the two-pass
>scheme suggested there doubles bus utilization for a given I/O rate.)
>
>Putting middle boxes in the path just exacerbates this.
>
>Whether or not to use outboard checksumming is entirely up to
>NIC designers. We merely raise the issue that it checks less of
>the datapath than is covered by software checksums.
>
>
>>Just a couple of follow-on points:
>>
>>   - I suspect the Internet survives because the bulk of the
>>     traffic isn't all that critical (Web pages, particularly
>>     large items like images and perhaps video clips), so when
>>     they're corrupted, nothing breaks in a major way.
>>
>>   - One point to consider is that if you use IPSEC, then you
>>     get very strong protection from its integrity guarantees
>
>For a software IPSEC implementation.
>
>A hardware implementation with outboard crypto hardware could
>potentially fall foul of the same kinds of local-to-source-host
>errors, (DMA errors or whatever the ultimate cause is) which our data
>indicates some NICs suffer from.  If the data has already been
>curdled by the time the encrypting accelerator sees it,
>I don't see how IPsec actually enhances integrity.
>
>----------------------------------------------------------
>Vern Paxson writes:
>
>> > - One point to consider is that if you use IPSEC, then you
>> >   get very strong protection from its integrity guarantees
>>
>> For a software IPSEC implementation.
>
>Good point!
>
>         Vern
>
>----------------------------------------------------------
>Craig Partridge writes:
>
>I'd like to clarify my statement to Jim about Adler-32 to be a bit more
>clear
>about what the issues are.   I've also added Mark Adler to the cc list as
>I'd promised Mark to get him results when we had them, and while the
results
>aren't quite cooked, if the Jim is going to circulate a discussion, Mark
>should see it first.
>
>If you're designing a checksum, there are certain features you'd like it
>to have.  Here's a starting point of a formal definition.  Given the set
>V of all possible bit vectors and a checksum function C(), what we'd like
>is:
>
>    prob(C(v1) == C(v2)) is 1/2**(sizeof C())
>
>that is given any two v1, v2 being different elements of V, the chance
that
>their checksum will collide is the best possible, namely 1 over 2 raised
to
>the power of the bitwidth of the result of C().
>
>Three sub points:
>
>    1. This is not quite the same as what the cryptographic checksum folks
>    want.  They actually want it to be very hard [for some computational
>    approximation of hard], given C(v1) to find a v2 such that C(v1) ==
>C(v2).
>    For network checksums, we don't care as we're protecting from errors,
>    not attacks.
>
>    2. If we do not pick v1 and v2 at random, but according to some
>distribution
>    rule of likely packet sizes, packet contents, etc, we'd still like the
>    equation to be true.  We don't want to be vulnerable to certain
>    traffic patterns, etc.
>
>    3. You can compare the effectiveness of checksums by how close they
>    come to this ideal -- that is, how effectively do they use their
>    range of values?
>
>OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
>improve the performance of the Fletcher checksum by summing modulo a prime
>number, rather than 255 or 256.
>
>However, it sums bytes (8-bit quantities) into 16-bit fields.  As a
result,
>the high bits of the 16-bit fields take some time to fill (they only get
>filled by propogating carries from lower bits) and until the packet
>is quite big (thousands of bytes) you don't get enough mixing in the high
>bits.  Two problems: (a) we're not fully using the 16-bit width, so for
>smaller packets the chance of collision is much greater than
1/2**sizeof(C)
>simply because some bits in the checksum are always (or with very
>high probability) set to 0; and (b) it looks like (and Jonathan's still
>working on this) that the law of large numbers will cause the values to
>cluster still further [I think of this behavior as the result of just
>looking at all the bits instead of just the low order bits mod a prime,
>we're
>vulnerable to the fact that the sums are not evenly distributed, by
>the law of large numbers]
>
>We're still working on whether the core idea behind Adler-32 (namely
working
>modulo a prime) is as powerful as it seems, but it is clear that to have
>a hope of making it comparable to the TCP checksum or Fletcher, you have
to
>sum 16-bit quantities into 16-bit fields.
>
>Craig
>
>----------------------------------------------------------
>Jonathan writes:
>
>In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes
>
>>I'd like to clarify my statement to Jim about Adler-32 to be a bit more
>clear
>>about what the issues are.   I've also added Mark Adler to the cc list as
>>I'd promised Mark to get him results when we had them, and while the
>results
>>aren't quite cooked, if the Jim is going to circulate a discussion, Mark
>>should see it first.
>>
>>If you're designing a checksum, there are certain features you'd like it
>>to have.  Here's a starting point of a formal definition.  Given the set
>>V of all possible bit vectors and a checksum function C(), what we'd like
>is:
>>
>>   prob(C(v1) == C(v2)) is 1/2**(sizeof C())
>>
>>that is given any two v1, v2 being different elements of V, the chance
that
>>their checksum will collide is the best possible, namely 1 over 2 raised
to
>>the power of the bitwidth of the result of C().
>
>Jim, Craig:
>
>Just to be picky, as I'm working right now on definitions of some of
>these issues for my thesis:
>
>One can list other desirable properties, like wanting each bit of the
>checksum field to have informational entropy 1/2.  Craig's aggregate
>definition falls out from that, with a few extra assumptions.
>
>Also, less formally, desiring each bit of the input data to contribute
>equally to flipping the the final state of each output bit.
>that is where Adler32 runs into trouble when given short inputs.
>
>
>>Three sub points:
>>
>>    1. This is not quite the same as what the cryptographic checksum
folks
>>    want.  They actually want it to be very hard [for some computational
>>    approximation of hard], given C(v1) to find a v2 such that C(v1) ==
>C(v2).
>>    For network checksums, we don't care as we're protecting from errors,
>>    not attacks.
>
>There are two versions of formal cryptographic invertibility; the
>other criteria is that it be computationally intractable to find *any*
>v1 and v2 such that C(v1) = C(v2).  Crypto folks would generally like
>both.
>
>
>>    2. If we do not pick v1 and v2 at random, but according to some
>distributi
>>on
>>    rule of likely packet sizes, packet contents, etc, we'd still like
the
>>    equation to be true.  We don't want to be vulnerable to certain
>>    traffic patterns, etc.
>>
>>    3. You can compare the effectiveness of checksums by how close they
>>    come to this ideal -- that is, how effectively do they use their
>>    range of values?
>>
>>OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
>>improve the performance of the Fletcher checksum by summing modulo a
prime
>>number, rather than 255 or 256.
>>
>>However, it sums bytes (8-bit quantities) into 16-bit fields.  As a
result,
>>the high bits of the 16-bit fields take some time to fill (they only get
>>filled by propogating carries from lower bits) and until the packet
>>is quite big (thousands of bytes) you don't get enough mixing in the high
>>bits.  Two problems: (a) we're not fully using the 16-bit width, so for
>>smaller packets the chance of collision is much greater than
1/2**sizeof(C)
>>simply because some bits in the checksum are always (or with very
>>high probability) set to 0; and (b) it looks like (and Jonathan's still
>>working on this) that the law of large numbers will cause the values to
>>cluster still further [I think of this behavior as the result of just
>>looking at all the bits instead of just the low order bits mod a prime,
>we're
>>vulnerable to the fact that the sums are not evenly distributed, by
>>the law of large numbers]
>
>To be fair, and to give the whole picture, there are a couple of
>points here that should be expanded.
>
>The first is that for large input data lengths, we can show that the
>distribution of both 16-bit halves of the Adler-32 sum should acually
>be well distributed.  That holds true for addition mod M, of any
>repeated independent observations of a random variable.  (A proof
>appears in the appendix of the 1998 ToN paper you cited already;
>although  that version of the proof may not formally state
>all the necessary assumptions about indepedent observations.)
>
>However, for networking in general and SCTP in particular, there are
>fairly modest hard upper boundes on the maximum input length.
>SCTP forbids fragmentation.  For the increasingly-pervasive Ethernet
>frame length of 1500 bytes that means SCTP checksums have no more
>than 1480 input bytes.
>
>The data we have -- and I am still working on it -- say that's too
>short to get good coverage of either the two sum in SCTP-- the purely
>additive commutative sum, or the higher-order running sum of the
>commutative sum (which is position-dependent)
>
>Craig's description gives a good intuition for the first, commutative
>sum. For the position-dependent sum (the running sum of the first sum),
>another good intution for the computational modelling I've done is to
>imagine a hash-table where we hash, independently, all possible values
>at all possible offsets in a 1480-byte SCTP packet.  The intuition
>isn't so much "law of large numbers" as that SCTP draws its per-byte
>values from one particular corner of a two-dimensional space (the
>hash-table vs. all possible bytes); so it ends up with an uneven
>coverage of the space of all hash values.
>
>
>>We're still working on whether the core idea behind Adler-32 (namely
>working
>>modulo a prime) is as powerful as it seems, but it is clear that to have
>>a hope of making it comparable to the TCP checksum or Fletcher, you have
to
>>sum 16-bit quantities into 16-bit fields.
>
>Just so the reasoning is clear, another alternative is to sum 8-bit
>inputs into 8-bit accumulators, modulo 251.  Given 32 bits of
>available checksum field, the 16-bit sums are preferable.
>
>Again, this is for short inputs.  For large inputs of several tens of
>Kbytes, the Adler32 sum should do much better (at 64Kbytes it should
>give very uniform coverage).  At those input sizes, the comparison comes
>down to Adler32 having 15 pairs of values which are congruent, mod
>65521, whereas a Fletcher sum wiht 16-bit inputs would have only one;
>versus better `stirring' from a prime modulus.
>
>I dont know what the distribution of sizes of zlib-compressed files
>is.  If they are generally large, then our work may not be applicable
>to Adler-32 its original designed purpose.
>
>
>I also don't know the history of why SCTP chose the Adler-32 sum
>rather than, say, CRC-32. The gossip I hear from IETF-going friend in
>the Bay Area is that there was concern about the performance of
>CRC-32; a direct bit-by-bit shift-and-add was seen as too slow.  I
>hear there was also a supposition that an efficient table-lookup
>version would require very large tables (128 kbits?) and that
>tables that arge were prohibitive for PDAs and small handheld devices.
>
>I am *not* suggesting that is an accurate report; it probably isn't.
>But if there's any grain of truth in it, both four-bit and eight-bit
>table-lookup algorithms for CRC32 exist.  Table lookup size need not
>be an issue.  Perhaps we should draw the SCTP authors -- C. Sharp? --
>into this discussion as well.
>
>----------------------------------------------------------
>Jonathan writes:
>
>A small terminological correction here:
>
>In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes
>>
>
>[...]
>
>>
>>so for
>>smaller packets the chance of collision is much greater than
1/2**sizeof(C)
>>simply because some bits in the checksum are always (or with very
>>high probability) set to 0; and (b) it looks like (and Jonathan's still
>>working on this) that the law of large numbers will cause the values to
>>cluster still further [I think of this behavior as the result of just
>>looking at all the bits instead of just the low order bits mod a prime,
>we're
>>vulnerable to the fact that the sums are not evenly distributed, by
>>the law of large numbers]
>
>I think its acutally a central limit theorem, not the law of large
>numbers.  For addition modulo M, the "central limit theorem" says that
>summation of many independent identically-distributed random variables
>will tend to a uniform distribution, not a normal distribution
>(as does the weighted sum)
>
>----------------------------------------------------------
>Jonathan writes:
>
>Jim,
>
>I forwarded my previos message to Chip Sharp.  (is the rfc2960
>address, chsharp@cisco.com, still valid?)
>
>He may wish to comment on the SCTP issues as well.
>
>
>----------------------------------------------------------
>----------------------------------------------------------
>Links to papers:
>
>End-to-End Internet Packet Dynamics, Vern Paxson
><
http://citeseer.nj.nec.com/cache/papers2/cs/11598/ftp:zSzzSzftp.ee.lbl.govz
>SzpaperszSzvp-pkt-dyn-ton99.pdf/paxson97endtoend.pdf>
>
>When the CRC and TCP Checksum Disagree, Jonathan Stone, Craig Partridge
><http://www.acm.org/sigcomm/sigcomm2000/conf/paper/sigcomm2000-9-1.pdf>
>
>Performance of Checksums and CRCs over Real Data, Jonathan Stone, Michael
>Greenwald, Craig Partridge, Jim Hughes
><
http://citeseer.nj.nec.com/cache/papers2/cs/1909/ftp:zSzzSzftp.dsg.stanford
>.eduzSzpubzSzpaperszSzsplice-paper.pdf/performance-of-checksums-and.pdf>
>
>----------------------------------------------------------
>
>
>







From owner-ips@ece.cmu.edu  Tue Apr 17 10:35:55 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA08690
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 10:35:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HCQr112578
	for ips-outgoing; Tue, 17 Apr 2001 08:26:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c017.sfo.cp.net (c017-h020.c017.sfo.cp.net [209.228.12.234])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3HCQAr12519
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 08:26:10 -0400 (EDT)
Received: (cpmta 22108 invoked from network); 17 Apr 2001 05:26:02 -0700
Received: from sangate-GW.ser.netvision.net.il (HELO sangate.com) (212.143.114.146)
  by smtp.sangate.com (209.228.12.234) with SMTP; 17 Apr 2001 05:26:02 -0700
X-Sent: 17 Apr 2001 12:26:02 GMT
Message-ID: <3ADC2109.4306C6F0@sangate.com>
Date: Tue, 17 Apr 2001 13:55:05 +0300
From: Mark Mokryn <mark@sangate.com>
Organization: SANgate Systems
X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16-22 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
CC: ips@ece.cmu.edu
Subject: Re: iSCSI & Linked Commands
References: <C1256A31.002F930E.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

Santosh is right. Linked commands require an identical I_T_L_x nexus,
but many Fibre Channel (and possibly) SCSI adapters generate a queue tag
on-board, with no possibility of host software control. On such
adapters, the generation of linked commands is impossible, and clearly
today's SCSI layers are aware of this. 

This raises the entire issue of task management in iSCSI: Linked
commands are dated back to SCSI-2, where they indeed served a purpose.
In the SCSI bus protocol, the target controlled all SCSI bus phases
(following selection). Thus, in a linked command sequence, the target
may drive the command phase immediately following the status phase, thus
saving bus cycles (i.e. arbitration, selection, etc.). However, in the
serial protocols, I don't see how linked commands are of any use, since
there are no bus phases to save. In contrast with popular belief, linked
commands offer no atomicity. Even in SCSI bus protocol, a linked command
may be disconnected at any time (at the target's discretion), and a new
command (from any initiator) may be started. Linked commands have always
been optional, and indeed many target implementations today do not
support them. For instance, looking at the Shark SCSI reference manual,
according to the inquiry data, Shark does not support linked commands.

So, perhaps the wise thing to do is to not support linked commands in
iSCSI. It has always been an optional feature for logical units, and
today is outdated and often unsupported, both by targets and initiators.

-mark

julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> Sorry to interrupt this captivating thread. Why do you think linked
> commands won't work?
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 16/04/2001 20:13:04
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Douglas Otis <dotis@sanlight.net>
> cc:   Ips <ips@ece.cmu.edu>
> Subject:  Re: iSCSI:flow control, acknowledgement, and a deterministic
>       recovery
> 
> Doug,
> 
> You seem to be referring to linked commands as a case wherein the
> approach of Abort Task will not flush stale PDUs.
> 
> Linked Commands cannot work the way SCSI implementations are defined
> today, since linked commands require the initiator task tag (I_T_L_x
> nexus identifier in SAM-2 Execute Command terminology) to be generated
> by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
> OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> adapter firmware). IOW, there is no common reference handle like the
> task tag sent down from the ULP that allows for association of multiple
> commands to a task in several/most implementations today.
> 
> When this is fixed up to get linked commands to work [& there exist
> examples of its usage], there is no reason connection allegiance could
> not be applied to all the commands within the task.
> 
> I fail to see why you think Abort Task will not work with sequential
> devices (?).
> 
> - Santosh
> 
> Douglas Otis wrote:
> >
> > Santosh,
> >
> > I see a few problems with this approach.  Tasks as defined in iSCSI do
> not
> > maintain connection allegiance.  The driver binds all SCSI commands to
> their
> > connection for the most resent association.  Although there are several
> > places within the iSCSI proposal that make reference to a task having a
> > connection allegiance, this is in error.  Commands and not tasks carry
> such
> > allegiance.  Your recovery scheme will not allow a satisfactory recovery
> > with a sequential device.  In this case, repeating the command is not a
> > solution.  As a result, one connection falter and it will become a
> difficult
> > situation.  In addition, you have no clue from iSCSI your delivery
> status.
> > You do not know if you are waiting for the target or if you are waiting
> for
> > the connection.  Some sequential devices have rather long time-outs with
> > these complications of deducing status created by the multiple
> connections.
> >
> > The application will not know about these connection allegiance problems.
> > The iSCSI layer does not define interaction to provide additional
> > application status to allow these applications to respond in a manner
> that
> > may aid this situation nor should such additional information be
> required.
> > With your scheme the SCSI driver must examine the content of these
> commands
> > to make a guess as to the connection allegiance assignments.  Now the
> driver
> > is expected to understand what the intended action is of this SCSI
> > management command.  What signal is used to indicate a need for the iSCSI
> > immediate treatment?  The only obvious seems to be the task attribute
> > argument.  With the way iSCSI has defined iSCSI immediate, I would expect
> > those commands to be treated in a LIFO rather than the normal FIFO
> fashion.
> >
> > Doug
> >
> > > Douglas Otis wrote:
> > > >
> > > > With multiple connections, if you are not going to use a valid
> > > > CmdSN, or in your case a null CmdSN for all commands, then there
> > > > would be a need to include a timestamp to meet a timely delivery
> > > > requirement in the same manner as used in FC encapsulation.  IP
> > > > can deliver over any time period.  A command could arrive at any
> > > > time with respect to other connections.  With all of your feedback
> > > > now from just the SCSI layer, the SCSI layer is likely to have timed
> > > > out and restarted and now stray commands finally make an appearance
> > > > (the technician re-inserted the cable).  What did that do?  Yes,
> > > > if this were on a single connection, then TCP could provide some
> > > > assurances, (ignoring digests errors) but you must not make that
> > > > assumption nor can you assume all disruptions are symmetric.
> > >
> > > Doug,
> > >
> > > The below snippet from my last mail answered your above concern. The
> > > Abort Task is sent on the same connection as the command. (connection
> > > allegiance applied to the abort task as well). The Abort task pushes
> the
> > > stale data PDUs. There is no need for a timestamp on iSCSI PDUs.
> > >
> > > > > As for your second concern regarding I/O timeouts, there is
> > > no need for
> > > > > any timestamp. An I/O timeout is dealt with by an Abort Task.
> > > The abort
> > > > > task response guarantees that the abort reached the target and
> pushed
> > > > > all intermediate stale frames. Failure to complete Abort Task leads
> to
> > > > > higher level error recovery (ex : Logout, or some higher form of
> task
> > > > > mgmt).
> > >
> > > - Santosh
>  - santoshr.vcf


From owner-ips@ece.cmu.edu  Tue Apr 17 12:41:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA11558
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 12:41:43 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HE4ux19283
	for ips-outgoing; Tue, 17 Apr 2001 10:04:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HE43r19235
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 10:04:04 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id QAA151186
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 16:03:56 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA223438
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 16:03:56 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A31.004D42A6 ; Tue, 17 Apr 2001 16:03:53 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A31.004D40EF.00@d12mta02.de.ibm.com>
Date: Tue, 17 Apr 2001 16:08:15 +0200
Subject: draft version 05-92 available at my site
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk






Dear colleagues,

I've just placed 05-92 at http://www.haifa.il.com/satran/ips

Recovery is still "work in progress" and probably won't make it to Nashua
(not in full form).

Only minor changes:

Better wording for SNACK implicit ack.

Removed the x'fffffff" restriction on DataSN and put excilicitily a ceiling
to the number of PDUs in a sequence and the number of R2Ts (guess what its
fffffffe)

Discourage empty data PDUs

Discourage void requesting R2Ts

Indicate use NOPs for acks when no other PDUs available.

Some typo's (sure not all -:))

Regards,
Julo
Security (additions)

many editorials and clarifications.

Also my "todo" folder on iSCSI (including things that I have agreed to
change is empty) and the only item i have under "toConsider" is the
StatusSN and associated SNACK (I could not make up my mind and the few
arguments I've heard against the current scheme  are weak).
If you think I've forgot something please send a note.

I will let you know when and how I post changes but I will spend little
time to answer non-urgent mail
during the next several days.


Regards,
Julo





From owner-ips@ece.cmu.edu  Tue Apr 17 12:44:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA11605
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 12:44:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HEeu221974
	for ips-outgoing; Tue, 17 Apr 2001 10:40:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3HEGHr20186
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 10:16:17 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Tue Apr 17 10:12:36 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Tue Apr 17 10:14:53 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id KAA25966
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 10:14:50 -0400 (EDT)
Message-ID: <3ADC4FDA.1010ACC2@research.bell-labs.com>
Date: Tue, 17 Apr 2001 10:14:50 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: aborting an out of sequence cmdSN
References: <0F31E5C394DAD311B60C00E029101A0708015447@corpmx9.isus.emc.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


David,

The "iSCSI cancel" proposal you describe below has been presented 
once before.  If you recall, I was asking for a refCmdSN in the 
TaskMgmt PDU.  The refCmdSN would be the cmdSN of the *original*
task.   See http://ips.pdl.cs.cmu.edu/mail/msg04022.html

Your objection to this proposal then was that it did not comply 
with SAM2.  See http://ips.pdl.cs.cmu.edu/mail/msg04028.html
The only difference between that and what you now describe below 
is the addition of this concept of an "iSCSI service error" response.

We do need *some* mechanism for a deterministic abort_task.

Otherwise, initiator has to retain state to ensure that it will
process (or fail) any SCSI response or R2T PDUs which may 
come in later, since the abort_task did not deterministically
ensure that the target has deleted the original task.
(Note: somewhere in the spec, there is a statement that initiator
MUST always honour R2T...?)

The initiator state must now be retained for an iSCSI task
which has *NO* corresponding SCSI task - which gets as complicated
to cleanup as the target state which would result from a 
pending abort.  

So our options for abort_task boil down to..
(1) use connection allegiance for TASK MGMT PDU.
(2) reject all commands prior to cmdSN of TASK MGMT PDU.
(3) cmdSN of original task is sent with TASK MGMT PDU and
    target at the iSCSI layer keeps state.
(4) iSCSI initiator retains state for deleted tasks to ensure
    that R2T/Scsi Responses are appropriately handled.

-Sandeep

Black_David@emc.com wrote:
> 
> Eddy,
> 
> > Does iSCSI have a way to abort the commands that are in the target but not
> > yet in the SCSI Target Layer?
> 
> No, it doesn't.  Part of the problem is that just considering
> iSCSI info, the Initiator can't tell whether such commands
> are in the Target vs. still in-flight.  TCP ACK info may help,
> but if TCP hasn't ACKed, the Initiator still doesn't know.
> 
> This is the issue that Doug's been posting about, advocating
> that iSCSI always reject all commands prior to the task abort
> on the assumption that the initiator can reissue them - given
> the lack of support this has received, I'm going to take this
> opportunity to state that I believe WG rough consensus exists
> for NOT pursuing this "reject all prior commands" approach.
> This is the last chance for anyone else to express support for
> this reject-based approach.
> 
> If we were to address the original issue, I think some sort
> of new mechanism is needed at the iSCSI level, as the
> attempts to use a side effect of SCSI task aborts don't
> seem to be getting much of anywhere (e.g., a previous
> attempt at this involved having iSCSI automatically send
> task management commands multiple times).  A SWAG at what
> such a mechanism might look like is some sort of iSCSI
> "Cancel" that names the CmdSN(s) to be cancelled and sent
> in the same PDU as the SCSI Task Abort or Task Set Abort
> that is supposed to abort them.  Canceling would apply
> both to commands waiting for CmdSN in iSCSI as well as
> those not yet received by the target (i.e., the target
> has to have some notion of "pending cancel" for commands
> it hasn't received yet.  The cancelled commands
> probably have to return some sort of iSCSI service error
> rather than SCSI sense data.  Something like this looks
> like it can be made to work as desired (issue TASK
> ABORT with the corresponding Cancel for immediate delivery,
> and if SCSI declares the TASK ABORT to have succeeded,
> then that command will not subsequently execute), but it'll
> add complexity.  Is that worth the benefit of being able
> to reach out and definitively strangle an errant command?
> 
> --David
> 
> > -----Original Message-----
> > From: Eddy Quicksall [SMTP:Eddy@quicksall.com]
> > Sent: Friday, April 13, 2001 6:40 PM
> > To:   Ips@Ece. Cmu. Edu (E-mail)
> > Subject:      aborting an out of sequence cmdSN
> >
> >
> > If a command is received out of sequence, it is not passed to the SCSI
> > Target Layer. So, it must be queued waiting for the prior command. Now, if
> > the prior command never comes the operating system may attempt to abort
> > the
> > lost command by sending an abort to the driver.
> >
> > The abort can't be turned into an ABORT TASK TMF because the target iSCSI
> > has not acknowledged that it has received the original command yet. And,
> > even if the ABORT TASK was sent, it can't get to the SCSI Target Layer
> > because its cmdSN will be higher than the most recent command.
> >
> > The iSCSI initiator drive could just jerk it out of its queue (knowing
> > that
> > it hasn't been acknowledged yet) but then the iSCSI target may still
> > execute
> > it when the missing PDU's arrive.
> >
> > Does iSCSI have a way to abort the commands that are in the target but not
> > yet in the SCSI Target Layer?
> >
> > Eddy
> >


From owner-ips@ece.cmu.edu  Tue Apr 17 12:44:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA11619
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 12:44:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HEjHg22344
	for ips-outgoing; Tue, 17 Apr 2001 10:45:17 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3HEhxr22249
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 10:44:00 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by dirty; Tue Apr 17 10:44:01 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Tue Apr 17 10:43:59 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id KAA27874;
	Tue, 17 Apr 2001 10:43:56 -0400 (EDT)
Message-ID: <3ADC56AC.E93E6EB9@research.bell-labs.com>
Date: Tue, 17 Apr 2001 10:43:56 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
CC: ips@ece.cmu.edu
Subject: Re: iSCSI Requirements Draft - Informal WG Last Call
References: <C1256A30.0059AD02.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


Julian,

I believe you mean "pipelined delivery" as opposed to "ordered
delivery".  The latter is more strict and will introduce pipeline
stalls which we would wish to avoid.  The flow control mechanism
is getting tied into introducing false sequentiality.

As an aside, a colleague here wanted to know if we had considered
having per-LUN ordering as opposed to per-session command ordering.
The pipeline could also slow down because of an aging disk or a 
heavily loaded disk.  I do see some discussions at the Haifa meeting
  http://www.ece.cmu.edu/~ips/meetingMinutes/06.20.2000.txt

Was it just added complexity which forced the per-session choice?

> Point 3.4.3 brings up ordering of commands per LUN or ordering of commands
> per session?
> MW - reorder requirements - second before the first one
> RH - does anybody disagree with 3.4.3?
> CS - No. It basically says that you support SCSI queuing
> JH - In the wide area, a method of pipelining commands and responses
> is required
> CS - the requirement is more complex than saying you just support
> SCSI queing.
> RH - delivering commands in order never hurts
> MT - Why keep order between logical units in commands?
> RH - SAM-2 does not require order between LUNs. However, it may make
> target implemetnation easier.

thanks,
-Sandeep


julian_satran@il.ibm.com wrote:
> 
> Ordered delivery of commands to ANY TYPE of devices will increase in
> importance as network speeds increase and the need to hide latency
> increases.
> 
> Today databases don't use queuing and rely and trickle the commands to
> devices 1 by 1 to ensure atomicity and order.
> As latency will become the determining factor in performance this is bound
> to change.
> 
> SCSI has done an excellent job in defining the queueing mechanism. We have
> to make it work with good performance in our environment.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 04:33:45
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   ips@ece.cmu.edu
> cc:   Black_David@emc.com
> Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call
> 
> David & All,
> 
> I object to the following requirement :
> 
> " MUST support ordered delivery of SCSI commands from the initiator to
> the
>   target, to support SCSI Task Queuing. "
> 
> Ordered delivery is not a requirement for disk based applications and
> non tagged queueing tape applications, which form the majority of
> today's data traffic.
> 
> To impose strict ordering (even in the presence of errors ?) as a MUST
> is penalizing the majority of today's data traffic that does not expect
> ordering from the SCSI subsystem.
> 
> I am particularly concerned about the effect of the above requirement in
> the presence of errors. Does iSCSI expect strict ordering to be
> maintained even when individual I/O errors like ULP timeout occur ?
> 
> On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
> choose not to retry the command, but instead, error it back to the ULP.
> In such a case, it can plug the hole in CmdSN with a NOP-OUT.
> 
> The above requirement is not feasible to be met under such circumstances
> and others similar to this. Mandating strict ordering on ULP timeouts
> implies a session level error recovery on any individual I/O being
> failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
> as error recovery and should not be imposed.
> 
> The above requirement must be changed to :
> " SHOULD support ordered delivery of SCSI commands from the initiator to
> the
>   target, to support SCSI Task Queuing. "
> 
> - Santosh
> 
> Black_David@emc.com wrote:
> >
> > It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> > as an Informational RFC. There is no formal requirement for
> > a WG Last Call, but if you have any further substantive comments
> > on the document please raise them on this list within the next
> > two weeks, i.e. by April 27th at the latest.
> >
> > If you have typographical/editorial comments please send them
> > direct to the document's author, Marjorie Krueger
> > <marjorie_krueger@hp.com>.
> >
> > Thanks,
> > --David and Elizabeth, IPS WG co-chairs
>  - santoshr.vcf


From owner-ips@ece.cmu.edu  Tue Apr 17 14:31:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA12996
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 14:31:43 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HFhxC26510
	for ips-outgoing; Tue, 17 Apr 2001 11:43:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h008.c007.snv.cp.net [209.228.33.214])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3HFgur26455
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 11:42:56 -0400 (EDT)
Received: (cpmta 26374 invoked from network); 17 Apr 2001 08:41:55 -0700
Received: from unknown (HELO ljoy) (64.130.130.105)
  by smtp.telocity.com (209.228.33.214) with SMTP; 17 Apr 2001 08:41:55 -0700
X-Sent: 17 Apr 2001 15:41:55 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: <julian_satran@il.ibm.com>, <ips@ece.cmu.edu>
Cc: <ralphoweber@compuserve.com>
Subject: RE: iSCSI linked commands
Date: Tue, 17 Apr 2001 08:40:10 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEGJCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <C1256A31.003D3A23.00@d12mta02.de.ibm.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

I am aware that iSCSI can be used for parallel SCSI configurations as well
Fibre-channel that has a level of accommodation for tape linked commands but
I am unaware of such implementations.  Without getting involved with this FC
discussion, it is clear that parallel SCSI is still a valid means of
implementing linking.  The point that I was making as it relates to the
proposal was that on page 16 you indicate that the task carries the
allegiance.  It is the command as you clearly indicate on the prior page.
As there can only be one command in play at any point in time, this task can
become spread over any number of connections as you state.  As such, rather
than indicating an allegiance associated with the task, you may wish to say
with outstanding commands.  This should be taken only as an editorial
concern.  I was trying to make the point that connection allegiance is not
constant with respect to the task.  Santosh wishes to exclude the sequential
model from his thinking.  It was the examination of the command and the
overhead of tracking these allegiances I saw as undesirable.  The change to
implementing serialization prevents the problem if you introduce allegiance
for commands carrying the same serialization so this thread should become
stale as to the original concern.  The reason to introduce allegiance for
these commands carrying the same serialization is to prevent the command
window being closed.

With your present scheme, "immediate" commands should not advance the window
and should be placed in a prior position on the same connection.   If I have
missed this requirement, ignore this concern.

Doug

> Doug,
>
> I think you would want to go back to SAM.  Linked command are
> broken by any
> "irregularity" in execution.
> The basic assumption is that the initiator is in charge of shipping linked
> commands - one-by-one.
> I assume that for high latency links they won't be very popular.
>
> At a very early stage (about 2 years ago) we contemplated the idea of
> "prefetching" linked commands and have the target
> effect the serialization. We would have had to come up with a way of
> conveying the initiator which command broke the chain (if it broke) or
> caused a unit attention (if it caused) and it was not at all clear that
> this was "in the spirit of SAM" .
> There where also more esoteric issues with later command getting modified
> by execution of prior commands etc. -:).
>
> The 360 channels had the same class of issues.
>
> I assume that T10 folks went over these issues many times.
>
> Julo
>
> "Douglas Otis" <dotis@sanlight.net> on 16/04/2001 23:41:49
>
> Please respond to "Douglas Otis" <dotis@sanlight.net>
>
> To:   "Santosh Rao" <santoshr@cup.hp.com>
> cc:   "Ips" <ips@ece.cmu.edu>
> Subject:  RE: iSCSI:flow control, acknowledgement, and a deterministic
>       recovery
>
>
>
>
> Santosh,
>
> The iSCSI proposal ver 5-91 explicitly defines tasks and also includes the
> option to allow linked commands to be sent across different connections.
> Obviously, for sequential devices, particular attention must be paid to
> command serialization as these commands tend to use relative addressing or
> are dependent upon the successful completion of prior commands.  This
> requirement is not helped with Auto-Sense and impels the need for a target
> model change in SCSI.  An error injected into the SCSI layer as a
> result of
> a network communication error will significantly reduce the
> utility of most
> backup applications.  Such reliance on the SCSI layer to recover from such
> uncertainty imposed as a result of the inability of the network transport
> to
> do minimal handshakes and retries is the wrong approach.  Regardless, you
> are burdening the driver with the duty of tracking a transient connection
> allegiance status.  The latest version has improved language with the
> exception of Pg. 16 in two places.
>
> Ver 5-91
> Pg. 15
> "Connection allegiance is strictly per-command and not per-task."
>
> Pg. 16
> "tasks that have allegiance to the connection"
> "all outstanding tasks that have allegiance to the connection to conclude
> and send their status."
>
> Doug
>
> > Doug,
> >
> > You seem to be referring to linked commands as a case wherein the
> > approach of Abort Task will not flush stale PDUs.
> >
> > Linked Commands cannot work the way SCSI implementations are defined
> > today, since linked commands require the initiator task tag (I_T_L_x
> > nexus identifier in SAM-2 Execute Command terminology) to be generated
> > by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
> > OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> > adapter firmware). IOW, there is no common reference handle like the
> > task tag sent down from the ULP that allows for association of multiple
> > commands to a task in several/most implementations today.
> >
> > When this is fixed up to get linked commands to work [& there exist
> > examples of its usage], there is no reason connection allegiance could
> > not be applied to all the commands within the task.
> >
> > I fail to see why you think Abort Task will not work with sequential
> > devices (?).
> >
> > - Santosh
> >
> > Douglas Otis wrote:
> > >
> > > Santosh,
> > >
> > > I see a few problems with this approach.  Tasks as defined in
> > > iSCSI do not maintain connection allegiance.  The driver binds all
> > > SCSI commands to their connection for the most resent association.
> > > Although there are several places within the iSCSI proposal that
> > > make reference to a task having a connection allegiance, this is
> > > in error.  Commands and not tasks carry such allegiance.  Your
> > > recovery scheme will not allow a satisfactory recovery with a
> > > sequential device.  In this case, repeating the command is not a
> > > solution.  As a result, one connection falter and it will become a
> > > difficult situation.  In addition, you have no clue from iSCSI your
> > > delivery status.  You do not know if you are waiting for the target
> > > or if you are waiting for the connection.  Some sequential devices
> > > have rather long time-outs with these complications of deducing
> > > status created by the multiple connections.
> > >
> > > The application will not know about these connection allegiance
> > > problems. The iSCSI layer does not define interaction to provide
> > > additional application status to allow these applications to respond
> > > in a manner that may aid this situation nor should such additional
> > > information be required.  With your scheme the SCSI driver must
> > > examine the content of these commands to make a guess as to the
> > > connection allegiance assignments.  Now the driver is expected to
> > > understand what the intended action is of this SCSI management
> > > command.  What signal is used to indicate a need for the iSCSI
> > > immediate treatment?  The only obvious seems to be the task attribute
> > > argument.  With the way iSCSI has defined iSCSI immediate, I
> > > would expect those commands to be treated in a LIFO rather than the
> > > normal FIFO fashion.
> > >
> > > Doug
>
>
>
>
>
>
>



From owner-ips@ece.cmu.edu  Tue Apr 17 15:12:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA13833
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 15:12:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HE52U19295
	for ips-outgoing; Tue, 17 Apr 2001 10:05:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HE46r19236
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 10:04:06 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id QAA30204
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 16:03:56 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA223446
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 16:03:56 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A31.004D4142 ; Tue, 17 Apr 2001 16:03:49 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A31.004D404B.00@d12mta02.de.ibm.com>
Date: Tue, 17 Apr 2001 15:48:52 +0200
Subject: Re: iSCSI & Linked Commands
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Mark,

The argument about the HBAs generating tags is pretty weak as iSCSI will
have it's own HBA's and iSCSI will generate the tags in any implementation.
As for the utility - the sequential and conditional execution of the linked
commands is guaranteed regardless of delivery or queuing order.  The only
reason they might get obsolete is their inability to hide latency but I
don't see any compelling reason to have them unsupported by iSCSI.

Julo

Mark Mokryn <mark@sangate.com> on 17/04/2001 12:55:05

Please respond to Mark Mokryn <mark@sangate.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI & Linked Commands




Julian,

Santosh is right. Linked commands require an identical I_T_L_x nexus,
but many Fibre Channel (and possibly) SCSI adapters generate a queue tag
on-board, with no possibility of host software control. On such
adapters, the generation of linked commands is impossible, and clearly
today's SCSI layers are aware of this.

This raises the entire issue of task management in iSCSI: Linked
commands are dated back to SCSI-2, where they indeed served a purpose.
In the SCSI bus protocol, the target controlled all SCSI bus phases
(following selection). Thus, in a linked command sequence, the target
may drive the command phase immediately following the status phase, thus
saving bus cycles (i.e. arbitration, selection, etc.). However, in the
serial protocols, I don't see how linked commands are of any use, since
there are no bus phases to save. In contrast with popular belief, linked
commands offer no atomicity. Even in SCSI bus protocol, a linked command
may be disconnected at any time (at the target's discretion), and a new
command (from any initiator) may be started. Linked commands have always
been optional, and indeed many target implementations today do not
support them. For instance, looking at the Shark SCSI reference manual,
according to the inquiry data, Shark does not support linked commands.

So, perhaps the wise thing to do is to not support linked commands in
iSCSI. It has always been an optional feature for logical units, and
today is outdated and often unsupported, both by targets and initiators.

-mark

julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> Sorry to interrupt this captivating thread. Why do you think linked
> commands won't work?
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 16/04/2001 20:13:04
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Douglas Otis <dotis@sanlight.net>
> cc:   Ips <ips@ece.cmu.edu>
> Subject:  Re: iSCSI:flow control, acknowledgement, and a deterministic
>       recovery
>
> Doug,
>
> You seem to be referring to linked commands as a case wherein the
> approach of Abort Task will not flush stale PDUs.
>
> Linked Commands cannot work the way SCSI implementations are defined
> today, since linked commands require the initiator task tag (I_T_L_x
> nexus identifier in SAM-2 Execute Command terminology) to be generated
> by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
> OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> adapter firmware). IOW, there is no common reference handle like the
> task tag sent down from the ULP that allows for association of multiple
> commands to a task in several/most implementations today.
>
> When this is fixed up to get linked commands to work [& there exist
> examples of its usage], there is no reason connection allegiance could
> not be applied to all the commands within the task.
>
> I fail to see why you think Abort Task will not work with sequential
> devices (?).
>
> - Santosh
>
> Douglas Otis wrote:
> >
> > Santosh,
> >
> > I see a few problems with this approach.  Tasks as defined in iSCSI do
> not
> > maintain connection allegiance.  The driver binds all SCSI commands to
> their
> > connection for the most resent association.  Although there are several
> > places within the iSCSI proposal that make reference to a task having a
> > connection allegiance, this is in error.  Commands and not tasks carry
> such
> > allegiance.  Your recovery scheme will not allow a satisfactory
recovery
> > with a sequential device.  In this case, repeating the command is not a
> > solution.  As a result, one connection falter and it will become a
> difficult
> > situation.  In addition, you have no clue from iSCSI your delivery
> status.
> > You do not know if you are waiting for the target or if you are waiting
> for
> > the connection.  Some sequential devices have rather long time-outs
with
> > these complications of deducing status created by the multiple
> connections.
> >
> > The application will not know about these connection allegiance
problems.
> > The iSCSI layer does not define interaction to provide additional
> > application status to allow these applications to respond in a manner
> that
> > may aid this situation nor should such additional information be
> required.
> > With your scheme the SCSI driver must examine the content of these
> commands
> > to make a guess as to the connection allegiance assignments.  Now the
> driver
> > is expected to understand what the intended action is of this SCSI
> > management command.  What signal is used to indicate a need for the
iSCSI
> > immediate treatment?  The only obvious seems to be the task attribute
> > argument.  With the way iSCSI has defined iSCSI immediate, I would
expect
> > those commands to be treated in a LIFO rather than the normal FIFO
> fashion.
> >
> > Doug
> >
> > > Douglas Otis wrote:
> > > >
> > > > With multiple connections, if you are not going to use a valid
> > > > CmdSN, or in your case a null CmdSN for all commands, then there
> > > > would be a need to include a timestamp to meet a timely delivery
> > > > requirement in the same manner as used in FC encapsulation.  IP
> > > > can deliver over any time period.  A command could arrive at any
> > > > time with respect to other connections.  With all of your feedback
> > > > now from just the SCSI layer, the SCSI layer is likely to have
timed
> > > > out and restarted and now stray commands finally make an appearance
> > > > (the technician re-inserted the cable).  What did that do?  Yes,
> > > > if this were on a single connection, then TCP could provide some
> > > > assurances, (ignoring digests errors) but you must not make that
> > > > assumption nor can you assume all disruptions are symmetric.
> > >
> > > Doug,
> > >
> > > The below snippet from my last mail answered your above concern. The
> > > Abort Task is sent on the same connection as the command. (connection
> > > allegiance applied to the abort task as well). The Abort task pushes
> the
> > > stale data PDUs. There is no need for a timestamp on iSCSI PDUs.
> > >
> > > > > As for your second concern regarding I/O timeouts, there is
> > > no need for
> > > > > any timestamp. An I/O timeout is dealt with by an Abort Task.
> > > The abort
> > > > > task response guarantees that the abort reached the target and
> pushed
> > > > > all intermediate stale frames. Failure to complete Abort Task
leads
> to
> > > > > higher level error recovery (ex : Logout, or some higher form of
> task
> > > > > mgmt).
> > >
> > > - Santosh
>  - santoshr.vcf





From owner-ips@ece.cmu.edu  Tue Apr 17 16:35:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA16232
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 16:35:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HI44a06683
	for ips-outgoing; Tue, 17 Apr 2001 14:04:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HI3Nr06649
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 14:03:24 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRHMT>; Tue, 17 Apr 2001 11:03:17 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173427@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Ips (E-mail)" <ips@ece.cmu.edu>
Subject: RE: aborting an out of sequence cmdSN
Date: Tue, 17 Apr 2001 11:03:17 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

> > Does iSCSI have a way to abort the commands that are in the 
> target but not
> > yet in the SCSI Target Layer?
> 
> No, it doesn't.  Part of the problem is that just considering
> iSCSI info, the Initiator can't tell whether such commands
> are in the Target vs. still in-flight.  TCP ACK info may help,
> but if TCP hasn't ACKed, the Initiator still doesn't know.
> 

I don't understand what problem is being addressed.

I assume that iSCSI is just an extension of the pipeline for SCSI commands
and task management functions.  As long as nothing impedes pipeline flow,
the appropriate operations should eventually be presented to the SCSI layer
for execution in the proper order.

I thought it was decided a long time ago that the SCSI layer would interact
with the iSCSI transport such that flow through the pipeline never blocked
indefinitely.

Is that the case?

Charles
> -----Original Message-----
> From: Black_David@emc.com [mailto:Black_David@emc.com]
> Sent: Tuesday, April 17, 2001 12:03 AM
> To: Eddy@quicksall.com; ips@ece.cmu.edu
> Subject: RE: aborting an out of sequence cmdSN
> 
> 
> Eddy,
> 
> > Does iSCSI have a way to abort the commands that are in the 
> target but not
> > yet in the SCSI Target Layer?
> 
> No, it doesn't.  Part of the problem is that just considering
> iSCSI info, the Initiator can't tell whether such commands
> are in the Target vs. still in-flight.  TCP ACK info may help,
> but if TCP hasn't ACKed, the Initiator still doesn't know.
> 
> This is the issue that Doug's been posting about, advocating
> that iSCSI always reject all commands prior to the task abort
> on the assumption that the initiator can reissue them - given
> the lack of support this has received, I'm going to take this
> opportunity to state that I believe WG rough consensus exists
> for NOT pursuing this "reject all prior commands" approach.
> This is the last chance for anyone else to express support for
> this reject-based approach.
> 
> If we were to address the original issue, I think some sort
> of new mechanism is needed at the iSCSI level, as the
> attempts to use a side effect of SCSI task aborts don't
> seem to be getting much of anywhere (e.g., a previous
> attempt at this involved having iSCSI automatically send
> task management commands multiple times).  A SWAG at what
> such a mechanism might look like is some sort of iSCSI
> "Cancel" that names the CmdSN(s) to be cancelled and sent
> in the same PDU as the SCSI Task Abort or Task Set Abort
> that is supposed to abort them.  Canceling would apply
> both to commands waiting for CmdSN in iSCSI as well as
> those not yet received by the target (i.e., the target
> has to have some notion of "pending cancel" for commands
> it hasn't received yet.  The cancelled commands
> probably have to return some sort of iSCSI service error
> rather than SCSI sense data.  Something like this looks
> like it can be made to work as desired (issue TASK 
> ABORT with the corresponding Cancel for immediate delivery,
> and if SCSI declares the TASK ABORT to have succeeded,
> then that command will not subsequently execute), but it'll
> add complexity.  Is that worth the benefit of being able
> to reach out and definitively strangle an errant command?
> 
> --David
> 
> > -----Original Message-----
> > From:	Eddy Quicksall [SMTP:Eddy@quicksall.com]
> > Sent:	Friday, April 13, 2001 6:40 PM
> > To:	Ips@Ece. Cmu. Edu (E-mail)
> > Subject:	aborting an out of sequence cmdSN
> > 
> > 
> > If a command is received out of sequence, it is not passed 
> to the SCSI
> > Target Layer. So, it must be queued waiting for the prior 
> command. Now, if
> > the prior command never comes the operating system may 
> attempt to abort
> > the
> > lost command by sending an abort to the driver.
> > 
> > The abort can't be turned into an ABORT TASK TMF because 
> the target iSCSI
> > has not acknowledged that it has received the original 
> command yet. And,
> > even if the ABORT TASK was sent, it can't get to the SCSI 
> Target Layer
> > because its cmdSN will be higher than the most recent command.
> > 
> > The iSCSI initiator drive could just jerk it out of its 
> queue (knowing
> > that
> > it hasn't been acknowledged yet) but then the iSCSI target may still
> > execute
> > it when the missing PDU's arrive.
> > 
> > Does iSCSI have a way to abort the commands that are in the 
> target but not
> > yet in the SCSI Target Layer?
> > 
> > Eddy
> > 
> 


From owner-ips@ece.cmu.edu  Tue Apr 17 16:36:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA16266
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 16:36:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HIj6Y09667
	for ips-outgoing; Tue, 17 Apr 2001 14:45:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HIiCr09619
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 14:44:12 -0400 (EDT)
Received: from amrelay1.boi.hp.com (amrelay1.boi.hp.com [15.56.8.24])
	by atlrel1.hp.com (Postfix) with ESMTP
	id EFB4080C; Tue, 17 Apr 2001 14:44:10 -0400 (EDT)
Received: from xpabh3.corp.hp.com (xpabh3.corp.hp.com [15.58.136.223])
	by amrelay1.boi.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id MAA14520;
	Tue, 17 Apr 2001 12:44:09 -0600 (MDT)
Received: by xpabh3.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <2M5N3ZGM>; Tue, 17 Apr 2001 11:43:55 -0700
Message-ID: <499DC368E25AD411B3F100902740AD65BC5AB2@xrose03.rose.hp.com>
From: "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
To: "'julian_satran@il.ibm.com'" <julian_satran@il.ibm.com>
Cc: ips@ece.cmu.edu, tsvwg@ietf.org, "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>,
        Randall Stewart <rrs@cisco.com>,
        "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
Subject: Re: [Tsvwg] [SCTP checksum problems]
Date: Tue, 17 Apr 2001 11:43:51 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julian,
The SCTP folks are right now discussing changing the SCTP checksum to be a
CRC-32 (or other). This is a very good thing and really what needs to happen
with SCTP for it to support iSCSI and other data-critical applications
effectively (and also relieve iSCSI from having to implement data integrity
checking and transport-like functionality over SCTP).

They are looking for inputs as to which CRC-32 or checksum to use. The iSCSI
WG's CRC investigation work and conclusion would be a valuable input into
their decision. The sooner that you can provide the iSCSI recommended CRC
and reasoning behind it to them, the better, even before the forthcoming I-D
is distributed.

Jim Wendt
Networked Storage Architecture
Hewlett-Packard Company
jim_wendt@hp.com 916-785-5198

----------------------------------------------------------------------------
-

> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: Sunday, April 15, 2001 7:58 AM
> To: ips@ece.cmu.edu
> Subject: CRCs
> 
> 
> 
> 
> Dear colleagues,
> 
> We will probably not be able to finish the CRC/checksum 
> document in time
> for Nashua but we hope it will be out very soon after that.   
> However I
> would like to inform you that while in Orlando and 
> Minneapolis we where
> still talking about different CRCs we (Dafna Sheinwald, Pat 
> Thaler, Matt
> Wakeley, Vince Cavanna and myself) have agreed on a CRC and 
> the forthcoming
> ID will give all the reasons and why we recomend it.
> 
> Regards,
> Julo
> 

----------------------------------------------------------------------------
-

-----Original Message-----
From: Randall Stewart [mailto:rrs@cisco.com]
Sent: Tuesday, April 17, 2001 4:31 AM
To: Jonathan Wood
Cc: xieqb@cig.mot.com; tsvwg@ietf.org; Jim Wendt; Jonathan Stone; Craig
Partridge
Subject: Re: [Tsvwg] [SCTP checksum problems]


Jonathan:

I will make sure everyone at the bakeoff is aware of the upcoming
"checksum" change... Now one of the big questions yet is
what checksum should we use?

I kinda lean towards crc-32 myself (but of course I have no technical
basis for this and need to keep silent on which one to use anyway :->),
but do we have other candidates besides fletcher-32 and possibly
modified 
Adler-32 (i.e. 16 bit adds instead of 8)??

I will take the above 3 and do a bit of performance work this
week and post some numbers... thats about all I can do i.e.  tell
how much time the options I know of take... 

If you have some other candidates let me know and I can possibly get
some performance numbers on these as well...

As far as which is the best... I encourage all of you check-sum
experts out there to please join the thread :)

Oh, I know Jonathan Stone's paper will NOT be ready until sometime
in May.. so we may want to proceed slowly so that Craig Partridge and
he can have some cycles to add to this dicussion :)

R

Jonathan Wood wrote:
> 
> As an SCTP implementor and someone who will want to get the hardware folks
to
> help with checksumming, I wholeheartedly agree with Randy. Remember that
SCTP is
> just a proposed standard, and is as such not all that far along the
> standardization process. We should still be able to make changes like this
if
> necessary.
> 
> Jon
> 
> >
> >Q:
> >
> >The only problem with an additional "CRC chunk" is that
> >it makes hardware assistance to error correction much
> >more difficult. It is better (I think) to just realize
> >we made a mistake. Get the opinions of the experts as to
> >what checksum to use... i.e.:
> >
> >- CRC-32
> >- Modified Adler-32 (16 bit word sums)
> >- Fletcher-32
> >- ???
> >
> >And then go with this as a replacement... Admit we were wrong
> >and fix the problem..
> >
> >This way you have ONE and only ONE checksum algorithm making
> >hardware designers life much easier...
> >
> >R
> >
> >Qiaobing Xie wrote:
> >>
> >> Another solution could be (I think I mentioned this to Randy and a few
> >> others at last IETF):
> >>
> >> - Define a CRC-32 (or other strong checksum) control chunk and when the
> >> sender wishes to use a stronger checksum protection, in addition to the
> >> Adler-32 in the common SCTP header it includes this CRC-32 chuck in the
> >> outbound packet. When the packet arrives, the receiver will do the
> >> Adler-32 first, and then if the receiver supports the CRC-32 and sees
> >> the presence of the CRC-32 chunk in the packet it will further verify
> >> the CRC-32.
> >>
> >> We could also use a bit pattern in the chunk type of the CRC-32 chunk
so
> >> that if the receiver doesn't understand the CRC-32 chunk it would drop
> >> it with a report back to the sender.
> >>
> >> -Qiaobing
> >>
> >> _______________________________________________
> >> tsvwg mailing list
> >> tsvwg@ietf.org
> >> http://www1.ietf.org/mailman/listinfo/tsvwg
> >
> >--
> >Randall R. Stewart
> >Systems & Solutions Engineering
> >Cisco Systems Inc.
> >rrs@cisco.com 815-342-5222 or 815-477-2127
> >
> >_______________________________________________
> >tsvwg mailing list
> >tsvwg@ietf.org
> >http://www1.ietf.org/mailman/listinfo/tsvwg

-- 
Randall R. Stewart
Systems & Solutions Engineering
Cisco Systems Inc.
rrs@cisco.com 815-342-5222 or 815-477-2127

> 


From owner-ips@ece.cmu.edu  Tue Apr 17 16:39:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA16323
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 16:39:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HIE2T07423
	for ips-outgoing; Tue, 17 Apr 2001 14:14:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HIDLr07391
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 14:13:22 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP id 9F2429A5
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 11:13:20 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA14256
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 11:13:15 -0700 (PDT)
Message-ID: <3ADC891A.87E70DB3@cup.hp.com>
Date: Tue, 17 Apr 2001 11:19:06 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI & Linked Commands
References: <C1256A31.004D404B.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------2915F1AD098ED8F72B3AC12F"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------2915F1AD098ED8F72B3AC12F
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,

Mark has already replied to your question on why linked commands won't
work given today's implementations. In most cases, the initiator task
tag is either generated in the HBA driver or in the HBA firmware. 

Since there is no task tag generated from the SCSI ULP while handing an
I/O request to the SCSI LLP, making the association across different
commands within the same task (linked command) becomes impossible.

Is there any reason why iSCSI would want to apply connection allegiance
per command and not per task (?). If someone did use linked commands and
wanted a clean way of aborting the task on some error, an abort task
sent down the same connection as the rest of the task would ensure
flushing of stale PDUs. 

I agree that there is no need to say linked commands are unsupported
since this is a ULP feature and support or lack thereof is decided at
the initiator and target ULPs. 

- Santosh


julian_satran@il.ibm.com wrote:
> 
> Mark,
> 
> The argument about the HBAs generating tags is pretty weak as iSCSI will
> have it's own HBA's and iSCSI will generate the tags in any implementation.
> As for the utility - the sequential and conditional execution of the linked
> commands is guaranteed regardless of delivery or queuing order.  The only
> reason they might get obsolete is their inability to hide latency but I
> don't see any compelling reason to have them unsupported by iSCSI.
> 
> Julo
> 
> Mark Mokryn <mark@sangate.com> on 17/04/2001 12:55:05
> 
> Please respond to Mark Mokryn <mark@sangate.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   ips@ece.cmu.edu
> Subject:  Re: iSCSI & Linked Commands
> 
> Julian,
> 
> Santosh is right. Linked commands require an identical I_T_L_x nexus,
> but many Fibre Channel (and possibly) SCSI adapters generate a queue tag
> on-board, with no possibility of host software control. On such
> adapters, the generation of linked commands is impossible, and clearly
> today's SCSI layers are aware of this.
> 
> This raises the entire issue of task management in iSCSI: Linked
> commands are dated back to SCSI-2, where they indeed served a purpose.
> In the SCSI bus protocol, the target controlled all SCSI bus phases
> (following selection). Thus, in a linked command sequence, the target
> may drive the command phase immediately following the status phase, thus
> saving bus cycles (i.e. arbitration, selection, etc.). However, in the
> serial protocols, I don't see how linked commands are of any use, since
> there are no bus phases to save. In contrast with popular belief, linked
> commands offer no atomicity. Even in SCSI bus protocol, a linked command
> may be disconnected at any time (at the target's discretion), and a new
> command (from any initiator) may be started. Linked commands have always
> been optional, and indeed many target implementations today do not
> support them. For instance, looking at the Shark SCSI reference manual,
> according to the inquiry data, Shark does not support linked commands.
> 
> So, perhaps the wise thing to do is to not support linked commands in
> iSCSI. It has always been an optional feature for logical units, and
> today is outdated and often unsupported, both by targets and initiators.
> 
> -mark
> 
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > Sorry to interrupt this captivating thread. Why do you think linked
> > commands won't work?
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 16/04/2001 20:13:04
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   Douglas Otis <dotis@sanlight.net>
> > cc:   Ips <ips@ece.cmu.edu>
> > Subject:  Re: iSCSI:flow control, acknowledgement, and a deterministic
> >       recovery
> >
> > Doug,
> >
> > You seem to be referring to linked commands as a case wherein the
> > approach of Abort Task will not flush stale PDUs.
> >
> > Linked Commands cannot work the way SCSI implementations are defined
> > today, since linked commands require the initiator task tag (I_T_L_x
> > nexus identifier in SAM-2 Execute Command terminology) to be generated
> > by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
> > OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> > adapter firmware). IOW, there is no common reference handle like the
> > task tag sent down from the ULP that allows for association of multiple
> > commands to a task in several/most implementations today.
> >
> > When this is fixed up to get linked commands to work [& there exist
> > examples of its usage], there is no reason connection allegiance could
> > not be applied to all the commands within the task.
> >
> > I fail to see why you think Abort Task will not work with sequential
> > devices (?).
> >
> > - Santosh
> >
> > Douglas Otis wrote:
> > >
> > > Santosh,
> > >
> > > I see a few problems with this approach.  Tasks as defined in iSCSI do
> > not
> > > maintain connection allegiance.  The driver binds all SCSI commands to
> > their
> > > connection for the most resent association.  Although there are several
> > > places within the iSCSI proposal that make reference to a task having a
> > > connection allegiance, this is in error.  Commands and not tasks carry
> > such
> > > allegiance.  Your recovery scheme will not allow a satisfactory
> recovery
> > > with a sequential device.  In this case, repeating the command is not a
> > > solution.  As a result, one connection falter and it will become a
> > difficult
> > > situation.  In addition, you have no clue from iSCSI your delivery
> > status.
> > > You do not know if you are waiting for the target or if you are waiting
> > for
> > > the connection.  Some sequential devices have rather long time-outs
> with
> > > these complications of deducing status created by the multiple
> > connections.
> > >
> > > The application will not know about these connection allegiance
> problems.
> > > The iSCSI layer does not define interaction to provide additional
> > > application status to allow these applications to respond in a manner
> > that
> > > may aid this situation nor should such additional information be
> > required.
> > > With your scheme the SCSI driver must examine the content of these
> > commands
> > > to make a guess as to the connection allegiance assignments.  Now the
> > driver
> > > is expected to understand what the intended action is of this SCSI
> > > management command.  What signal is used to indicate a need for the
> iSCSI
> > > immediate treatment?  The only obvious seems to be the task attribute
> > > argument.  With the way iSCSI has defined iSCSI immediate, I would
> expect
> > > those commands to be treated in a LIFO rather than the normal FIFO
> > fashion.
> > >
> > > Doug
> > >
> > > > Douglas Otis wrote:
> > > > >
> > > > > With multiple connections, if you are not going to use a valid
> > > > > CmdSN, or in your case a null CmdSN for all commands, then there
> > > > > would be a need to include a timestamp to meet a timely delivery
> > > > > requirement in the same manner as used in FC encapsulation.  IP
> > > > > can deliver over any time period.  A command could arrive at any
> > > > > time with respect to other connections.  With all of your feedback
> > > > > now from just the SCSI layer, the SCSI layer is likely to have
> timed
> > > > > out and restarted and now stray commands finally make an appearance
> > > > > (the technician re-inserted the cable).  What did that do?  Yes,
> > > > > if this were on a single connection, then TCP could provide some
> > > > > assurances, (ignoring digests errors) but you must not make that
> > > > > assumption nor can you assume all disruptions are symmetric.
> > > >
> > > > Doug,
> > > >
> > > > The below snippet from my last mail answered your above concern. The
> > > > Abort Task is sent on the same connection as the command. (connection
> > > > allegiance applied to the abort task as well). The Abort task pushes
> > the
> > > > stale data PDUs. There is no need for a timestamp on iSCSI PDUs.
> > > >
> > > > > > As for your second concern regarding I/O timeouts, there is
> > > > no need for
> > > > > > any timestamp. An I/O timeout is dealt with by an Abort Task.
> > > > The abort
> > > > > > task response guarantees that the abort reached the target and
> > pushed
> > > > > > all intermediate stale frames. Failure to complete Abort Task
> leads
> > to
> > > > > > higher level error recovery (ex : Logout, or some higher form of
> > task
> > > > > > mgmt).
> > > >
> > > > - Santosh
> >  - santoshr.vcf
--------------2915F1AD098ED8F72B3AC12F
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------2915F1AD098ED8F72B3AC12F--



From owner-ips@ece.cmu.edu  Tue Apr 17 16:39:34 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA16334
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 16:39:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HInId09993
	for ips-outgoing; Tue, 17 Apr 2001 14:49:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HIlxr09917
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 14:47:59 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 2585A18F6; Tue, 17 Apr 2001 11:41:28 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA16473;
	Tue, 17 Apr 2001 11:41:24 -0700 (PDT)
Message-ID: <3ADC8FB2.141C6C96@cup.hp.com>
Date: Tue, 17 Apr 2001 11:47:14 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: IPS Reflector <ips@ece.cmu.edu>,
        Fibre Channel T11 reflector <fc@network.com>
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
References: <C1256A30.0059AD0F.00@d12mta05.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------ACE8F5E86B86C3A3BA6F6B15"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------ACE8F5E86B86C3A3BA6F6B15
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> The bit and the interpretation are protocol specific.
> 
> FCP uses it like iSCSI - i.e. the order has to maintained within a sequence


Not true. If you take a look at FCP-2 rev 04 Section 10.1.1.7
description on EMDP, it explicitly states :
"The EMDP bit does not affect the order of frames within a sequence". 

For a WRITE command, an EMDP setting of 0 implies that the buffer offset
in R2T requests must be in continuous and increasing order whereas an
EMDP setting of 1 implies the buffer offset in R2T can be out of order.

For a READ command, an EMDP setting of 0 implies the buffer offset in
READ data PDUs is in continuous and increasing order, whereas, an EMDP
setting of 1 implies buffer offset in READ Data PDUs can be out of
order.

Based on the above rules, iSCSI is violating EMDP setting by its error
recovery for data digest errors detected by targets on Data PDUs.

- Santosh


> (a R2T derived output or the entire input).
> In that sense we are not violating the EMDP.

> 
> And BTW the recovery procedure in FCP is similar although a bit more
> complicated than ours and involves also
> a link level sequence.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 03:54:28
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   IPS Reflector <ips@ece.cmu.edu>
> cc:
> Subject:  iSCSI : digest error handling violates EMDP/InDataOrder
> 
> Where :
> =======
> 
> Section 6.2 (pg 80). Digest Errors
> -----------------------------------
> "If the error is a Data-Digest-Error in a Data-PDU, the target MUST
> either request retransmission with a R2T or answer with a Reject iSCSI
> PDU and abort the task."
> 
> Problem :
> ---------
> On a Data digest error detected by a target, it MUST NOT request
> re-transmission of the data PDU thru an R2T if the session login key
> InDataOrder is set to yes. The current rev 05 draft violates
> InDataOrder/EMDP settings by allowing a re-transmission of R2T by
> target.
> 
> Scenario :
> ==========
> initiator           target
> ---------           ------
> EMDP=0
> InDataOrder=YES
> (exp_off=0)
>         offset=0,len=64k <------ R2T
> 
> --------> data PDUs
> (exp_off = 64K)
>                               data digest error results in
>                      an 8K PDU being dropped at offset 24K.
> 
>        offset=24K,len=8K  <------ R2T for missing PDU.
> 
> exp_off != offset
> 
> - Santosh
--------------ACE8F5E86B86C3A3BA6F6B15
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------ACE8F5E86B86C3A3BA6F6B15--



From owner-ips@ece.cmu.edu  Tue Apr 17 16:40:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA16357
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 16:40:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HJ22c10909
	for ips-outgoing; Tue, 17 Apr 2001 15:02:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HJ1mr10877
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 15:01:48 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 65D99C6E; Tue, 17 Apr 2001 12:01:47 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id MAA18883;
	Tue, 17 Apr 2001 12:01:09 -0700 (PDT)
Message-ID: <3ADC9450.3DCA0DFE@cup.hp.com>
Date: Tue, 17 Apr 2001 12:06:56 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI Requirements Draft - Informal WG Last Call
References: <C1256A30.0059AD02.00@d12mta02.de.ibm.com> <3ADC56AC.E93E6EB9@research.bell-labs.com>
Content-Type: multipart/mixed;
 boundary="------------5F33514FDBBA8D105F0F50CC"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------5F33514FDBBA8D105F0F50CC
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


Julian,

Ordered delivery can be achieved to better effect using the SAM-2 CRN
based ordering due to the following reasons :

1) CRN provides ordering on a per-lun basis and can be turned on and off
for a subset of I/Os to that LUN. This allows for flexible ordering
since ordering is a function of the I/O type from the application.
Applications that are doing READ only operations (like a search engine)
do not require any ordering. Ordering is required on metadata updates,
any form of synchronization I/Os, WRITEs interspersed with READS, etc.
Thus, an ordering solution should be flexible enough to be applied at
the scope of a subset of I/Os destined to a LUN.

Such an ordering scheme would also allow ordering to be turned on for
only tape applications if disk applications did not require ordering.

iSCSI's ordering solution does not provide this flexibility, whereas
usage of CRN would.

2) Such a fine granularity scope of ordering also minimizes the impact
of error recovery actions taken when loss of order occurs. The impact
with CRN would be a target-initiator handshake based on ACA + some
checkpoints to error back all the pending CRN enabled commands on that
LUN.

Comparing the equivalent error recovery in a CmdSN based iSCSI ordering
solution, the action taken would be to error back all the pending I/Os
destined to the entire session following loss of order. With high end
disk arrays having 1000+ LUN configurations, such error recovery is
extreme, [especially when ordering may have been desired by the appln
only on a small subset of I/Os to 1 LUN, and loss of ordering for the
remaining 999 LUNs was a don't care].

3) A CRN based ordering scheme works for all underlying SCSI transports
as opposed to CmdSN based ordering.

4) The generation of a stream of commands that expect strong ordering
will need to be accompanied by corresponding generation of a sequence
number at the same layer. (CRN would provide such a sequencing). Failure
to do so can result in silent loss of order that slips un-detected due
to potential points of failure in the stack b/n the SCSI ULP and the
physical bus/link. (ex : I/O failures within the HBA driver due to
resource allocation failures or other such conditions can cause loss of
order.).

Attempts to enforce ordering at multiple layers of the stack (CRN at the
ULP and CmdSN at the LLP), especially when CmdSN does not provide all
the benefits that CRN would provide is over-engineering the solution to
the ordering problem. It also impacts iSCSI performance.

- Santosh

> 
> julian_satran@il.ibm.com wrote:
> >
> > Ordered delivery of commands to ANY TYPE of devices will increase in
> > importance as network speeds increase and the need to hide latency
> > increases.
> >
> > Today databases don't use queuing and rely and trickle the commands to
> > devices 1 by 1 to ensure atomicity and order.
> > As latency will become the determining factor in performance this is bound
> > to change.
> >
> > SCSI has done an excellent job in defining the queueing mechanism. We have
> > to make it work with good performance in our environment.
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 04:33:45
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:   Black_David@emc.com
> > Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call
> >
> > David & All,
> >
> > I object to the following requirement :
> >
> > " MUST support ordered delivery of SCSI commands from the initiator to
> > the
> >   target, to support SCSI Task Queuing. "
> >
> > Ordered delivery is not a requirement for disk based applications and
> > non tagged queueing tape applications, which form the majority of
> > today's data traffic.
> >
> > To impose strict ordering (even in the presence of errors ?) as a MUST
> > is penalizing the majority of today's data traffic that does not expect
> > ordering from the SCSI subsystem.
> >
> > I am particularly concerned about the effect of the above requirement in
> > the presence of errors. Does iSCSI expect strict ordering to be
> > maintained even when individual I/O errors like ULP timeout occur ?
> >
> > On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
> > choose not to retry the command, but instead, error it back to the ULP.
> > In such a case, it can plug the hole in CmdSN with a NOP-OUT.
> >
> > The above requirement is not feasible to be met under such circumstances
> > and others similar to this. Mandating strict ordering on ULP timeouts
> > implies a session level error recovery on any individual I/O being
> > failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
> > as error recovery and should not be imposed.
> >
> > The above requirement must be changed to :
> > " SHOULD support ordered delivery of SCSI commands from the initiator to
> > the
> >   target, to support SCSI Task Queuing. "
> >
> > - Santosh
> >
> > Black_David@emc.com wrote:
> > >
> > > It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> > > as an Informational RFC. There is no formal requirement for
> > > a WG Last Call, but if you have any further substantive comments
> > > on the document please raise them on this list within the next
> > > two weeks, i.e. by April 27th at the latest.
> > >
> > > If you have typographical/editorial comments please send them
> > > direct to the document's author, Marjorie Krueger
> > > <marjorie_krueger@hp.com>.
> > >
> > > Thanks,
> > > --David and Elizabeth, IPS WG co-chairs
> >  - santoshr.vcf
--------------5F33514FDBBA8D105F0F50CC
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------5F33514FDBBA8D105F0F50CC--



From owner-ips@ece.cmu.edu  Tue Apr 17 18:56:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA17639
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 18:56:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HK85t15984
	for ips-outgoing; Tue, 17 Apr 2001 16:08:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HK7fr15928
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 16:07:43 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRHVB>; Tue, 17 Apr 2001 13:07:32 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B17342A@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI & Linked Commands
Date: Tue, 17 Apr 2001 13:07:25 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

> Mark has already replied to your question on why linked commands won't
> work given today's implementations. In most cases, the initiator task
> tag is either generated in the HBA driver or in the HBA firmware. 

The fact that some adapter implementations on some interconnects may not
support linked commands does not grant a license to delete the capability
from the protocol specification.

Incidentally, in my opinion the problem of linked command support has more
to do with whether or not the adapter has enough intelligence to handle the
linked command semantics and retain its internal task state following the
receipt of status.  Making the tag accessible in some way is required if for
no other reason that to support the ABORT TASK function  

Charles

> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Tuesday, April 17, 2001 11:19 AM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI & Linked Commands
> 
> 
> Julian,
> 
> Mark has already replied to your question on why linked commands won't
> work given today's implementations. In most cases, the initiator task
> tag is either generated in the HBA driver or in the HBA firmware. 
> 
> Since there is no task tag generated from the SCSI ULP while 
> handing an
> I/O request to the SCSI LLP, making the association across different
> commands within the same task (linked command) becomes impossible.
> 
> Is there any reason why iSCSI would want to apply connection 
> allegiance
> per command and not per task (?). If someone did use linked 
> commands and
> wanted a clean way of aborting the task on some error, an abort task
> sent down the same connection as the rest of the task would ensure
> flushing of stale PDUs. 
> 
> I agree that there is no need to say linked commands are unsupported
> since this is a ULP feature and support or lack thereof is decided at
> the initiator and target ULPs. 
> 
> - Santosh
> 
> 
> julian_satran@il.ibm.com wrote:
> > 
> > Mark,
> > 
> > The argument about the HBAs generating tags is pretty weak 
> as iSCSI will
> > have it's own HBA's and iSCSI will generate the tags in any 
> implementation.
> > As for the utility - the sequential and conditional 
> execution of the linked
> > commands is guaranteed regardless of delivery or queuing 
> order.  The only
> > reason they might get obsolete is their inability to hide 
> latency but I
> > don't see any compelling reason to have them unsupported by iSCSI.
> > 
> > Julo
> > 
> > Mark Mokryn <mark@sangate.com> on 17/04/2001 12:55:05
> > 
> > Please respond to Mark Mokryn <mark@sangate.com>
> > 
> > To:   Julian Satran/Haifa/IBM@IBMIL
> > cc:   ips@ece.cmu.edu
> > Subject:  Re: iSCSI & Linked Commands
> > 
> > Julian,
> > 
> > Santosh is right. Linked commands require an identical 
> I_T_L_x nexus,
> > but many Fibre Channel (and possibly) SCSI adapters 
> generate a queue tag
> > on-board, with no possibility of host software control. On such
> > adapters, the generation of linked commands is impossible, 
> and clearly
> > today's SCSI layers are aware of this.
> > 
> > This raises the entire issue of task management in iSCSI: Linked
> > commands are dated back to SCSI-2, where they indeed served 
> a purpose.
> > In the SCSI bus protocol, the target controlled all SCSI bus phases
> > (following selection). Thus, in a linked command sequence, 
> the target
> > may drive the command phase immediately following the 
> status phase, thus
> > saving bus cycles (i.e. arbitration, selection, etc.). 
> However, in the
> > serial protocols, I don't see how linked commands are of 
> any use, since
> > there are no bus phases to save. In contrast with popular 
> belief, linked
> > commands offer no atomicity. Even in SCSI bus protocol, a 
> linked command
> > may be disconnected at any time (at the target's 
> discretion), and a new
> > command (from any initiator) may be started. Linked 
> commands have always
> > been optional, and indeed many target implementations today do not
> > support them. For instance, looking at the Shark SCSI 
> reference manual,
> > according to the inquiry data, Shark does not support 
> linked commands.
> > 
> > So, perhaps the wise thing to do is to not support linked 
> commands in
> > iSCSI. It has always been an optional feature for logical units, and
> > today is outdated and often unsupported, both by targets 
> and initiators.
> > 
> > -mark
> > 
> > julian_satran@il.ibm.com wrote:
> > >
> > > Santosh,
> > >
> > > Sorry to interrupt this captivating thread. Why do you 
> think linked
> > > commands won't work?
> > >
> > > Julo
> > >
> > > Santosh Rao <santoshr@cup.hp.com> on 16/04/2001 20:13:04
> > >
> > > Please respond to Santosh Rao <santoshr@cup.hp.com>
> > >
> > > To:   Douglas Otis <dotis@sanlight.net>
> > > cc:   Ips <ips@ece.cmu.edu>
> > > Subject:  Re: iSCSI:flow control, acknowledgement, and a 
> deterministic
> > >       recovery
> > >
> > > Doug,
> > >
> > > You seem to be referring to linked commands as a case wherein the
> > > approach of Abort Task will not flush stale PDUs.
> > >
> > > Linked Commands cannot work the way SCSI implementations 
> are defined
> > > today, since linked commands require the initiator task 
> tag (I_T_L_x
> > > nexus identifier in SAM-2 Execute Command terminology) to 
> be generated
> > > by the SCSI ULP. However, in practice, the Initiator Task 
> Tag (or the FC
> > > OX_ID) is typically generated in the SCSI LLP (or in some 
> cases in the
> > > adapter firmware). IOW, there is no common reference 
> handle like the
> > > task tag sent down from the ULP that allows for 
> association of multiple
> > > commands to a task in several/most implementations today.
> > >
> > > When this is fixed up to get linked commands to work [& 
> there exist
> > > examples of its usage], there is no reason connection 
> allegiance could
> > > not be applied to all the commands within the task.
> > >
> > > I fail to see why you think Abort Task will not work with 
> sequential
> > > devices (?).
> > >
> > > - Santosh
> > >
> > > Douglas Otis wrote:
> > > >
> > > > Santosh,
> > > >
> > > > I see a few problems with this approach.  Tasks as 
> defined in iSCSI do
> > > not
> > > > maintain connection allegiance.  The driver binds all 
> SCSI commands to
> > > their
> > > > connection for the most resent association.  Although 
> there are several
> > > > places within the iSCSI proposal that make reference to 
> a task having a
> > > > connection allegiance, this is in error.  Commands and 
> not tasks carry
> > > such
> > > > allegiance.  Your recovery scheme will not allow a satisfactory
> > recovery
> > > > with a sequential device.  In this case, repeating the 
> command is not a
> > > > solution.  As a result, one connection falter and it 
> will become a
> > > difficult
> > > > situation.  In addition, you have no clue from iSCSI 
> your delivery
> > > status.
> > > > You do not know if you are waiting for the target or if 
> you are waiting
> > > for
> > > > the connection.  Some sequential devices have rather 
> long time-outs
> > with
> > > > these complications of deducing status created by the multiple
> > > connections.
> > > >
> > > > The application will not know about these connection allegiance
> > problems.
> > > > The iSCSI layer does not define interaction to provide 
> additional
> > > > application status to allow these applications to 
> respond in a manner
> > > that
> > > > may aid this situation nor should such additional information be
> > > required.
> > > > With your scheme the SCSI driver must examine the 
> content of these
> > > commands
> > > > to make a guess as to the connection allegiance 
> assignments.  Now the
> > > driver
> > > > is expected to understand what the intended action is 
> of this SCSI
> > > > management command.  What signal is used to indicate a 
> need for the
> > iSCSI
> > > > immediate treatment?  The only obvious seems to be the 
> task attribute
> > > > argument.  With the way iSCSI has defined iSCSI 
> immediate, I would
> > expect
> > > > those commands to be treated in a LIFO rather than the 
> normal FIFO
> > > fashion.
> > > >
> > > > Doug
> > > >
> > > > > Douglas Otis wrote:
> > > > > >
> > > > > > With multiple connections, if you are not going to 
> use a valid
> > > > > > CmdSN, or in your case a null CmdSN for all 
> commands, then there
> > > > > > would be a need to include a timestamp to meet a 
> timely delivery
> > > > > > requirement in the same manner as used in FC 
> encapsulation.  IP
> > > > > > can deliver over any time period.  A command could 
> arrive at any
> > > > > > time with respect to other connections.  With all 
> of your feedback
> > > > > > now from just the SCSI layer, the SCSI layer is 
> likely to have
> > timed
> > > > > > out and restarted and now stray commands finally 
> make an appearance
> > > > > > (the technician re-inserted the cable).  What did 
> that do?  Yes,
> > > > > > if this were on a single connection, then TCP could 
> provide some
> > > > > > assurances, (ignoring digests errors) but you must 
> not make that
> > > > > > assumption nor can you assume all disruptions are symmetric.
> > > > >
> > > > > Doug,
> > > > >
> > > > > The below snippet from my last mail answered your 
> above concern. The
> > > > > Abort Task is sent on the same connection as the 
> command. (connection
> > > > > allegiance applied to the abort task as well). The 
> Abort task pushes
> > > the
> > > > > stale data PDUs. There is no need for a timestamp on 
> iSCSI PDUs.
> > > > >
> > > > > > > As for your second concern regarding I/O 
> timeouts, there is
> > > > > no need for
> > > > > > > any timestamp. An I/O timeout is dealt with by an 
> Abort Task.
> > > > > The abort
> > > > > > > task response guarantees that the abort reached 
> the target and
> > > pushed
> > > > > > > all intermediate stale frames. Failure to 
> complete Abort Task
> > leads
> > > to
> > > > > > > higher level error recovery (ex : Logout, or some 
> higher form of
> > > task
> > > > > > > mgmt).
> > > > >
> > > > > - Santosh
> > >  - santoshr.vcf
> 


From owner-ips@ece.cmu.edu  Tue Apr 17 20:50:52 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA19113
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 20:50:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HMV9D26789
	for ips-outgoing; Tue, 17 Apr 2001 18:31:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HMUZr26713
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 18:30:36 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP id 0ADCDBBB
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 15:30:35 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id PAA08308
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 15:30:30 -0700 (PDT)
Message-ID: <3ADCC565.ECE6D3A8@cup.hp.com>
Date: Tue, 17 Apr 2001 15:36:21 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: Re: iSCSI : More problems with Status SNACK !
References: <C1256A30.0059AD0E.00@d12mta05.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------74E59DCB0B1C420551E5FD17"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------74E59DCB0B1C420551E5FD17
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:
> 
> 2) SNACK mechanism cannot be relied upon for resource cleanup for the
> following reasons :
> 
> a) SNACK support MUST be mandatory at the target and target can NEVER
> fail a Status SNACK.
> b) Initiators MUST always use a Status SNACK and this is not possible on
> a UP timeout. IOW, there exist I/O timeout and other circumstances when
> the initiator gives up and does not attempt SACK (suppose SACK itself
> got a digest error at the target and timed out at the initiator !).
> 
> Since the current SNACK model is heavily dependent on the above
> assumptions [which canot be met], failure of SNACK blocks further
> forward progress with resource cleanup at the target since all further
> I/O completions beyond the hole StatSN cannot be acknowledged.
> 
> In the worst case, any I/O timeout would imply session level error
> recovery since the target will no longer be able to relaim resources.
> 
> +++ any UL timeout must include an abort for the task to clean up the
> target++

Julian,

The Abort Task sent by initiator on the ULP timeouts cleans up resources
for that specific task. 

The issue under debate was that the spec does not have a mechanism by
which, once a hole is created, [which cannot be filled by the Status
SNACK,] the initiator can switch back to bulk acknowledgements. i.e.
while the timed out I/O resources may be released thru the Abort Task,
the remaining tasks completed thereafter are unable to be acknowledged
by the initiator.

> 
> Proposal :
> ==========
> 1) Negotiate Status SACK support at login time.
> 2) Do not use StatSN when Status SACK is not supported.
> 3) Modify the current SNACK PDU to eliminate "Additional run Length"
> (which is of no practical use currently) and replace with an explicit
> positive ack run described by ack_begrun and ack_run_length.
> 
> Comments ?
> +++ I am basically against options - If I can avoid them.
> I don't see how an optional SNACK and STatSN would simplify a
> target/initiator
> while still allowing command recovery without popping errors into SCSI+++

If a target does not support Status SNACK, then, such a target is
effectively releasing its I/O resources upon completion. This implies
that the target is neither capable of supporting SNACK nor the "retry"
(or replay) concept.

In such cases, command recovery may occur at the SCSI layer or iSCSI may
retry the task at its layer. For such simple implementations that don't
resort to complex status recovery techniques, StatSN has no value add
and only creates complexity by potential holes.

Of course, any such implementation may typically ignore ExpStatSN and
continue to release resources as the I/Os complete with a StatSN being
initialized only for compliance. Rather than have such a behaviour, the
spec should allow for these implementations by letting them use StatSN
of 0 as a don't care.

- Santosh
--------------74E59DCB0B1C420551E5FD17
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------74E59DCB0B1C420551E5FD17--



From owner-ips@ece.cmu.edu  Tue Apr 17 20:51:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA19136
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 20:51:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HN58Z29338
	for ips-outgoing; Tue, 17 Apr 2001 19:05:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HN4Vr29309
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 19:04:31 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id AAB90BBF; Tue, 17 Apr 2001 16:04:30 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id QAA12444;
	Tue, 17 Apr 2001 16:04:25 -0700 (PDT)
Message-ID: <3ADCCD59.E192EFCC@cup.hp.com>
Date: Tue, 17 Apr 2001 16:10:17 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Sandeep Joshi <sandeepj@research.bell-labs.com>
Cc: ips@ece.cmu.edu
Subject: Re: aborting an out of sequence cmdSN
References: <0F31E5C394DAD311B60C00E029101A0708015447@corpmx9.isus.emc.com> <3ADC4FDA.1010ACC2@research.bell-labs.com>
Content-Type: multipart/mixed;
 boundary="------------09D4CBA14595D9CB62F7DE30"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------09D4CBA14595D9CB62F7DE30
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I'd think option 1 is the simplest (with the caveat that the task mgmt
PDU referred to is the Abort Task.) and only impacts the affected
command/task.

Pierre Labat and I have asked for this 4 months ago. (See :
http://ips.pdl.cs.cmu.edu/mail/msg02958.html). The concept of connection
allegiance should be extended to include the Abort Task. Also,
connection allegiance should apply to the task (which spans multiple
commands in the case of linked commands.), allowing for a deterministic
clean up of stale PDUs of the task through the use of Abort Task.

- Santosh

Sandeep Joshi wrote:

> 
> So our options for abort_task boil down to..
> (1) use connection allegiance for TASK MGMT PDU.
> (2) reject all commands prior to cmdSN of TASK MGMT PDU.
> (3) cmdSN of original task is sent with TASK MGMT PDU and
>     target at the iSCSI layer keeps state.
> (4) iSCSI initiator retains state for deleted tasks to ensure
>     that R2T/Scsi Responses are appropriately handled.
--------------09D4CBA14595D9CB62F7DE30
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------09D4CBA14595D9CB62F7DE30--



From owner-ips@ece.cmu.edu  Tue Apr 17 22:05:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA20691
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 22:05:45 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3I0EBR03689
	for ips-outgoing; Tue, 17 Apr 2001 20:14:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3I0Dfr03670
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 20:13:41 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 82B25BE8; Tue, 17 Apr 2001 17:13:40 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id RAA17624;
	Tue, 17 Apr 2001 17:13:35 -0700 (PDT)
Message-ID: <3ADCDD8F.C510C763@cup.hp.com>
Date: Tue, 17 Apr 2001 17:19:27 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Cc: T10 Reflector <t10@t10.org>
Subject: iSCSI : login keys & mode page settings
Content-Type: multipart/mixed;
 boundary="------------CC6594F50963FEA1121210B0"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------CC6594F50963FEA1121210B0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

All/Julian,

The iSCSI draft is lacking sufficient description on the subject of mode
page settings specific to iSCSI, their corresponding iSCSI login keys
and the interactions between these 2 mechanisms. Specific comments
enclosed below in that regard :


1) The iSCSI draft needs to describe the layout of the protocol specific
mode pages, namely, disconnect-reconnect mode page, protocol specific
lun page and protocol specific port page as applicable to iSCSI. Such a
figurative and textual description should be along the lines of that in
FCP-2 Section 10.


2) Specifically, the iSCSI draft lacks the description of the layout of
the protocol specific lun page and in its absence, then describes a
field from this page called
EnableCmdRn. This field is non-existent in the SPC-2 description of this
page in Section 8.3.10.


3) On a side note, the EnableCmdRN  & CmdRN fields should be re-named to
EnableCRN and CRN to reflect the same semantics and context as the CRN
defined in SAM-2 and FCP-2.


4) The EnableCmdRN login key should be removed from the list of iSCSI
login keys as this is a per-LUN key and iSCSI login keys have the scope
of a session. IOW, EnableCmdRN should be negotiated through a mode
select only and not through iSCSI login.


5) On a more fundamental note, should iSCSI allow for 2 levels of
control of I-T[-L] nexus operational parameters thru both the mode
select/sense scsi mechanisms and iSCSI login key mechanisms ?

Ex :
---------------------------------------------------------------------
iSCSI login key         SCSI mode page parameter
---------------------------------------------------------------------
DataPDULength           - Max Burst Size (Disc-reconn mode page)
FirstBurstSize          - First Burst Size (disc-reconn mode page)
InDataOrder             - EMDP (disconn-reconn mode page)
EnableCmdRN             - Enable CRN (LUN control mode page)

If such control is to be allowed at both the SCSI ULP and iSCSI
transport layers, a communication mechanism should be defined to
synchronize the state of these operational parameters across the 2
layers when a change is made in either layer through its corresponding
mechanisms.

Ex : 
a) Change thru iSCSI login key should result in an up call to update the
SCSI ULP.
b) Change made thru mode select should result in a down call to update
iSCSI LLP.
c) Change thru iSCSI login key should result in an up call to SCSI ULP
to cause a UNIT ATTENTION indicating "Mode Parameters Changed".


6) If such a level of dual control is provided, the iSCSI login
keys listed above be made LO (leading only) to allow for changes to
operational parameters only during session login. This is to
minimize/eliminate disruption of ongoing I/O activity that occurs due to
the generation of a UNIT ATTENTION CHECK CONDITION when any change is
made to the above paramters.


7) If only 1 mechanism of control is desired, which of the following
alternatives is desirable :
i) Only settable thru mode select and seen thru mode sense 
Pros :
------
- Allows 1 mechanism of control.
- Removes the need for synchronization of these values across SCSI ULP &
iSCSI.

Cons :
------
- Requires setting for all LUNs to enable for the entire session.


ii) Settable thru iSCSI and also settable/viewable thru mode sense.
Pros :
------
- flexible and allows control thru both scsi & iSCSI.
Cons :
------
- Can lead to synchronization overheads. 
- Needs SP(save page) setting also to be communicated in synching iSCSI
login to mode page values.


iii) Only settable thru iSCSI and viewable thru mode sense.
Pros:
-----
- Single mechanism of modification avoiding synchronization issues in
setting.
Cons :
------
- Denies traditional mechanism of modification. (mode sense).
- May break existing applns if enforced thru SPC-2.
- Requires changes to SPC-2.
- iSCSI compliance requires changes in SCSI ULP for SPC-2 compliance of
the change to not use mode select for parameter changes that are shared
with iSCSI.


8) If these operational parameters are allowed to be set thru iSCSI
login and they also impact mode page settings, iSCSI spec should
describe the scope of the mode page setting in terms of whether this
setting is a saved page setting or not ?

9) Should saved page settings be allowed thru iSCSI ?


- Santosh
--------------CC6594F50963FEA1121210B0
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------CC6594F50963FEA1121210B0--



From owner-ips@ece.cmu.edu  Tue Apr 17 23:33:46 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA22625
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 23:33:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HMt8X28604
	for ips-outgoing; Tue, 17 Apr 2001 18:55:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HMsBr28532
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 18:54:11 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTR2AM>; Tue, 17 Apr 2001 15:54:05 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B17342E@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Cc: Charles Monia <cmonia@NishanSystems.com>
Subject: RE: iSCSI & Linked Commands
Date: Tue, 17 Apr 2001 15:53:57 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi Santosh:

See below.

> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Tuesday, April 17, 2001 3:43 PM
> To: Charles Monia
> Cc: ips@ece.cmu.edu
> Subject: Re: iSCSI & Linked Commands
> 
> 
> Charles Monia wrote:
>  
> > > Mark has already replied to your question on why linked 
> commands won't
> > > work given today's implementations. In most cases, the 
> initiator task
> > > tag is either generated in the HBA driver or in the HBA firmware.
> > 
> > The fact that some adapter implementations on some 
> interconnects may not
> > support linked commands does not grant a license to delete 
> the capability
> > from the protocol specification.
> 
> Charles,
> 
> I was'nt saying that iSCSI must not support linked commands. See my
> comment on this :
> 

Point taken, sorry for the misunderstanding on my part.

<stuff deleted>

> The more interesting question is :
> Is there any particular reason connection allegiance would rather be
> enforced per command than per task ? (if linked commands are un-used,
> this is a moot question....). If it were enforced per task, then, an
> Abort Task would effectively flush all stale PDUs of that 
> task ensuring
> no stale PDUs in the network upon I/O termination at the initiator.
> 

I don't see how the stale PDU scenario comes about. As Julo stated earlier,
the initiator doesn't send the next command in the chain until it receives
status for the previous one.  So, at any time, there should only be PDUs in
flight for one command in a linked command sequence.

In that respect, I believe connection alliegance per command gives
marginally better opportunity for load balancing, since the initiator can
launch the next command in the chain on the most appropriate TCP connection.

Charles


From owner-ips@ece.cmu.edu  Tue Apr 17 23:38:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA22658
	for <ips-archive@odin.ietf.org>; Tue, 17 Apr 2001 23:38:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HMc6o27291
	for ips-outgoing; Tue, 17 Apr 2001 18:38:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HMbJr27222
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 18:37:19 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id AC9D28AE; Tue, 17 Apr 2001 15:37:18 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id PAA09025;
	Tue, 17 Apr 2001 15:37:10 -0700 (PDT)
Message-ID: <3ADCC6F4.72C4EF2F@cup.hp.com>
Date: Tue, 17 Apr 2001 15:43:00 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Charles Monia <cmonia@NishanSystems.com>
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI & Linked Commands
References: <B300BD9620BCD411A366009027C21D9B17342A@ariel.nishansystems.com>
Content-Type: multipart/mixed;
 boundary="------------D9215E28DAB4630CAD34C91A"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------D9215E28DAB4630CAD34C91A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Charles Monia wrote:
 
> > Mark has already replied to your question on why linked commands won't
> > work given today's implementations. In most cases, the initiator task
> > tag is either generated in the HBA driver or in the HBA firmware.
> 
> The fact that some adapter implementations on some interconnects may not
> support linked commands does not grant a license to delete the capability
> from the protocol specification.

Charles,

I was'nt saying that iSCSI must not support linked commands. See my
comment on this :

> I agree that there is no need to say linked commands are unsupported
> since this is a ULP feature and support or lack thereof is decided at
> the initiator and target ULPs.  

What I did say was that several/most implementations perform task tag
initialization at the LLP or further below rendering this feature
un-usable in several cases. 

The more interesting question is :
Is there any particular reason connection allegiance would rather be
enforced per command than per task ? (if linked commands are un-used,
this is a moot question....). If it were enforced per task, then, an
Abort Task would effectively flush all stale PDUs of that task ensuring
no stale PDUs in the network upon I/O termination at the initiator.

- Santosh
--------------D9215E28DAB4630CAD34C91A
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------D9215E28DAB4630CAD34C91A--



From owner-ips@ece.cmu.edu  Wed Apr 18 00:35:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA23130
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 00:35:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HME5R25502
	for ips-outgoing; Tue, 17 Apr 2001 18:14:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HMDor25484
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 18:13:50 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id SAA83552;
	Tue, 17 Apr 2001 18:06:24 -0400
Received: from f6n94e (d03nm102h.boulder.ibm.com [9.99.140.94])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id QAA76968;
	Tue, 17 Apr 2001 16:13:46 -0600
Importance: Normal
Subject: Re: iSCSI & Linked Commands
To: Mark Mokryn <mark@sangate.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.5  September 22, 2000
Message-ID: <OF56C1EFD7.8D35541E-ON87256A31.0079AE36@LocalDomain>
From: "Kenneth Hallam" <khallam@us.ibm.com>
Date: Tue, 17 Apr 2001 15:13:46 -0700
X-MIMETrack: Serialize by Router on D03NM102/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/17/2001 04:13:46 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Mark,

The Unisys Clearpath IX series of processors, (these are mainframe class
machines) make use of SCSI commands with the Link option set for most I/O
operations. Thus the AS/400 is not alone in its use of this command
feature.

Regards,

Ken Hallam
Technical Support Marketing Leader
IBM Storage Systems Group            Voice T/L 321-5689 or 520-799-5689
Tucson, AZ 85744                                  Fax  T/L 321-2097 or
520-799-2097
khallam@us.ibm.com



From owner-ips@ece.cmu.edu  Wed Apr 18 00:36:41 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA23143
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 00:36:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HLn5h23805
	for ips-outgoing; Tue, 17 Apr 2001 17:49:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c017.sfo.cp.net (c017-h020.c017.sfo.cp.net [209.228.12.234])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3HLm3r23754
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 17:48:03 -0400 (EDT)
Received: (cpmta 9062 invoked from network); 17 Apr 2001 14:47:51 -0700
Received: from ras4-p29.rvt.netvision.net.il (HELO sangate.com) (62.0.182.158)
  by smtp.sangate.com (209.228.12.234) with SMTP; 17 Apr 2001 14:47:51 -0700
X-Sent: 17 Apr 2001 21:47:51 GMT
Message-ID: <3ADCBB27.ADAFFA71@sangate.com>
Date: Tue, 17 Apr 2001 23:52:39 +0200
From: Mark Mokryn <mark@sangate.com>
X-Mailer: Mozilla 4.75 [en] (Win95; U)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI & Linked Commands
References: <C1256A31.004D404B.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

The HBA issue just underscores the fact that linked commands are simply
unused today. You can get sequential, conditional execution through
ordered commands. A precise emulation of the linked command sequence is
to send the second ordered command only after receiving status on the
first (we can even argue whether the ordered tags are necessary here or
anywhere else, but this is another issue). Thus, it appears to me that
linked commands are a redundant mechanism. The only implementation I
know of is on the AS/400 (if someone knows of others I'd like to know),
where they're used for skip reads and writes (i.e. LBA mask is sent with
first CDB, read/write with second CDB). But with all due respect to the
Rochester folks, if this is the chief implementation (plus these are
non-standard commands)... The reason I suggested to obsolete them is
that they're redundant, unused, and their benefit seems to be highly
questionable. Thus, linked commands (along with some other stuff) just
clutter the standards.

I believe linked commands were an attempt (and a weak one at that) to
mimic IBM channel command chaining. It's a poor effort, since another
task may be inserted into the task set (and possibly executed) at any
point between chain start and end. Mainframe command chaining works 
since the device (or at least device extent, with PAV) is locked once
the chain begins.

Now if you really want atomicity through chaining, then the way to go is
as you suggested earlier - to prefetch the commands. This is the way
it's done in Ficon - the entire chain is prefetched and then executed,
and only the final status counts. If recovery is needed, it is done at
the chain level.

-mark

julian_satran@il.ibm.com wrote:
> 
> Mark,
> 
> The argument about the HBAs generating tags is pretty weak as iSCSI will
> have it's own HBA's and iSCSI will generate the tags in any implementation.
> As for the utility - the sequential and conditional execution of the linked
> commands is guaranteed regardless of delivery or queuing order.  The only
> reason they might get obsolete is their inability to hide latency but I
> don't see any compelling reason to have them unsupported by iSCSI.
> 
> Julo
> 
> Mark Mokryn <mark@sangate.com> on 17/04/2001 12:55:05
> 
> Please respond to Mark Mokryn <mark@sangate.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   ips@ece.cmu.edu
> Subject:  Re: iSCSI & Linked Commands
> 
> Julian,
> 
> Santosh is right. Linked commands require an identical I_T_L_x nexus,
> but many Fibre Channel (and possibly) SCSI adapters generate a queue tag
> on-board, with no possibility of host software control. On such
> adapters, the generation of linked commands is impossible, and clearly
> today's SCSI layers are aware of this.
> 
> This raises the entire issue of task management in iSCSI: Linked
> commands are dated back to SCSI-2, where they indeed served a purpose.
> In the SCSI bus protocol, the target controlled all SCSI bus phases
> (following selection). Thus, in a linked command sequence, the target
> may drive the command phase immediately following the status phase, thus
> saving bus cycles (i.e. arbitration, selection, etc.). However, in the
> serial protocols, I don't see how linked commands are of any use, since
> there are no bus phases to save. In contrast with popular belief, linked
> commands offer no atomicity. Even in SCSI bus protocol, a linked command
> may be disconnected at any time (at the target's discretion), and a new
> command (from any initiator) may be started. Linked commands have always
> been optional, and indeed many target implementations today do not
> support them. For instance, looking at the Shark SCSI reference manual,
> according to the inquiry data, Shark does not support linked commands.
> 
> So, perhaps the wise thing to do is to not support linked commands in
> iSCSI. It has always been an optional feature for logical units, and
> today is outdated and often unsupported, both by targets and initiators.
> 
> -mark
> 
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > Sorry to interrupt this captivating thread. Why do you think linked
> > commands won't work?
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 16/04/2001 20:13:04
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   Douglas Otis <dotis@sanlight.net>
> > cc:   Ips <ips@ece.cmu.edu>
> > Subject:  Re: iSCSI:flow control, acknowledgement, and a deterministic
> >       recovery
> >
> > Doug,
> >
> > You seem to be referring to linked commands as a case wherein the
> > approach of Abort Task will not flush stale PDUs.
> >
> > Linked Commands cannot work the way SCSI implementations are defined
> > today, since linked commands require the initiator task tag (I_T_L_x
> > nexus identifier in SAM-2 Execute Command terminology) to be generated
> > by the SCSI ULP. However, in practice, the Initiator Task Tag (or the FC
> > OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> > adapter firmware). IOW, there is no common reference handle like the
> > task tag sent down from the ULP that allows for association of multiple
> > commands to a task in several/most implementations today.
> >
> > When this is fixed up to get linked commands to work [& there exist
> > examples of its usage], there is no reason connection allegiance could
> > not be applied to all the commands within the task.
> >
> > I fail to see why you think Abort Task will not work with sequential
> > devices (?).
> >
> > - Santosh
> >
> > Douglas Otis wrote:
> > >
> > > Santosh,
> > >
> > > I see a few problems with this approach.  Tasks as defined in iSCSI do
> > not
> > > maintain connection allegiance.  The driver binds all SCSI commands to
> > their
> > > connection for the most resent association.  Although there are several
> > > places within the iSCSI proposal that make reference to a task having a
> > > connection allegiance, this is in error.  Commands and not tasks carry
> > such
> > > allegiance.  Your recovery scheme will not allow a satisfactory
> recovery
> > > with a sequential device.  In this case, repeating the command is not a
> > > solution.  As a result, one connection falter and it will become a
> > difficult
> > > situation.  In addition, you have no clue from iSCSI your delivery
> > status.
> > > You do not know if you are waiting for the target or if you are waiting
> > for
> > > the connection.  Some sequential devices have rather long time-outs
> with
> > > these complications of deducing status created by the multiple
> > connections.
> > >
> > > The application will not know about these connection allegiance
> problems.
> > > The iSCSI layer does not define interaction to provide additional
> > > application status to allow these applications to respond in a manner
> > that
> > > may aid this situation nor should such additional information be
> > required.
> > > With your scheme the SCSI driver must examine the content of these
> > commands
> > > to make a guess as to the connection allegiance assignments.  Now the
> > driver
> > > is expected to understand what the intended action is of this SCSI
> > > management command.  What signal is used to indicate a need for the
> iSCSI
> > > immediate treatment?  The only obvious seems to be the task attribute
> > > argument.  With the way iSCSI has defined iSCSI immediate, I would
> expect
> > > those commands to be treated in a LIFO rather than the normal FIFO
> > fashion.
> > >
> > > Doug
> > >
> > > > Douglas Otis wrote:
> > > > >
> > > > > With multiple connections, if you are not going to use a valid
> > > > > CmdSN, or in your case a null CmdSN for all commands, then there
> > > > > would be a need to include a timestamp to meet a timely delivery
> > > > > requirement in the same manner as used in FC encapsulation.  IP
> > > > > can deliver over any time period.  A command could arrive at any
> > > > > time with respect to other connections.  With all of your feedback
> > > > > now from just the SCSI layer, the SCSI layer is likely to have
> timed
> > > > > out and restarted and now stray commands finally make an appearance
> > > > > (the technician re-inserted the cable).  What did that do?  Yes,
> > > > > if this were on a single connection, then TCP could provide some
> > > > > assurances, (ignoring digests errors) but you must not make that
> > > > > assumption nor can you assume all disruptions are symmetric.
> > > >
> > > > Doug,
> > > >
> > > > The below snippet from my last mail answered your above concern. The
> > > > Abort Task is sent on the same connection as the command. (connection
> > > > allegiance applied to the abort task as well). The Abort task pushes
> > the
> > > > stale data PDUs. There is no need for a timestamp on iSCSI PDUs.
> > > >
> > > > > > As for your second concern regarding I/O timeouts, there is
> > > > no need for
> > > > > > any timestamp. An I/O timeout is dealt with by an Abort Task.
> > > > The abort
> > > > > > task response guarantees that the abort reached the target and
> > pushed
> > > > > > all intermediate stale frames. Failure to complete Abort Task
> leads
> > to
> > > > > > higher level error recovery (ex : Logout, or some higher form of
> > task
> > > > > > mgmt).
> > > >
> > > > - Santosh
> >  - santoshr.vcf


From owner-ips@ece.cmu.edu  Wed Apr 18 00:39:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA23144
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 00:36:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HLt5W24201
	for ips-outgoing; Tue, 17 Apr 2001 17:55:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HLsAr24153
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 17:54:15 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3HMvt118478;
	Tue, 17 Apr 2001 15:58:21 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Santosh Rao" <santoshr@cup.hp.com>, <julian_satran@il.ibm.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI Requirements Draft - Informal WG Last Call
Date: Tue, 17 Apr 2001 14:48:03 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEGOCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <3ADC9450.3DCA0DFE@cup.hp.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Santosh,

CRN does not resolve possibilities of trapped commands beyond the ULP
timeouts.  To resolve the trapped command problem, each command MUST pass
through the sequencer in order.  Stray commands are then identified by a
serialization beyond the current sequence point.  Any alternative scheme
would be complex.  This problem is not related to SCSI, it is an artifact of
IP networks with the potential for greatly delayed delivery.  The example
that I used was a technician temporarily unplugging the cable.

For security reasons, a command rejected for this reason should not be done
silently.  The iSCSI proposal should be changed to indicate that commands
serialized outside the current window will be reported.  The paragraph on
page 11 should be changed with an error code assigned to indicate the
direction of the error, before or beyond the command window.

Ver 5-92 Pg 11
  "The target MUST NOT transmit a MaxCmdSN that is more than 2**31 - 1
   above the last ExpCmdSN.  For non-immediate commands, the CmdSN field
   can take any value from ExpCmdSN to MaxCmdSN. For immediate commands,
   the CmdSN field can take any value from ExpCmdSN to MaxCmdSN+1. The
   target MUST silently ignore any command outside this range or
   duplicates within the range that have not been flagged with the retry
   bit (the X bit in the opcode)."

If iSCSI were just a single connection protocol, you would have some basis
for your complaint.  The topic still not addressed by the "immediate"
command designation is with respect to limits placed on the number of
successive "immediate" commands to ensure enough resources are held in
reserve.

Doug


> Julian,
>
> Ordered delivery can be achieved to better effect using the SAM-2 CRN
> based ordering due to the following reasons :
>
> 1) CRN provides ordering on a per-lun basis and can be turned on and off
> for a subset of I/Os to that LUN. This allows for flexible ordering
> since ordering is a function of the I/O type from the application.
> Applications that are doing READ only operations (like a search engine)
> do not require any ordering. Ordering is required on metadata updates,
> any form of synchronization I/Os, WRITEs interspersed with READS, etc.
> Thus, an ordering solution should be flexible enough to be applied at
> the scope of a subset of I/Os destined to a LUN.
>
> Such an ordering scheme would also allow ordering to be turned on for
> only tape applications if disk applications did not require ordering.
>
> iSCSI's ordering solution does not provide this flexibility, whereas
> usage of CRN would.
>
> 2) Such a fine granularity scope of ordering also minimizes the impact
> of error recovery actions taken when loss of order occurs. The impact
> with CRN would be a target-initiator handshake based on ACA + some
> checkpoints to error back all the pending CRN enabled commands on that
> LUN.
>
> Comparing the equivalent error recovery in a CmdSN based iSCSI ordering
> solution, the action taken would be to error back all the pending I/Os
> destined to the entire session following loss of order. With high end
> disk arrays having 1000+ LUN configurations, such error recovery is
> extreme, [especially when ordering may have been desired by the appln
> only on a small subset of I/Os to 1 LUN, and loss of ordering for the
> remaining 999 LUNs was a don't care].
>
> 3) A CRN based ordering scheme works for all underlying SCSI transports
> as opposed to CmdSN based ordering.
>
> 4) The generation of a stream of commands that expect strong ordering
> will need to be accompanied by corresponding generation of a sequence
> number at the same layer. (CRN would provide such a sequencing). Failure
> to do so can result in silent loss of order that slips un-detected due
> to potential points of failure in the stack b/n the SCSI ULP and the
> physical bus/link. (ex : I/O failures within the HBA driver due to
> resource allocation failures or other such conditions can cause loss of
> order.).
>
> Attempts to enforce ordering at multiple layers of the stack (CRN at the
> ULP and CmdSN at the LLP), especially when CmdSN does not provide all
> the benefits that CRN would provide is over-engineering the solution to
> the ordering problem. It also impacts iSCSI performance.
>
> - Santosh
>
> >
> > julian_satran@il.ibm.com wrote:
> > >
> > > Ordered delivery of commands to ANY TYPE of devices will increase in
> > > importance as network speeds increase and the need to hide latency
> > > increases.
> > >
> > > Today databases don't use queuing and rely and trickle the commands to
> > > devices 1 by 1 to ensure atomicity and order.
> > > As latency will become the determining factor in performance
> this is bound
> > > to change.
> > >
> > > SCSI has done an excellent job in defining the queueing
> mechanism. We have
> > > to make it work with good performance in our environment.
> > >
> > > Julo
> > >
> > > Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 04:33:45
> > >
> > > Please respond to Santosh Rao <santoshr@cup.hp.com>
> > >
> > > To:   ips@ece.cmu.edu
> > > cc:   Black_David@emc.com
> > > Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call
> > >
> > > David & All,
> > >
> > > I object to the following requirement :
> > >
> > > " MUST support ordered delivery of SCSI commands from the initiator to
> > > the
> > >   target, to support SCSI Task Queuing. "
> > >
> > > Ordered delivery is not a requirement for disk based applications and
> > > non tagged queueing tape applications, which form the majority of
> > > today's data traffic.
> > >
> > > To impose strict ordering (even in the presence of errors ?) as a MUST
> > > is penalizing the majority of today's data traffic that does
> not expect
> > > ordering from the SCSI subsystem.
> > >
> > > I am particularly concerned about the effect of the above
> requirement in
> > > the presence of errors. Does iSCSI expect strict ordering to be
> > > maintained even when individual I/O errors like ULP timeout occur ?
> > >
> > > On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
> > > choose not to retry the command, but instead, error it back
> to the ULP.
> > > In such a case, it can plug the hole in CmdSN with a NOP-OUT.
> > >
> > > The above requirement is not feasible to be met under such
> circumstances
> > > and others similar to this. Mandating strict ordering on ULP timeouts
> > > implies a session level error recovery on any individual I/O being
> > > failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
> > > as error recovery and should not be imposed.
> > >
> > > The above requirement must be changed to :
> > > " SHOULD support ordered delivery of SCSI commands from the
> initiator to
> > > the
> > >   target, to support SCSI Task Queuing. "
> > >
> > > - Santosh
> > >
> > > Black_David@emc.com wrote:
> > > >
> > > > It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> > > > as an Informational RFC. There is no formal requirement for
> > > > a WG Last Call, but if you have any further substantive comments
> > > > on the document please raise them on this list within the next
> > > > two weeks, i.e. by April 27th at the latest.
> > > >
> > > > If you have typographical/editorial comments please send them
> > > > direct to the document's author, Marjorie Krueger
> > > > <marjorie_krueger@hp.com>.
> > > >
> > > > Thanks,
> > > > --David and Elizabeth, IPS WG co-chairs
> > >  - santoshr.vcf



From owner-ips@ece.cmu.edu  Wed Apr 18 00:48:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA23223
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 00:48:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3HNb9i01449
	for ips-outgoing; Tue, 17 Apr 2001 19:37:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-4.cisco.com (sj-msg-core-4.cisco.com [171.71.163.10])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3HMLfr26047
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 18:21:41 -0400 (EDT)
Received: from mira-sjc5-2.cisco.com (mira-sjc5-2.cisco.com [171.71.163.16])
	by sj-msg-core-4.cisco.com (8.9.3/8.9.1) with ESMTP id PAA10208;
	Tue, 17 Apr 2001 15:16:44 -0700 (PDT)
Received: from cisco.com (rtp-dial-2-25.cisco.com [10.83.96.25])
	by mira-sjc5-2.cisco.com (Mirapoint)
	with ESMTP id ACN04821 (AUTH rrs);
	Tue, 17 Apr 2001 15:15:20 -0700 (PDT)
Message-ID: <3ADCC075.E15CDAC1@cisco.com>
Date: Tue, 17 Apr 2001 17:15:17 -0500
From: Randall Stewart <rrs@cisco.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.12 i386)
X-Accept-Language: en
MIME-Version: 1.0
To: "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
CC: "'julian_satran@il.ibm.com'" <julian_satran@il.ibm.com>, ips@ece.cmu.edu,
        tsvwg@ietf.org, "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@Sun.COM>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>
Subject: Re: [Tsvwg] [SCTP checksum problems]
References: <499DC368E25AD411B3F100902740AD65BC5AB2@xrose03.rose.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian:

Your input would be invaluable.. please send us any comments
or input when you are ready.. I think we need to be
looking seriously at this in the early May time frame... so
any input you could give would be most welcome. We are limited
to a 32 bit checksum (the size of the common header CRC area).

But within that restriction any input you may have as to the
best for data integrety would be wonderful! 

Regards

R


"WENDT,JIM (HP-Roseville,ex1)" wrote:
> 
> Julian,
> The SCTP folks are right now discussing changing the SCTP checksum to be a
> CRC-32 (or other). This is a very good thing and really what needs to happen
> with SCTP for it to support iSCSI and other data-critical applications
> effectively (and also relieve iSCSI from having to implement data integrity
> checking and transport-like functionality over SCTP).
> 
> They are looking for inputs as to which CRC-32 or checksum to use. The iSCSI
> WG's CRC investigation work and conclusion would be a valuable input into
> their decision. The sooner that you can provide the iSCSI recommended CRC
> and reasoning behind it to them, the better, even before the forthcoming I-D
> is distributed.
> 
> Jim Wendt
> Networked Storage Architecture
> Hewlett-Packard Company
> jim_wendt@hp.com 916-785-5198
> 
> ----------------------------------------------------------------------------
> -
> 
> > -----Original Message-----
> > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > Sent: Sunday, April 15, 2001 7:58 AM
> > To: ips@ece.cmu.edu
> > Subject: CRCs
> >
> >
> >
> >
> > Dear colleagues,
> >
> > We will probably not be able to finish the CRC/checksum
> > document in time
> > for Nashua but we hope it will be out very soon after that.
> > However I
> > would like to inform you that while in Orlando and
> > Minneapolis we where
> > still talking about different CRCs we (Dafna Sheinwald, Pat
> > Thaler, Matt
> > Wakeley, Vince Cavanna and myself) have agreed on a CRC and
> > the forthcoming
> > ID will give all the reasons and why we recomend it.
> >
> > Regards,
> > Julo
> >
> 
> ----------------------------------------------------------------------------
> -
> 
> -----Original Message-----
> From: Randall Stewart [mailto:rrs@cisco.com]
> Sent: Tuesday, April 17, 2001 4:31 AM
> To: Jonathan Wood
> Cc: xieqb@cig.mot.com; tsvwg@ietf.org; Jim Wendt; Jonathan Stone; Craig
> Partridge
> Subject: Re: [Tsvwg] [SCTP checksum problems]
> 
> Jonathan:
> 
> I will make sure everyone at the bakeoff is aware of the upcoming
> "checksum" change... Now one of the big questions yet is
> what checksum should we use?
> 
> I kinda lean towards crc-32 myself (but of course I have no technical
> basis for this and need to keep silent on which one to use anyway :->),
> but do we have other candidates besides fletcher-32 and possibly
> modified
> Adler-32 (i.e. 16 bit adds instead of 8)??
> 
> I will take the above 3 and do a bit of performance work this
> week and post some numbers... thats about all I can do i.e.  tell
> how much time the options I know of take...
> 
> If you have some other candidates let me know and I can possibly get
> some performance numbers on these as well...
> 
> As far as which is the best... I encourage all of you check-sum
> experts out there to please join the thread :)
> 
> Oh, I know Jonathan Stone's paper will NOT be ready until sometime
> in May.. so we may want to proceed slowly so that Craig Partridge and
> he can have some cycles to add to this dicussion :)
> 
> R
> 
> Jonathan Wood wrote:
> >
> > As an SCTP implementor and someone who will want to get the hardware folks
> to
> > help with checksumming, I wholeheartedly agree with Randy. Remember that
> SCTP is
> > just a proposed standard, and is as such not all that far along the
> > standardization process. We should still be able to make changes like this
> if
> > necessary.
> >
> > Jon
> >
> > >
> > >Q:
> > >
> > >The only problem with an additional "CRC chunk" is that
> > >it makes hardware assistance to error correction much
> > >more difficult. It is better (I think) to just realize
> > >we made a mistake. Get the opinions of the experts as to
> > >what checksum to use... i.e.:
> > >
> > >- CRC-32
> > >- Modified Adler-32 (16 bit word sums)
> > >- Fletcher-32
> > >- ???
> > >
> > >And then go with this as a replacement... Admit we were wrong
> > >and fix the problem..
> > >
> > >This way you have ONE and only ONE checksum algorithm making
> > >hardware designers life much easier...
> > >
> > >R
> > >
> > >Qiaobing Xie wrote:
> > >>
> > >> Another solution could be (I think I mentioned this to Randy and a few
> > >> others at last IETF):
> > >>
> > >> - Define a CRC-32 (or other strong checksum) control chunk and when the
> > >> sender wishes to use a stronger checksum protection, in addition to the
> > >> Adler-32 in the common SCTP header it includes this CRC-32 chuck in the
> > >> outbound packet. When the packet arrives, the receiver will do the
> > >> Adler-32 first, and then if the receiver supports the CRC-32 and sees
> > >> the presence of the CRC-32 chunk in the packet it will further verify
> > >> the CRC-32.
> > >>
> > >> We could also use a bit pattern in the chunk type of the CRC-32 chunk
> so
> > >> that if the receiver doesn't understand the CRC-32 chunk it would drop
> > >> it with a report back to the sender.
> > >>
> > >> -Qiaobing
> > >>
> > >> _______________________________________________
> > >> tsvwg mailing list
> > >> tsvwg@ietf.org
> > >> http://www1.ietf.org/mailman/listinfo/tsvwg
> > >
> > >--
> > >Randall R. Stewart
> > >Systems & Solutions Engineering
> > >Cisco Systems Inc.
> > >rrs@cisco.com 815-342-5222 or 815-477-2127
> > >
> > >_______________________________________________
> > >tsvwg mailing list
> > >tsvwg@ietf.org
> > >http://www1.ietf.org/mailman/listinfo/tsvwg
> 
> --
> Randall R. Stewart
> Systems & Solutions Engineering
> Cisco Systems Inc.
> rrs@cisco.com 815-342-5222 or 815-477-2127
> 
> >

-- 
Randall R. Stewart
Systems & Solutions Engineering
Cisco Systems Inc.
rrs@cisco.com 815-342-5222 or 815-477-2127


From owner-ips@ece.cmu.edu  Wed Apr 18 01:21:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA24895
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 01:21:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3I33FO13934
	for ips-outgoing; Tue, 17 Apr 2001 23:03:15 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sphmraaa.compuserve.com (hs-img-rel-1.compuserve.com [149.174.177.156])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3I337r13921
	for <ips@ece.cmu.edu>; Tue, 17 Apr 2001 23:03:07 -0400 (EDT)
Received: (from mailgate@localhost)
	by sphmraaa.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) id XAA09096
	for ips@ece.cmu.edu; Tue, 17 Apr 2001 23:03:01 -0400 (EDT)
Received: from compuserve.com (sfr-tgn-sfi-vty85.as.wcom.net [216.192.11.85])
	by sphmraaa.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) with ESMTP id XAA09041;
	Tue, 17 Apr 2001 23:02:54 -0400 (EDT)
Message-ID: <3ADD0141.DA223E52@compuserve.com>
Date: Tue, 17 Apr 2001 19:51:45 -0700
From: Ralph Weber <ralphoweber@compuserve.com>
Reply-To: ENDL_TX@computer.org
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD NSCPCD47  (Win98; I)
X-Accept-Language: en,pdf
MIME-Version: 1.0
To: ips@ece.cmu.edu
CC: julian_satran@il.ibm.com
Subject: Re: iSCSI linked commands
References: <C1256A31.003D3A23.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

julian_satran@il.ibm.com wrote:

> Doug,
>
> I think you would want to go back to SAM.  Linked command are broken by any
> "irregularity" in execution.

The specific statement you are looking for appears on SAM-2 r16
top of PDF page 66, "The receipt of any status, except INTERMEDIATE
or INTERMEDIATE-CONDITION MET, shall indicate that the associated
task has ended."

> The basic assumption is that the initiator is in charge of shipping linked
> commands - one-by-one.

I am not an expert on linked commands (there are only three such
people that I know of).  However, my understanding is that linked
commands are REQUIRED to be shipped one-by-one.  The initiator is
allowed to pick the next linked command based on the results
returned by the previous linked command.

> I assume that for high latency links they won't be very popular.

They are not very popular, period.

> At a very early stage (about 2 years ago) we contemplated the idea of
> "prefetching" linked commands and have the target
> effect the serialization. We would have had to come up with a way of
> conveying the initiator which command broke the chain (if it broke) or
> caused a unit attention (if it caused) and it was not at all clear that
> this was "in the spirit of SAM" .
> There where also more esoteric issues with later command getting modified
> by execution of prior commands etc. -:).

Like I said above, I think pre-fetching linked commands violates
"the rules".

Thanks.

Ralph...





From owner-ips@ece.cmu.edu  Wed Apr 18 03:41:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA07683
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 03:41:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3I58MD22026
	for ips-outgoing; Wed, 18 Apr 2001 01:08:22 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3I57gr21968
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 01:07:42 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3I6Cx118801;
	Tue, 17 Apr 2001 23:13:11 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Santosh Rao" <santoshr@cup.hp.com>,
        "Charles Monia" <cmonia@NishanSystems.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI & Linked Commands
Date: Tue, 17 Apr 2001 22:03:06 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOEHCCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <3ADCC6F4.72C4EF2F@cup.hp.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Santosh,

Linked commands typically employ a feature that allows relative addressing
for sequential devices.  Link bits are still present within the CDB even for
fibre channel.  As only one link command is ever seen on the network at any
point in time, identify the most recent command related to a task and then
use this information to associate the Abort Task command possibly issued
when the ULP times out.  This assumes that Abort Task would be the command
used in this event and that the backup application is able to recover from
such.

For your scheme to work, log all command's connection allegiance, examine
all CDBs to determine their nature, and identify the related task based on
the content of a management command.  You are suggesting that the connection
will be flushed by your technique, but if not, tear down the connection?
Without any acknowledgement as to what was delivered into the SCSI layer,
this forces a wait for the ULP to timeout on each command.  Potentially this
invites a large list of timeouts to stumble over before recovery while each
timeout may be met with a management command.  One wonders if you have saved
any effort.  This is compared to checking the sequence of commands at the
receiver which also provides acknowledgement.  This means there is no extra
effort to ensure sequential delivery and a problem can be detected and
overcome without involving the ULP.

A result of not having any acknowledgement within the iSCSI transport would
be all problems are determined by timeouts within the ULP.  Each problem is
met with uncertainty of cause.  There would only be one command to flush per
such timeout however.  I was just trying to correct the language you used
with respect to command and not task allegiance.  You have however failed to
continue the operation of the sequential device based on a network problem.
In addition, you must ensure that the driver notices the intent of the
management command and finds the correct connection to send this command.
If this connector is indeed unplugged, then this command as well as the one
you are attempting to abort will timeout.  After two ULP timeouts, a
connection tear down will then create many more such events even if the
problem is not with the network.  Check by pinging first?  How long do you
wait for a ping to respond?

Without management being serialized, you could not be sure of their
placement relative to other connections.  All connections would need to stop
until this management command is acknowledged by the SCSI device and even
then these pending commands on other connections may not have already
arrived.  The first timeout caused some application to screech to a halt and
now the entire system will be required to wait for this recovery.  Should
this problem happen again, there will be uncertainty as to whether it is a
connection problem or a device problem.  A ping down the connection in
question may reveal a problem after an additional timeout. You have removed
communication with the sequencer with the exception of "rare" sequential
commands but recovery from a connection falter looks difficult, slow and
highly dependent upon the ULP being well written for this iSCSI environment.

Doug

> Charles Monia wrote:
>
> > > Mark has already replied to your question on why linked commands won't
> > > work given today's implementations. In most cases, the initiator task
> > > tag is either generated in the HBA driver or in the HBA firmware.
> >
> > The fact that some adapter implementations on some interconnects may not
> > support linked commands does not grant a license to delete the
> capability
> > from the protocol specification.
>
> Charles,
>
> I was'nt saying that iSCSI must not support linked commands. See my
> comment on this :
>
> > I agree that there is no need to say linked commands are unsupported
> > since this is a ULP feature and support or lack thereof is decided at
> > the initiator and target ULPs.
>
> What I did say was that several/most implementations perform task tag
> initialization at the LLP or further below rendering this feature
> un-usable in several cases.
>
> The more interesting question is :
> Is there any particular reason connection allegiance would rather be
> enforced per command than per task ? (if linked commands are un-used,
> this is a moot question....). If it were enforced per task, then, an
> Abort Task would effectively flush all stale PDUs of that task ensuring
> no stale PDUs in the network upon I/O termination at the initiator.
>
> - Santosh



From owner-ips@ece.cmu.edu  Wed Apr 18 04:40:24 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08133
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 04:40:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3I7JKQ00234
	for ips-outgoing; Wed, 18 Apr 2001 03:19:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3I7Irr00221
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 03:18:53 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id JAA72664
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 09:18:46 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id JAA61428
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 09:18:46 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.00282716 ; Wed, 18 Apr 2001 09:18:34 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A32.00282585.00@d12mta02.de.ibm.com>
Date: Wed, 18 Apr 2001 09:03:39 +0200
Subject: Re: iSCSI Requirements Draft - Informal WG Last Call
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Sandeep,

Ordered vs. Pipelined - in delivery (not execution) terms are the same.
Yes we considered per LUN ordering and dropped as being excessively complex
to implement (the only thing that will require per-LUN state) and the total
ordering will be required for complete resets.

SAM still keeps the per-LUN ordering for CmdRN.

Julo

Sandeep Joshi <sandeepj@research.bell-labs.com> on 17/04/2001 16:43:56

Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call





Julian,

I believe you mean "pipelined delivery" as opposed to "ordered
delivery".  The latter is more strict and will introduce pipeline
stalls which we would wish to avoid.  The flow control mechanism
is getting tied into introducing false sequentiality.

As an aside, a colleague here wanted to know if we had considered
having per-LUN ordering as opposed to per-session command ordering.
The pipeline could also slow down because of an aging disk or a
heavily loaded disk.  I do see some discussions at the Haifa meeting
  http://www.ece.cmu.edu/~ips/meetingMinutes/06.20.2000.txt

Was it just added complexity which forced the per-session choice?

> Point 3.4.3 brings up ordering of commands per LUN or ordering of
commands
> per session?
> MW - reorder requirements - second before the first one
> RH - does anybody disagree with 3.4.3?
> CS - No. It basically says that you support SCSI queuing
> JH - In the wide area, a method of pipelining commands and responses
> is required
> CS - the requirement is more complex than saying you just support
> SCSI queing.
> RH - delivering commands in order never hurts
> MT - Why keep order between logical units in commands?
> RH - SAM-2 does not require order between LUNs. However, it may make
> target implemetnation easier.

thanks,
-Sandeep


julian_satran@il.ibm.com wrote:
>
> Ordered delivery of commands to ANY TYPE of devices will increase in
> importance as network speeds increase and the need to hide latency
> increases.
>
> Today databases don't use queuing and rely and trickle the commands to
> devices 1 by 1 to ensure atomicity and order.
> As latency will become the determining factor in performance this is
bound
> to change.
>
> SCSI has done an excellent job in defining the queueing mechanism. We
have
> to make it work with good performance in our environment.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 04:33:45
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   ips@ece.cmu.edu
> cc:   Black_David@emc.com
> Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call
>
> David & All,
>
> I object to the following requirement :
>
> " MUST support ordered delivery of SCSI commands from the initiator to
> the
>   target, to support SCSI Task Queuing. "
>
> Ordered delivery is not a requirement for disk based applications and
> non tagged queueing tape applications, which form the majority of
> today's data traffic.
>
> To impose strict ordering (even in the presence of errors ?) as a MUST
> is penalizing the majority of today's data traffic that does not expect
> ordering from the SCSI subsystem.
>
> I am particularly concerned about the effect of the above requirement in
> the presence of errors. Does iSCSI expect strict ordering to be
> maintained even when individual I/O errors like ULP timeout occur ?
>
> On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
> choose not to retry the command, but instead, error it back to the ULP.
> In such a case, it can plug the hole in CmdSN with a NOP-OUT.
>
> The above requirement is not feasible to be met under such circumstances
> and others similar to this. Mandating strict ordering on ULP timeouts
> implies a session level error recovery on any individual I/O being
> failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
> as error recovery and should not be imposed.
>
> The above requirement must be changed to :
> " SHOULD support ordered delivery of SCSI commands from the initiator to
> the
>   target, to support SCSI Task Queuing. "
>
> - Santosh
>
> Black_David@emc.com wrote:
> >
> > It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> > as an Informational RFC. There is no formal requirement for
> > a WG Last Call, but if you have any further substantive comments
> > on the document please raise them on this list within the next
> > two weeks, i.e. by April 27th at the latest.
> >
> > If you have typographical/editorial comments please send them
> > direct to the document's author, Marjorie Krueger
> > <marjorie_krueger@hp.com>.
> >
> > Thanks,
> > --David and Elizabeth, IPS WG co-chairs
>  - santoshr.vcf





From owner-ips@ece.cmu.edu  Wed Apr 18 04:42:21 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08158
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 04:42:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3I6cOv27905
	for ips-outgoing; Wed, 18 Apr 2001 02:38:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3I6aSr27808
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 02:36:29 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id IAA185040;
	Wed, 18 Apr 2001 08:28:30 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA243644;
	Wed, 18 Apr 2001 08:28:28 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.0023902D ; Wed, 18 Apr 2001 08:28:26 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Randall Stewart <rrs@cisco.com>
cc: "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>, ips@ece.cmu.edu,
        tsvwg@ietf.org, "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>
Message-ID: <C1256A32.00238F3F.00@d12mta05.de.ibm.com>
Date: Wed, 18 Apr 2001 08:27:04 +0200
Subject: Re: [Tsvwg] [SCTP checksum problems]
Mime-Version: 1.0
Content-type: multipart/mixed; 
	Boundary="0__=XVHp06PpulT0oOPGDF6R8rp8CZ14LJlh0b0xIimfQU7bGeH59Tvveuqo"
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--0__=XVHp06PpulT0oOPGDF6R8rp8CZ14LJlh0b0xIimfQU7bGeH59Tvveuqo
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline



Randall,

We are going to publish an I-D justifying our choice in about 2 weeks.
It is more extensive than the attached (and has some additional authors)
but here is a preliminary:

(See attached file: draft-sheinwald-iSCSI-CRC-00.txt)

The summary is:

-Checksums (Adler, Fletcher are weak and sensitive to data bias)
-CRCs are far better
-Not all CRCs are equal - out of the CRCs for which there is experimental
and analytical experience CRC32C is the best.

Regards,
Julo

Randall Stewart <rrs@cisco.com> on 18/04/2001 00:15:17

Please respond to Randall Stewart <rrs@cisco.com>

To:   "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
cc:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org,
      "'Craig Partridge'" <craig@aland.bbn.com>, Jonathan Wood
      <Jonathan.Wood@sun.com>, xieqb@cig.mot.com, Jonathan Stone
      <jonathan@dsg.stanford.edu>
Subject:  Re: [Tsvwg] [SCTP checksum problems]




Julian:

Your input would be invaluable.. please send us any comments
or input when you are ready.. I think we need to be
looking seriously at this in the early May time frame... so
any input you could give would be most welcome. We are limited
to a 32 bit checksum (the size of the common header CRC area).

But within that restriction any input you may have as to the
best for data integrety would be wonderful!

Regards

R


"WENDT,JIM (HP-Roseville,ex1)" wrote:
>
> Julian,
> The SCTP folks are right now discussing changing the SCTP checksum to be
a
> CRC-32 (or other). This is a very good thing and really what needs to
happen
> with SCTP for it to support iSCSI and other data-critical applications
> effectively (and also relieve iSCSI from having to implement data
integrity
> checking and transport-like functionality over SCTP).
>
> They are looking for inputs as to which CRC-32 or checksum to use. The
iSCSI
> WG's CRC investigation work and conclusion would be a valuable input into
> their decision. The sooner that you can provide the iSCSI recommended CRC
> and reasoning behind it to them, the better, even before the forthcoming
I-D
> is distributed.
>
> Jim Wendt
> Networked Storage Architecture
> Hewlett-Packard Company
> jim_wendt@hp.com 916-785-5198
>
>
----------------------------------------------------------------------------

> -
>
> > -----Original Message-----
> > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > Sent: Sunday, April 15, 2001 7:58 AM
> > To: ips@ece.cmu.edu
> > Subject: CRCs
> >
> >
> >
> >
> > Dear colleagues,
> >
> > We will probably not be able to finish the CRC/checksum
> > document in time
> > for Nashua but we hope it will be out very soon after that.
> > However I
> > would like to inform you that while in Orlando and
> > Minneapolis we where
> > still talking about different CRCs we (Dafna Sheinwald, Pat
> > Thaler, Matt
> > Wakeley, Vince Cavanna and myself) have agreed on a CRC and
> > the forthcoming
> > ID will give all the reasons and why we recomend it.
> >
> > Regards,
> > Julo
> >
>
>
----------------------------------------------------------------------------

> -
>
> -----Original Message-----
> From: Randall Stewart [mailto:rrs@cisco.com]
> Sent: Tuesday, April 17, 2001 4:31 AM
> To: Jonathan Wood
> Cc: xieqb@cig.mot.com; tsvwg@ietf.org; Jim Wendt; Jonathan Stone; Craig
> Partridge
> Subject: Re: [Tsvwg] [SCTP checksum problems]
>
> Jonathan:
>
> I will make sure everyone at the bakeoff is aware of the upcoming
> "checksum" change... Now one of the big questions yet is
> what checksum should we use?
>
> I kinda lean towards crc-32 myself (but of course I have no technical
> basis for this and need to keep silent on which one to use anyway :->),
> but do we have other candidates besides fletcher-32 and possibly
> modified
> Adler-32 (i.e. 16 bit adds instead of 8)??
>
> I will take the above 3 and do a bit of performance work this
> week and post some numbers... thats about all I can do i.e.  tell
> how much time the options I know of take...
>
> If you have some other candidates let me know and I can possibly get
> some performance numbers on these as well...
>
> As far as which is the best... I encourage all of you check-sum
> experts out there to please join the thread :)
>
> Oh, I know Jonathan Stone's paper will NOT be ready until sometime
> in May.. so we may want to proceed slowly so that Craig Partridge and
> he can have some cycles to add to this dicussion :)
>
> R
>
> Jonathan Wood wrote:
> >
> > As an SCTP implementor and someone who will want to get the hardware
folks
> to
> > help with checksumming, I wholeheartedly agree with Randy. Remember
that
> SCTP is
> > just a proposed standard, and is as such not all that far along the
> > standardization process. We should still be able to make changes like
this
> if
> > necessary.
> >
> > Jon
> >
> > >
> > >Q:
> > >
> > >The only problem with an additional "CRC chunk" is that
> > >it makes hardware assistance to error correction much
> > >more difficult. It is better (I think) to just realize
> > >we made a mistake. Get the opinions of the experts as to
> > >what checksum to use... i.e.:
> > >
> > >- CRC-32
> > >- Modified Adler-32 (16 bit word sums)
> > >- Fletcher-32
> > >- ???
> > >
> > >And then go with this as a replacement... Admit we were wrong
> > >and fix the problem..
> > >
> > >This way you have ONE and only ONE checksum algorithm making
> > >hardware designers life much easier...
> > >
> > >R
> > >
> > >Qiaobing Xie wrote:
> > >>
> > >> Another solution could be (I think I mentioned this to Randy and a
few
> > >> others at last IETF):
> > >>
> > >> - Define a CRC-32 (or other strong checksum) control chunk and when
the
> > >> sender wishes to use a stronger checksum protection, in addition to
the
> > >> Adler-32 in the common SCTP header it includes this CRC-32 chuck in
the
> > >> outbound packet. When the packet arrives, the receiver will do the
> > >> Adler-32 first, and then if the receiver supports the CRC-32 and
sees
> > >> the presence of the CRC-32 chunk in the packet it will further
verify
> > >> the CRC-32.
> > >>
> > >> We could also use a bit pattern in the chunk type of the CRC-32
chunk
> so
> > >> that if the receiver doesn't understand the CRC-32 chunk it would
drop
> > >> it with a report back to the sender.
> > >>
> > >> -Qiaobing
> > >>
> > >> _______________________________________________
> > >> tsvwg mailing list
> > >> tsvwg@ietf.org
> > >> http://www1.ietf.org/mailman/listinfo/tsvwg
> > >
> > >--
> > >Randall R. Stewart
> > >Systems & Solutions Engineering
> > >Cisco Systems Inc.
> > >rrs@cisco.com 815-342-5222 or 815-477-2127
> > >
> > >_______________________________________________
> > >tsvwg mailing list
> > >tsvwg@ietf.org
> > >http://www1.ietf.org/mailman/listinfo/tsvwg
>
> --
> Randall R. Stewart
> Systems & Solutions Engineering
> Cisco Systems Inc.
> rrs@cisco.com 815-342-5222 or 815-477-2127
>
> >

--
Randall R. Stewart
Systems & Solutions Engineering
Cisco Systems Inc.
rrs@cisco.com 815-342-5222 or 815-477-2127


--0__=XVHp06PpulT0oOPGDF6R8rp8CZ14LJlh0b0xIimfQU7bGeH59Tvveuqo
Content-type: application/octet-stream; 
	name="draft-sheinwald-iSCSI-CRC-00.txt"
Content-Disposition: attachment; filename="draft-sheinwald-iSCSI-CRC-00.txt"
Content-Description: Text - character set unknown
Content-Transfer-Encoding: base64

DQoNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgIA0KSW50ZXJuZXQgRHJhZnQgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICBEYWZuYSBTaGVpbndhbGQgDQpEb2N1bWVudDogZHJhZnQtc2hl
aW53YWxkLWlTQ1NJLUNSQy0wMC50eHQgICAgICAgICAgICAgICAgSnVsaWFuIFNhdHJhbiANCkNh
dGVnb3J5OiBpbmZvcm1hdGlvbmFsICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgSUJNIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgDQogICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgUGF0IFRoYWxlciANCiAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBB
Z2lsZW50IA0KICAgIA0KIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIE1lbW8g
DQogICAgICAgICAgICAgICAgICAgaVNDU0kgQ1JDL0NoZWNrc3VtIENvbnNpZGVyYXRpb25zIA0K
ICAgIA0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoN
Cg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KICANClNoZWlud2FsZCwgU2F0cmFuIElu
Zm9ybWF0aW9uYWwsIEV4cGlyZSBPY3RvYmVyIDIwMDEgICAgICAgICAgICAgICAgIDEgCgwNCiAg
ICAgICAgICAgICAgICAgICAgICAgaVNDU0kgQ1JDIGNvbnNpZGVyYXRpb25zICAgICAgRmVicnVh
cnkgMjYsIDIwMDEgDQogDQogDQogICAgDQpTdGF0dXMgb2YgdGhpcyBNZW1vIA0KICAgIA0KICAg
IA0KICAgVGhpcyBkb2N1bWVudCBpcyBhbiBJbnRlcm5ldC1EcmFmdCBhbmQgaXMgaW4gZnVsbCBj
b25mb3JtYW5jZSB3aXRoIA0KICAgYWxsIHByb3Zpc2lvbnMgb2YgU2VjdGlvbiAxMCBvZiBSRkMy
MDI2IFtSRkMyMDI2XS4gIA0KICAgIA0KICAgSW50ZXJuZXQtRHJhZnRzIGFyZSB3b3JraW5nIGRv
Y3VtZW50cyBvZiB0aGUgSW50ZXJuZXQgRW5naW5lZXJpbmcgDQogICBUYXNrIEZvcmNlIChJRVRG
KSwgaXRzIGFyZWFzLCBhbmQgaXRzIHdvcmtpbmcgZ3JvdXBzLiBOb3RlIHRoYXQgb3RoZXIgDQog
ICBncm91cHMgbWF5IGFsc28gZGlzdHJpYnV0ZSB3b3JraW5nIGRvY3VtZW50cyBhcyBJbnRlcm5l
dC1EcmFmdHMuIA0KICAgSW50ZXJuZXQtRHJhZnRzIGFyZSBkcmFmdCBkb2N1bWVudHMgdmFsaWQg
Zm9yIGEgbWF4aW11bSBvZiBzaXggbW9udGhzIA0KICAgYW5kIG1heSBiZSB1cGRhdGVkLCByZXBs
YWNlZCwgb3IgbWFkZSBvYnNvbGV0ZSBieSBvdGhlciBkb2N1bWVudHMgYXQgDQogICBhbnkgdGlt
ZS4gSXQgaXMgaW5hcHByb3ByaWF0ZSB0byB1c2UgSW50ZXJuZXQtIERyYWZ0cyBhcyByZWZlcmVu
Y2UgDQogICBtYXRlcmlhbCBvciB0byBjaXRlIHRoZW0gb3RoZXIgdGhhbiBhcyAid29yayBpbiBw
cm9ncmVzcy4iICANCiAgIFRoZSBsaXN0IG9mIGN1cnJlbnQgSW50ZXJuZXQtRHJhZnRzIGNhbiBi
ZSBhY2Nlc3NlZCBhdCANCiAgIGh0dHA6Ly93d3cuaWV0Zi5vcmcvaWV0Zi8xaWQtYWJzdHJhY3Rz
LnR4dCAgDQogICBUaGUgbGlzdCBvZiBJbnRlcm5ldC1EcmFmdCBTaGFkb3cgRGlyZWN0b3JpZXMg
Y2FuIGJlIGFjY2Vzc2VkIGF0IA0KICAgaHR0cDovL3d3dy5pZXRmLm9yZy9zaGFkb3cuaHRtbC4g
DQogICAgDQogICAgDQpBYnN0cmFjdCANCiAgICANCiAgIEN5Y2xpYyByZWR1bmRhbmN5IGNoZWNr
IChDUkMpIGNvZGVzIFtQZXRlcnNvbl0gYXJlIHNob3J0ZW5lZCBjeWNsaWMgDQogICBjb2RlcyB1
c2VkIGZvciBlcnJvciBkZXRlY3Rpb24uIEEgbnVtYmVyIG9mIENSQyBjb2RlcyBoYXZlIGJlZW4g
DQogICBhZG9wdGVkIGluIHN0YW5kYXJkczogQVRNLCBJRUMsIElFRUUsIENDSVRULCBJQk0tU0RM
QywgYW5kIG1vcmUgDQogICBbQmFpY2hdLiAgVGhlIG1vc3QgaW1wb3J0YW50IGV4cGVjdGF0aW9u
IGZyb20gc3VjaCBhIGNvZGUgaXMgYSB2ZXJ5IA0KICAgbG93IHByb2JhYmlsaXR5IGZvciB1bmRl
dGVjdGVkIGVycm9ycy4gIFRoZSBwcm9iYWJpbGl0eSBvZiB1bmRldGVjdGVkIA0KICAgZXJyb3Jz
IG9uIHN1Y2ggY29kZXMgaGFzIGJlZW4sIGFuZCBzdGlsbCBpcywgc3ViamVjdCB0byBleHRlbnNp
dmUgDQogICBzdHVkaWVzIHRoYXQgaGF2ZSBpbmNsdWRlZCBib3RoIGFuYWx5dGljYWwgbW9kZWxz
IGFuZCBzaW11bGF0aW9ucy4gDQogICBUaG9zZSBjb2RlcyBoYXZlIGJlZW4gdXNlZCBleHRlbnNp
dmVseSBpbiBjb21tdW5pY2F0aW9ucyBhbmQgbWFnbmV0aWMgDQogICByZWNvcmRpbmcgYXMgdGhl
eSBkZW1vbnN0cmF0ZSBnb29kICJidXJzdCBlcnJvciIgZGV0ZWN0aW9uIA0KICAgY2FwYWJpbGl0
aWVzIGJ1dCB0aGV5IGFyZSBnb29kIGFsc28gaW4gZGV0ZWN0aW5nIHNldmVyYWwgaW5kZXBlbmRl
bnQgDQogICBiaXQgZXJyb3JzLiAgSGFyZHdhcmUgaW1wbGVtZW50YXRpb25zIGFyZSB2ZXJ5IHNp
bXBsZSBhbmQgd2VsbCBrbm93biANCiAgICh0aGVpciBzaW1wbGljaXR5IGhhcyBtYWRlIHRoZW0g
cG9wdWxhciB3aXRoIGhhcmR3YXJlIGRldmVsb3BlcnMgZm9yIA0KICAgbWFueSB5ZWFycykgYnV0
IGFsZ29yaXRobXMgYW5kIHNvZnR3YXJlIGZvciBlZmZlY3RpdmUgaW1wbGVtZW50YXRpb25zIA0K
ICAgb2YgQ1JDIGFyZSBub3cgYWxzbyB3aWRlbHkgYXZhaWxhYmxlIFtXaWxsaWFtc10uIA0KICAg
IA0KICAgVGhlIHByb2JhYmlsaXR5IGZvciB1bmRldGVjdGVkIGVycm9ycyBkZXBlbmRzIG9uIHRo
ZSBwb2x5bm9taWFsIA0KICAgc2VsZWN0ZWQsIHRoZSBlcnJvciBkaXN0cmlidXRpb24gKGVycm9y
IG1vZGVsKSBhbmQgdGhlIGRhdGEgbGVuZ3RoLiANCiAgICANCiAgIEluIHRoaXMgbWVtbyB3ZSBh
dHRlbXB0IHRvIGdpdmUgc29tZSBlc3RpbWF0ZXMgZm9yIHRoZSBwcm9iYWJpbGl0eSBvZiANCiAg
IHVuZGV0ZWN0ZWQgZXJyb3JzIHRoYXQgd2lsbCBmYWNpbGl0YXRlIHRoZSBzZWxlY3Rpb24gb2Yg
YW4gZXJyb3IgDQogICBkZXRlY3Rpb24gY29kZSBmb3IgaVNDU0kuICANCiAgICANCiAgIFdlIHdp
bGwgYWxzbyBhdHRlbXB0IHRvIGNvbXBhcmUgQ1JDcyB3aXRoIG90aGVyIGNoZWNrc3VtIGZvcm1z
IA0KICAgKEZsZXRjaGVyLCBBZGxlciwgd2VpZ2h0ZWQgY2hlY2tzdW1zKSBpbmFzbXVjaCBhcyBh
dmFpbGFibGUgZGF0YSB3aWxsIA0KICAgcGVybWl0LiANCiAgDQpTYXRyYW4sIEouICAgICAgIFN0
YW5kYXJkcy1UcmFjaywgRXhwaXJlIE9jdG9iZXIgMjAwMSAgICAgICAgICAgICAgICAyIAoMDQog
ICAgICAgICAgICAgICAgICAgICAgIGlTQ1NJIENSQyBjb25zaWRlcmF0aW9ucyAgICAgIEZlYnJ1
YXJ5IDI2LCAyMDAxIA0KIA0KIA0KICAgIA0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0K
DQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoN
Cg0KDQoNCg0KDQoNCg0KDQogIA0KU2F0cmFuLCBKLiAgICAgICBTdGFuZGFyZHMtVHJhY2ssIEV4
cGlyZSBPY3RvYmVyIDIwMDEgICAgICAgICAgICAgICAgMyAKDA0KICAgICAgICAgICAgICAgICAg
ICAgICBpU0NTSSBDUkMgY29uc2lkZXJhdGlvbnMgICAgICBGZWJydWFyeSAyNiwgMjAwMSANCiAN
CiANCjEuIEVycm9yIG1vZGVscyBhbmQgZ29hbHMgDQogICAgDQogICBXZSB3aWxsIGFuYWx5emUg
dGhlIGNvZGUgYmVoYXZpb3IgdW5kZXIgdHdvIGNvbmRpdGlvbnM6IA0KICAgIA0KICAgICAgLSBu
b2lzeSBjaGFubmVsIC0gYnVyc3QgZXJyb3JzIG9mIGFuIGF2ZXJhZ2UgbGVuZ3RoIG9mIG4gYml0
cyANCiAgICAgIC0gbG93IG5vaXNlIGNoYW5uZWwgLSBpbmRlcGVuZGVudCBzaW5nbGUgYml0IGVy
cm9ycyANCiAgICAgICANCiAgIEJ1cnN0IGVycm9ycyBhcmUgdGhlIHByZXZhbGVudCBuYXR1cmFs
IHBoZW5vbWVub24gb24gY29tbXVuaWNhdGlvbiANCiAgIGxpbmVzIGFuZCByZWNvcmRpbmcgbWVk
aWEuIFRoZSBudW1iZXJzIHF1b3RlZCBmb3IgdGhvc2UgcmV2b2x2ZSANCiAgIGFyb3VuZCB0aGUg
QkVSIChiaXQgZXJyb3IgcmF0ZSkgYnV0IGZyZXF1ZW50bHkgdGhvc2UgbnVtYmVycyBhcmUgDQog
ICBub3RoaW5nIGVsc2UgdGhhbiBhIHJlZmxlY3Rpb24gb2YgdGhlIEJ1cnN0IEVycm9yIFJhdGUg
bXVsdGlwbGllZCBieSANCiAgIHRoZSBhdmVyYWdlIGJ1cnN0IGxlbmd0aC4gSW4gZmllbGQgZW5n
aW5lZXJpbmcgdGVzdHMgMyBudW1iZXJzIGFyZSANCiAgIHVzdWFsbHkgcXVvdGUgdG9nZXRoZXIg
LSBCRVIsIGVycm9yLWZyZWUtc2Vjb25kcyBhbmQgc2V2ZXJlbHktZXJyb3ItDQogICBzZWNvbmRz
IC0gYW5kIHRoaXMgaWxsdXN0cmF0ZXMgb3VyIHBvaW50LiANCiAgICANCiAgIEV2ZW4gYmV5b25k
IGNvbW11bmljYXRpb24gYW5kIHJlY29yZGluZyBtZWRpYSB0aGUgZWZmZWN0cyBvZiBlcnJvcnMg
DQogICB3aWxsIGJlICJidXJzdHkiICAtKGUuZy4sIGEgbWVtb3J5IGVycm9yIHdpbGwgYWZmZWN0
IG1vcmUgdGhhbiBhIA0KICAgc2luZ2xlIGJpdCBhbmQgdGhlIHRvdGFsIGVmZmVjdCB3aWxsIG5v
dCBiZSB2ZXJ5IGRpZmZlcmVudCBmcm9tIHRoZSANCiAgIGNvbW11bmljYXRpb24gZXJyb3IsIHNv
ZnR3YXJlIGVycm9ycyB3aGlsZSBtYW5pcHVsYXRpbmcgcGFja2V0cyB3aWxsIA0KICAgaGF2ZSBh
IGJ1cnN0IGVmZmVjdCkuICBTb2Z0d2FyZSBlcnJvcnMgcmVzdWx0IGFsc28gaW4gYnVyc3QgZXJy
b3JzLiANCiAgIEluIGFkZGl0aW9uIHNlcmlhbCBpbnRlcm5hbCBpbnRlcmNvbm5lY3RzIHdpbGwg
bWFrZSB0aGlzIHR5cGUgb2YgDQogICBlcnJvciB0aGUgbW9zdCBjb21tb24gd2l0aGluIG1hY2hp
bmVzIHRvby4gDQogICAgDQogICBXZSBhbmFseXplIGFsc28gdGhlIGVmZmVjdHMgb2Ygc2luZ2xl
IGluZGVwZW5kZW50IGJpdCBlcnJvcnMgLSBhcyANCiAgIHRob3NlIGNhbiBiZSBjYXVzZSBieSBz
b21lIGRlZmVjdHMuICANCiAgICANCiAgIE9uIGJ1cnN0IHdlIHdpbGwgYXNzdW1lIGFuIGF2ZXJh
Z2UgYnVyc3QgZXJyb3IgZHVyYXRpb24gb2YgYmQgdGhhdCBhdCANCiAgIGEgZ2l2ZW4gdHJhbnNt
aXNzaW9uIHJhdGUgcyB3aWxsIHJlc3VsdCBpbiBhbiBhdmVyYWdlIGJ1cnN0IG9mIGEgPSANCiAg
IGJkL3MgYml0cyANCiAgIChlLmcuLCBhbiBhdmVyYWdlIGJ1cnN0IGR1cmF0aW9uIG9mIDMgbnMg
YXQgMUdicyBnaXZlcyBhbiBhdmVyYWdlIA0KICAgYnVyc3Qgb2YgMyBiaXRzKS4gDQogICAgDQog
ICBGb3IgdGhlIGJ1cnN0IGVycm9yIHJhdGUgd2Ugd2lsbCB0YWtlIDEwXi0xMCAodGhlIG51bWJl
cnMgcXVvdGVkIGZvciANCiAgIEJFUiBvbiB3aXJlZCBjb21tdW5pY2F0aW9uIGNoYW5uZWxzIGFy
ZSBiZXR3ZWVuIDEwXi0xMCB0byAxMF4tMTIgYW5kIA0KICAgd2UgY29uc2lkZXIgdGhlIEJFUiBh
cyBidXJzdC1lcnJvci1yYXRlKmF2ZXJhZ2UtYnVyc3QtbGVuZ3RoKS4gIA0KICAgUGxlYXNlIGhv
d2V2ZXIga2VlcCBpbiBtaW5kIHRoYXQgaWYgdGhlIGNoYW5uZWwgaW5jbHVkZXMgd2lyZWxlc3Mg
DQogICBsaW5rcyB0aGUgZXJyb3IgcmF0ZXMgY2FuIGJlIHN1YnN0YW50aWFsbHkgaGlnaGVyLiAN
CiAgICANCiAgIEZvciBpbmRlcGVuZGVudCBzaW5nbGUgYml0IGVycm9ycyB3ZSB3aWxsIGFzc3Vt
ZSBhIDEwXi0xMSBlcnJvciByYXRlLiANCiAgICANCiAgIEFzIHRoZSBlcnJvciBkZXRlY3Rpb24g
bWVjaGFuaXNtcyB3aWxsIGhhdmUgdG8gdHJhbnNwb3J0IGxhcmdlIA0KICAgYW1vdW50cyBvZiBk
YXRhIChwZXRhYnl0ZXM9MTBeMTYgYml0cykgd2l0aG91dCBlcnJvcnMgd2Ugd2lsbCB0YXJnZXQg
DQogICB2ZXJ5IGxvdyBwcm9iYWJpbGl0aWVzIGZvciB1bmRldGVjdGVkIGVycm9ycyBmb3IgYWxs
IGJsb2NrIGxlbmd0aHMgDQogICAoYXQgMTBHYi9zIHRoYXQgbXVjaCBkYXRhIGNhbiBiZSBzZW50
IGluIGxlc3MgdGhhbiAyIHdlZWtzISBvbiBhIA0KICAgc2luZ2xlIGxpbmspLiANCiAgICANCg0K
ICANClNhdHJhbiwgSi4gICAgICAgU3RhbmRhcmRzLVRyYWNrLCBFeHBpcmUgT2N0b2JlciAyMDAx
ICAgICAgICAgICAgICAgIDQgCgwNCiAgICAgICAgICAgICAgICAgICAgICAgaVNDU0kgQ1JDIGNv
bnNpZGVyYXRpb25zICAgICAgRmVicnVhcnkgMjYsIDIwMDEgDQogDQogDQogICBBbHRlcm5hdGl2
ZWx5LCBhcyBpU0NTSSBoYXMgdG8gcGVyZm9ybSwgZWZmaWNpZW50bHkgd2Ugd2lsbCByZXF1aXJl
IA0KICAgdGhhdCB0aGUgZXJyb3IgZGV0ZWN0aW9uIGNhcGFiaWxpdHkgb2YgYSBzZWxlY3RlZCBw
cm90ZWN0aW9uIA0KICAgbWVjaGFuaXNtIHNob3VsZCBiZSB2ZXJ5IGdvb2QgYXQgbGVhc3QgdXAg
dG8gYmxvY2sgbGVuZ3RocyBvZiA4ayANCiAgIGJ5dGVzICg2NGtiaXRzKS4gDQogICAgDQogICBU
aGUgZXJyb3IgZGV0ZWN0aW9uIGNhcGFiaWxpdHkgc2hvdWxkIGtlZXAgdGhlIHByb2JhYmlsaXR5
IG9mIA0KICAgdW5kZXRlY3RlZCBlcnJvcnMgYXQgdmFsdWVzIHRoYXQgd291bGQgYmUgbWVhbiAi
bmV4dC10by1pbXBvc3NpYmxlIi4gIA0KICAgV2UgcmVjb2duaXplIGhvd2V2ZXIgdGhhdCBzdWNo
IGF0dHJpYnV0ZXMgYXJlIGhhcmQgdG8gcXVhbnRpZnkgYW5kIHdlIA0KICAgcmVzb3J0ZWQgdG8g
cGh5c2ljcyAtIDEwXjIzIGlzIHRoZSBBdm9nYWRybyBudW1iZXIgd2hpbGUgMTBeNDUgaXMgdGhl
IA0KICAgbnVtYmVyIG9mIGF0b21zIGluIHRoZSBrbm93biBVbml2ZXJzZSAob3IgaXQgd2FzIG1h
bnkgeWVhcnMgYWdvIHdoZW4gDQogICB3ZSByZWFkIGFib3V0IGl0KSBhbmQgdGhvc2Ugd291bGQg
dGhlIGJvdW5kcyBvZiBpbmNlcnRpdHVkZSB3ZSBjb3VsZCANCiAgIGxpdmUgd2l0aC4gKDEwXi0y
MyBhdCB3b3JzdCBhbmQgMTBeLTQ1IGlmIHdlIGNhbiBhZmZvcmQgaXQpLiBGb3IgOGsgDQogICBi
bG9ja3MgdGhlIHBlci9iaXQgZXF1aXZhbGVudCB3b3VsZCBiZSAoMTBeLTI4IHRvIDEwXi01MCkg
IA0KICAgIA0KICAgIA0KICAgIA0KICAgIA0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0K
DQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KICANClNhdHJhbiwgSi4gICAgICAg
U3RhbmRhcmRzLVRyYWNrLCBFeHBpcmUgT2N0b2JlciAyMDAxICAgICAgICAgICAgICAgIDUgCgwN
CiAgICAgICAgICAgICAgICAgICAgICAgaVNDU0kgQ1JDIGNvbnNpZGVyYXRpb25zICAgICAgRmVi
cnVhcnkgMjYsIDIwMDEgDQogDQogDQoyLiBCYWNrZ3JvdW5kIGFuZCBsaXRlcmF0dXJlIHN1cnZl
eSANCiAgICANCiAgIEVhY2ggY29kZXdvcmQgb2YgYSBiaW5hcnkgKG4saykgQ1JDIGNvZGUgQyBj
b25zaXN0cyBvZiBuID0gaytyIGJpdHMuIA0KICAgVGhlIGJsb2NrIG9mIHIgcGFyaXR5IGJpdHMg
aXMgY29tcHV0ZWQgZnJvbSB0aGUgYmxvY2sgb2YgayANCiAgIGluZm9ybWF0aW9uIGJpdHMuIFRo
ZSBjb2RlIGhhcyBhIGRlZ3JlZSByIGdlbmVyYXRvciBwb2x5bm9taWFsIGcoeCkuICANCiAgICAN
CiAgIFRoZSBjb2RlIGlzIGxpbmVhciBpbiB0aGUgc2Vuc2UgdGhhdCB0aGUgYml0d2lzZSBhZGRp
dGlvbiBvZiBhbnkgdHdvIA0KICAgY29kZXdvcmRzIHlpZWxkcyBhIGNvZGV3b3JkLiAgDQogICAg
DQogICBGb3IgdGhlIG1pbmltYWwgbSBzdWNoIHRoYXQgZyh4KSBkaXZpZGVzICh4Xm0pLTEsIGVp
dGhlciBuPW0sIGFuZCB0aGUgDQogICBjb2RlIEMgY29tcHJpc2VzIHRoZSBzZXQgRCBvZiBhbGwg
dGhlIG11bHRpcGxpY2F0aW9ucyBvZiBnKHgpIG1vZHVsbyANCiAgICh4Xm0pLTEsIG9yIG48bSwg
YW5kIEMgaXMgb2J0YWluZWQgZnJvbSBEIGJ5IHNob3J0ZW5pbmcgZWFjaCB3b3JkIGluIA0KICAg
dGhlIGxhdHRlciBpbiBtLW4gc3BlY2lmaWMgcG9zaXRpb25zICh3aGljaCBhbHNvIHJlZHVjZXMg
dGhlIG51bWJlciANCiAgIG9mIHdvcmRzIHNpbmNlIGFsbCB6ZXJvIHdvcmRzIGFyZSB0aGVuIGRp
c2NhcmRlZCBhbmQgZHVwbGljYXRlcyBhcmUgDQogICBub3QgbWFpbnRhaW5lZCkuIA0KICAgIA0K
ICAgRXJyb3IgZGV0ZWN0aW9uIGF0IHRoZSByZWNlaXZpbmcgZW5kIGlzIG1hZGUgYnkgY29tcHV0
aW5nIHRoZSBwYXJpdHkgDQogICBiaXRzIGZyb20gdGhlIHJlY2VpdmVkIGluZm9ybWF0aW9uIGJs
b2NrLCBhbmQgY29tcGFyaW5nIHRoZW0gd2l0aCB0aGUgDQogICByZWNlaXZlZCBwYXJpdHkgYml0
cy4gIA0KICAgICANCiAgIEFuIHVuZGV0ZWN0ZWQgZXJyb3Igb2NjdXJzIHdoZW4gdGhlIHJlY2Vp
dmVkIHdvcmQgYycgaXMgYSBjb2Rld29yZCANCiAgIGJ1dCBkaWZmZXJlbnQgZnJvbSB0aGUgb25l
IGMgdHJhbnNtaXR0ZWQuICANCiAgICANCiAgIFRoaXMgaXMgb25seSBwb3NzaWJsZSB3aGVuIHRo
ZSBlcnJvciBwYXR0ZXJuIGU9YyctYyBpcyBhIGNvZGV3b3JkIGJ5IA0KICAgaXRzZWxmIChiZWNh
dXNlIG9mIHRoZSBsaW5lYXJpdHkgb2YgdGhlIGNvZGUpLiBUaGUgcGVyZm9ybWFuY2Ugb2YgYSAN
CiAgIENSQyBjb2RlIGlzIG1lYXN1cmVkIGJ5IHRoZSBwcm9iYWJpbGl0eSBQdWQgb2YgdW5kZXRl
Y3RlZCBjaGFubmVsIA0KICAgZXJyb3JzLiANCiAgICANCiAgIExldCBBaSBkZW5vdGUgdGhlIG51
bWJlciBvZiBjb2Rld29yZHMgb2Ygd2VpZ2h0IGksIGkuZS4sIHdpdGggaSAxLQ0KICAgYml0cy4g
Rm9yIGEgYmluYXJ5IHN5bW1ldHJpYyBjaGFubmVsIChCU0MpLCB3aXRoIHNwb3JhZGljLCANCiAg
IGluZGVwZW5kZW50IGJpdCBlcnJvciByYXRlIDA8PWVwc2lsb248PTAuNSwgdGhlIHByb2JhYmls
aXR5IG9mIA0KICAgdW5kZXRlY3RlZCBlcnJvcnMgZm9yIHRoZSBjb2RlIEMgaXMgdGh1cyBnaXZl
biBieTogDQogICAgDQogICBQdWQoQyxlcHNpbG9uKSA9IFNpZ21hW2ZvciBpPWQgdG8gbl0gKEFp
KihlcHNpbG9uXmkpKigxLWVwc2lsb24pXihuLQ0KICAgaSkpIA0KICAgIA0KICAgd2hlcmUgZCBp
cyB0aGUgZGlzdGFuY2Ugb2YgdGhlIGNvZGU6IHRoZSBtaW5pbWFsIHdlaWdodCBkaWZmZXJlbmNl
IA0KICAgYmV0d2VlbiB0d28gY29kZSBpbiBDIHdoaWNoLCBieSB0aGUgbGluZWFyaXR5IG9mIHRo
ZSBjb2RlLCBpcyBhbHNvIA0KICAgdGhlIG1pbmltYWwgd2VpZ2h0IG9mIGFueSBjb2Rld29yZCBp
biB0aGUgY29kZS4gIFB1ZCBjYW4gYWxzbyBiZSANCiAgIGV4cHJlc3NlZCBieSB0aGUgd2VpZ2h0
IGRpc3RyaWJ1dGlvbiBvZiB0aGUgZHVhbCBjb2RlOiB0aGUgc2V0IG9mIA0KICAgd29yZHMgZWFj
aCBvZiB3aGljaCBpcyBwZXJwZW5kaWN1bGFyIChiaXR3aXNlIEFORCB5aWVsZHMgYW4gZXZlbiAN
CiAgIG51bWJlciBvZiAxLWJpdHMpIHRvIGV2ZXJ5IHdvcmQgb2YgQy4gIA0KICAgVGhlIGZhY3Qg
dGhhdCBQdWQgY2FuIGJlIGNvbXB1dGVkIHVzaW5nIHRoZSBkdWFsIGNvZGUgaXMgZXh0cmVtZWx5
IA0KICAgaW1wb3J0YW50OyByZWdhcmRsZXNzIG9mIHRoZSBsZW5ndGggb2YgdGhlIGNvZGUgYmxv
Y2sgLSB0aGUgbnVtYmVyIG9mIA0KICAgZGlmZmVyZW50IGNvZGVzIGluIHRoZSBkdWFsIGNvZGUg
aXMgMl5yLiBJZiB3ZSBkZW5vdGUgd2l0aCBCaSB0aGUgDQogICBudW1iZXIgb2YgY29kZXdvcmRz
IG9mIHdlaWdodCBpLCBpLmUuLCB3aXRoIGkgMS1iaXRzIHRoZW46IA0KDQogIA0KU2F0cmFuLCBK
LiAgICAgICBTdGFuZGFyZHMtVHJhY2ssIEV4cGlyZSBPY3RvYmVyIDIwMDEgICAgICAgICAgICAg
ICAgNiAKDA0KICAgICAgICAgICAgICAgICAgICAgICBpU0NTSSBDUkMgY29uc2lkZXJhdGlvbnMg
ICAgICBGZWJydWFyeSAyNiwgMjAwMSANCiANCiANCiAgICANCiAgIFB1ZCAoQyxlcHNpbG9uKSA9
IDJeLXIgU2lnbWEgW2ZvciBpPTAgdG8gbl0gQmkqKDEtMiplcHNpbG9uKV5pIC0gKDEtDQogICBl
cHNpbG9uKV5uIA0KICAgIA0KICAgV29sZiBbV29sZjk0b10gaW50cm9kdWNlZCBhbiBlZmZpY2ll
bnQgYWxnb3JpdGhtIGZvciBlbnVtZXJhdGluZyBhbGwgDQogICB0aGUgY29kZXdvcmRzIG9mIGEg
Y29kZSwgYW5kIGZpbmRpbmcgdGhlaXIgd2VpZ2h0IGRpc3RyaWJ1dGlvbi4gDQogICAgDQogICBX
b2xmIFtXb2xmODJdIGZvdW5kIHRoYXQsIGNvdW50ZXIgdG8gd2hhdCB3YXMgYXNzdW1lZCwgKDEp
IHRoZXJlIA0KICAgZXhpc3QgY29kZXMgZm9yIHdoaWNoIFB1ZChDLGVwc2lsb24pPlB1ZChDLDAu
NSkgZm9yIHNvbWUgZXBzaWxvbiBub3Q9IA0KICAgMC41IGFuZCAoMikgUHVkIGlzIG5vdCBhbHdh
eXMgaW5jcmVhc2luZyBmb3IgMDw9ZXBzaWxvbjw9MC41LiAgVGhlIA0KICAgdmFsdWUgb2Ygd2hh
dCB3YXMgYXNzdW1lZCB0byBiZSB0aGUgd29yc3QgUHVkIGlzIFB1ZChDLDAuNSk9KDJeLXIpIC0g
DQogICAoMl4tbikuIFRoaXMgc3RlbXMgZnJvbSB0aGUgZmFjdCB0aGF0IHdpdGggZXBzaWxvbj0w
LjUsIGFsbCAyXm4gDQogICByZWNlaXZlZCB3b3JkcyBhcmUgZXF1YWxseSBsaWtlbHkgYW5kIG91
dCBvZiB0aGVtIDJeKG4tciktMSB3aWxsIGJlIA0KICAgYWNjZXB0ZWQgYXMgY29kZXdvcmRzIG9m
IG5vIGVycm9ycywgYWx0aG91Z2ggdGhleSBhcmUgZGlmZmVyZW50IGZyb20gDQogICB0aGUgY29k
ZXdvcmQgdHJhbnNtaXR0ZWQuIA0KICAgIA0KICAgV29sZiBbV29sZjk0al0gaW52ZXN0aWdhdGVk
IHRoZSBDQ0lUVCAxNi1iaXQgcG9seW5vbWlhbC4gVGhpcyBpcyBhIA0KICAgY29kZSBvZiB0aGUg
ZmFtaWx5IG9mIChzaG9ydGVuZWQgY29kZXMgb2YpIGEgQkNIIGNvZGUgb2YgbGVuZ3RoIDJeKHIt
DQogICAxKSAtMSAocj0xNiBpbiB0aGUgQ0NJVFQgMTYtYml0IGNhc2UpIGdlbmVyYXRlZCBieSBh
IHBvbHlub21pYWwgb2YgDQogICB0aGUgZm9ybSBnKHgpID0oeCsxKXAoeCkgd2l0aCBwKHgpIGJl
aW5nIGEgcHJpbWl0aXZlIHBvbHlub21pYWwgb2YgDQogICBkZWdyZWUgci0xICg9MTUgaW4gdGhp
cyBjYXNlKS4gVGhlc2UgY29kZXMgaGF2ZSBhIEJDSCBkZXNpZ24gZGlzdGFuY2UgDQogICBvZiA0
LiBUaGF0IGlzLCB0aGUgbWluaW1hbCBkaXN0YW5jZSBiZXR3ZWVuIGFueSB0d28gY29kZXdvcmRz
IGluIHRoZSANCiAgIGNvZGUgaXMgYXQgbGVhc3QgNCBiaXRzICh3aGljaCBpcyBlYXJuZWQgYnkg
dGhlIGZhY3QgdGhhdCB0aGUgDQogICBzZXF1ZW5jZSBvZiBwb3dlcnMgb2YgYWxwaGEsIHRoZSBy
b290IG9mIHAoeCksIHdoaWNoIGFyZSByb290cyBvZiANCiAgIGcoeCksIGluY2x1ZGVzIHRocmVl
IGNvbnNlY3V0aXZlIHBvd2VyczoNICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgIGFscGhhXjAsIGFscGhhXjEsIGFscGhhXjIpLiANCiAgIEhlbmNlLCBldmVyeSAzIHNp
bmdsZSBiaXQgZXJyb3JzIGFyZSBkZXRlY3RhYmxlLiAgDQogICAgDQogICBXb2xmIGZvdW5kIHRo
YXQgZGlmZmVyZW50IHNob3J0ZW5lZCB2ZXJzaW9ucyBvZiBhIGdpdmVuIGNvZGUsIG9mIHRoZSAN
CiAgIHNhbWUgY29kZXdvcmQgbGVuZ3RoLCBwZXJmb3JtIHRoZSBzYW1lIChpbmRlcGVuZGVudCBv
ZiB3aGljaCBzcGVjaWZpYyANCiAgIGluZGV4ZXMgYXJlIG9taXR0ZWQgZnJvbSB0aGUgb3JpZ2lu
YWwgY29kZSkuIEhlIGFsc28gZm91bmQgdGhhdCBmb3IgDQogICB0aGUgdW5zaG9ydGVuZWQgY29k
ZXMsIGFsbCBwcmltaXRpdmUgcG9seW5vbWlhbHMgeWllbGQgY29kZXMgb2YgdGhlIA0KICAgc2Ft
ZSBwZXJmb3JtYW5jZS4gQnV0IGZvciB0aGUgc2hvcnRlbmVkIHZlcnNpb25zLCB0aGUgY2hvaWNl
IG9mIHRoZSANCiAgIHByaW1pdGl2ZSBwb2x5bm9taWFsIGRvZXMgbWFrZSBhIGRpZmZlcmVuY2Uu
IFdvbGYgW1dvbGY5NGogZm91bmQgYSANCiAgIHByaW1pdGl2ZSBwb2x5bm9taWFsIHdoaWNoICh3
aGVuIG11bHRpcGxpZWQgYnkgeCsxKSB5aWVsZHMgYSANCiAgIGdlbmVyYXRpbmcgcG9seW5vbWlh
bCB0aGF0IG91dHBlcmZvcm1zIHRoZSBDQ0lUVCBvbmUgYnkgYW4gb3JkZXIgb2YgDQogICBtYWdu
aXR1ZGUuIEZvciAzMi1iaXQsIGhlIGZvdW5kIGFuIGV4YW1wbGUgb2YgdHdvIHBvbHlub21pYWxz
IHRoYXQgDQogICBkaWZmZXIgaW4gdGhlaXIgcHJvYmFiaWxpdHkgb2YgdW5kZXRlY3RlZCBidXJz
dCBvZiBsZW5ndGggMzMgYnkgNCANCiAgIG9yZGVycyBvZiBtYWduaXR1ZGUuIA0KICAgIA0KICAg
SXQgc28gaGFwcGVucywgdGhhdCBmb3Igc29tZSBzaG9ydGVuZWQgY29kZXMsIHRoZSBtaW5pbXVt
IGRpc3RhbmNlLCANCiAgIG9yIHRoZSBkaXN0cmlidXRpb24gb2YgdGhlIHdlaWdodHMsIGlzIGJl
dHRlciB0aGFuIGZvciBvdGhlcnMgZGVyaXZlZCANCiAgIGZyb20gZGlmZmVyZW50IHVuc2hvcnRl
bmVkIGNvZGVzLiANCiAgICANCiAgIEJhaWNoZXZhIGV0IGFsIFtCYWljaGV2YV0gbWFkZSBhIGNv
bXByZWhlbnNpdmUgY29tcGFyaXNvbiBvZiANCiAgIGRpZmZlcmVudCBnZW5lcmF0aW5nIHBvbHlu
b21pYWxzIG9mIGRlZ3JlZSAxNiBvZiB0aGUgZm9ybSBnKHgpID0gDQogICAoeCsxKXAoeCksIGFu
ZCBvZiBvdGhlciBmb3Jtcy4gVGhleSBjb21wdXRlZCB0aGVpciBQdWQgZm9yIA0KDQogIA0KU2F0
cmFuLCBKLiAgICAgICBTdGFuZGFyZHMtVHJhY2ssIEV4cGlyZSBPY3RvYmVyIDIwMDEgICAgICAg
ICAgICAgICAgNyAKDA0KICAgICAgICAgICAgICAgICAgICAgICBpU0NTSSBDUkMgY29uc2lkZXJh
dGlvbnMgICAgICBGZWJydWFyeSAyNiwgMjAwMSANCiANCiANCiAgIGNvZGVsZW5ndGhzIHVwIHRv
IDEwMjQgYml0cy4gVGhleSBtZWFzdXJlZCB0aGVpciAiZ29vZG5lcyIgIC0tIGlmIA0KICAgUHVk
KEMsZXBzaWxvbikgIDw9IFB1ZChDLDAuNSkgYW5kICJ3ZWxsLWJlaGF2ZWQiIC0tIGlmIFB1ZChD
LGVwc2lsb24pIA0KICAgaXMgaW5jcmVhc2luZyB3aXRoIGluY3JlYXNpbmcgZXBzaWxvbiBpbiB0
aGUgcmFuZ2UgMCwwLjUuICBUaGUgcGFwZXIgDQogICBnaXZlcyBhIGNvbXByZWhlbnNpdmUgdGFi
bGUgdGhhdCBsaXN0cyB3aGljaCBvZiB0aGUgcG9seW5vbWlhbHMgaXMgDQogICBnb29kIGFuZCB3
aGljaCBpcyB3ZWxsLWJlaGF2ZWQgZm9yIGRpZmZlcmVudCBsZW5ndGggcmFuZ2VzLiANCiAgICAN
CiAgIEZvciBhIHNpbmdsZSBidXJzdCBlcnJvciwgV29sZiBbV29sZjk0Sl0gc3VnZ2VzdGVkIHRo
ZSBtb2RlbCBvZiAoYjpwKSANCiAgIGJ1cnN0IC0tIHRoZSBlcnJvcnMgb25seSBvY2N1ciB3aXRo
aW4gYSBzcGFuIG9mIGIgYml0cywgYW5kIHdpdGhpbiANCiAgIHRoYXQgc3BhbiwgdGhlIGVycm9y
cyBvY2N1ciByYW5kb21seSwgd2l0aCBiaXQgZXJyb3IgcHJvYmFiaWxpdHkgMCA8PSANCiAgIHAg
PD0gMS4gDQogICAgDQogICBGb3IgcD0wLjUsIHdoaWNoIHVzZWQgdG8gYmUgY29uc2lkZXJlZCB0
aGUgd29yc3QgY2FzZSwgaXQgaXMgd2VsbCANCiAgIGtub3duIHRoYXQgdGhlIHByb2JhYmlsaXR5
IG9mIHVuZGV0ZWN0ZWQgb25lIGJ1cnN0IGVycm9yIG9mIGxlbmd0aCBiIA0KICAgPD0gciBpcywg
b2YgbGVuZ3RoIGI9cisxIGlzIDJeLShyLTEpLCBhbmQgb2YgYiA+IHIrMSwgaXMgMl4tciwgDQog
ICBpbmRlcGVuZGVudGx5IG9mIHRoZSBjaG9pY2Ugb2YgdGhlIHByaW1pdGl2ZSBwb2x5bm9taWFs
LiAgDQogICAgDQogICBXaXRoIFdvbGYncyBkZWZpbml0aW9uLA0gICAgICAgICAgICAgICAgICAg
ICAgICAgIHdoZXJlIHAgY2FuIGJlIGRpZmZlcmVudCB0aGFuIDAuNSwgaW5kZWVkIGl0IA0KICAg
d2FzIGZvdW5kIHRoYXQgZm9yIGEgZ2l2ZW4gYiB0aGVyZSBhcmUgdmFsdWVzIG9mIHAsIGRpZmZl
cmVudCBmcm9tIA0KICAgMC41IHdoaWNoIG1heGltaXplIHRoZSBwcm9iYWJpbGl0eSBvZiB1bmRl
dGVjdGVkIChiOnApIGJ1cnN0IGVycm9yLg0gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgDQogICAgDQogICBXb2xmIHBy
b3ZlZCB0aGF0IGZvciBhIGdpdmVuIGNvZGUsIGZvciBhbGwgYiBpbiB0aGUgcmFuZ2UgciA8IGIg
PCBuLCANCiAgIHRoZSBjb25kaXRpb25hbCBwcm9iYWJpbGl0eSBvZiB1bmRldGVjdGVkIGVycm9y
IGZvciB0aGUgKG4sIG4tcikgDQogICBjb2RlLCBnaXZlbiB0aGF0IGEgKGI6cCkgYnVyc3Qgb2Nj
dXJyZWQsIGlzIGVxdWFsIHRvIHRoZSBwcm9iYWJpbGl0eSANCiAgIG9mIHVuZGV0ZWN0ZWQgZXJy
b3JzIGZvciB0aGUgc2FtZSBjb2RlICh0aGUgc2FtZSBnZW5lcmF0aW5nIA0KICAgcG9seW5vbWlh
bCksIHNob3J0ZW5lZCB0byBibG9jayBsZW5ndGggYiwgd2hlbiB0aGlzIHNob3J0ZW5lZCBjb2Rl
IGlzIA0KICAgdXNlZCB3aXRoIGEgYmluYXJ5IHN5bW1ldHJpYyBjaGFubmVsIHdpdGggY2hhbm5l
bCAoc3BvcmFkaWMsIA0KICAgaW5kZXBlbmRlbnQpIGJpdCBlcnJvciBwcm9iYWJpbGl0eSBwLiAN
CiAgICANCiAgIEZvciB0aGUgSUVFRS04MDIuMyB1c2VkIENSQzMyLCBGdWppd2FyYSBbRnVqaXdh
cmE4OV0gbWVhc3VyZWQgdGhlIA0KICAgd2VpZ2h0cyBvZiBhbGwgd29yZHMgb2YgYWxsIHNob3J0
ZW5lZCB2ZXJzaW9ucyBvZiB0aGUgSUVFRSA4MDIuMyBjb2RlIA0KICAgb2YgMzIgY2hlY2sgYml0
cy4gVGhpcyBjb2RlIGlzIGdlbmVyYXRlZCBieSBhIHByaW1pdGl2ZSBwb2x5bm9taWFsIG9mIA0K
ICAgZGVncmVlIDMyOiAgDQogICBnKHgpID0geF4zMiArIHheMjYgKyB4XjIzICsgeF4yMiArIHhe
MTYgKyB4XjEyICsgeF4xMSArIHheMTAgKyB4XjggKyANCiAgIHheNyArIHheNSArIHheNCArIHhe
MiArIHggKyAxIGFuZCBoZW5jZSB0aGUgZGVzaWduZWQgZGlzdGFuY2Ugb2YgaXQgDQogICBpcyBv
bmx5IDMuIFRoaXMgZGlzdGFuY2UgaG9sZHMgZm9yIGNvZGVzIGFzIGxvbmcgYXMgMl4zMi0xLiBI
b3dldmVyLCANCiAgIHRoZSBmcmFtZSBmb3JtYXQgb2YgTUFDIChNZWRpYSBBY2Nlc3MgQ29udHJv
bCkgb2YgdGhlIGRhdGEgbGluayBsYXllciANCiAgIGluIElFRUUgODAyLjMsIGFzIHdlbGwgYXMg
dGhhdCBvZiB0aGUgZGF0YSBsaW5rIGxheWVyIGZvciB0aGUgDQogICBFdGhlcm5ldCAoMTk4MCkg
Zm9yYmlkIGxlbmd0aHMgZXhjZWVkaW5nIDEyLDE0NCBiaXRzLiBGdWppd2FyYSBvbmx5IA0KICAg
aW52ZXN0aWdhdGVkIHN1Y2ggYm91bmRlZCBsZW5ndGhzLiBUaGV5IGZvdW5kIHRoYXQgZm9yIHNo
b3J0ZW5lZCANCiAgIHZlcnNpb25zLCB0aGUgbWluaW11bSBkaXN0YW5jZSB3YXMgZm91bmQgdG8g
YmUgNCBmb3IgbGVuZ3RocyA0MDk2IHRvIA0KICAgMTIsMTQ0OyA1IGZvciBsZW5ndGhzIDUxMiB0
byAyMDQ4OyBhbmQgZXZlbiAxNSBmb3IgbGVuZ3RocyAzMyB0aHJvdWdoIA0KICAgNDIuICANCiAg
IEZ1aml3YXJhIGdpdmVzIGEgY2hhcnQgb2YgcmVzdWx0cyBvZiBjYWxjdWxhdGlvbnMgb2YgUHVk
IGZyb20gd2hpY2ggDQogICB3ZSBjYW4gc2VlIHRoYXQgZm9yIGNvZGVzIG9mIGxlbmd0aCAxMiwx
NDQgYW5kIEJTQyBvZiBlcHNpbG9uID0gMTBeLTUgDQogICAtIDEwXi00LCBQdWQ9IDEwXi0xNCAt
IDEwXi0xMyBhbmQgZm9yIGVwc2lsb24gPSAxMF4tNCAtIDEwXi0zLCAgDQogICBQdWQoNTEyLGVw
c2lsb24pID0gMTBeLTE1IA0KICANClNhdHJhbiwgSi4gICAgICAgU3RhbmRhcmRzLVRyYWNrLCBF
eHBpcmUgT2N0b2JlciAyMDAxICAgICAgICAgICAgICAgIDggCgwNCiAgICAgICAgICAgICAgICAg
ICAgICAgaVNDU0kgQ1JDIGNvbnNpZGVyYXRpb25zICAgICAgRmVicnVhcnkgMjYsIDIwMDEgDQog
DQogDQogICBQdWQoMTAyNCxlcHNpbG9uKSA9IDEwXi0xNCwgIA0KICAgUHVkKDIwNDgsZXBzaWxv
bikgPSAxMF4tMTMsIA0KICAgUHVkKDQwOTYsZXBzaWxvbikgPSAxMF4tMTIgLSAxMF4tMTEsIGFu
ZCANCiAgIFB1ZCg4MTkyLGVwc2lsb24pID0gMTBeLTEwIHdoaWNoIGlzIHJhdGhlciBjbG9zZSB0
byAyXi0zMi4gDQogICAgDQogICBbQ2FzdGFnbm9saTkzXSBleHRlbmRlZCBGdWppd2FyYSdzIHRl
Y2huaXF1ZSBmb3IgZWZmaWNpZW50bHkgdGhlIA0KICAgbWluaW11bSBkaXN0YW5jZSB0aHJvdWdo
IHRoZSB3ZWlnaHQgZGlzdHJpYnV0aW9uIG9mIHRoZSBkdWFsIGNvZGUgYW5kIA0KICAgZXhwbG9y
ZWQgYSBsYXJnZSBudW1iZXIgb2YgQ1JDIGNvZGVzIHdpdGggMjQgYW5kIDMyIGJpdC4gVGhleSAN
CiAgIGV4cGxvcmVkIHNldmVyYWwgY29kZXMgYnVpbHQgYXMgYSBtdWx0aXBsaWNhdGlvbiBvZiBz
ZXZlcmFsIGxvd2VyIA0KICAgZGVncmVlIGlycmVkdWNpYmxlIHBvbHlub21pYWxzLiANCiAgIElu
IHRoZSBwb3B1bGFyIGNsYXNzIG9mICh4KzEpKmRlZzMxLWlycmVkdWNpYmxlLXBvbHlub21pYWwg
dGhleSANCiAgIGV4cGxvcmVkIDQ3MDAwIHBvbHlub21pYWxzIChub3QgYWxsIHRoZSBwb3NzaWJs
ZSBvbmVzIC0gMiooMl4zMC0NCiAgIDEpLzMxKS4gVGhlIGJlc3QgdGhhdCB0aGV5IGZvdW5kIGhh
cyBkPTYgdXAgdG8gYmxvY2sgbGVuZ3RocyBvZiA1Mjc1IA0KICAgYW5kIGQ9NCB1cCB0byAyXjMx
LTEgKGJpdHMpLiANCiAgIFRoZSBpbnZlc3RpZ2F0aW9uIHdhcyBkb25lIGluIDE5OTMgd2l0aCBh
IHNwZWNpYWwgcHVycG9zZSBwcm9jZXNzb3IgDQogICAgDQogICBCeSBjb21wYXJpc29uIHRoZSBJ
RUVFLTgwMiBjb2RlIGhhcyBkPTQgdXAgdG8gYXQgbGVhc3QgNjQsMDAwIGJpdHMgDQogICBhbmQg
ZD0zIHVwIHRvIDJeMzItMSBiaXRzLiANCiAgICANCiAgIENSQzMyLzQgKHdlIHdpbGwgY2FsbCBp
dCBDUkMzMkMgaW4gdGhlIHJlc3Qgb2YgdGhpcyBtZW1vKSBpcyANCiAgIDExRURDNkY0MSANCiAg
IElFRUUtODAyIENSQyBpcyAxMDRDMTFEQjcgDQogICAgDQogICBbU3RvbmU5OF0gZXZhbHVhdGVk
IHRoZSBwZXJmb3JtYW5jZSBvZiBDUkMgKHRoZSBBQUw1IENSQyB0aGF0IGlzIHRoZSANCiAgIHNh
bWUgYXMgSUVFRTgwMikgYW5kIHRoZSBUQ1AgYW5kIEZsZXRjaGVyIGNoZWNrc3VtcyBvbiBsYXJn
ZSBhbW91bnRzIA0KICAgb2YgZGF0YS4gVGhlIHJlc3VsdHMgb2YgdGhpcyBleHBlcmltZW50IGlu
ZGljYXRlIGEgc2VyaW91cyB3ZWFrbmVzcyANCiAgIG9mIHRoZSBjaGVja3N1bXMgb24gcmVhbC1k
YXRhIHRoYXQgc3RlbXMgZnJvbSB0aGUgZmFjdCB0aGF0IGNoZWNrc3VtcyANCiAgIGRvIG5vdCBz
cHJlYWQgdGhlICJob3Qgc3BvdHMiIGluIGlucHV0IGRhdGEuICBIb3dldmVyLCB0aGUgcmVzdWx0
cyANCiAgIHNob3cgdGhhdCBGbGV0Y2hlciBiZWhhdmVzIGJ5IGEgZmFjdG9yIG9mIDIgYmV0dGVy
IHRoYW4gdGhlIHJlZ3VsYXIgDQogICBUQ1AgY2hlY2tzdW0uICANCiAgICANCiAgICANCg0KDQoN
Cg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQogIA0KU2F0cmFuLCBKLiAgICAgICBTdGFuZGFy
ZHMtVHJhY2ssIEV4cGlyZSBPY3RvYmVyIDIwMDEgICAgICAgICAgICAgICAgOSAKDA0KICAgICAg
ICAgICAgICAgICAgICAgICBpU0NTSSBDUkMgY29uc2lkZXJhdGlvbnMgICAgICBGZWJydWFyeSAy
NiwgMjAwMSANCiANCiANCjMuIFByb2JhYmlsaXR5IG9mIHVuZGV0ZWN0ZWQgZXJyb3JzIC0gYnVy
c3QgZXJyb3IgIA0KICAgIA0KMy4xIENSQzMyQyAoZGVyaXZhdGlvbnMgZnJvbSBbV29sZjk0al0p
IA0KICAgIA0KICAgV29sZiBbV29sZjk0al0gZm91bmQgYSAzMi1iaXQgcG9seW5vbWlhbCBvZiB0
aGUgZm9ybSBnKHgpID0gKDEreClwKHgpIA0KICAgZm9yIHdoaWNoIHRoZSBjb25kaXRpb25hbCBw
cm9iYWJpbGl0eSBvZiB1bmRldGVjdGVkIGVycm9yLCBnaXZlbiB0aGF0IA0KICAgYSBidXJzdCBv
ZiBsZW5ndGggMzMgb2NjdXJyZWQsIGlzIGF0IG1vc3QgKGkuZS4sIG1heGltaXplZCBvdmVyIGFs
bCANCiAgIHBvc3NpYmxlIGNoYW5uZWwgYml0IGVycm9yIHByb2JhYmlsaXRpZXMgd2l0aGluIHRo
ZSBidXJzdCkgNCAqIDEwXi0NCiAgIDEwLiANCiAgICANCiAgIFdlIHdpbGwgbm93IGZpbmQgdGhl
IHByb2JhYmlsaXR5IG9mIHVuZGV0ZWN0ZWQgZXJyb3IsIGdpdmVuIHRoYXQgYSANCiAgIGJ1cnN0
IG9mIGxlbmd0aCAzNCBvY2N1cnJlZCwgdXNpbmcgdGhlIHJlc3VsdCBkZXJpdmVkIGluIHRoaXMg
cGFwZXIsIA0KICAgbmFtZWx5IHRoYXQgZm9yIGEgZ2l2ZW4gY29kZSwgZm9yIGFsbCBiIGluIHRo
ZSByYW5nZSAzMiA8IGIgPCBuLCB0aGUgDQogICBjb25kaXRpb25hbCBwcm9iYWJpbGl0eSBvZiB1
bmRldGVjdGVkIGVycm9yIGZvciB0aGUgKG4sIG4tMzIpIGNvZGUsIA0KICAgZ2l2ZW4gdGhhdCBh
IChiOnApIGJ1cnN0IG9jY3VycmVkLCBpcyBlcXVhbCB0byB0aGUgcHJvYmFiaWxpdHkgb2YgDQog
ICB1bmRldGVjdGVkIGVycm9ycyBmb3IgdGhlIHNhbWUgY29kZSAodGhlIHNhbWUgZ2VuZXJhdGlu
ZyBwb2x5bm9taWFsKSwgDQogICBzaG9ydGVuZWQgdG8gYmxvY2sgbGVuZ3RoIGIsIHdoZW4gdGhp
cyBzaG9ydGVuZWQgY29kZSBpcyB1c2VkIHdpdGggYSANCiAgIGJpbmFyeSBzeW1tZXRyaWMgY2hh
bm5lbCB3aXRoIGNoYW5uZWwgKHNwb3JhZGljLCBpbmRlcGVuZGVudCkgYml0IA0KICAgZXJyb3Ig
cHJvYmFiaWxpdHkgcC4gDQogICAgDQogICBUaGUgYXBwcm94aW1hdGlvbiBmb3JtdWxhIGZvciBQ
dWQgb2Ygc3BvcmFkaWMgZXJyb3JzLCBhc3N1bWluZyB0aGF0IA0KICAgdGhlIHdlaWdodHMgQWkg
YXJlIGRpc3RyaWJ1dGVkIGJpbm9taWFsbHksIGlzIA0KICAgIA0KICAgUHVkKEMsIGVwc2lsb24p
ID1+PSBTaWdtYVtmb3IgaT1kIHRvIG5dKigobiBjaG9vc2UgaSkgLyAyXnIgKSooMS0NCiAgIGVw
c2lsb24pXihuLWkpICogZXBzaWxvbl5pIC4gDQogICAgDQogICBBc3N1bWluZyBhIHZlcnkgc21h
bGwgZXBzaWxvbiwgdGhpcyBleHByZXNzaW9uIGlzIGRvbWluYXRlZCBieSBpPWQuIA0KICAgRnJv
bSBbRnVqaXdhcmE4OV0gd2Uga25vdyB0aGF0IGZvciAzMi1iaXQgQ1JDLCBmb3Igc3VjaCBzbWFs
bCBuLCANCiAgIGQ9MTUuIFRodXMsIHdoZW4gbiBncm93cyBmcm9tIDMzIHRvIDM0LCB3ZSBmaW5k
IHRoYXQgdGhlIA0KICAgYXBwcm94aW1hdGlvbiBvZiBQdWQgZ3Jvd3MgYnkgMzQvMTk7IGFuZCB3
aGVuIG4gZ3Jvd3MgZnVydGhlciB0byAzNSwgDQogICBQdWQgZ3Jvd3MgYnkgYW5vdGhlciAzNS8y
MC4gVGFraW5nLCBmcm9tIFdvbGYgW1dvbGY5NGpdLCANCiAgIFB1ZChwKnwzMykgPSA0IHggMTBe
ey0xMH0sIHdlIGhhdmUgIFB1ZChwKnwzNCkgPSA3LjE1IHggMTBeey0xMH0gYW5kIA0KICAgUHVk
KHAqfDM1KSA9IDEuMjUgeCAxMF57LTl9LiANCiAgICANCiAgIEZvciB0aGUgZGVuc2l0eSBmdW5j
dGlvbiBvZiB0aGUgYnVyc3QgbGVuZ3RoLCB3ZSBhc3N1bWUgdGhlIFJheWxlaWdoIA0KICAgZGVu
c2l0eSBmdW5jdGlvbiAodGhlIGRpc2NyZXRpemF0aW9uIHRoZXJlb2YgdG8gaW50ZWdlciksIHdo
aWNoIGlzIA0KICAgdGhlIGRlbnNpdHkgb2YgdGhlIGFic29sdXRlIHZhbHVlcyBvZiBjb21wbGV4
IG51bWJlcnMgb2YgR2F1c3MgDQogICBkaXN0cmlidXRpb246IA0KICAgICAgICAgZih4KSA9IHgg
LyBhXjIgIGV4cCB7LXheMiAvIDJhXjIgfSAgICAgLCB4PjAgDQogICB0aGlzIGRlbnNpdHkgZnVu
Y3Rpb24gaGFzIGEgcGVhayBhdCB0aGUgcGFyYW1ldGVyIGEsIGFuZCBpdCBkZWNyZWFzZXMgDQog
ICBzbW9vdGhseSBmb3IgZ3Jvd2luZyB4LiANCiAgIFdlIHRha2UgdGhyZWUgY29uc2VjdXRpdmUg
Yml0cyBhcyB0aGUgbW9zdCBjb21tb24gYnVyc3QgZXZlbnQgb25jZSBhbiANCiAgIGVycm9yIGRv
ZXMgb2NjdXIsIGFuZCB0aHVzIGE9My4gDQogICAgDQogICBOb3csIHRoZSBwcm9iYWJpbGl0eSB0
aGF0IGEgYnVyc3Qgb2YgbGVuZ3RoIGIgb2NjdXJzIGluIGEgc3BlY2lmaWMgDQogICBwb3NpdGlv
biBpcyB0aGUgYnVyc3QgZXJyb3IgcmF0ZSwgd2hpY2ggd2UgZXN0aW1hdGUgYXMgMTBeey0xMH0s
IA0KICAgdGltZXMgZihiKS4gIA0KICANClNhdHJhbiwgSi4gICAgICAgU3RhbmRhcmRzLVRyYWNr
LCBFeHBpcmUgT2N0b2JlciAyMDAxICAgICAgICAgICAgICAgMTAgCgwNCiAgICAgICAgICAgICAg
ICAgICAgICAgaVNDU0kgQ1JDIGNvbnNpZGVyYXRpb25zICAgICAgRmVicnVhcnkgMjYsIDIwMDEg
DQogDQogDQogICBDYWxjdWxhdGluZyBmb3IgYj0zMyB3ZSBmaW5kIGYoMzMpID0gMS45NCB4IDEw
XnstMjZ9LiANCiAgIFRvZ2V0aGVyLCB3ZSBoYXZlIHRoYXQgdGhlIHByb2JhYmlsaXR5IHRoYXQg
YSBidXJzdCBvZiBsZW5ndGggMzMgDQogICBvY2N1cnJlZCB3aGljaCBzdGFydHMgYXQgYSBzcGVj
aWZpYyBwb3NpdGlvbiBpcyAxLjk0IHggMTBeey0zNn0uIA0KICAgTXVsdGlwbHlpbmcgdGhpcyBi
eSB0aGUgcHJvYmFiaWxpdHkgdGhhdCB0aGlzIGJ1cnN0IGVycm9yIGlzIG5vdCANCiAgIGRldGVj
dGVkLCBQdWQocCp8MzMpLCB3ZSBnZXQgdGhhdCB0aGUgcHJvYmFiaWxpdHkgdGhhdCBhIGJ1cnN0
IHRoYXQgDQogICBvY2N1cnJlZCBhdCBhIHNwZWNpZmljIHBvc2l0aW9uIGlzIG5vdCBkZXRlY3Rl
ZCBpcyA3Ljc5IHggMTAgXnstNDZ9LiANCiAgICANCiAgIEdvaW5nIGFnYWluIGFsb25nIHRoaXMg
cGF0aCBvZiBjYWxjdWxhdGlvbnMsIHRoaXMgdGltZSBmb3IgYj0zNCB3ZSANCiAgIGZpbmQgdGhh
dCBmKDM0KSA9IDQuODUqMTBeey0yOH0uIE11bHRpcGx5aW5nIGJ5IDEwXnstMTB9IGFuZCBieSAN
CiAgIFB1ZChwKnwzNCkgPSA3LjE1KjEwXnstMTB9IHdlIGdldCB0aGF0IHRoZSBwcm9iYWJpbGl0
eSB0aGF0IGEgYnVyc3QgDQogICBvZiBsZW5ndGggMzQgdGhhdCBvY2N1cnJlZCBhdCBhIHNwZWNp
ZmljIHBvc2l0aW9uIGlzIG5vdCBkZXRlY3RlZCAgDQogICBpcyAzLjQ2KjEwXnstNDd9LiANCiAg
ICANCiAgIExhc3QsIGNvbXB1dGluZyBmb3IgYj0zNSwgd2UgZ2V0IDEqMTBeey0yOX0gKiAxMF57
LTEwfSAqIDEuMjUqMTBeey05fSANCiAgID0gMS4yNSoxMF57LTQ4fS4gDQogICAgDQogICBJdCBs
b29rcyBsaWtlIHRoZSB0b3RhbCBjYW4gYmUgYXBwcm94aW1hdGVkIGF0IDEwXi00NSB3aXRoaW4g
dGhlIA0KICAgYm91bmRzIG9mIHdoYXQgd2UgYXJlIGxvb2tpbmcgZm9yLiANCiAgICANCiAgIFdo
ZW4gd2UgbXVsdGlwbHkgdGhpcyBieSB0aGUgbGVuZ3RoIG9mIHRoZSBjb2RlIChiZWNhdXNlIHRo
dXMgZmFyIHdlIA0KICAgY2FsY3VsYXRlZCBmb3IgYSBzcGVjaWZpYyBwb3NpdGlvbikgd2UgaGF2
ZSAxMF4tNDUgKiA2LjUqMTBeNCA9IA0KICAgNi41KjEwXi00MSBhcyBhbiB1cHBlciBib3VuZCBv
biB0aGUgcHJvYmFiaWxpdHkgb2YgdW5kZXRlY3RlZCBidXJzdCANCiAgIGVycm9yIGZvciBhIGNv
ZGUgb2YgbGVuZ3RoIDhLIEJ5dGVzLiANCiAgICANCiAgIElmIHdlIHN0YXJ0IHRoaXMgd2hvbGUg
Y2FsY3VsYXRpb24gb25jZSBhZ2Fpbiwgd2l0aCBpbml0aWFsIA0KICAgcHJvYmFiaWxpdHkgUChw
fDMzKSB3b3JzZSB0aGFuIHRoZSBiZXN0IHRoYXQgV29sZiBbV29sZjk0al0gZm91bmQuIA0KICAg
V2Ugd2lsbCB0YWtlIHRoZSB3b3JzdCB0aGF0IGhlIGZvdW5kLCB3aGljaCBoZSBwcmVzZW50ZWQg
YWdhaW5zdCB0aGUgDQogICBiZXN0IHRoYXQgaGUgZm91bmQuIEZvciB0aGlzIG9uZSwgUChwKnwz
MykgPSA1LjEqMTBeey02fS4gV2Ugd2lsbCANCiAgIHRodXMgbXVsdGlwbHkgdGhlIGVuZCByZXN1
bHQgd2Ugb2J0YWluZWQgYmVmb3JlLCAxMF57LTQ1fSBieSAxMF40LCANCiAgIHRoZSByYXRpbyBv
ZiB0aGUgYmVzdCBhbmQgdGhlIHdvcnN0IG9mIFdvbGYsIGFuZCBjb25jbHVkZSB0aGF0ICANCiAg
IFdlIGNhbiB0YWtlIDEwXnstNDF9IGFzIGFuIHVwcGVyIGJvdW5kIGZvciB0aGUgcHJvYmFiaWxp
dHkgdGhhdCBhIA0KICAgYnVyc3Qgb2NjdXJyZWQgYnV0IHdhcyBub3QgZGV0ZWN0ZWQgYnkgQ1JD
MzJDLiANCiAgICANCiAgIFdlIGNhbiBhbHNvIGFwcGx5IHRoaXMgb3ZlcmVzdGltYXRpb24gZm9y
IElFRUUgODAyLjMuIA0KICAgIA0KICAgQ29tbWVudDogDQogICAyXnstMzJ9ID0gMi4zMyoxMF57
LTEwfS4gDQogICAgDQozLjIgQ2hlY2tzdW1zIA0KICAgIA0KDQoNCg0KDQoNCg0KDQoNCiAgDQpT
YXRyYW4sIEouICAgICAgIFN0YW5kYXJkcy1UcmFjaywgRXhwaXJlIE9jdG9iZXIgMjAwMSAgICAg
ICAgICAgICAgIDExIAoMDQogICAgICAgICAgICAgICAgICAgICAgIGlTQ1NJIENSQyBjb25zaWRl
cmF0aW9ucyAgICAgIEZlYnJ1YXJ5IDI2LCAyMDAxIA0KIA0KIA0KNC4gUHJvYmFiaWxpdHkgb2Yg
dW5kZXRlY3RlZCBlcnJvcnMgLSBpbmRlcGVuZGVudCBlcnJvcnMgIA0KICAgIA0KNC4xIENSQyAo
ZGVyaXZhdGlvbnMgZnJvbSBbQ2FzdGFnbm9saTkzXSkgDQogICAgDQogICBJbiBbQ2FzdGFnbm9s
aTkzXSBpdCBpcyByZXBvcnRlZCB0aGF0IGZvciBlcHNpbG9uPTEwXi02LCBQdWQgZm9yIGEgDQog
ICBzaW5nbGUgYml0IGVycm9yLCBmb3IgYSBjb2RlIG9mIGxlbmd0aCA4S0IsIGZvciBib3RoIGNh
c2VzLCBJRUVFLQ0KICAgODAyLjMgYW5kIENSQzMyQyBpcyAxMF57LTIwfS4gVGhleSBhbHNvIHJl
cG9ydCB0aGF0IENSQzMyQyBoYXMgDQogICBkaXN0YW5jZSA0LCBhbmQgSUVFRSBlaXRoZXIgMyBv
ciA0IGZvciB0aGlzIGNvZGUgbGVuZ3RoLiBGcm9tIHRoaXMsIA0KICAgYW5kIHRoZSBtaW5pbXVt
IGRpc3RhbmNlIG9mIHRoZSBjb2RlIG9mIHRoaXMgbGVuZ3RoLCB3ZSBjb25jbHVkZSB0aGF0IA0K
ICAgd2l0aCBvdXIgZXN0aW1hdGlvbiBvZiBlcHNpbG9uLCBuYW1lbHkgMTBeey0xMX0sIHdlIHNo
b3VsZCBtdWx0aXBseSANCiAgIHRoZSByZXBvcnRlZCByZXN1bHQgYnkgezEwXnstNX19XjQgPSAx
MF57LTIwfSBmb3IgQ1JDMzJDLCBhbmQgZWl0aGVyIA0KICAgMTBeey0xNX0gb3IgMTBeey0yMH0g
Zm9yIElFRUU4MDIuMy4gDQogICAgDQo0LjIgQ2hlY2tzdW1zIA0KICAgIA0KICAgUGF0IFRoYWxl
ciByZXBvcnRlZCB0aGF0IGZvciBpbmRlcGVuZGVudCBiaXQgZXJyb3JzLCBQdWQgb2YgQ1JDIGlz
IA0KICAgYXBwcm94aW1hdGVseSAxMiwwMDAgYmV0dGVyIHRoYW4gRmxldGNoZXIsIGFuZCAyMiww
MDAgdGhhbiBBZGxlci4gRm9yIA0KICAgYnVyc3QgZXJyb3JzLCBieSB0aGUgc2ltcGxlIGV4YW1w
bGVzIHRoYXQgZXhpc3RzIGZvciB0aHJlZSANCiAgIGNvbnNlY3V0aXZlIHZhbHVlcyB0aGF0IGNh
biBwcm9kdWNlIGFuIHVuZGV0ZWN0ZWQgYnVyc3QsIHdlIHRha2UgdGhlIA0KICAgZmFjdG9yIHRv
IGJlIGF0IGxlYXN0IHRoZSBzYW1lLiANCiAgICANCiAgICANCg0KDQoNCg0KDQoNCg0KDQoNCg0K
DQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KICANClNhdHJhbiwgSi4gICAgICAg
U3RhbmRhcmRzLVRyYWNrLCBFeHBpcmUgT2N0b2JlciAyMDAxICAgICAgICAgICAgICAgMTIgCgwN
CiAgICAgICAgICAgICAgICAgICAgICAgaVNDU0kgQ1JDIGNvbnNpZGVyYXRpb25zICAgICAgRmVi
cnVhcnkgMjYsIDIwMDEgDQogDQogDQo1LiBTdW1tYXJ5IGFuZCBjb25jbHVzaW9ucyANCiAgICAN
CiAgICANCiAgIFRoZSBmb2xsb3dpbmcgdGFibGUgaXMgYSBzdW1tYXJ5IG9mIHRoZSBlcnJvciBk
ZXRlY3Rpb24gY2FwYWJpbGl0aWVzIA0KICAgb2YgdGhlIGRpZmZlcmVudCBjb2RlcyBhbmFseXpl
ZC4gSSB0aGUgdGFibGUgZCBpcyB0aGUgbWluaW1hbCANCiAgIGRpc3RhbmNlIGF0IGJsb2NrIGxl
bmd0aCBibG9jayAoaW4gYml0cyksIEh3IC0gaGFyZHdhcmUgbmVlZGVkIC0gQSANCiAgIG1lYW5z
IGFkZGVyLCBNIG1vZHVsbywgTEZTUiAtIGxpbmVhciBmZWVkYmFjayBzaGlmdCByZWdpc3Rlciwg
aS9ieXRlIA0KICAgLSBzb2Z0d2FyZSBpbnN0cnVjdGlvbnMvYnl0ZSwgVGFibGUgc2l6ZSAoaWYg
dGFibGUgbG9va3VwIG5lZWRlZCksIFQtDQogICBsb29rIG51bWJlciBvZiBsb29rdXBzL2J5dGUs
IFB1ZGIgLSBQdWQgYnVyc3QgYW5kIFB1ZHMgLSBQdWQgDQogICBzcG9yYWRpYzogDQogICAgDQog
ICArLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0NCiAgIC0tLSsgDQogICB8ICBDb2RlICAgICB8IGQgfCBibG9jayB8IEh3
IHwgaS9CeXRlIHwgVC1zaXplIHwgVC1sb29rfCBQdWRiICB8IFB1ZHMgICANCiAgIHwgDQogICAr
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0NCiAgIC0tLSsgDQogICB8IEZMZXRjaGVyMzJ8IDMgfCAyXjE5ICB8IEEgIHwg
MiAgICAgIHwgIC0gICAgIHwgIC0gICAgfCAxMF4tMzd8IDEwXi0NCiAgIDM2IHwgIA0KICAgKy0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tDQogICAtLS0rIA0KICAgfCBBZGxlcjMyICAgfCAzIHwgMl4xOSAgfCBBK018IDMg
ICAgICB8ICAtICAgICB8ICAtICAgIHwgMTBeLTM2fCAxMF4tDQogICAzNSB8IA0KICAgKy0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tDQogICAtLS0rIA0KICAgfCBJRUVFLTgwMiAgfCA0IHwgMl4xNiAgfExGU1J8IDIuNzUg
ICB8IDJeMTggICB8IDAuNS9iIHwgMTBeLTQxfCAxMF4tDQogICA0MCB8IA0KICAgfCAgICAgICAg
ICAgfCAgIHwgICAgICAgfCAgICB8ICAgICAgICB8ICAgICAgICB8ICAgICAgIHwgICAgICAgfC0x
MF4tDQogICAzNSB8IA0KICAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQogICAtLS0rIA0KICAgfCBDUkMzMkMgICAg
fCA0IHwgMl4zMS0xfExGU1J8IDIuNzUgICB8IDJeMTggICB8IDAuNS9iIHwgMTBeLTQxfCAxMF4t
DQogICA0MCB8IA0KICAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQogICAtLS0rIA0KICAgIA0KICAgVGhlIHByb2Jh
YmlsaXRpZXMgZm9yIHVuZGV0ZWN0ZWQgZXJyb3JzIGluIHRoZSBhYm92ZSB0YWJsZSBhcmUgDQog
ICBjb21wdXRlZCBhc3N1bWluZyB1bmlmb3JtbHkgZGlzdHJpYnV0ZWQgZGF0YS4gIEZvciByZWFs
IGRhdGEgLSB0aGF0IA0KICAgY2FuIGJlIGJpYXNlZCAtIFtTdG9uZTk4XSBjaGVja3N1bXMgYmVo
YXZlIHN1YnN0YW50aWFsbHkgd29yc2UgdGhhbiANCiAgIENSQ3MgDQogICAgDQogICBDb25zaWRl
cmluZyB0aGUgcHJvdGVjdGlvbiBsZXZlbCBpdCBvZmZlcnMsIHRoZSBsYWNrIG9mIHNlbnNpdGl2
aXR5IA0KICAgZm9yIGJpYXNlZCBkYXRhIGFuZCB0aGUgbGFyZ2UgYmxvY2sgaXQgY2FuIHByb3Rl
Y3Qgd2UgdGhpbmsgdGhhdCANCiAgIENSQzMyQyBpcyBhIGdvb2QgY2hvaWNlIGFzIGEgYmFzaWMg
ZXJyb3IgZGV0ZWN0aW9uIG1lY2hhbmlzbSBmb3IgDQogICBpU0NTSS4gDQogICAgDQoNCiAgDQpT
YXRyYW4sIEouICAgICAgIFN0YW5kYXJkcy1UcmFjaywgRXhwaXJlIE9jdG9iZXIgMjAwMSAgICAg
ICAgICAgICAgIDEzIAoMDQogICAgICAgICAgICAgICAgICAgICAgIGlTQ1NJIENSQyBjb25zaWRl
cmF0aW9ucyAgICAgIEZlYnJ1YXJ5IDI2LCAyMDAxIA0KIA0KIA0KICAgUGxlYXNlIG9ic2VydmUg
YWxzbyB0aGF0IGJ1cnN0IGVycm9ycyB0aGF0IGFyZSBjaGFyYWN0ZXJpemVkIGJ5IGEgDQogICBm
aXhlZCBhdmVyYWdlIHRpbWUgd2lsbCBoYXZlIGhpZ2hlciBpbXBhY3Qgb24gZXJyb3IgZGV0ZWN0
aW9uIA0KICAgY2FwYWJpbGl0eSBhcyB0aGUgc3BlZWQgb2YgdGhlIGNoYW5uZWxzIChtYWNoaW5l
cyBhbmQgbmV0d29ya3MpIA0KICAgaW5jcmVhc2VzLiBUaGUgb25seSBsb25nLXRlcm0gd2F5IHRv
IGtlZXAgdGhlIFB1ZCB3aXRoaW4gYm91bmRzIGlzIHRvIA0KICAgcmVkdWNlIHRoZSBCRVIgYnkg
dXNpbmcgYmV0dGVyIGNoYW5uZWwgY29kaW5nIChhcyBvcHBvc2VkIHRvIHNvdXJjZSANCiAgIGNv
ZGluZyB3ZSB3aGVyZSBkZWFsaW5nIHdpdGggaGVyZSkuICANCiAgICANCiAgICAgDQoNCg0KDQoN
Cg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0K
DQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KICANClNhdHJhbiwgSi4gICAgICAgU3RhbmRhcmRzLVRy
YWNrLCBFeHBpcmUgT2N0b2JlciAyMDAxICAgICAgICAgICAgICAgMTQgCgwNCiAgICAgICAgICAg
ICAgICAgICAgICAgaVNDU0kgQ1JDIGNvbnNpZGVyYXRpb25zICAgICAgRmVicnVhcnkgMjYsIDIw
MDEgDQogDQogDQo2LiBSZWZlcmVuY2VzIGFuZCBCaWJsaW9ncmFwaHkgDQogDQogICAgICBbQXJh
emldIEIgQXJhemkgQSBjb21tb25zZW5zZSBBcHByb2FjaCB0byB0aGUgVGhlb3J5IG9mIEVycm9y
IA0KICAgICAgQ29ycmVjdGluZyBjb2RlcyANCiAgICAgIFtCYWljaGV2YV1UIEJhaWNoZXZhLCBT
IERvZHVuZWtvdiBhbmQgUCBLYXpha292LiBVbmRldGVjdGVkIA0KICAgICAgZXJyb3IgcHJvYmFi
aWxpdHkgcGVyZm9ybWFuY2Ugb2YgY3ljbGljIHJlZHVuZGFuY3ktY2hlY2sgY29kZXMgDQogICAg
ICBvZiAxNi1iaXQgcmVkdW5kYW5jeS4gSUVFRSBQcm9jZWVkaW5ncyBvbiBDb21tdW5pY2F0aW9u
cywgDQogICAgICAxNDc6MjUzLTI1NiwgT2N0b2JlciAyMDAwIA0KICAgICAgW0JsYWNrXSAiRmFz
dCBDUkMzMiBpbiBTb2Z0d2FyZSIgIGJ5IFJpY2hhcmQgQmxhY2ssIDE5OTQsIGF0ICANCiAgICAg
IHd3dy5jbC5jYW0uYWMudWsvUmVzZWFyY2gvU1JHL2JsdWVib29rLzIxL2NyYy9jcmMuaHRtbCAg
DQogICAgICBbQ2FzdGFnbm9saTkzXSBHdXkgQ2FzdGFnbm9saSwgU3RlZmFuIEJyYWV1ZXIgYW5k
IE1hcnRpbiANCiAgICAgIEhlcnJtYW4gIk9wdGltaXphdGlvbiBvZiBDeWNsaWMgUmVkdW5kYW5j
eS1DaGVjayBDb2RlcyB3aXRoIDI0IA0KICAgICAgYW5kIDMyIFBhcml0eSBCaXRzIiwgSUVFRSBU
cmFuc2FjdC4gb24gQ29tbXVuaWNhdGlvbnMsIFZvbC4gNDEsIA0KICAgICAgTm8uIDYsIEp1bmUg
MTk5MyANCiAgICAgIFtGSVRTXSAiTkFTQSBGSVRTIGRvY3VtZW50cyIgIGF0IA0KICAgICAgaHR0
cDovL2hlYXNhcmMuZ3NmYy5uYXNhLmdvdi9kb2NzL2hlYXNhcmMvb2Z3Zy9kb2NzL2dlbmVyYWwv
Y2hlDQogICAgICBja3N1bS9ub2RlMjYuaHRtbCAgDQogICAgICBbRnVqaXdhcmE4OV0gVG9ydSBG
dWppd2FyYSwgVGFkYW8gS2FzYW1pLCBhbmQgU2h1IExpbi4gk0Vycm9yIA0KICAgICAgZGV0ZWN0
aW5nIGNhcGFiaWxpdGllcyBvZiB0aGUgc2hvcnRlbmVkIGhhbW1pbmcgY29kZXMgYWRvcHRlZCAN
CiAgICAgIGZvciBlcnJvciBkZXRlY3Rpb24gaW4gSUVFRSBzdGFuZGFyZCA4MDIuMyIuIElFRUUg
VHJhbnNhY3Rpb25zIA0KICAgICAgb24gQ29tbXVuaWNhdGlvbnMsIENPTS0zNzo5ODaWOTg5LCBT
ZXB0ZW1iZXIgMTk4OS4gDQogICAgICBbUGV0ZXJzb25dVyBXZXNsZXkgUGV0ZXJzb24gJiBFIEog
V2VsZG9uIC0gRXJyb3IgQ29ycmVjdGluZyANCiAgICAgIENvZGVzIC0gRmlyc3QgRWRpdGlvbiAx
OTYxL1NlY29uZCBFZGl0aW9uIDE5NzIgDQogICAgICBbUkZDMjAyNl0gQnJhZG5lciwgUy4sICJU
aGUgSW50ZXJuZXQgU3RhbmRhcmRzIFByb2Nlc3MgLS0gDQogICAgICBSZXZpc2lvbiAzIiwgUkZD
IDIwMjYsIE9jdG9iZXIgMTk5Ni4gDQogICAgICBbUG9seW5vbWlhbHNdICJJbmZvcm1hdGlvbiBv
biBQcmltaXRpdmUgYW5kIElycmVkdWNpYmxlIA0KICAgICAgUG9seW5vbWlhbHMiIGF0IA0KICAg
ICAgaHR0cDovL3d3dy50aGVvcnkuY3NjLnV2aWMuY2EvfmNvcy9pbmYvbmVjay9Qb2x5SW5mby5o
dG1sICANCiAgICAgIFtSRkMxMTQ2XSBUQ1AgQWx0ZXJuYXRlIENoZWNrc3VtIE9wdGlvbnMgDQog
ICAgICBbUkZDMTk1MF0gWkxJQiBDb21wcmVzc2VkIERhdGEgRm9ybWF0IFNwZWNpZmljYXRpb24g
dmVyc2lvbiAzLjMgDQogICAgICBbU3RvbmU5OF0gSi4gU3RvbmUgZXQuIGFsICJQZXJmb3JtYW5j
ZSBvZiBDaGVja3N1bXMgYW5kIENSQydzIA0KICAgICAgb3ZlciBSZWFsIERhdGEiIElFRUUvQUNN
IFRyYW5zYWN0aW9ucyBvbiBOZXR3b3JraW5nLCBWb2wuIDYsIA0KICAgICAgTm8uIDUsIE9jdG9i
ZXIgMTk5OCANCiAgICAgIFtXaWxsaWFtc10gIFJvc3MgV2lsbGlhbXMgLSBBIFBBSU5MRVNTIEdV
SURFIFRPIENSQyBFUlJPUiANCiAgICAgIERFVEVDVElPTiBBTEdPUklUSE1TIHdpZGVseSBhdmFp
bGFibGUgb24gdGhlIG5ldCAtIChlLmcuLCANCiAgICAgIGZ0cC5hZGVsYWlkZS5lZHUuYXUvcHVi
L3JvY2tzb2Z0L2NyY192My50eHQpIA0KICAgICAgW1dvbGY4Ml0gSi5LLiBXb2xmLCBBcm5vbGQg
TWljaGVsc29uIGFuZCBBbGxlbiBMZXZlc3F1ZS4gT24gdGhlIA0KICAgICAgcHJvYmFiaWxpdHkg
b2YgdW5kZXRlY3RlZCBlcnJvciBmb3IgbGluZWFyIGJsb2NrIGNvZGVzLiBJRUVFIA0KICAgICAg
VHJhbnNhY3Rpb25zIG9uIENvbW11bmljYXRpb25zLCBDT00tMzA6MzE3LTMyNCwgMTk4MiAgDQog
ICAgICBbV29sZjg4XSBKLksuIFdvbGYsIFIuRC4gQmxhY2tlbmV5IEFuIEV4YWN0IEV2YWx1YXRp
b24gb2YgdGhlIA0KICAgICAgUHJvYmFiaWxpdHkgb2YgVW5kZXRlY3RlZCBFcnJvciBmb3IgQ2Vy
dGFpbiBTaG9ydGVuZWQgQmluYXJ5IA0KICAgICAgQ1JDIENvZGVzIC0gIFByb2MuIE1JTENPTSAt
IElFRUUgMTk4OCAgDQogICAgICBbV29sZjk0Sl0gSi5LLiBXb2xmIGFuZCBEZXh0ZXIgQ2h1biBU
aGUgc2luZ2xlIGJ1cnN0IGVycm9yIA0KICAgICAgZGV0ZWN0aW9uIHBlcmZvcm1hbmNlIG9mIGJp
bmFyeSBjeWNsaWMgY29kZXMuIElFRUUgVHJhbnNhY3Rpb25zIA0KICAgICAgb24gQ29tbXVuaWNh
dGlvbnMgQ09NLTQyOjExLTEzLCBKYW51YXJ5IDE5OTQgDQoNCg0KICANClNhdHJhbiwgSi4gICAg
ICAgU3RhbmRhcmRzLVRyYWNrLCBFeHBpcmUgT2N0b2JlciAyMDAxICAgICAgICAgICAgICAgMTUg
CgwNCiAgICAgICAgICAgICAgICAgICAgICAgaVNDU0kgQ1JDIGNvbnNpZGVyYXRpb25zICAgICAg
RmVicnVhcnkgMjYsIDIwMDEgDQogDQogDQogICAgICBbV29sZjk0T10gRGV4dGVyIENodW4gYW5k
IEouSy4gV29sZi4gU3BlY2lhbCBIYXJkd2FyZSBmb3IgDQogICAgICBjb21wdXRpbmcgdGhlIHBy
b2JhYmlsaXR5IG9mIHVuZGV0ZWN0ZWQgZXJyb3IgZm9yIGNlcnRhaW4gDQogICAgICBiaW5hcnkg
Y3JjIGNvZGVzIGFuZCB0ZXN0IHJlc3VsdHMuIElFRUUgVHJhbnNhY3Rpb25zIG9uIA0KICAgICAg
Q29tbXVuaWNhdGlvbnMsIENPTS00MjoyNzY5LTI3NzIgDQogICAgICAgDQogICAgDQogICAgDQoN
Cg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0K
DQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQogIA0KU2F0cmFuLCBKLiAgICAgICBTdGFu
ZGFyZHMtVHJhY2ssIEV4cGlyZSBPY3RvYmVyIDIwMDEgICAgICAgICAgICAgICAxNiAKDA0KICAg
ICAgICAgICAgICAgICAgICAgICBpU0NTSSBDUkMgY29uc2lkZXJhdGlvbnMgICAgICBGZWJydWFy
eSAyNiwgMjAwMSANCiANCiANCjcuIEF1dGhvcidzIEFkZHJlc3NlcyANCiAgICANCiAgICAgICAg
RGFmbmEgU2hlaW53YWxkIA0KICAgICAgICBKdWxpYW4gU2F0cmFuIA0KICAgICAgICBJQk0sIEhh
aWZhIFJlc2VhcmNoIExhYiANCiAgICAgICAgTUFUQU0gLSBBZHZhbmNlZCBUZWNobm9sb2d5IENl
bnRlciANCiAgICAgICAgSGFpZmEgMzE5MDUsIElzcmFlbCANCiAgICAgICAgUGhvbmUgKzk3MiA0
IDgyOSA2MjExIA0KICAgICAgICBFbWFpbDogc2hlaW53YWxkQGlsLmlibS5jb20gSnVsaWFuX1Nh
dHJhbkBpbC5pYm0uY29tIA0KICAgICAgICAgICANCiAgICANCiAgICAgICAgUGF0IFRoYWxlciAN
CiAgICAgICAgQWdpbGVudCAgICAgICANCiAgICANCiAgICANCg0KDQoNCg0KDQoNCg0KDQoNCg0K
DQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCiAgDQpT
YXRyYW4sIEouICAgICAgIFN0YW5kYXJkcy1UcmFjaywgRXhwaXJlIE9jdG9iZXIgMjAwMSAgICAg
ICAgICAgICAgIDE3IAoMDQogICAgICAgICAgICAgICAgICAgICAgIGlTQ1NJIENSQyBjb25zaWRl
cmF0aW9ucyAgICAgIEZlYnJ1YXJ5IDI2LCAyMDAxIA0KIA0KIA0KICAgIA0KRnVsbCBDb3B5cmln
aHQgU3RhdGVtZW50IA0KIA0KICAgIkNvcHlyaWdodCAoQykgVGhlIEludGVybmV0IFNvY2lldHkg
KGRhdGUpLiBBbGwgUmlnaHRzIFJlc2VydmVkLiBUaGlzIA0KICAgZG9jdW1lbnQgYW5kIHRyYW5z
bGF0aW9ucyBvZiBpdCBtYXkgYmUgY29waWVkIGFuZCBmdXJuaXNoZWQgdG8gDQogICBvdGhlcnMs
IGFuZCBkZXJpdmF0aXZlIHdvcmtzIHRoYXQgY29tbWVudCBvbiBvciBvdGhlcndpc2UgZXhwbGFp
biBpdCANCiAgIG9yIGFzc2lzdCBpbiBpdHMgaW1wbGVtZW50YXRpb24gbWF5IGJlIHByZXBhcmVk
LCBjb3BpZWQsIHB1Ymxpc2hlZCANCiAgIGFuZCBkaXN0cmlidXRlZCwgaW4gd2hvbGUgb3IgaW4g
cGFydCwgd2l0aG91dCByZXN0cmljdGlvbiBvZiBhbnkgDQogICBraW5kLCBwcm92aWRlZCB0aGF0
IHRoZSBhYm92ZSBjb3B5cmlnaHQgbm90aWNlIGFuZCB0aGlzIHBhcmFncmFwaCBhcmUgDQogICBp
bmNsdWRlZCBvbiBhbGwgc3VjaCBjb3BpZXMgYW5kIGRlcml2YXRpdmUgd29ya3MuIEhvd2V2ZXIs
IHRoaXMgDQogICBkb2N1bWVudCBpdHNlbGYgbWF5IG5vdCBiZSBtb2RpZmllZCBpbiBhbnkgd2F5
LCBzdWNoIGFzIGJ5IHJlbW92aW5nIA0KICAgdGhlIGNvcHlyaWdodCBub3RpY2Ugb3IgcmVmZXJl
bmNlcyB0byB0aGUgSW50ZXJuZXQgU29jaWV0eSBvciBvdGhlciANCiAgIEludGVybmV0IG9yZ2Fu
aXphdGlvbnMsIGV4Y2VwdCBhcyBuZWVkZWQgZm9yIHRoZSBwdXJwb3NlIG9mIA0KICAgZGV2ZWxv
cGluZyBJbnRlcm5ldCBzdGFuZGFyZHMgaW4gd2hpY2ggY2FzZSB0aGUgcHJvY2VkdXJlcyBmb3Ig
DQogICBjb3B5cmlnaHRzIGRlZmluZWQgaW4gdGhlIEludGVybmV0IFN0YW5kYXJkcyBwcm9jZXNz
IG11c3QgYmUgDQogICBmb2xsb3dlZCwgb3IgYXMgcmVxdWlyZWQgdG8gdHJhbnNsYXRlIGl0IGlu
dG8gbGFuZ3VhZ2VzIG90aGVyIHRoYW4gICAgDQogICBFbmdsaXNoLiANCiAgICANCiAgIFRoZSBs
aW1pdGVkIHBlcm1pc3Npb25zIGdyYW50ZWQgYWJvdmUgYXJlIHBlcnBldHVhbCBhbmQgd2lsbCBu
b3QgYmUgICAgDQogICByZXZva2VkIGJ5IHRoZSBJbnRlcm5ldCBTb2NpZXR5IG9yIGl0cyBzdWNj
ZXNzb3JzIG9yIGFzc2lnbnMuIA0KICAgIA0KICAgVGhpcyBkb2N1bWVudCBhbmQgdGhlIGluZm9y
bWF0aW9uIGNvbnRhaW5lZCBoZXJlaW4gaXMgcHJvdmlkZWQgb24gYW4gDQogICAiQVMgSVMiIGJh
c2lzIGFuZCBUSEUgSU5URVJORVQgU09DSUVUWSBBTkQgVEhFIElOVEVSTkVUIEVOR0lORUVSSU5H
IA0KICAgVEFTSyBGT1JDRSBESVNDTEFJTVMgQUxMIFdBUlJBTlRJRVMsIEVYUFJFU1MgT1IgSU1Q
TElFRCwgSU5DTFVESU5HIA0KICAgQlVUIE5PVCBMSU1JVEVEIFRPIEFOWSBXQVJSQU5UWSBUSEFU
IFRIRSBVU0UgT0YgVEhFIElORk9STUFUSU9OIA0KICAgSEVSRUlOIFdJTEwgTk9UIElORlJJTkdF
IEFOWSBSSUdIVFMgT1IgQU5ZIElNUExJRUQgV0FSUkFOVElFUyBPRiAgIA0KICAgTUVSQ0hBTlRB
QklMSVRZIE9SIEZJVE5FU1MgRk9SIEEgUEFSVElDVUxBUiBQVVJQT1NFLiIgDQoNCg0KDQoNCg0K
DQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCg0KDQoNCiAgDQpTYXRyYW4sIEouICAgICAgIFN0
YW5kYXJkcy1UcmFjaywgRXhwaXJlIE9jdG9iZXIgMjAwMSAgICAgICAgICAgICAgIDE4IAoM

--0__=XVHp06PpulT0oOPGDF6R8rp8CZ14LJlh0b0xIimfQU7bGeH59Tvveuqo--



From owner-ips@ece.cmu.edu  Wed Apr 18 10:26:09 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA11772
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 10:26:08 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ICcel01808
	for ips-outgoing; Wed, 18 Apr 2001 08:38:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ICbkr01746
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 08:37:46 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id OAA301976
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 14:37:34 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id OAA66276
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 14:37:34 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.00455A98 ; Wed, 18 Apr 2001 14:37:31 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A32.00455901.00@d12mta02.de.ibm.com>
Date: Wed, 18 Apr 2001 14:42:37 +0200
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



OK - I misread it. In any case we are not FCP and we don't violate iSCSI
rules.
Even in FCP the real reason for the ordering requirement is media order and
the need to minimize buffering for reordering.
Both can be maintained while requiring retransmission with R2T.

Julo

Santosh Rao <santoshr@cup.hp.com> on 17/04/2001 20:47:14

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   IPS Reflector <ips@ece.cmu.edu>, Fibre Channel T11 reflector
      <fc@network.com>
Subject:  Re: iSCSI : digest error handling violates EMDP/InDataOrder




julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> The bit and the interpretation are protocol specific.
>
> FCP uses it like iSCSI - i.e. the order has to maintained within a
sequence


Not true. If you take a look at FCP-2 rev 04 Section 10.1.1.7
description on EMDP, it explicitly states :
"The EMDP bit does not affect the order of frames within a sequence".

For a WRITE command, an EMDP setting of 0 implies that the buffer offset
in R2T requests must be in continuous and increasing order whereas an
EMDP setting of 1 implies the buffer offset in R2T can be out of order.

For a READ command, an EMDP setting of 0 implies the buffer offset in
READ data PDUs is in continuous and increasing order, whereas, an EMDP
setting of 1 implies buffer offset in READ Data PDUs can be out of
order.

Based on the above rules, iSCSI is violating EMDP setting by its error
recovery for data digest errors detected by targets on Data PDUs.

- Santosh


> (a R2T derived output or the entire input).
> In that sense we are not violating the EMDP.

>
> And BTW the recovery procedure in FCP is similar although a bit more
> complicated than ours and involves also
> a link level sequence.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 03:54:28
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   IPS Reflector <ips@ece.cmu.edu>
> cc:
> Subject:  iSCSI : digest error handling violates EMDP/InDataOrder
>
> Where :
> =======
>
> Section 6.2 (pg 80). Digest Errors
> -----------------------------------
> "If the error is a Data-Digest-Error in a Data-PDU, the target MUST
> either request retransmission with a R2T or answer with a Reject iSCSI
> PDU and abort the task."
>
> Problem :
> ---------
> On a Data digest error detected by a target, it MUST NOT request
> re-transmission of the data PDU thru an R2T if the session login key
> InDataOrder is set to yes. The current rev 05 draft violates
> InDataOrder/EMDP settings by allowing a re-transmission of R2T by
> target.
>
> Scenario :
> ==========
> initiator           target
> ---------           ------
> EMDP=0
> InDataOrder=YES
> (exp_off=0)
>         offset=0,len=64k <------ R2T
>
> --------> data PDUs
> (exp_off = 64K)
>                               data digest error results in
>                      an 8K PDU being dropped at offset 24K.
>
>        offset=24K,len=8K  <------ R2T for missing PDU.
>
> exp_off != offset
>
> - Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Wed Apr 18 10:29:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA11815
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 10:29:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ID6hd03731
	for ips-outgoing; Wed, 18 Apr 2001 09:06:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from megapathdsl.net (snowbird.megapath.net [216.200.176.7])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ID6Or03715
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 09:06:25 -0400 (EDT)
Received: from [64.7.4.187] (HELO ntwlassie)
  by megapathdsl.net (CommuniGate Pro SMTP 3.4.3)
  with SMTP id 19770468 for ips@ece.cmu.edu; Wed, 18 Apr 2001 06:05:49 -0700
Message-ID: <002101c0c80b$6e057220$320114ac@ntwlassie>
Reply-To: "Justin R. Bendich" <bendich@TrelliSoft.com>
From: "Justin R. Bendich" <bendich@TrelliSoft.com>
To: <ips@ece.cmu.edu>
Subject: Some questions about naming (newbie)
Date: Wed, 18 Apr 2001 08:28:17 -0500
Organization: TrelliSoft, inc.
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_001E_01C0C7E1.84CF21B0"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4029.2901
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4029.2901
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.

------=_NextPart_000_001E_01C0C7E1.84CF21B0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hello. I'm a developer at TrelliSoft. We have a product that, among =
other
things, discovers the disks attached to the computer on which we run.
This information is centralized, and an attempt made to identify disks
visible from multiple computers. That's why i'm interested in the =
answers
to the following questions. Perhaps some of these are answered in the
draft specification, but it's a long document, so any help (including =
refer-
ence to specific sections of the draft) would be appreciated.

1. Can the node-name (formerly WWUID) of an iSCSI target change?
2. Can the WWN of a FC target change (sorry -- i'm not that familiar =
with FC)?
3. Is there any requirement/recommendation that iSCSI devices report =
their
   node-name in the INQUIRY return (i would suspect VPD page 0x83)?

My preferences for the answers to those questions would be:

1. No
2. No (i think that one's true)
3. Yes

and if the (naming) specification is not so evolved yet, perhaps input =
at
this point can still affect it?

Justin R. Bendich
TrelliSoft, inc.
(630) 545-0576 x14
bendich@TrelliSoft.com

------=_NextPart_000_001E_01C0C7E1.84CF21B0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 5.50.4030.2400" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Hello. I'm a developer at TrelliSoft. =
We have a=20
product that, among other</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>things, discovers the disks attached to =
the=20
computer on which we run.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>This information is centralized, and an =
attempt=20
made to identify disks</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>visible from multiple computers. That's =
why i'm=20
interested in the answers</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>to the following questions. Perhaps =
some of these=20
are answered in the</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>draft specification, but it's a long =
document, so=20
any help (including refer-</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>ence to specific sections of the draft) =
would be=20
appreciated.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>1. Can&nbsp;the node-name (formerly =
WWUID) of an=20
iSCSI&nbsp;target change?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>2. Can the WWN of a FC target change =
(sorry -- i'm=20
not that familiar with FC)?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT><FONT face=3DArial size=3D2>3. =
Is there any=20
requirement/recommendation that iSCSI devices report their</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp; node-name in the INQUIRY =
return (i=20
would suspect VPD page 0x83)?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>My preferences for the answers to those =
questions=20
would be:</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>1. No</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>2. No (i think that one's =
true)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>3. Yes</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>and if the (naming) specification is =
not so evolved=20
yet, perhaps input at</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>this point can still affect =
it?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Justin R. Bendich</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>TrelliSoft, inc.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>(630) 545-0576 x14</FONT></DIV>
<DIV><FONT face=3DArial size=3D2><A=20
href=3D"mailto:bendich@TrelliSoft.com">bendich@TrelliSoft.com</A></FONT><=
/DIV></BODY></HTML>

------=_NextPart_000_001E_01C0C7E1.84CF21B0--



From owner-ips@ece.cmu.edu  Wed Apr 18 10:30:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA11864
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 10:30:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ICReZ01131
	for ips-outgoing; Wed, 18 Apr 2001 08:27:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ICRIr01113
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 08:27:19 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id OAA74114;
	Wed, 18 Apr 2001 14:24:07 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id OAA120802;
	Wed, 18 Apr 2001 14:24:03 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.00441B23 ; Wed, 18 Apr 2001 14:23:53 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
cc: ips@ece.cmu.edu, tsvwg@ietf.org, "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>,
        Randall Stewart <rrs@cisco.com>,
        "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
Message-ID: <C1256A32.004419D1.00@d12mta05.de.ibm.com>
Date: Wed, 18 Apr 2001 14:28:58 +0200
Subject: Re: [Tsvwg] [SCTP checksum problems]
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Jim,

Most of the reasosn for are in the text I sent you (an very few of the
arguments against checksums - if you need more of those the are some
preliminary memos on my site htpp://www.haifa.il.ibm.com/satran/ips). The
CRC selected was from the "optimization" paper that appears in the
references.


Regards,
Julo

"WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com> on 17/04/2001 20:43:51

Please respond to "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu, tsvwg@ietf.org, "'Craig Partridge'"
      <craig@aland.bbn.com>, Jonathan Wood <Jonathan.Wood@Sun.COM>,
      xieqb@cig.mot.com, Jonathan Stone <jonathan@dsg.stanford.edu>,
      Randall Stewart <rrs@cisco.com>, "WENDT,JIM (HP-Roseville,ex1)"
      <jim_wendt@hp.com>
Subject:  Re: [Tsvwg] [SCTP checksum problems]




Julian,
The SCTP folks are right now discussing changing the SCTP checksum to be a
CRC-32 (or other). This is a very good thing and really what needs to
happen
with SCTP for it to support iSCSI and other data-critical applications
effectively (and also relieve iSCSI from having to implement data integrity
checking and transport-like functionality over SCTP).

They are looking for inputs as to which CRC-32 or checksum to use. The
iSCSI
WG's CRC investigation work and conclusion would be a valuable input into
their decision. The sooner that you can provide the iSCSI recommended CRC
and reasoning behind it to them, the better, even before the forthcoming
I-D
is distributed.

Jim Wendt
Networked Storage Architecture
Hewlett-Packard Company
jim_wendt@hp.com 916-785-5198

----------------------------------------------------------------------------

-

> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: Sunday, April 15, 2001 7:58 AM
> To: ips@ece.cmu.edu
> Subject: CRCs
>
>
>
>
> Dear colleagues,
>
> We will probably not be able to finish the CRC/checksum
> document in time
> for Nashua but we hope it will be out very soon after that.
> However I
> would like to inform you that while in Orlando and
> Minneapolis we where
> still talking about different CRCs we (Dafna Sheinwald, Pat
> Thaler, Matt
> Wakeley, Vince Cavanna and myself) have agreed on a CRC and
> the forthcoming
> ID will give all the reasons and why we recomend it.
>
> Regards,
> Julo
>

----------------------------------------------------------------------------

-

-----Original Message-----
From: Randall Stewart [mailto:rrs@cisco.com]
Sent: Tuesday, April 17, 2001 4:31 AM
To: Jonathan Wood
Cc: xieqb@cig.mot.com; tsvwg@ietf.org; Jim Wendt; Jonathan Stone; Craig
Partridge
Subject: Re: [Tsvwg] [SCTP checksum problems]


Jonathan:

I will make sure everyone at the bakeoff is aware of the upcoming
"checksum" change... Now one of the big questions yet is
what checksum should we use?

I kinda lean towards crc-32 myself (but of course I have no technical
basis for this and need to keep silent on which one to use anyway :->),
but do we have other candidates besides fletcher-32 and possibly
modified
Adler-32 (i.e. 16 bit adds instead of 8)??

I will take the above 3 and do a bit of performance work this
week and post some numbers... thats about all I can do i.e.  tell
how much time the options I know of take...

If you have some other candidates let me know and I can possibly get
some performance numbers on these as well...

As far as which is the best... I encourage all of you check-sum
experts out there to please join the thread :)

Oh, I know Jonathan Stone's paper will NOT be ready until sometime
in May.. so we may want to proceed slowly so that Craig Partridge and
he can have some cycles to add to this dicussion :)

R

Jonathan Wood wrote:
>
> As an SCTP implementor and someone who will want to get the hardware
folks
to
> help with checksumming, I wholeheartedly agree with Randy. Remember that
SCTP is
> just a proposed standard, and is as such not all that far along the
> standardization process. We should still be able to make changes like
this
if
> necessary.
>
> Jon
>
> >
> >Q:
> >
> >The only problem with an additional "CRC chunk" is that
> >it makes hardware assistance to error correction much
> >more difficult. It is better (I think) to just realize
> >we made a mistake. Get the opinions of the experts as to
> >what checksum to use... i.e.:
> >
> >- CRC-32
> >- Modified Adler-32 (16 bit word sums)
> >- Fletcher-32
> >- ???
> >
> >And then go with this as a replacement... Admit we were wrong
> >and fix the problem..
> >
> >This way you have ONE and only ONE checksum algorithm making
> >hardware designers life much easier...
> >
> >R
> >
> >Qiaobing Xie wrote:
> >>
> >> Another solution could be (I think I mentioned this to Randy and a few
> >> others at last IETF):
> >>
> >> - Define a CRC-32 (or other strong checksum) control chunk and when
the
> >> sender wishes to use a stronger checksum protection, in addition to
the
> >> Adler-32 in the common SCTP header it includes this CRC-32 chuck in
the
> >> outbound packet. When the packet arrives, the receiver will do the
> >> Adler-32 first, and then if the receiver supports the CRC-32 and sees
> >> the presence of the CRC-32 chunk in the packet it will further verify
> >> the CRC-32.
> >>
> >> We could also use a bit pattern in the chunk type of the CRC-32 chunk
so
> >> that if the receiver doesn't understand the CRC-32 chunk it would drop
> >> it with a report back to the sender.
> >>
> >> -Qiaobing
> >>
> >> _______________________________________________
> >> tsvwg mailing list
> >> tsvwg@ietf.org
> >> http://www1.ietf.org/mailman/listinfo/tsvwg
> >
> >--
> >Randall R. Stewart
> >Systems & Solutions Engineering
> >Cisco Systems Inc.
> >rrs@cisco.com 815-342-5222 or 815-477-2127
> >
> >_______________________________________________
> >tsvwg mailing list
> >tsvwg@ietf.org
> >http://www1.ietf.org/mailman/listinfo/tsvwg

--
Randall R. Stewart
Systems & Solutions Engineering
Cisco Systems Inc.
rrs@cisco.com 815-342-5222 or 815-477-2127

>





From owner-ips@ece.cmu.edu  Wed Apr 18 10:31:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA11893
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 10:31:04 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ICggw02081
	for ips-outgoing; Wed, 18 Apr 2001 08:42:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ICgMr02058
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 08:42:22 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id OAA232194
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 14:42:15 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id OAA154290
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 14:42:15 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.0045C4CE ; Wed, 18 Apr 2001 14:42:03 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A32.0045C316.00@d12mta02.de.ibm.com>
Date: Wed, 18 Apr 2001 14:46:21 +0200
Subject: Re: iSCSI Requirements Draft - Informal WG Last Call
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



There where many reasons for going to a total ordering.
You will find them by looking up discussions starting with the Haifa
meeting last summer.

Julo

Santosh Rao <santoshr@cup.hp.com> on 17/04/2001 21:06:56

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call





Julian,

Ordered delivery can be achieved to better effect using the SAM-2 CRN
based ordering due to the following reasons :

1) CRN provides ordering on a per-lun basis and can be turned on and off
for a subset of I/Os to that LUN. This allows for flexible ordering
since ordering is a function of the I/O type from the application.
Applications that are doing READ only operations (like a search engine)
do not require any ordering. Ordering is required on metadata updates,
any form of synchronization I/Os, WRITEs interspersed with READS, etc.
Thus, an ordering solution should be flexible enough to be applied at
the scope of a subset of I/Os destined to a LUN.

Such an ordering scheme would also allow ordering to be turned on for
only tape applications if disk applications did not require ordering.

iSCSI's ordering solution does not provide this flexibility, whereas
usage of CRN would.

2) Such a fine granularity scope of ordering also minimizes the impact
of error recovery actions taken when loss of order occurs. The impact
with CRN would be a target-initiator handshake based on ACA + some
checkpoints to error back all the pending CRN enabled commands on that
LUN.

Comparing the equivalent error recovery in a CmdSN based iSCSI ordering
solution, the action taken would be to error back all the pending I/Os
destined to the entire session following loss of order. With high end
disk arrays having 1000+ LUN configurations, such error recovery is
extreme, [especially when ordering may have been desired by the appln
only on a small subset of I/Os to 1 LUN, and loss of ordering for the
remaining 999 LUNs was a don't care].

3) A CRN based ordering scheme works for all underlying SCSI transports
as opposed to CmdSN based ordering.

4) The generation of a stream of commands that expect strong ordering
will need to be accompanied by corresponding generation of a sequence
number at the same layer. (CRN would provide such a sequencing). Failure
to do so can result in silent loss of order that slips un-detected due
to potential points of failure in the stack b/n the SCSI ULP and the
physical bus/link. (ex : I/O failures within the HBA driver due to
resource allocation failures or other such conditions can cause loss of
order.).

Attempts to enforce ordering at multiple layers of the stack (CRN at the
ULP and CmdSN at the LLP), especially when CmdSN does not provide all
the benefits that CRN would provide is over-engineering the solution to
the ordering problem. It also impacts iSCSI performance.

- Santosh

>
> julian_satran@il.ibm.com wrote:
> >
> > Ordered delivery of commands to ANY TYPE of devices will increase in
> > importance as network speeds increase and the need to hide latency
> > increases.
> >
> > Today databases don't use queuing and rely and trickle the commands to
> > devices 1 by 1 to ensure atomicity and order.
> > As latency will become the determining factor in performance this is
bound
> > to change.
> >
> > SCSI has done an excellent job in defining the queueing mechanism. We
have
> > to make it work with good performance in our environment.
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 13/04/2001 04:33:45
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:   Black_David@emc.com
> > Subject:  Re: iSCSI Requirements Draft - Informal WG Last Call
> >
> > David & All,
> >
> > I object to the following requirement :
> >
> > " MUST support ordered delivery of SCSI commands from the initiator to
> > the
> >   target, to support SCSI Task Queuing. "
> >
> > Ordered delivery is not a requirement for disk based applications and
> > non tagged queueing tape applications, which form the majority of
> > today's data traffic.
> >
> > To impose strict ordering (even in the presence of errors ?) as a MUST
> > is penalizing the majority of today's data traffic that does not expect
> > ordering from the SCSI subsystem.
> >
> > I am particularly concerned about the effect of the above requirement
in
> > the presence of errors. Does iSCSI expect strict ordering to be
> > maintained even when individual I/O errors like ULP timeout occur ?
> >
> > On a ULP timeout (caused by, say, a hole in CmdSN), the initiator may
> > choose not to retry the command, but instead, error it back to the ULP.
> > In such a case, it can plug the hole in CmdSN with a NOP-OUT.
> >
> > The above requirement is not feasible to be met under such
circumstances
> > and others similar to this. Mandating strict ordering on ULP timeouts
> > implies a session level error recovery on any individual I/O being
> > failed back from iSCSI to SCSI ULP. This is a very heavy hammer to use
> > as error recovery and should not be imposed.
> >
> > The above requirement must be changed to :
> > " SHOULD support ordered delivery of SCSI commands from the initiator
to
> > the
> >   target, to support SCSI Task Queuing. "
> >
> > - Santosh
> >
> > Black_David@emc.com wrote:
> > >
> > > It is intended to submit draft-ietf-ips-iscsi-reqmts-02.txt
> > > as an Informational RFC. There is no formal requirement for
> > > a WG Last Call, but if you have any further substantive comments
> > > on the document please raise them on this list within the next
> > > two weeks, i.e. by April 27th at the latest.
> > >
> > > If you have typographical/editorial comments please send them
> > > direct to the document's author, Marjorie Krueger
> > > <marjorie_krueger@hp.com>.
> > >
> > > Thanks,
> > > --David and Elizabeth, IPS WG co-chairs
> >  - santoshr.vcf
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Wed Apr 18 11:37:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA13036
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 11:37:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IE2r007718
	for ips-outgoing; Wed, 18 Apr 2001 10:02:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IE20r07643
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:02:00 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id KAA23161; Wed, 18 Apr 2001 10:01:48 -0400 (EDT)
Message-ID: <3ADD9DF5.FD7711DF@cisco.com>
Date: Wed, 18 Apr 2001 09:00:21 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: "Justin R. Bendich" <bendich@TrelliSoft.com>
CC: ips@ece.cmu.edu
Subject: Re: Some questions about naming (newbie)
References: <002101c0c80b$6e057220$320114ac@ntwlassie>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Justin-

The answers to all three questions should generally be "no".

The iSCSI node name of a target is independent of the target's
addresses, and does not change over time.  I believe the same to
be true of a FC WWN.  However, care must be taken with FC WWNs;
there are several types:

1. A Target WWPN (world-wide port name) is different for each
   interface on an FC device, and does not uniquely identify the
   target.

2. A Target WWNN (world-wide node name) is often just set to be
   the same as the port name, perhaps with a high-order bit set.
   Although I think it was intended to be a way to correlate
   different ports on the same target, I don't think that all
   implementations have done it that way, so it may not be reliable.

3. A LUN WWN is at the SCSI LUN level, and can be used for identifying
   LUNs regardless of whether the underlying transport is FC, parallel
   SCSI, or iSCSI.  However, most FC devices do not support the LUN
   WWN, and use serial numbers, device IDs, and other mechanisms to
   correlate them.  To do this, one must implement device-type-specific
   code.

Anyway, those were just a few observations that might help.

On question 3, keep in mind that an INQUIRY is done at the SCSI
level, and is independent of the SCSI transport mechanism used.
Therefore, the VPD information would not return anything that would
reflect an iSCSI identifier.  The ability to return a LUN WWN (from
(3) above) on this page really is a SCSI-level thing, and has nothing
to do with FC other than that it shares the same address format.

At any rate, in order to connect to a SCSI device via iSCSI to do
the INQUIRY, you must first log in to the target at the iSCSI level.
To do this, you would already have the iSCSI name (formerly WWUI),
so you should already have the information you need without the
inquiry data.

Please let me know if this will meet your requirements, or if you
have other naming requirements.  We will be publishing another naming
draft shortly.  The draft is very close to being finished, but we
would like to know if there are needs it does not address, or that
we need to document better.

Hope this helps,

Mark

> "Justin R. Bendich" wrote:
> 
> Hello. I'm a developer at TrelliSoft. We have a product that, among other
> things, discovers the disks attached to the computer on which we run.
> This information is centralized, and an attempt made to identify disks
> visible from multiple computers. That's why i'm interested in the answers
> to the following questions. Perhaps some of these are answered in the
> draft specification, but it's a long document, so any help (including refer-
> ence to specific sections of the draft) would be appreciated.
> 
> 1. Can the node-name (formerly WWUID) of an iSCSI target change?
> 2. Can the WWN of a FC target change (sorry -- i'm not that familiar with FC)?
> 3. Is there any requirement/recommendation that iSCSI devices report their
>    node-name in the INQUIRY return (i would suspect VPD page 0x83)?
> 
> My preferences for the answers to those questions would be:
> 
> 1. No
> 2. No (i think that one's true)
> 3. Yes
> 
> and if the (naming) specification is not so evolved yet, perhaps input at
> this point can still affect it?
> 
> Justin R. Bendich
> TrelliSoft, inc.
> (630) 545-0576 x14
> bendich@TrelliSoft.com

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Wed Apr 18 11:38:13 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA13086
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 11:38:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IEEj908523
	for ips-outgoing; Wed, 18 Apr 2001 10:14:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IEDfr08462
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:13:42 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id QAA93546
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 16:13:32 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA140826
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 16:13:31 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.004E20AA ; Wed, 18 Apr 2001 16:13:21 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A32.004E1F02.00@d12mta02.de.ibm.com>
Date: Wed, 18 Apr 2001 16:18:25 +0200
Subject: Re: iSCSI : login keys & mode page settings
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

Comments in text.

Julo

Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 02:19:27

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:   T10 Reflector <t10@t10.org>
Subject:  iSCSI : login keys & mode page settings




All/Julian,

The iSCSI draft is lacking sufficient description on the subject of mode
page settings specific to iSCSI, their corresponding iSCSI login keys
and the interactions between these 2 mechanisms. Specific comments
enclosed below in that regard :


1) The iSCSI draft needs to describe the layout of the protocol specific
mode pages, namely, disconnect-reconnect mode page, protocol specific
lun page and protocol specific port page as applicable to iSCSI. Such a
figurative and textual description should be along the lines of that in
FCP-2 Section 10.

+++ I don't know what version you are reading - both 5.91and 92 have them.
+++

2) Specifically, the iSCSI draft lacks the description of the layout of
the protocol specific lun page and in its absence, then describes a
field from this page called
EnableCmdRn. This field is non-existent in the SPC-2 description of this
page in Section 8.3.10.

+++ See above - this is a protocol specific page
+++

3) On a side note, the EnableCmdRN  & CmdRN fields should be re-named to
EnableCRN and CRN to reflect the same semantics and context as the CRN
defined in SAM-2 and FCP-2.

+++ what's in a name... +++

4) The EnableCmdRN login key should be removed from the list of iSCSI
login keys as this is a per-LUN key and iSCSI login keys have the scope
of a session. IOW, EnableCmdRN should be negotiated through a mode
select only and not through iSCSI login.

+++ I realized this and you are reading old stuff !!+++


5) On a more fundamental note, should iSCSI allow for 2 levels of
control of I-T[-L] nexus operational parameters thru both the mode
select/sense scsi mechanisms and iSCSI login key mechanisms ?

+++ It is all about function - several people felt that the (primitive)
negotiation element in the text commands is better than trying to set a
parameter to an unacceptable value and finding this out through a mode
sense
++++

Ex :
---------------------------------------------------------------------
iSCSI login key         SCSI mode page parameter
---------------------------------------------------------------------
DataPDULength           - Max Burst Size (Disc-reconn mode page)
FirstBurstSize          - First Burst Size (disc-reconn mode page)
InDataOrder             - EMDP (disconn-reconn mode page)
EnableCmdRN             - Enable CRN (LUN control mode page)

If such control is to be allowed at both the SCSI ULP and iSCSI
transport layers, a communication mechanism should be defined to
synchronize the state of these operational parameters across the 2
layers when a change is made in either layer through its corresponding
mechanisms.

Ex :
a) Change thru iSCSI login key should result in an up call to update the
SCSI ULP.
b) Change made thru mode select should result in a down call to update
iSCSI LLP.
c) Change thru iSCSI login key should result in an up call to SCSI ULP
to cause a UNIT ATTENTION indicating "Mode Parameters Changed".

+++ I view the ULP as the source for generating the text comands through a
new interface (give some credit to implementers). Whenever this is not the
case the SCSI will use the old mode set commands.
ISCSI will not set parameters by its own  +++


6) If such a level of dual control is provided, the iSCSI login
keys listed above be made LO (leading only) to allow for changes to
operational parameters only during session login. This is to
minimize/eliminate disruption of ongoing I/O activity that occurs due to
the generation of a UNIT ATTENTION CHECK CONDITION when any change is
made to the above paramters.


7) If only 1 mechanism of control is desired, which of the following
alternatives is desirable :
i) Only settable thru mode select and seen thru mode sense
Pros :
------
- Allows 1 mechanism of control.
- Removes the need for synchronization of these values across SCSI ULP &
iSCSI.

Cons :
------
- Requires setting for all LUNs to enable for the entire session.


ii) Settable thru iSCSI and also settable/viewable thru mode sense.
Pros :
------
- flexible and allows control thru both scsi & iSCSI.
Cons :
------
- Can lead to synchronization overheads.
- Needs SP(save page) setting also to be communicated in synching iSCSI
login to mode page values.


iii) Only settable thru iSCSI and viewable thru mode sense.
Pros:
-----
- Single mechanism of modification avoiding synchronization issues in
setting.
Cons :
------
- Denies traditional mechanism of modification. (mode sense).
- May break existing applns if enforced thru SPC-2.
- Requires changes to SPC-2.
- iSCSI compliance requires changes in SCSI ULP for SPC-2 compliance of
the change to not use mode select for parameter changes that are shared
with iSCSI.


8) If these operational parameters are allowed to be set thru iSCSI
login and they also impact mode page settings, iSCSI spec should
describe the scope of the mode page setting in terms of whether this
setting is a saved page setting or not ?

9) Should saved page settings be allowed thru iSCSI ?


- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Wed Apr 18 11:39:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA13107
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 11:39:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IDchL06010
	for ips-outgoing; Wed, 18 Apr 2001 09:38:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-1.cisco.com (sj-msg-core-1.cisco.com [171.71.163.11])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IDcIr05990
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 09:38:18 -0400 (EDT)
Received: from sponge.cisco.com (sponge.cisco.com [171.71.61.25])
	by sj-msg-core-1.cisco.com (8.9.3/8.9.1) with ESMTP id GAA21241;
	Wed, 18 Apr 2001 06:38:12 -0700 (PDT)
Received: from dap02w2k (sjc-vpn-270.cisco.com [10.21.65.14])
	by sponge.cisco.com (Mirapoint)
	with SMTP id ABQ01620;
	Wed, 18 Apr 2001 08:38:06 -0500 (CDT)
From: "Dave Peterson" <dap@cisco.com>
To: "Douglas Otis" <dotis@sanlight.net>, "Santosh Rao" <santoshr@cup.hp.com>,
        "Charles Monia" <cmonia@NishanSystems.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI & Linked Commands
Date: Wed, 18 Apr 2001 08:35:40 -0500
Message-ID: <EDEKKDKNBFCABNBAAOBBEEAHCEAA.dap@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
In-Reply-To: <NEBBJGDMMLHHCIKHGBEJOEHCCGAA.dotis@sanlight.net>
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit



> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> Douglas Otis
> Sent: Wednesday, April 18, 2001 12:03 AM
> To: Santosh Rao; Charles Monia
> Cc: ips@ece.cmu.edu
> Subject: RE: iSCSI & Linked Commands
>
>
> Santosh,
>
> Linked commands typically employ a feature that allows relative addressing
> for sequential devices.  Link bits are still present within the CDB even
for
> fibre channel.

Humm, I'm not aware of this feature for sequential devices.
Also, for FC-TAPE compliant devices command linking is prohibited.
Dave



From owner-ips@ece.cmu.edu  Wed Apr 18 11:40:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA13119
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 11:39:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IDijQ06384
	for ips-outgoing; Wed, 18 Apr 2001 09:44:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IDiKr06363
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 09:44:20 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id PAA244394
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 15:44:11 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id PAA223486
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 15:44:12 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.004B734B ; Wed, 18 Apr 2001 15:44:07 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A32.004B733A.00@d12mta02.de.ibm.com>
Date: Wed, 18 Apr 2001 15:49:17 +0200
Subject: Re: iSCSI : More problems with Status SNACK !
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



IF StatSN is going to become optional a negative response to SNACK while
keeping up counters for compliance
could be a good enough solution for a good enough target.

But I am still not sure that this is the way to go.

Julo

Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 00:36:21

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  Re: iSCSI : More problems with Status SNACK !




julian_satran@il.ibm.com wrote:
>
> 2) SNACK mechanism cannot be relied upon for resource cleanup for the
> following reasons :
>
> a) SNACK support MUST be mandatory at the target and target can NEVER
> fail a Status SNACK.
> b) Initiators MUST always use a Status SNACK and this is not possible on
> a UP timeout. IOW, there exist I/O timeout and other circumstances when
> the initiator gives up and does not attempt SACK (suppose SACK itself
> got a digest error at the target and timed out at the initiator !).
>
> Since the current SNACK model is heavily dependent on the above
> assumptions [which canot be met], failure of SNACK blocks further
> forward progress with resource cleanup at the target since all further
> I/O completions beyond the hole StatSN cannot be acknowledged.
>
> In the worst case, any I/O timeout would imply session level error
> recovery since the target will no longer be able to relaim resources.
>
> +++ any UL timeout must include an abort for the task to clean up the
> target++

Julian,

The Abort Task sent by initiator on the ULP timeouts cleans up resources
for that specific task.

The issue under debate was that the spec does not have a mechanism by
which, once a hole is created, [which cannot be filled by the Status
SNACK,] the initiator can switch back to bulk acknowledgements. i.e.
while the timed out I/O resources may be released thru the Abort Task,
the remaining tasks completed thereafter are unable to be acknowledged
by the initiator.

>
> Proposal :
> ==========
> 1) Negotiate Status SACK support at login time.
> 2) Do not use StatSN when Status SACK is not supported.
> 3) Modify the current SNACK PDU to eliminate "Additional run Length"
> (which is of no practical use currently) and replace with an explicit
> positive ack run described by ack_begrun and ack_run_length.
>
> Comments ?
> +++ I am basically against options - If I can avoid them.
> I don't see how an optional SNACK and STatSN would simplify a
> target/initiator
> while still allowing command recovery without popping errors into SCSI+++

If a target does not support Status SNACK, then, such a target is
effectively releasing its I/O resources upon completion. This implies
that the target is neither capable of supporting SNACK nor the "retry"
(or replay) concept.

In such cases, command recovery may occur at the SCSI layer or iSCSI may
retry the task at its layer. For such simple implementations that don't
resort to complex status recovery techniques, StatSN has no value add
and only creates complexity by potential holes.

Of course, any such implementation may typically ignore ExpStatSN and
continue to release resources as the I/Os complete with a StatSN being
initialized only for compliance. Rather than have such a behaviour, the
spec should allow for these implementations by letting them use StatSN
of 0 as a don't care.

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Wed Apr 18 12:37:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA14038
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 12:37:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IERne09548
	for ips-outgoing; Wed, 18 Apr 2001 10:27:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from aland.bbn.com (aland.bbn.com [204.162.9.10])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IE8Ur08089
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:08:30 -0400 (EDT)
Received: from aland.bbn.com (localhost [127.0.0.1])
	by aland.bbn.com (8.11.3/8.11.3) with ESMTP id f3IE85P06380;
	Wed, 18 Apr 2001 10:08:05 -0400 (EDT)
Message-Id: <200104181408.f3IE85P06380@aland.bbn.com>
To: julian_satran@il.ibm.com
cc: Randall Stewart <rrs@cisco.com>,
        "WENDT,JIM (HP-Roseville, ex1)" <jim_wendt@hp.com>, ips@ece.cmu.edu,
        tsvwg@ietf.org, "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>, craig@aland.bbn.com
Subject: Re: [Tsvwg] [SCTP checksum problems] 
In-Reply-To: Your message of "Wed, 18 Apr 2001 08:27:04 +0200."
             <C1256A32.00238F3F.00@d12mta05.de.ibm.com> 
Date: Wed, 18 Apr 2001 10:08:05 -0400
From: Craig Partridge <craig@aland.bbn.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Hi Julian:

I read your ID quickly and may have more comments later.

I think the bit error study is flawed.  The key issue here is that
the TCP, and Fletcher checksum (and I think Adler) will catch all
1 bit errors.  Indeed, the TCP checksum will catch up to 16 distinct
bit errors, provided that the bit offsets are all distinct, modulo 16.

Expressing this simply, assuming my system did its Poisson calculations
right:

    * we don't have to worry about single bit errors in a chunk of
      data, the checksum will always catch it.

    * the probability of 2 bit errors occuring in a single 1KB packet
      is about 1 in 3* 10^15.  The probability that the two bits will
      have the same offset mod 16 is around .008 (if I did the distribution
      right) and the chance is 50% that the two bits will offset each other
      (i.e. they'll flip opposite ways).  Ergo, your chance of an
      undetected error is about 1 in 8*10^17.  I believe (without having
      done all the match) that is as good or better than you'll get for
      CRC32 in almost all cases.  Since the checksums are cheaper to compute,
      they're a win.

    * more than 2 bit errors per chunk is so rare we probably don't care

So the big issue is burst errors (and, incidentally, burst erasures).

Thanks!

Craig


From owner-ips@ece.cmu.edu  Wed Apr 18 15:12:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16309
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 15:12:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IERq009574
	for ips-outgoing; Wed, 18 Apr 2001 10:27:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IED2r08431
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:13:02 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LM7VR>; Wed, 18 Apr 2001 10:03:39 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015459@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: sandeepj@research.bell-labs.com, ips@ece.cmu.edu
Subject: RE: aborting an out of sequence cmdSN
Date: Wed, 18 Apr 2001 10:12:56 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sandeep,

> The "iSCSI cancel" proposal you describe below has been presented 
> once before.  If you recall, I was asking for a refCmdSN in the 
> TaskMgmt PDU.
> 
> The only difference between that and what you now describe below 
> is the addition of this concept of an "iSCSI service error" response.

Close enough - sorry for the lack of acknowledgement.  Of the four
options, (1) and (3) make the most sense to me.  Assuming that
we document the iSCSI-based effects of task management operations
(probably should do this not only for task abort, but also task set
abort and task set clear):

> (1) use connection allegiance for TASK MGMT PDU.

Ok, with the possible concern about something going wrong on
that connection.

> (2) reject all commands prior to cmdSN of TASK MGMT PDU.

I don't see anyone aside from Doug interested in this, so I consider
WG rough consensus to have rejected this approach.

> (3) cmdSN of original task is sent with TASK MGMT PDU and
>     target at the iSCSI layer keeps state.

The state's not a big deal - in essence the iSCSI target pretends it
received a NOP instead of the actual command if the abort arrives
ahead of the command.

> (4) iSCSI initiator retains state for deleted tasks to ensure
>     that R2T/Scsi Responses are appropriately handled.

Ugly, although errant R2Ts may show up if the abort/clear is
across connections.  With the iSCSI enhancements, the
initiator should be able to throw these away.

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Wed Apr 18 16:09:17 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA16983
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 16:09:16 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IIN6P26443
	for ips-outgoing; Wed, 18 Apr 2001 14:23:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3IIMKr26415
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 14:22:20 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Wed Apr 18 14:19:18 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Wed Apr 18 14:21:38 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id OAA02525
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 14:21:34 -0400 (EDT)
Message-ID: <3ADDDB2E.3FEED1E0@research.bell-labs.com>
Date: Wed, 18 Apr 2001 14:21:34 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: target discovery issue
References: <C1256A32.00621D20.00@d12mta05.de.ibm.com> <3ADDD8E4.C2D974C2@cisco.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


just realized that the reflector is not seeing this discussion.
the question at hand is how should target discovery notification 
be sent in the iSCSI world.

Mark Bakke wrote:
> 
> Actually, I think that part of it is an iSCSI issue.  That is,
> if a new target is created, that's at the SCSI level.  But if I
> add an iSCSI address on which to access that target, it now must
> be discovered first by the iSCSI layer on the host, before it
> can be presented to the SCSI layer.  In this case, we would need
> to send an iSCSI event indicating that there is a new target (or
> at least that there is some change in availability of targets);
> the host would then use SendTargets to find out the specifics.
> 
> This brings up Sandeep's question #2.  If I am a target, I can
> send this message either:
> 
>  a) On every iSCSI connection
> 
>   OR
> 
>  b) On all connections to canonical targets
> 
> Method a gives us better coverage, and does not require an
> initiator to keep its canonical target connection around in
> between these little sendtargets commands.  However, if an
> initiator logs into a canonical target, finds that it has no
> targets to connect to (yet), and one is added later, the
> initiator would only find out if it had kept its canonical
> target connection, unless it is using an out-of-band discovery
> mechanism.
> 
> Method a will also tend to bother connections to targets
> that are doing the "real" work (data path stuff).
> 
> Method b will keep these events away from the data path, and
> will not generally have to send so many events.  However, it
> would require each initiator that wanted to be notified to keep
> its canonical connection around.
> 
> There is a Method C, which is a combination of the above:
> 
>  c) The device will send this async event message on ONE of the
>     connections to each initiator name (formerly WWUI) that is
>     connected to it.  If one of these connections is to the
>     canonical target, the device will use that one.
> 
> Method c allows the initiator to choose whether it would rather
> keep an explicit canonical target connection around (e.g. if the
> other connections have been pushed down to hardware), or whether
> it would rather not keep the connection around, and be notified
> on one of the others.  The number of messages sent by targets
> would be identical to that in method b.
> 
> --
> Mark
> 
> julian_satran@il.ibm.com wrote:
> >
> > Sandeep,
> >
> > I think we are deep in T10 territory - this is a SCSI issue.
> >
> > Julo
> >
> > Sandeep Joshi <sandeepj@research.bell-labs.com> on 18/04/2001 16:21:06
> >
> > Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>
> >
> > To:   Julian Satran/Haifa/IBM@IBMIL, mbakke@cisco.com
> > cc:
> > Subject:  target discovery issue
> >
> > Julian & Mark,
> >
> > Friendly reminder... the issue mentioned below may not
> > have been resolved.
> >
> > 1) Is target discovery going to be the SCSI event or will
> >    it be wrapped up as an iSCSI event ?
> > 2) Do we have to keep a session to the canonical target
> >    always open to be able to do target discovery?
> >
> > -Sandeep
> >
> > I am not sure.  There are some SCSI items in it too (SCSI handles now the
> > appearnce of new LUs).
> >
> > I will need a longer discussion with NDT to understand the semantics.
> >
> > Julo
> >
> > Sandeep Joshi <sandeepj@research.bell-labs.com> on 12/03/2001 22:25:27
> >
> > Julian,
> >
> > in case you skip this one..
> > your response is required on point (1) for amending iSCSI draft.
> >
> > -sandeep
> >
> > Sandeep-
> >
> > The problem you pointed out in item number 1 creates the need
> > for an additional iSCSI-level event.  Since the discovery of
> > targets happens at the iSCSI level, rather than at the SCSI
> > level, how about adding this to 2.18.1 (in iSCSI-05)?
> >
> >   4   Network entity indicates that a "target discovery" event
> >       has occurred.
> >
> > Upon receiving this message, the initiator should use SendTargets,
> > or whatever other methods of discovery it is using, to find out
> > what has changed.  Usually, this would be due to adding a new
> > target.
> >
> > We will fix items 2-4; thanks for pointing them out.
> >
> > Thanks,
> >
> > Mark
> >
> > Sandeep Joshi wrote:
> > >
> > > 1) Section 4.2 last line before Section 4.2.1
> > >     "target MUST send any iSCSI-level async on this session,
> > >      allowing the initiator to discover new targets.."
> > >
> > >    The session mentioned here is a session to the canonical target.
> > >
> > >    However, the iSCSI 05 draft does not mention any such condition
> > >    in Sec 2.18 on Async Message.   In there, a SCSI event (note: not
> > >    iSCSI) is used to notify availability of new targets.
> > >


From owner-ips@ece.cmu.edu  Wed Apr 18 16:12:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA17044
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 16:12:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IICwp25692
	for ips-outgoing; Wed, 18 Apr 2001 14:12:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IIBpr25597
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 14:11:51 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id EF0B218FA
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:44:49 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA10326
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:44:44 -0700 (PDT)
Message-ID: <3ADDD3EE.A6A8F70A@cup.hp.com>
Date: Wed, 18 Apr 2001 10:50:38 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI : login keys & mode page settings
References: <C1256A32.004E1F02.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------C9143DB6A4DA5749134A2CE1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------C9143DB6A4DA5749134A2CE1
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:

> +++ I don't know what version you are reading - both 5.91and 92 have them.
> +++

Julian,

Thanks for the clarification. (My comments were written up prior to the
release of 5.9x and it IS difficult to keep up on the reading what with
daily new releases of the
draft !).

Some additional discussion below.

> 
> 3) On a side note, the EnableCmdRN  & CmdRN fields should be re-named to
> EnableCRN and CRN to reflect the same semantics and context as the CRN
> defined in SAM-2 and FCP-2.
> 
> +++ what's in a name... +++

Consistency for one ! (Any strong reasons not to call this CRN, as SAM
and FCP do ?)


> 5) On a more fundamental note, should iSCSI allow for 2 levels of
> control of I-T[-L] nexus operational parameters thru both the mode
> select/sense scsi mechanisms and iSCSI login key mechanisms ?
> 
> +++ It is all about function - several people felt that the (primitive)
> negotiation element in the text commands is better than trying to set a
> parameter to an unacceptable value and finding this out through a mode
> sense
> ++++

Fundamentally, the login key mechanism allows negotiation at the scope
of a session while the mode select allows negotiation at the scope of a
LUN. For most of the parameters listed below (other than EnableCmdRN),
these are set on a per-session basis.

Based on that, it is more optimal to negotiate once using a login key
than set it on a per-LUN basis. 

However, having allowed 2 mechanisms to set these values, iSCSI MUST
then comment on the need to synchronize their settings in the 2 layers
and also comment on the need to trigger a UNIT ATTENTION when changed
through the login key mechanism.

> Ex :
> a) Change thru iSCSI login key should result in an up call to update the
> SCSI ULP.
> b) Change made thru mode select should result in a down call to update
> iSCSI LLP.
> c) Change thru iSCSI login key should result in an up call to SCSI ULP
> to cause a UNIT ATTENTION indicating "Mode Parameters Changed".
> 
> +++ I view the ULP as the source for generating the text comands through a
> new interface (give some credit to implementers). Whenever this is not the
> case the SCSI will use the old mode set commands.
> ISCSI will not set parameters by its own  +++

That would be rather starange violation of layering. Why does the ULP
need to drive transport layer settings and if it did need to do so, why
would it use a transport mechanism rather than a mode select (?).

Also, if you believe iSCSI will not set parameters on its own, this
should be stated explicitly in the draft. The draft is allowing for 2
mechanisms to control these settings without advocating the need to
synchronize b/n them and generate UA when changes are done thru login
keys.

> 
> 6) If such a level of dual control is provided, the iSCSI login
> keys listed above be made LO (leading only) to allow for changes to
> operational parameters only during session login. This is to
> minimize/eliminate disruption of ongoing I/O activity that occurs due to
> the generation of a UNIT ATTENTION CHECK CONDITION when any change is
> made to the above paramters.

Are we in agreement on the above ? 


>8) If these operational parameters are allowed to be set thru iSCSI
> login and they also impact mode page settings, iSCSI spec should
> describe the scope of the mode page setting in terms of whether this
> setting is a saved page setting or not ?
> 
> 9) Should saved page settings be allowed thru iSCSI ?

I did not see any comments on the above issues (?).
--------------C9143DB6A4DA5749134A2CE1
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------C9143DB6A4DA5749134A2CE1--



From owner-ips@ece.cmu.edu  Wed Apr 18 16:19:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA17185
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 16:19:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IDthL07167
	for ips-outgoing; Wed, 18 Apr 2001 09:55:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IDt1r07127
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 09:55:02 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id PAA08140
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 15:54:52 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id PAA72904
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 15:54:52 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.004C6E41 ; Wed, 18 Apr 2001 15:54:49 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A32.004C6CA4.00@d12mta02.de.ibm.com>
Date: Wed, 18 Apr 2001 15:59:52 +0200
Subject: Re: aborting an out of sequence cmdSN
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



First the current draft does not stop you from doing what you want - it
just does not mandate it.
Second with CmdSN now you have a better chance - before handing the command
over to SCSI task management
mark all the relevant commands in the iSCSI queue as non-deliverable to a
LUN or all LUNs or keep a stack of barriers for active aborts (I assume
that this is a reasonably low number) to the same effect.

Julo


Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 01:10:17

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Sandeep Joshi <sandeepj@research.bell-labs.com>
cc:   ips@ece.cmu.edu
Subject:  Re: aborting an out of sequence cmdSN




I'd think option 1 is the simplest (with the caveat that the task mgmt
PDU referred to is the Abort Task.) and only impacts the affected
command/task.

Pierre Labat and I have asked for this 4 months ago. (See :
http://ips.pdl.cs.cmu.edu/mail/msg02958.html). The concept of connection
allegiance should be extended to include the Abort Task. Also,
connection allegiance should apply to the task (which spans multiple
commands in the case of linked commands.), allowing for a deterministic
clean up of stale PDUs of the task through the use of Abort Task.

- Santosh

Sandeep Joshi wrote:

>
> So our options for abort_task boil down to..
> (1) use connection allegiance for TASK MGMT PDU.
> (2) reject all commands prior to cmdSN of TASK MGMT PDU.
> (3) cmdSN of original task is sent with TASK MGMT PDU and
>     target at the iSCSI layer keeps state.
> (4) iSCSI initiator retains state for deleted tasks to ensure
>     that R2T/Scsi Responses are appropriately handled.
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Wed Apr 18 17:15:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18031
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 17:15:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IFwo415950
	for ips-outgoing; Wed, 18 Apr 2001 11:58:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from megapathdsl.net (snowbird.megapath.net [216.200.176.7])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IFwBr15915
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 11:58:12 -0400 (EDT)
Received: from [64.7.4.187] (HELO ntwlassie)
  by megapathdsl.net (CommuniGate Pro SMTP 3.4.3)
  with SMTP id 19794285 for ips@ece.cmu.edu; Wed, 18 Apr 2001 08:57:36 -0700
Message-ID: <003701c0c823$6d777610$320114ac@ntwlassie>
Reply-To: "Justin R. Bendich" <bendich@TrelliSoft.com>
From: "Justin R. Bendich" <bendich@TrelliSoft.com>
To: <ips@ece.cmu.edu>
Subject: Re: Some questions about naming (newbie)
Date: Wed, 18 Apr 2001 11:20:04 -0500
Organization: TrelliSoft, inc.
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_0034_01C0C7F9.8451EE80"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4029.2901
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4029.2901
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.

------=_NextPart_000_0034_01C0C7F9.8451EE80
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

> 3. A LUN WWN is at the SCSI LUN level, and can be used for identifying
>    LUNs regardless of whether the underlying transport is FC, parallel
>    SCSI, or iSCSI.  However, most FC devices do not support the LUN
>    WWN, and use serial numbers, device IDs, and other mechanisms to
>    correlate them.  To do this, one must implement =
device-type-specific
>    code.

> Anyway, those were just a few observations that might help.
Thank you -- they do.

> On question 3, keep in mind that an INQUIRY is done at the SCSI
> level, and is independent of the SCSI transport mechanism used.
> Therefore, the VPD information would not return anything that would
> reflect an iSCSI identifier.  The ability to return a LUN WWN (from
> (3) above) on this page really is a SCSI-level thing, and has nothing
> to do with FC other than that it shares the same address format.
I think i understand your point:
A device might not be iSCSI only -- it might also have FC, plain SCSI
(what's the official name for that?) interfaces, so the device doesn't
necessarily know its node-name; that's the interface's responsibility?
I don't actually need a node-name -- any identifying string will do.
But the node-name is nice because the iSCSI naming spec requires it
(along with the actual LUN number) to uniquely identify the LUN.
That's something i don't know to be available through any other specifi-
cation. Am i right there?

> At any rate, in order to connect to a SCSI device via iSCSI to do
> the INQUIRY, you must first log in to the target at the iSCSI level.
> To do this, you would already have the iSCSI name (formerly WWUI),
> so you should already have the information you need without the
> inquiry data.
Yes, but in order for my (application-level) code to do that, it has to =
be
aware that the transport is iSCSI, and many OSs take great pains to hide
that information from application-level code. MS even tries to make ATA
hard disks look like SCSI (gag).
So it would be nice if:

1. Table 111 of the SCSI Primary Command set (T10 995d Rev. 11a)
   included an additional identifier type "iSCSI node-name"
2. Devices supporting iSCSI were strongly recommended (even though
   this can't be required for the reasons discussed above) to report =
this
   identifier on VPD page 0x83.

> Mark A. Bakke
> Cisco Systems
> mbakke@cisco.com
> 763.398.1054

Justin

------=_NextPart_000_0034_01C0C7F9.8451EE80
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 5.50.4030.2400" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2><FONT face=3D"Times New Roman" =
size=3D3>&gt; 3. A LUN=20
WWN is at the SCSI LUN level, and can be used for=20
identifying<BR>&gt;&nbsp;&nbsp;&nbsp; LUNs regardless of whether the =
underlying=20
transport is FC, parallel<BR>&gt;&nbsp;&nbsp;&nbsp; SCSI, or =
iSCSI.&nbsp;=20
However, most FC devices do not support the =
LUN<BR>&gt;&nbsp;&nbsp;&nbsp; WWN,=20
and use serial numbers, device IDs, and other mechanisms=20
to<BR>&gt;&nbsp;&nbsp;&nbsp; correlate them.&nbsp; To do this, one must=20
implement device-type-specific<BR>&gt;&nbsp;&nbsp;&nbsp; =
code.<BR><BR>&gt;=20
Anyway, those were just a few observations that might help.<BR>Thank you =
-- they=20
do.<BR><BR>&gt; On question 3, keep in mind that an INQUIRY is done at =
the=20
SCSI<BR>&gt; level, and is independent of the SCSI transport mechanism=20
used.<BR>&gt; Therefore, the VPD information would not return anything =
that=20
would<BR>&gt; reflect an iSCSI identifier.&nbsp; The ability to return a =
LUN WWN=20
(from<BR>&gt; (3) above) on this page really is a SCSI-level thing, and =
has=20
nothing<BR>&gt; to do with FC other than that it shares the same address =

format.<BR>I think i understand your point:<BR>A device might not be =
iSCSI only=20
-- it might also have FC, plain SCSI<BR>(what's the official name for =
that?)=20
interfaces, so the device doesn't<BR>necessarily know its node-name; =
that's the=20
interface's responsibility?<BR>I don't actually need a node-name -- any=20
identifying string will do.<BR>But the node-name is nice because the =
iSCSI=20
naming spec requires it<BR>(along with the actual LUN number) to =
uniquely=20
identify the LUN.<BR>That's something i don't know to be available =
through any=20
other specifi-<BR>cation. Am i right there?<BR><BR>&gt; At any rate, in =
order to=20
connect to a SCSI device via iSCSI to do<BR>&gt; the INQUIRY, you must =
first log=20
in to the target at the iSCSI level.<BR>&gt; To do this, you would =
already have=20
the iSCSI name (formerly WWUI),<BR>&gt; so you should already have the=20
information you need without the<BR>&gt; inquiry data.<BR>Yes, but in =
order for=20
my (application-level) code to do that, it has to be<BR>aware that the =
transport=20
is iSCSI, and many OSs take great pains to hide<BR>that information from =

application-level code. MS even tries to make ATA<BR>hard disks look =
like SCSI=20
(gag).<BR>So it would be nice if:<BR><BR>1. Table 111 of the SCSI =
Primary=20
Command set (T10 995d Rev. 11a)<BR>&nbsp;&nbsp; included an additional=20
identifier type "iSCSI node-name"<BR>2. Devices supporting iSCSI were =
strongly=20
recommended (even though<BR>&nbsp;&nbsp; this can't be required for the =
reasons=20
discussed above) to report this<BR>&nbsp;&nbsp; identifier on VPD page=20
0x83.<BR><BR>&gt; Mark A. Bakke<BR>&gt; Cisco Systems<BR>&gt; </FONT><A=20
href=3D"mailto:mbakke@cisco.com"><FONT face=3D"Times New Roman"=20
size=3D3>mbakke@cisco.com</FONT></A><BR><FONT face=3D"Times New Roman" =
size=3D3>&gt;=20
763.398.1054<BR><BR>Justin</FONT></FONT></DIV></BODY></HTML>

------=_NextPart_000_0034_01C0C7F9.8451EE80--



From owner-ips@ece.cmu.edu  Wed Apr 18 17:15:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18041
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 17:15:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IBwdt29350
	for ips-outgoing; Wed, 18 Apr 2001 07:58:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IBwAr29336
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 07:58:11 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id NAA317818
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 13:58:02 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id NAA219782
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 13:58:03 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A32.0041BB4B ; Wed, 18 Apr 2001 13:57:57 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A32.0041B837.00@d12mta02.de.ibm.com>
Date: Wed, 18 Apr 2001 13:22:03 +0200
Subject: RE: iSCSI linked commands
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk





Doug,

I did check the text on page 16. It says:

   However, consecutive commands that are part of a SCSI linked
   command-chain task MAY use different connections. Connection allegiance
   is strictly per-command and not per-task. During the iSCSI Full Feature
   Phase, the initiator and target MAY interleave unrelated SCSI commands,
   their SCSI Data and responses, over the session.


   What is the issue?

   Julo

"Douglas Otis" <dotis@sanlight.net> on 17/04/2001 17:40:10

Please respond to "Douglas Otis" <dotis@sanlight.net>

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:   ralphoweber@compuserve.com
Subject:  RE: iSCSI linked commands




Julian,

I am aware that iSCSI can be used for parallel SCSI configurations as well
Fibre-channel that has a level of accommodation for tape linked commands
but
I am unaware of such implementations.  Without getting involved with this
FC
discussion, it is clear that parallel SCSI is still a valid means of
implementing linking.  The point that I was making as it relates to the
proposal was that on page 16 you indicate that the task carries the
allegiance.  It is the command as you clearly indicate on the prior page.
As there can only be one command in play at any point in time, this task
can
become spread over any number of connections as you state.  As such, rather
than indicating an allegiance associated with the task, you may wish to say
with outstanding commands.  This should be taken only as an editorial
concern.  I was trying to make the point that connection allegiance is not
constant with respect to the task.  Santosh wishes to exclude the
sequential
model from his thinking.  It was the examination of the command and the
overhead of tracking these allegiances I saw as undesirable.  The change to
implementing serialization prevents the problem if you introduce allegiance
for commands carrying the same serialization so this thread should become
stale as to the original concern.  The reason to introduce allegiance for
these commands carrying the same serialization is to prevent the command
window being closed.

With your present scheme, "immediate" commands should not advance the
window
and should be placed in a prior position on the same connection.   If I
have
missed this requirement, ignore this concern.

Doug

> Doug,
>
> I think you would want to go back to SAM.  Linked command are
> broken by any
> "irregularity" in execution.
> The basic assumption is that the initiator is in charge of shipping
linked
> commands - one-by-one.
> I assume that for high latency links they won't be very popular.
>
> At a very early stage (about 2 years ago) we contemplated the idea of
> "prefetching" linked commands and have the target
> effect the serialization. We would have had to come up with a way of
> conveying the initiator which command broke the chain (if it broke) or
> caused a unit attention (if it caused) and it was not at all clear that
> this was "in the spirit of SAM" .
> There where also more esoteric issues with later command getting modified
> by execution of prior commands etc. -:).
>
> The 360 channels had the same class of issues.
>
> I assume that T10 folks went over these issues many times.
>
> Julo
>
> "Douglas Otis" <dotis@sanlight.net> on 16/04/2001 23:41:49
>
> Please respond to "Douglas Otis" <dotis@sanlight.net>
>
> To:   "Santosh Rao" <santoshr@cup.hp.com>
> cc:   "Ips" <ips@ece.cmu.edu>
> Subject:  RE: iSCSI:flow control, acknowledgement, and a deterministic
>       recovery
>
>
>
>
> Santosh,
>
> The iSCSI proposal ver 5-91 explicitly defines tasks and also includes
the
> option to allow linked commands to be sent across different connections.
> Obviously, for sequential devices, particular attention must be paid to
> command serialization as these commands tend to use relative addressing
or
> are dependent upon the successful completion of prior commands.  This
> requirement is not helped with Auto-Sense and impels the need for a
target
> model change in SCSI.  An error injected into the SCSI layer as a
> result of
> a network communication error will significantly reduce the
> utility of most
> backup applications.  Such reliance on the SCSI layer to recover from
such
> uncertainty imposed as a result of the inability of the network transport
> to
> do minimal handshakes and retries is the wrong approach.  Regardless, you
> are burdening the driver with the duty of tracking a transient connection
> allegiance status.  The latest version has improved language with the
> exception of Pg. 16 in two places.
>
> Ver 5-91
> Pg. 15
> "Connection allegiance is strictly per-command and not per-task."
>
> Pg. 16
> "tasks that have allegiance to the connection"
> "all outstanding tasks that have allegiance to the connection to conclude
> and send their status."
>
> Doug
>
> > Doug,
> >
> > You seem to be referring to linked commands as a case wherein the
> > approach of Abort Task will not flush stale PDUs.
> >
> > Linked Commands cannot work the way SCSI implementations are defined
> > today, since linked commands require the initiator task tag (I_T_L_x
> > nexus identifier in SAM-2 Execute Command terminology) to be generated
> > by the SCSI ULP. However, in practice, the Initiator Task Tag (or the
FC
> > OX_ID) is typically generated in the SCSI LLP (or in some cases in the
> > adapter firmware). IOW, there is no common reference handle like the
> > task tag sent down from the ULP that allows for association of multiple
> > commands to a task in several/most implementations today.
> >
> > When this is fixed up to get linked commands to work [& there exist
> > examples of its usage], there is no reason connection allegiance could
> > not be applied to all the commands within the task.
> >
> > I fail to see why you think Abort Task will not work with sequential
> > devices (?).
> >
> > - Santosh
> >
> > Douglas Otis wrote:
> > >
> > > Santosh,
> > >
> > > I see a few problems with this approach.  Tasks as defined in
> > > iSCSI do not maintain connection allegiance.  The driver binds all
> > > SCSI commands to their connection for the most resent association.
> > > Although there are several places within the iSCSI proposal that
> > > make reference to a task having a connection allegiance, this is
> > > in error.  Commands and not tasks carry such allegiance.  Your
> > > recovery scheme will not allow a satisfactory recovery with a
> > > sequential device.  In this case, repeating the command is not a
> > > solution.  As a result, one connection falter and it will become a
> > > difficult situation.  In addition, you have no clue from iSCSI your
> > > delivery status.  You do not know if you are waiting for the target
> > > or if you are waiting for the connection.  Some sequential devices
> > > have rather long time-outs with these complications of deducing
> > > status created by the multiple connections.
> > >
> > > The application will not know about these connection allegiance
> > > problems. The iSCSI layer does not define interaction to provide
> > > additional application status to allow these applications to respond
> > > in a manner that may aid this situation nor should such additional
> > > information be required.  With your scheme the SCSI driver must
> > > examine the content of these commands to make a guess as to the
> > > connection allegiance assignments.  Now the driver is expected to
> > > understand what the intended action is of this SCSI management
> > > command.  What signal is used to indicate a need for the iSCSI
> > > immediate treatment?  The only obvious seems to be the task attribute
> > > argument.  With the way iSCSI has defined iSCSI immediate, I
> > > would expect those commands to be treated in a LIFO rather than the
> > > normal FIFO fashion.
> > >
> > > Doug
>
>
>
>
>
>
>








From owner-ips@ece.cmu.edu  Wed Apr 18 17:59:57 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18676
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 17:59:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IH9r721104
	for ips-outgoing; Wed, 18 Apr 2001 13:09:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1t.cos.agilent.com (msgbas1tx.cos.agilent.com [192.6.9.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IH9Nr21049
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 13:09:23 -0400 (EDT)
Received: from msgrel1.and.agilent.com (msgrel1.and.agilent.com [130.30.33.104])
	by msgbas1t.cos.agilent.com (Postfix) with ESMTP
	id 3EBFD1B4; Wed, 18 Apr 2001 11:09:22 -0600 (MDT)
Received: from axandbh3.and.agilent.com (axandbh3.and.agilent.com [130.30.32.200])
	by msgrel1.and.agilent.com (Postfix) with SMTP
	id 7E89C244; Wed, 18 Apr 2001 13:09:12 -0400 (EDT)
Received: from 130.30.32.200 by axandbh3.and.agilent.com (InterScan E-Mail VirusWall NT); Wed, 18 Apr 2001 13:09:12 -0400 (Eastern Daylight Time)
Received: by axandbh3.and.agilent.com with Internet Mail Service (5.5.2653.19)
	id <24YVHVYW>; Wed, 18 Apr 2001 13:09:12 -0400
Message-ID: <FEEBE78C8360D411ACFD00D0B74779719A8855@xsj02.sjs.agilent.com>
From: "CAVANNA,VICENTE V (A-Roseville,ex1)" <vince_cavanna@agilent.com>
To: "'WENDT,JIM (HP-Roseville,ex1)'" <jim_wendt@hp.com>,
        "'julian_satran@il.ibm.com'" <julian_satran@il.ibm.com>
Cc: ips@ece.cmu.edu, tsvwg@ietf.org, "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>,
        Randall Stewart <rrs@cisco.com>,
        "CAVANNA,VICENTE V (A-Roseville,ex1)" <vince_cavanna@agilent.com>
Subject: RE: [Tsvwg] [SCTP checksum problems]
Date: Wed, 18 Apr 2001 13:09:08 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi Jim,
I don't think iSCSI can be completely relieved of performing some data
integrity checking as long as there exists the possibility of "middle boxes"
opening up the transport protocol's packet and thus potentially invalidating
any reliability guarantees the transport protocol makes.
Vince

|-----Original Message-----
|From: WENDT,JIM (HP-Roseville,ex1) [mailto:jim_wendt@hp.com]
|Sent: Tuesday, April 17, 2001 11:44 AM
|To: 'julian_satran@il.ibm.com'
|Cc: ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart; WENDT,JIM
|(HP-Roseville,ex1)
|Subject: Re: [Tsvwg] [SCTP checksum problems]
|
|
|Julian,
|The SCTP folks are right now discussing changing the SCTP 
|checksum to be a
|CRC-32 (or other). This is a very good thing and really what 
|needs to happen
|with SCTP for it to support iSCSI and other data-critical applications
|effectively (and also relieve iSCSI from having to implement 
|data integrity
|checking and transport-like functionality over SCTP).
|
|They are looking for inputs as to which CRC-32 or checksum to 
|use. The iSCSI
|WG's CRC investigation work and conclusion would be a valuable 
|input into
|their decision. The sooner that you can provide the iSCSI 
|recommended CRC
|and reasoning behind it to them, the better, even before the 
|forthcoming I-D
|is distributed.
|
|Jim Wendt
|Networked Storage Architecture
|Hewlett-Packard Company
|jim_wendt@hp.com 916-785-5198
|
|---------------------------------------------------------------
|-------------
|-
|
|> -----Original Message-----
|> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
|> Sent: Sunday, April 15, 2001 7:58 AM
|> To: ips@ece.cmu.edu
|> Subject: CRCs
|> 
|> 
|> 
|> 
|> Dear colleagues,
|> 
|> We will probably not be able to finish the CRC/checksum 
|> document in time
|> for Nashua but we hope it will be out very soon after that.   
|> However I
|> would like to inform you that while in Orlando and 
|> Minneapolis we where
|> still talking about different CRCs we (Dafna Sheinwald, Pat 
|> Thaler, Matt
|> Wakeley, Vince Cavanna and myself) have agreed on a CRC and 
|> the forthcoming
|> ID will give all the reasons and why we recomend it.
|> 
|> Regards,
|> Julo
|> 
|
|---------------------------------------------------------------
|-------------
|-
|
|-----Original Message-----
|From: Randall Stewart [mailto:rrs@cisco.com]
|Sent: Tuesday, April 17, 2001 4:31 AM
|To: Jonathan Wood
|Cc: xieqb@cig.mot.com; tsvwg@ietf.org; Jim Wendt; Jonathan Stone; Craig
|Partridge
|Subject: Re: [Tsvwg] [SCTP checksum problems]
|
|
|Jonathan:
|
|I will make sure everyone at the bakeoff is aware of the upcoming
|"checksum" change... Now one of the big questions yet is
|what checksum should we use?
|
|I kinda lean towards crc-32 myself (but of course I have no technical
|basis for this and need to keep silent on which one to use anyway :->),
|but do we have other candidates besides fletcher-32 and possibly
|modified 
|Adler-32 (i.e. 16 bit adds instead of 8)??
|
|I will take the above 3 and do a bit of performance work this
|week and post some numbers... thats about all I can do i.e.  tell
|how much time the options I know of take... 
|
|If you have some other candidates let me know and I can possibly get
|some performance numbers on these as well...
|
|As far as which is the best... I encourage all of you check-sum
|experts out there to please join the thread :)
|
|Oh, I know Jonathan Stone's paper will NOT be ready until sometime
|in May.. so we may want to proceed slowly so that Craig Partridge and
|he can have some cycles to add to this dicussion :)
|
|R
|
|Jonathan Wood wrote:
|> 
|> As an SCTP implementor and someone who will want to get the 
|hardware folks
|to
|> help with checksumming, I wholeheartedly agree with Randy. 
|Remember that
|SCTP is
|> just a proposed standard, and is as such not all that far along the
|> standardization process. We should still be able to make 
|changes like this
|if
|> necessary.
|> 
|> Jon
|> 
|> >
|> >Q:
|> >
|> >The only problem with an additional "CRC chunk" is that
|> >it makes hardware assistance to error correction much
|> >more difficult. It is better (I think) to just realize
|> >we made a mistake. Get the opinions of the experts as to
|> >what checksum to use... i.e.:
|> >
|> >- CRC-32
|> >- Modified Adler-32 (16 bit word sums)
|> >- Fletcher-32
|> >- ???
|> >
|> >And then go with this as a replacement... Admit we were wrong
|> >and fix the problem..
|> >
|> >This way you have ONE and only ONE checksum algorithm making
|> >hardware designers life much easier...
|> >
|> >R
|> >
|> >Qiaobing Xie wrote:
|> >>
|> >> Another solution could be (I think I mentioned this to 
|Randy and a few
|> >> others at last IETF):
|> >>
|> >> - Define a CRC-32 (or other strong checksum) control 
|chunk and when the
|> >> sender wishes to use a stronger checksum protection, in 
|addition to the
|> >> Adler-32 in the common SCTP header it includes this 
|CRC-32 chuck in the
|> >> outbound packet. When the packet arrives, the receiver will do the
|> >> Adler-32 first, and then if the receiver supports the 
|CRC-32 and sees
|> >> the presence of the CRC-32 chunk in the packet it will 
|further verify
|> >> the CRC-32.
|> >>
|> >> We could also use a bit pattern in the chunk type of the 
|CRC-32 chunk
|so
|> >> that if the receiver doesn't understand the CRC-32 chunk 
|it would drop
|> >> it with a report back to the sender.
|> >>
|> >> -Qiaobing
|> >>
|> >> _______________________________________________
|> >> tsvwg mailing list
|> >> tsvwg@ietf.org
|> >> http://www1.ietf.org/mailman/listinfo/tsvwg
|> >
|> >--
|> >Randall R. Stewart
|> >Systems & Solutions Engineering
|> >Cisco Systems Inc.
|> >rrs@cisco.com 815-342-5222 or 815-477-2127
|> >
|> >_______________________________________________
|> >tsvwg mailing list
|> >tsvwg@ietf.org
|> >http://www1.ietf.org/mailman/listinfo/tsvwg
|
|-- 
|Randall R. Stewart
|Systems & Solutions Engineering
|Cisco Systems Inc.
|rrs@cisco.com 815-342-5222 or 815-477-2127
|
|> 
|


From owner-ips@ece.cmu.edu  Wed Apr 18 18:02:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA18740
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 18:01:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IHFu721639
	for ips-outgoing; Wed, 18 Apr 2001 13:15:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IHF7r21538
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 13:15:07 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP id 54024B36
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:15:05 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA07629
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:15:00 -0700 (PDT)
Message-ID: <3ADDCCF5.903D33E3@cup.hp.com>
Date: Wed, 18 Apr 2001 10:20:53 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
References: <C1256A32.00455901.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------8E194DDFD692F81CC108D8DF"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------8E194DDFD692F81CC108D8DF
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:
> 
> OK - I misread it. In any case we are not FCP and we don't violate iSCSI
> rules.

Julian,

What reasons exist to differ in EMDP behaviour b/n iSCSI and FCP ? 

Also, a fundamental question is that the description only speaks about
incoming data PDUs. Are you implying that InDataOrder only control the
ordering for READ data PDUs ? If so, what is the mechanism to control
ordering for write data PDUs ?

It is a useful control option for initiators to negotiate that R2T
requests be made in increasing continuous buffer offset order and R2T
requests not be sent out of order. Does iSCSI allow this ?

- Santosh

"31 InDataOrder 
    
   InDataOrder=<yes|no> 
    
   No is used by iSCSI to indicate that the incoming data PDUs can be in 
   any order (EMDP = 1). Yes is used to indicate that incoming data PDUs 
   have to be at continuously increasing addresses (EMDP = 0). 
    
   This also sets the Connect-Disconnect mode page EMDP bit. 
    
   The default is yes but targets MAY support no. "


------------------------------------------------------------------------------------

> > FCP uses it like iSCSI - i.e. the order has to maintained within a
> sequence
> 
> Not true. If you take a look at FCP-2 rev 04 Section 10.1.1.7
> description on EMDP, it explicitly states :
> "The EMDP bit does not affect the order of frames within a sequence".
> 
> For a WRITE command, an EMDP setting of 0 implies that the buffer offset
> in R2T requests must be in continuous and increasing order whereas an
> EMDP setting of 1 implies the buffer offset in R2T can be out of order.
> 
> For a READ command, an EMDP setting of 0 implies the buffer offset in
> READ data PDUs is in continuous and increasing order, whereas, an EMDP
> setting of 1 implies buffer offset in READ Data PDUs can be out of
> order.
> 
> Based on the above rules, iSCSI is violating EMDP setting by its error
> recovery for data digest errors detected by targets on Data PDUs.
> 
> - Santosh
--------------8E194DDFD692F81CC108D8DF
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------8E194DDFD692F81CC108D8DF--



From owner-ips@ece.cmu.edu  Wed Apr 18 18:02:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA18758
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 18:02:57 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IH6mo20873
	for ips-outgoing; Wed, 18 Apr 2001 13:06:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IH5tr20806
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 13:05:55 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id 3B7A81680
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:04:38 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA06938
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 10:04:18 -0700 (PDT)
Message-ID: <3ADDCA74.400FE7E1@cup.hp.com>
Date: Wed, 18 Apr 2001 10:10:12 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI : More problems with Status SNACK !
References: <C1256A32.004B733A.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------799BD9567587F5B9E07867B6"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------799BD9567587F5B9E07867B6
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:

> 
> IF StatSN is going to become optional a negative response to SNACK while
> keeping up counters for compliance
> could be a good enough solution for a good enough target.
> 
> But I am still not sure that this is the way to go.

Do we agree that Data & Status SNACK capability must be optional ?

If so, these capabilities must be negotiated at login time to optimize
against initiators attempting SNACK and getting rejects on each attempt. 

When both initiator and target know that SNACK is unsupported, what is
the value add in continuing to generate StatSN and ExpStatSN for such an
I-T nexus ? It only adds extra overheads.

- Santosh
--------------799BD9567587F5B9E07867B6
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------799BD9567587F5B9E07867B6--



From owner-ips@ece.cmu.edu  Wed Apr 18 19:27:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19605
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 19:27:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ILT5e10296
	for ips-outgoing; Wed, 18 Apr 2001 17:29:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ILSLr10256
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 17:28:21 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3IMUc119945;
	Wed, 18 Apr 2001 15:30:39 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Dave Peterson" <dap@cisco.com>, "Santosh Rao" <santoshr@cup.hp.com>,
        "Charles Monia" <cmonia@NishanSystems.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI & Linked Commands
Date: Wed, 18 Apr 2001 14:20:49 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEHOCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <EDEKKDKNBFCABNBAAOBBEEAHCEAA.dap@cisco.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Dave,

You could be correct.  I have little knowledge about FC-TAPE.  At one time
there was a suggestion that intermediate status in FCP response and link
bits to make this accommodation.  I am not aware if this was ever
implemented.  Are you suggesting that linked commands be prohibited within
iSCSI as a result of it not being available within Fibre-Channel?  A locate
followed by a series of reads assumes a next block on a sequential device.
I see that as relative addressing.

Doug

> > Santosh,
> >
> > Linked commands typically employ a feature that allows relative
> addressing
> > for sequential devices.  Link bits are still present within the CDB even
> for
> > fibre channel.
>
> Humm, I'm not aware of this feature for sequential devices.
> Also, for FC-TAPE compliant devices command linking is prohibited.
> Dave
>
>



From owner-ips@ece.cmu.edu  Wed Apr 18 19:29:09 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19627
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 19:29:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IMB1g13051
	for ips-outgoing; Wed, 18 Apr 2001 18:11:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IMAZr12973
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 18:10:35 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP
	id DA73F94006; Wed, 18 Apr 2001 18:10:34 -0400 (EDT)
To: "CAVANNA,VICENTE V (A-Roseville,ex1)" <vince_cavanna@agilent.com>
Cc: "'WENDT,JIM (HP-Roseville,ex1)'" <jim_wendt@hp.com>,
        "'julian_satran@il.ibm.com'" <julian_satran@il.ibm.com>,
        ips@ece.cmu.edu, tsvwg@ietf.org,
        "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>,
        Randall Stewart <rrs@cisco.com>
Subject: Re: [Tsvwg] [SCTP checksum problems] 
In-Reply-To: Message from "CAVANNA,VICENTE V (A-Roseville,ex1)" <vince_cavanna@agilent.com> 
   of "Wed, 18 Apr 2001 13:09:08 EDT." <FEEBE78C8360D411ACFD00D0B74779719A8855@xsj02.sjs.agilent.com> 
References: <FEEBE78C8360D411ACFD00D0B74779719A8855@xsj02.sjs.agilent.com> 
Date: Wed, 18 Apr 2001 18:09:02 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010418221034.DA73F94006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Vince,

> I don't think iSCSI can be completely relieved of performing some data
> integrity checking as long as there exists the possibility of "middle boxes"
> opening up the transport protocol's packet and thus potentially invalidating
> any reliability guarantees the transport protocol makes.

Any protection provided against this failure mode will only be
transient, so we must temper the desire to introduce such a
requirement with reality.

Middleboxes can just as easily open up to the iSCSI layer and tinker
with the payload, as they do with other ULPs running on TCP (e.g HTTP)
today.  Short of securing the connection, there is ALWAYS a
possibility of a middlebox terminating and reoriginating an integrity
check.  In case you think this is a farfetched scenario, I do get the
impression that there is a high level of interest in `actively
middling' iSCSI once the specs crystalize.  Who shaves the barber?

An integrity check is not necessary as long as some lower layer
provides adequate integrity guarantees.

Adding an integrity check above the transport layer is based upon
documentation of the presence of a lot of crappy network hardware and
software and analyses of the transport integrity check (TCP checksum)
which suggests it might not be adequately strong against some such
observed errors.

I claim that the high incidence of `broken' (corruption introducing)
components is a result of a variety of factors which have shaped the
development of network components thus far.  The fact that integrity
checks are assumed to be performed in a network context substantially
lowers the bar for implementation correctness.

In a storage (or CPU) context, these types of implementation errors
are a) more easily detectable (more fatal) b) more carefully avoided
during implementation (because of the cost of a potential fatal
error).  If network components magically reached the same `quality
level' as storage and CPU components, there might be no justification
for additional integrity checks above the transport.  Similarly if the
transport (or whatever lower layer) integrity checks are very strong
(e.g. IPSec), there is, again, no need for a higher level integrity
check.

I am not disagreeing that we need an additional integrity check over
TCP in the present target environment, but I do disagree that iSCSI
will always need such a check, independently of what is running
beneath it.

Steph


From owner-ips@ece.cmu.edu  Wed Apr 18 23:57:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23423
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 23:57:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IGtqd20107
	for ips-outgoing; Wed, 18 Apr 2001 12:55:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IGtgr20056
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 12:55:42 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 388BB21C; Wed, 18 Apr 2001 09:54:54 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id JAA05997;
	Wed, 18 Apr 2001 09:54:49 -0700 (PDT)
Message-ID: <3ADDC83B.336B1A94@cup.hp.com>
Date: Wed, 18 Apr 2001 10:00:43 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: aborting an out of sequence cmdSN
References: <C1256A32.004C6CA4.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------68BB3B5DB2F37016AD65253E"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------68BB3B5DB2F37016AD65253E
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,

I understand that nothing in the current draft prevents usage of
connection allegiance for an abort task to flush the stale PDUs of a
command. 

However, as a target implementor, if such a guaranteed mechanism is not
mandated to flush stale PDUs of a command, targets will need to build
safeguards against possible stale PDUs from a command. Any such scheme
would cause targets to retire resources until a safe period (based on a
best guess !) has elapsed. The targets CANNOT depend on initiators
deploying this scheme in the absence of the spec mandating this and do
not have a guarantee that stale PDUs will not arrive after the Abort
Task has cleaned up the command at the target end.

Keeping in mind the above, I am asking that iSCSI mandate that the Abort
Task be sent on the same connection as the SCSI command being aborted.
In addition, [and this has been asked earlier], the iSCSI draft lacks a
section on I/O timeout handling under the error recovery section. Some
description must be provided on the actions an initiator should take on
an I/O timeout. (or timeout of other iSCSI PDUs).

For instance, should non-SCSI PDUs be timed ? (login, text, etc). If so,
what is their timeout values. What actions should an initiator take if
they time out. How is the task tag for that non-SCSI PDU cleaned up,
ensuring no stale PDUs are going to arrive for that task tag.

iSCSI Error Recovery descriptions MUST address task timeout handling. 

- Santosh


julian_satran@il.ibm.com wrote:
> 
> First the current draft does not stop you from doing what you want - it
> just does not mandate it.
> Second with CmdSN now you have a better chance - before handing the command
> over to SCSI task management
> mark all the relevant commands in the iSCSI queue as non-deliverable to a
> LUN or all LUNs or keep a stack of barriers for active aborts (I assume
> that this is a reasonably low number) to the same effect.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 01:10:17
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Sandeep Joshi <sandeepj@research.bell-labs.com>
> cc:   ips@ece.cmu.edu
> Subject:  Re: aborting an out of sequence cmdSN
> 
> I'd think option 1 is the simplest (with the caveat that the task mgmt
> PDU referred to is the Abort Task.) and only impacts the affected
> command/task.
> 
> Pierre Labat and I have asked for this 4 months ago. (See :
> http://ips.pdl.cs.cmu.edu/mail/msg02958.html). The concept of connection
> allegiance should be extended to include the Abort Task. Also,
> connection allegiance should apply to the task (which spans multiple
> commands in the case of linked commands.), allowing for a deterministic
> clean up of stale PDUs of the task through the use of Abort Task.
> 
> - Santosh
> 
> Sandeep Joshi wrote:
> 
> >
> > So our options for abort_task boil down to..
> > (1) use connection allegiance for TASK MGMT PDU.
> > (2) reject all commands prior to cmdSN of TASK MGMT PDU.
> > (3) cmdSN of original task is sent with TASK MGMT PDU and
> >     target at the iSCSI layer keeps state.
> > (4) iSCSI initiator retains state for deleted tasks to ensure
> >     that R2T/Scsi Responses are appropriately handled.
>  - santoshr.vcf
--------------68BB3B5DB2F37016AD65253E
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------68BB3B5DB2F37016AD65253E--



From owner-ips@ece.cmu.edu  Wed Apr 18 23:58:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23452
	for <ips-archive@odin.ietf.org>; Wed, 18 Apr 2001 23:58:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3IM97712908
	for ips-outgoing; Wed, 18 Apr 2001 18:09:07 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3IM8Pr12874
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 18:08:25 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id SAA95238;
	Wed, 18 Apr 2001 18:00:55 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id QAA70318;
	Wed, 18 Apr 2001 16:08:18 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: iSCSI: target discovery issue
To: Sandeep Joshi <sandeepj@research.bell-labs.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF57872AC2.8D8C2AAF-ON88256A32.0077EC68@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Wed, 18 Apr 2001 15:07:56 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/18/2001 04:08:18 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


First off you need to understand that we are talking about Targets, NOT LUs
nor LUNs.

Next this is a feature that you want to have in your management SW, and I
believe that iSNS can help here. Josh Tseng pipe in here.

Further, storage events don't usually just happen that everyone needs to
know about it.  As a rule, storage is brought on to meet some requirement.
The requirement is usually needed by a specific host.  Just because you add
a Storage Controller does NOT mean that all the various host should start
using the storage.  The storage is first established as LUs and the LUs are
assigned to specific authorized Host (or iSCSI Nodes as we are calling them
now) and when the session is started, and authenticated, that Host can
issue the Report LUNs and that maybe the first time an LU has been given a
Number for that particular Host.   But none of that should begin until the
Host is sure that there is something for it to find.  And the thing it
wants to find is a LUN not a SCSI Device/iSCSI Node or Target.  Getting
knowledge of a new LU that can be used by a specific host is NOT an iSCSI
thing.  It is SCSI, and very deep into the Management SW and Admin
Processes.

I punished you with this information so that you can see why I did not
understand why you were concerned about a Host being automatically told
about a new SCSI Device/ iSCSI Node/Target.  I would expect that if the
administrator had gone to the work to bring in the storage controller,
configure it for specific Hosts to use, set up the various LUs to be
authorized for the various Hosts.  It seems reasonable for the Admin to ask
the Host to recycle its discovery functions.
.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


Sandeep Joshi <sandeepj@research.bell-labs.com>@ece.cmu.edu on 04/18/2001
11:21:34 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  Re: target discovery issue




just realized that the reflector is not seeing this discussion.
the question at hand is how should target discovery notification
be sent in the iSCSI world.

Mark Bakke wrote:
>
> Actually, I think that part of it is an iSCSI issue.  That is,
> if a new target is created, that's at the SCSI level.  But if I
> add an iSCSI address on which to access that target, it now must
> be discovered first by the iSCSI layer on the host, before it
> can be presented to the SCSI layer.  In this case, we would need
> to send an iSCSI event indicating that there is a new target (or
> at least that there is some change in availability of targets);
> the host would then use SendTargets to find out the specifics.
>
> This brings up Sandeep's question #2.  If I am a target, I can
> send this message either:
>
>  a) On every iSCSI connection
>
>   OR
>
>  b) On all connections to canonical targets
>
> Method a gives us better coverage, and does not require an
> initiator to keep its canonical target connection around in
> between these little sendtargets commands.  However, if an
> initiator logs into a canonical target, finds that it has no
> targets to connect to (yet), and one is added later, the
> initiator would only find out if it had kept its canonical
> target connection, unless it is using an out-of-band discovery
> mechanism.
>
> Method a will also tend to bother connections to targets
> that are doing the "real" work (data path stuff).
>
> Method b will keep these events away from the data path, and
> will not generally have to send so many events.  However, it
> would require each initiator that wanted to be notified to keep
> its canonical connection around.
>
> There is a Method C, which is a combination of the above:
>
>  c) The device will send this async event message on ONE of the
>     connections to each initiator name (formerly WWUI) that is
>     connected to it.  If one of these connections is to the
>     canonical target, the device will use that one.
>
> Method c allows the initiator to choose whether it would rather
> keep an explicit canonical target connection around (e.g. if the
> other connections have been pushed down to hardware), or whether
> it would rather not keep the connection around, and be notified
> on one of the others.  The number of messages sent by targets
> would be identical to that in method b.
>
> --
> Mark
>
> julian_satran@il.ibm.com wrote:
> >
> > Sandeep,
> >
> > I think we are deep in T10 territory - this is a SCSI issue.
> >
> > Julo
> >
> > Sandeep Joshi <sandeepj@research.bell-labs.com> on 18/04/2001 16:21:06
> >
> > Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>
> >
> > To:   Julian Satran/Haifa/IBM@IBMIL, mbakke@cisco.com
> > cc:
> > Subject:  target discovery issue
> >
> > Julian & Mark,
> >
> > Friendly reminder... the issue mentioned below may not
> > have been resolved.
> >
> > 1) Is target discovery going to be the SCSI event or will
> >    it be wrapped up as an iSCSI event ?
> > 2) Do we have to keep a session to the canonical target
> >    always open to be able to do target discovery?
> >
> > -Sandeep
> >
> > I am not sure.  There are some SCSI items in it too (SCSI handles now
the
> > appearnce of new LUs).
> >
> > I will need a longer discussion with NDT to understand the semantics.
> >
> > Julo
> >
> > Sandeep Joshi <sandeepj@research.bell-labs.com> on 12/03/2001 22:25:27
> >
> > Julian,
> >
> > in case you skip this one..
> > your response is required on point (1) for amending iSCSI draft.
> >
> > -sandeep
> >
> > Sandeep-
> >
> > The problem you pointed out in item number 1 creates the need
> > for an additional iSCSI-level event.  Since the discovery of
> > targets happens at the iSCSI level, rather than at the SCSI
> > level, how about adding this to 2.18.1 (in iSCSI-05)?
> >
> >   4   Network entity indicates that a "target discovery" event
> >       has occurred.
> >
> > Upon receiving this message, the initiator should use SendTargets,
> > or whatever other methods of discovery it is using, to find out
> > what has changed.  Usually, this would be due to adding a new
> > target.
> >
> > We will fix items 2-4; thanks for pointing them out.
> >
> > Thanks,
> >
> > Mark
> >
> > Sandeep Joshi wrote:
> > >
> > > 1) Section 4.2 last line before Section 4.2.1
> > >     "target MUST send any iSCSI-level async on this session,
> > >      allowing the initiator to discover new targets.."
> > >
> > >    The session mentioned here is a session to the canonical target.
> > >
> > >    However, the iSCSI 05 draft does not mention any such condition
> > >    in Sec 2.18 on Async Message.   In there, a SCSI event (note: not
> > >    iSCSI) is used to notify availability of new targets.
> > >





From owner-ips@ece.cmu.edu  Thu Apr 19 00:43:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA23753
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 00:43:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J2lj704211
	for ips-outgoing; Wed, 18 Apr 2001 22:47:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J2kdA04195
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 22:46:39 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRK3A>; Wed, 18 Apr 2001 19:46:31 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B3FC1DD@ariel.nishansystems.com>
From: Joshua Tseng <jtseng@NishanSystems.com>
To: "'John Hufferd'" <hufferd@us.ibm.com>,
        Sandeep Joshi
	 <sandeepj@research.bell-labs.com>
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI: target discovery issue
Date: Wed, 18 Apr 2001 19:46:22 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

John, Sandeep,

I would like to append to Mark's note that method D is
the iSNS.  The iSNS breaks from the device-by-device
management paradigm that the previous methods use.  It
provides for network-wide storage device discovery,
zoning and management.

The details of how iSNS works is documented in the
iSNS document, and an overview is in the iSCSI N&D
requirements document.  But the key concept is that
instead of going to device A, configure its access
list, then on to device B, configure its access list,
then on to initiator A, configure its list of targets,
etc....you instead go to a single entity, the iSNS
server, to gain a network-wide view of all storage
assets.  If all storage devices have slaved their
discovery and management functions to the iSNS server,
then the iSNS is a single management point that the
GUI can use to configure discovery and access privileges
for the entire storage network.

When a new target shows up on the network, it registers
in the iSNS.  The iSNS server then sends notifications to
interested iSNS clients (only those configured for this
notification with the proper zoning) informing them of the
new device. The iSNS client is a co-resident application
on iSCSI targets and initiators that maintains communication
with the iSNS server.

Hope this helps.

Josh

> -----Original Message-----
> From: John Hufferd [mailto:hufferd@us.ibm.com]
> Sent: Wednesday, April 18, 2001 3:08 PM
> To: Sandeep Joshi
> Cc: ips@ece.cmu.edu
> Subject: iSCSI: target discovery issue
> 
> 
> 
> First off you need to understand that we are talking about 
> Targets, NOT LUs
> nor LUNs.
> 
> Next this is a feature that you want to have in your 
> management SW, and I
> believe that iSNS can help here. Josh Tseng pipe in here.
> 
> Further, storage events don't usually just happen that 
> everyone needs to
> know about it.  As a rule, storage is brought on to meet some 
> requirement.
> The requirement is usually needed by a specific host.  Just 
> because you add
> a Storage Controller does NOT mean that all the various host 
> should start
> using the storage.  The storage is first established as LUs 
> and the LUs are
> assigned to specific authorized Host (or iSCSI Nodes as we 
> are calling them
> now) and when the session is started, and authenticated, that Host can
> issue the Report LUNs and that maybe the first time an LU has 
> been given a
> Number for that particular Host.   But none of that should 
> begin until the
> Host is sure that there is something for it to find.  And the thing it
> wants to find is a LUN not a SCSI Device/iSCSI Node or 
> Target.  Getting
> knowledge of a new LU that can be used by a specific host is 
> NOT an iSCSI
> thing.  It is SCSI, and very deep into the Management SW and Admin
> Processes.
> 
> I punished you with this information so that you can see why I did not
> understand why you were concerned about a Host being 
> automatically told
> about a new SCSI Device/ iSCSI Node/Target.  I would expect 
> that if the
> administrator had gone to the work to bring in the storage controller,
> configure it for specific Hosts to use, set up the various LUs to be
> authorized for the various Hosts.  It seems reasonable for 
> the Admin to ask
> the Host to recycle its discovery functions.
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> 
> 
> Sandeep Joshi <sandeepj@research.bell-labs.com>@ece.cmu.edu 
> on 04/18/2001
> 11:21:34 AM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: target discovery issue
> 
> 
> 
> 
> just realized that the reflector is not seeing this discussion.
> the question at hand is how should target discovery notification
> be sent in the iSCSI world.
> 
> Mark Bakke wrote:
> >
> > Actually, I think that part of it is an iSCSI issue.  That is,
> > if a new target is created, that's at the SCSI level.  But if I
> > add an iSCSI address on which to access that target, it now must
> > be discovered first by the iSCSI layer on the host, before it
> > can be presented to the SCSI layer.  In this case, we would need
> > to send an iSCSI event indicating that there is a new target (or
> > at least that there is some change in availability of targets);
> > the host would then use SendTargets to find out the specifics.
> >
> > This brings up Sandeep's question #2.  If I am a target, I can
> > send this message either:
> >
> >  a) On every iSCSI connection
> >
> >   OR
> >
> >  b) On all connections to canonical targets
> >
> > Method a gives us better coverage, and does not require an
> > initiator to keep its canonical target connection around in
> > between these little sendtargets commands.  However, if an
> > initiator logs into a canonical target, finds that it has no
> > targets to connect to (yet), and one is added later, the
> > initiator would only find out if it had kept its canonical
> > target connection, unless it is using an out-of-band discovery
> > mechanism.
> >
> > Method a will also tend to bother connections to targets
> > that are doing the "real" work (data path stuff).
> >
> > Method b will keep these events away from the data path, and
> > will not generally have to send so many events.  However, it
> > would require each initiator that wanted to be notified to keep
> > its canonical connection around.
> >
> > There is a Method C, which is a combination of the above:
> >
> >  c) The device will send this async event message on ONE of the
> >     connections to each initiator name (formerly WWUI) that is
> >     connected to it.  If one of these connections is to the
> >     canonical target, the device will use that one.
> >
> > Method c allows the initiator to choose whether it would rather
> > keep an explicit canonical target connection around (e.g. if the
> > other connections have been pushed down to hardware), or whether
> > it would rather not keep the connection around, and be notified
> > on one of the others.  The number of messages sent by targets
> > would be identical to that in method b.
> >
> > --
> > Mark
> >
> > julian_satran@il.ibm.com wrote:
> > >
> > > Sandeep,
> > >
> > > I think we are deep in T10 territory - this is a SCSI issue.
> > >
> > > Julo
> > >
> > > Sandeep Joshi <sandeepj@research.bell-labs.com> on 
> 18/04/2001 16:21:06
> > >
> > > Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>
> > >
> > > To:   Julian Satran/Haifa/IBM@IBMIL, mbakke@cisco.com
> > > cc:
> > > Subject:  target discovery issue
> > >
> > > Julian & Mark,
> > >
> > > Friendly reminder... the issue mentioned below may not
> > > have been resolved.
> > >
> > > 1) Is target discovery going to be the SCSI event or will
> > >    it be wrapped up as an iSCSI event ?
> > > 2) Do we have to keep a session to the canonical target
> > >    always open to be able to do target discovery?
> > >
> > > -Sandeep
> > >
> > > I am not sure.  There are some SCSI items in it too (SCSI 
> handles now
> the
> > > appearnce of new LUs).
> > >
> > > I will need a longer discussion with NDT to understand 
> the semantics.
> > >
> > > Julo
> > >
> > > Sandeep Joshi <sandeepj@research.bell-labs.com> on 
> 12/03/2001 22:25:27
> > >
> > > Julian,
> > >
> > > in case you skip this one..
> > > your response is required on point (1) for amending iSCSI draft.
> > >
> > > -sandeep
> > >
> > > Sandeep-
> > >
> > > The problem you pointed out in item number 1 creates the need
> > > for an additional iSCSI-level event.  Since the discovery of
> > > targets happens at the iSCSI level, rather than at the SCSI
> > > level, how about adding this to 2.18.1 (in iSCSI-05)?
> > >
> > >   4   Network entity indicates that a "target discovery" event
> > >       has occurred.
> > >
> > > Upon receiving this message, the initiator should use SendTargets,
> > > or whatever other methods of discovery it is using, to find out
> > > what has changed.  Usually, this would be due to adding a new
> > > target.
> > >
> > > We will fix items 2-4; thanks for pointing them out.
> > >
> > > Thanks,
> > >
> > > Mark
> > >
> > > Sandeep Joshi wrote:
> > > >
> > > > 1) Section 4.2 last line before Section 4.2.1
> > > >     "target MUST send any iSCSI-level async on this session,
> > > >      allowing the initiator to discover new targets.."
> > > >
> > > >    The session mentioned here is a session to the 
> canonical target.
> > > >
> > > >    However, the iSCSI 05 draft does not mention any 
> such condition
> > > >    in Sec 2.18 on Async Message.   In there, a SCSI 
> event (note: not
> > > >    iSCSI) is used to notify availability of new targets.
> > > >
> 
> 
> 


From owner-ips@ece.cmu.edu  Thu Apr 19 02:18:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA07714
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 02:18:53 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J40jL07204
	for ips-outgoing; Thu, 19 Apr 2001 00:00:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3J40KA07157
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 00:00:20 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Wed Apr 18 23:57:35 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Wed Apr 18 23:59:55 EDT 2001
Received: (from sandeepj@localhost)
	by aura.research.bell-labs.com (8.9.1/8.9.1) id XAA29944;
	Wed, 18 Apr 2001 23:59:51 -0400 (EDT)
Date: Wed, 18 Apr 2001 23:59:51 -0400 (EDT)
Message-Id: <200104190359.XAA29944@aura.research.bell-labs.com>
From: sandeepj@research.bell-labs.com (Sandeep Joshi)
To: hufferd@us.ibm.com, jtseng@NishanSystems.com
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI: target discovery issue
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Josh & John,

Thanks for the hi-level perspective.  I do agree that iSNS
and/or SLP will do the trick here and that this functionality 
falls into the OA&M domain.

Four score and seven days ago (..approx!), this thread started due 
to confusion with the following line in the naming and discovery 
draft, which had no equivalent message code defined in the iSCSI 
Async PDU.   

> 1) Section 4.2 last line before Section 4.2.1
>    "the target MUST send any iSCSI-level async on the canonical
>    session, to allow the initiator to discover new targets as
>    they are created.."

Can the issue be laid to rest by removing this statement ?

Thanks,
-Sandeep


> John, Sandeep,
> 
> I would like to append to Mark's note that method D is
> the iSNS.  The iSNS breaks from the device-by-device
> management paradigm that the previous methods use.  It
> provides for network-wide storage device discovery,
> zoning and management.
> 
> The details of how iSNS works is documented in the
> iSNS document, and an overview is in the iSCSI N&D
> requirements document.  But the key concept is that
> instead of going to device A, configure its access
> list, then on to device B, configure its access list,
> then on to initiator A, configure its list of targets,
> etc....you instead go to a single entity, the iSNS
> server, to gain a network-wide view of all storage
> assets.  If all storage devices have slaved their
> discovery and management functions to the iSNS server,
> then the iSNS is a single management point that the
> GUI can use to configure discovery and access privileges
> for the entire storage network.
> 
> When a new target shows up on the network, it registers
> in the iSNS.  The iSNS server then sends notifications to
> interested iSNS clients (only those configured for this
> notification with the proper zoning) informing them of the
> new device. The iSNS client is a co-resident application
> on iSCSI targets and initiators that maintains communication
> with the iSNS server.
> 
> Hope this helps.
> 
> Josh
> 
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Wednesday, April 18, 2001 3:08 PM
> > To: Sandeep Joshi
> > Cc: ips@ece.cmu.edu
> > Subject: iSCSI: target discovery issue
> > 
> > 
> > 
> > First off you need to understand that we are talking about 
> > Targets, NOT LUs
> > nor LUNs.
> > 
> > Next this is a feature that you want to have in your 
> > management SW, and I
> > believe that iSNS can help here. Josh Tseng pipe in here.
> > 
> > Further, storage events don't usually just happen that 
> > everyone needs to
> > know about it.  As a rule, storage is brought on to meet some 
> > requirement.
> > The requirement is usually needed by a specific host.  Just 
> > because you add
> > a Storage Controller does NOT mean that all the various host 
> > should start
> > using the storage.  The storage is first established as LUs 
> > and the LUs are
> > assigned to specific authorized Host (or iSCSI Nodes as we 
> > are calling them
> > now) and when the session is started, and authenticated, that Host can
> > issue the Report LUNs and that maybe the first time an LU has 
> > been given a
> > Number for that particular Host.   But none of that should 
> > begin until the
> > Host is sure that there is something for it to find.  And the thing it
> > wants to find is a LUN not a SCSI Device/iSCSI Node or 
> > Target.  Getting
> > knowledge of a new LU that can be used by a specific host is 
> > NOT an iSCSI
> > thing.  It is SCSI, and very deep into the Management SW and Admin
> > Processes.
> > 
> > I punished you with this information so that you can see why I did not
> > understand why you were concerned about a Host being 
> > automatically told
> > about a new SCSI Device/ iSCSI Node/Target.  I would expect 
> > that if the
> > administrator had gone to the work to bring in the storage controller,
> > configure it for specific Hosts to use, set up the various LUs to be
> > authorized for the various Hosts.  It seems reasonable for 
> > the Admin to ask
> > the Host to recycle its discovery functions.
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> > 
> > 
> > Sandeep Joshi <sandeepj@research.bell-labs.com>@ece.cmu.edu 
> > on 04/18/2001
> > 11:21:34 AM
> > 
> > Sent by:  owner-ips@ece.cmu.edu
> > 
> > 
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: target discovery issue
> > 
> > 
> > 
> > 
> > just realized that the reflector is not seeing this discussion.
> > the question at hand is how should target discovery notification
> > be sent in the iSCSI world.
> > 
> > Mark Bakke wrote:
> > >
> > > Actually, I think that part of it is an iSCSI issue.  That is,
> > > if a new target is created, that's at the SCSI level.  But if I
> > > add an iSCSI address on which to access that target, it now must
> > > be discovered first by the iSCSI layer on the host, before it
> > > can be presented to the SCSI layer.  In this case, we would need
> > > to send an iSCSI event indicating that there is a new target (or
> > > at least that there is some change in availability of targets);
> > > the host would then use SendTargets to find out the specifics.
> > >
> > > This brings up Sandeep's question #2.  If I am a target, I can
> > > send this message either:
> > >
> > >  a) On every iSCSI connection
> > >
> > >   OR
> > >
> > >  b) On all connections to canonical targets
> > >
> > > Method a gives us better coverage, and does not require an
> > > initiator to keep its canonical target connection around in
> > > between these little sendtargets commands.  However, if an
> > > initiator logs into a canonical target, finds that it has no
> > > targets to connect to (yet), and one is added later, the
> > > initiator would only find out if it had kept its canonical
> > > target connection, unless it is using an out-of-band discovery
> > > mechanism.
> > >
> > > Method a will also tend to bother connections to targets
> > > that are doing the "real" work (data path stuff).
> > >
> > > Method b will keep these events away from the data path, and
> > > will not generally have to send so many events.  However, it
> > > would require each initiator that wanted to be notified to keep
> > > its canonical connection around.
> > >
> > > There is a Method C, which is a combination of the above:
> > >
> > >  c) The device will send this async event message on ONE of the
> > >     connections to each initiator name (formerly WWUI) that is
> > >     connected to it.  If one of these connections is to the
> > >     canonical target, the device will use that one.
> > >
> > > Method c allows the initiator to choose whether it would rather
> > > keep an explicit canonical target connection around (e.g. if the
> > > other connections have been pushed down to hardware), or whether
> > > it would rather not keep the connection around, and be notified
> > > on one of the others.  The number of messages sent by targets
> > > would be identical to that in method b.
> > >
> > > --
> > > Mark
> > >
> > > julian_satran@il.ibm.com wrote:
> > > >
> > > > Sandeep,
> > > >
> > > > I think we are deep in T10 territory - this is a SCSI issue.
> > > >
> > > > Julo
> > > >
> > > > Sandeep Joshi <sandeepj@research.bell-labs.com> on 
> > 18/04/2001 16:21:06
> > > >
> > > > Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>
> > > >
> > > > To:   Julian Satran/Haifa/IBM@IBMIL, mbakke@cisco.com
> > > > cc:
> > > > Subject:  target discovery issue
> > > >
> > > > Julian & Mark,
> > > >
> > > > Friendly reminder... the issue mentioned below may not
> > > > have been resolved.
> > > >
> > > > 1) Is target discovery going to be the SCSI event or will
> > > >    it be wrapped up as an iSCSI event ?
> > > > 2) Do we have to keep a session to the canonical target
> > > >    always open to be able to do target discovery?
> > > >
> > > > -Sandeep
> > > >
> > > > I am not sure.  There are some SCSI items in it too (SCSI 
> > handles now
> > the
> > > > appearnce of new LUs).
> > > >
> > > > I will need a longer discussion with NDT to understand 
> > the semantics.
> > > >
> > > > Julo
> > > >
> > > > Sandeep Joshi <sandeepj@research.bell-labs.com> on 
> > 12/03/2001 22:25:27
> > > >
> > > > Julian,
> > > >
> > > > in case you skip this one..
> > > > your response is required on point (1) for amending iSCSI draft.
> > > >
> > > > -sandeep
> > > >
> > > > Sandeep-
> > > >
> > > > The problem you pointed out in item number 1 creates the need
> > > > for an additional iSCSI-level event.  Since the discovery of
> > > > targets happens at the iSCSI level, rather than at the SCSI
> > > > level, how about adding this to 2.18.1 (in iSCSI-05)?
> > > >
> > > >   4   Network entity indicates that a "target discovery" event
> > > >       has occurred.
> > > >
> > > > Upon receiving this message, the initiator should use SendTargets,
> > > > or whatever other methods of discovery it is using, to find out
> > > > what has changed.  Usually, this would be due to adding a new
> > > > target.
> > > >
> > > > We will fix items 2-4; thanks for pointing them out.
> > > >
> > > > Thanks,
> > > >
> > > > Mark
> > > >
> > > > Sandeep Joshi wrote:
> > > > >
> > > > > 1) Section 4.2 last line before Section 4.2.1
> > > > >     "target MUST send any iSCSI-level async on this session,
> > > > >      allowing the initiator to discover new targets.."
> > > > >
> > > > >    The session mentioned here is a session to the 
> > canonical target.
> > > > >
> > > > >    However, the iSCSI 05 draft does not mention any 
> > such condition
> > > > >    in Sec 2.18 on Async Message.   In there, a SCSI 
> > event (note: not
> > > > >    iSCSI) is used to notify availability of new targets.
> > > > >


From owner-ips@ece.cmu.edu  Thu Apr 19 02:29:38 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA07790
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 02:29:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ILlwu11549
	for ips-outgoing; Wed, 18 Apr 2001 17:47:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ILlkr11508
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 17:47:46 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id RAA51836
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 17:40:18 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id PAA152862
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 15:47:40 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: IPS-ALL
To: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFF28C8038.1B7CF3CA-ON88256A32.00776F6F@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Wed, 18 Apr 2001 14:47:20 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/18/2001 03:47:40 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


This is just another reminder that your subject line should begin with
iSCSI:, or FCIP:, or iFCP: depending on the topic you are discussing.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


Sandeep Joshi <sandeepj@research.bell-labs.com>@ece.cmu.edu on 04/18/2001
11:21:34 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  Re: target discovery issue




just realized that the reflector is not seeing this discussion.
the question at hand is how should target discovery notification
be sent in the iSCSI world.

Mark Bakke wrote:
>
> Actually, I think that part of it is an iSCSI issue.  That is,
> if a new target is created, that's at the SCSI level.  But if I
> add an iSCSI address on which to access that target, it now must
> be discovered first by the iSCSI layer on the host, before it
> can be presented to the SCSI layer.  In this case, we would need
> to send an iSCSI event indicating that there is a new target (or
> at least that there is some change in availability of targets);
> the host would then use SendTargets to find out the specifics.
>
> This brings up Sandeep's question #2.  If I am a target, I can
> send this message either:
>
>  a) On every iSCSI connection
>
>   OR
>
>  b) On all connections to canonical targets
>
> Method a gives us better coverage, and does not require an
> initiator to keep its canonical target connection around in
> between these little sendtargets commands.  However, if an
> initiator logs into a canonical target, finds that it has no
> targets to connect to (yet), and one is added later, the
> initiator would only find out if it had kept its canonical
> target connection, unless it is using an out-of-band discovery
> mechanism.
>
> Method a will also tend to bother connections to targets
> that are doing the "real" work (data path stuff).
>
> Method b will keep these events away from the data path, and
> will not generally have to send so many events.  However, it
> would require each initiator that wanted to be notified to keep
> its canonical connection around.
>
> There is a Method C, which is a combination of the above:
>
>  c) The device will send this async event message on ONE of the
>     connections to each initiator name (formerly WWUI) that is
>     connected to it.  If one of these connections is to the
>     canonical target, the device will use that one.
>
> Method c allows the initiator to choose whether it would rather
> keep an explicit canonical target connection around (e.g. if the
> other connections have been pushed down to hardware), or whether
> it would rather not keep the connection around, and be notified
> on one of the others.  The number of messages sent by targets
> would be identical to that in method b.
>
> --
> Mark
>
> julian_satran@il.ibm.com wrote:
> >
> > Sandeep,
> >
> > I think we are deep in T10 territory - this is a SCSI issue.
> >
> > Julo
> >
> > Sandeep Joshi <sandeepj@research.bell-labs.com> on 18/04/2001 16:21:06
> >
> > Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>
> >
> > To:   Julian Satran/Haifa/IBM@IBMIL, mbakke@cisco.com
> > cc:
> > Subject:  target discovery issue
> >
> > Julian & Mark,
> >
> > Friendly reminder... the issue mentioned below may not
> > have been resolved.
> >
> > 1) Is target discovery going to be the SCSI event or will
> >    it be wrapped up as an iSCSI event ?
> > 2) Do we have to keep a session to the canonical target
> >    always open to be able to do target discovery?
> >
> > -Sandeep
> >
> > I am not sure.  There are some SCSI items in it too (SCSI handles now
the
> > appearnce of new LUs).
> >
> > I will need a longer discussion with NDT to understand the semantics.
> >
> > Julo
> >
> > Sandeep Joshi <sandeepj@research.bell-labs.com> on 12/03/2001 22:25:27
> >
> > Julian,
> >
> > in case you skip this one..
> > your response is required on point (1) for amending iSCSI draft.
> >
> > -sandeep
> >
> > Sandeep-
> >
> > The problem you pointed out in item number 1 creates the need
> > for an additional iSCSI-level event.  Since the discovery of
> > targets happens at the iSCSI level, rather than at the SCSI
> > level, how about adding this to 2.18.1 (in iSCSI-05)?
> >
> >   4   Network entity indicates that a "target discovery" event
> >       has occurred.
> >
> > Upon receiving this message, the initiator should use SendTargets,
> > or whatever other methods of discovery it is using, to find out
> > what has changed.  Usually, this would be due to adding a new
> > target.
> >
> > We will fix items 2-4; thanks for pointing them out.
> >
> > Thanks,
> >
> > Mark
> >
> > Sandeep Joshi wrote:
> > >
> > > 1) Section 4.2 last line before Section 4.2.1
> > >     "target MUST send any iSCSI-level async on this session,
> > >      allowing the initiator to discover new targets.."
> > >
> > >    The session mentioned here is a session to the canonical target.
> > >
> > >    However, the iSCSI 05 draft does not mention any such condition
> > >    in Sec 2.18 on Async Message.   In there, a SCSI event (note: not
> > >    iSCSI) is used to notify availability of new targets.
> > >





From owner-ips@ece.cmu.edu  Thu Apr 19 02:33:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA07836
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 02:33:53 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J1lek01769
	for ips-outgoing; Wed, 18 Apr 2001 21:47:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J1lDA01747
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 21:47:13 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP id 46AC0A88
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 18:47:12 -0700 (PDT)
Received: (from santoshr@localhost)
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) id SAA18720
	for ips@ece.cmu.edu; Wed, 18 Apr 2001 18:47:07 -0700 (PDT)
From: Santosh Rao <santoshr@cup.hp.com>
Message-Id: <200104190147.SAA18720@hpcuhe.cup.hp.com>
Subject: iSCSI : New PDU opcode usage in rev 5.92
To: ips@ece.cmu.edu (ips)
Date: Wed, 18 Apr 2001 18:47:07 -0700 (PDT)
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian & All,

I've got a quick question on how the new opcode layouts would work for
dual mode scsi implementations. (i.e. initiators that responded in
target mode or targets that acted as initiators also).

The new opcode layout is :

----------------
X|I| | | | | | |
----------------
7 6 5 4 3 2 1 0

where bits 5-0 -> opcode
X -> retry bit
I -> immediate bit

The same values are used for the command as well as response opcodes and
bits X & I are intended to both be set to 1 by targets.

i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
command and response is based on targets setting X & I bits to 1.

Now, if an initiator [capable of target mode] sent the following
commands, how would they be interpreted :

1) 0xc4.
is this a text command being retried in immediate mode, 
or is it a text response ?

2) 0xc1
is this a scsi command being retried in immediate mode,
or is it a scsi response ?

3) 0xc2
is this a scsi task mgmt command being retried in immediate mode,
or is it a scsi task mgmt response ?

etc.....

- Santosh

-- 
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################


From owner-ips@ece.cmu.edu  Thu Apr 19 03:27:24 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA08175
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 03:27:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J5wlB11399
	for ips-outgoing; Thu, 19 Apr 2001 01:58:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from med.corp.rhapsodynetworks.com (64-160-62-201.rhapsodynetworks.com [64.160.62.201] (may be forged))
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J5wVA11392
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 01:58:32 -0400 (EDT)
Received: by med.corp.rhapsodynetworks.com with Internet Mail Service (5.5.2653.19)
	id <JG73VYJC>; Wed, 18 Apr 2001 22:58:24 -0700
Message-ID: <15851BD69CFCD41186B100B0D0AABE650C1B6C@med.corp.rhapsodynetworks.com>
From: Venkat Rangan <venkat@rhapsodynetworks.com>
To: "'Santosh Rao'" <santoshr@cup.hp.com>, ips@ece.cmu.edu
Subject: RE: iSCSI : New PDU opcode usage in rev 5.92
Date: Wed, 18 Apr 2001 22:58:23 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Santosh,

Is it not the case that requests go in the direction from the Initiator to
Target,
where Target is the one "listening" for new connections on the well-known
port?
A dual mode scsi implementation therefore has two separate sessions and sets
of connections.
One set is [I->DualModeTarget] and the other is [DualModeInitiator->T]
and the connections are independent. If I and T happens to be the same
system, you
can not use a single connection for bidirectional sessions between the two.

So if you receive a PDU from a target, you can only do so with SourcePort
set to
well-known-port, and it must be a Response from target. May be I'm assuming
something
that is not valid...

Venkat Rangan
Rhapsody Networks Inc.
http://www.rhapsodynetworks.com

-----Original Message-----
From: Santosh Rao [mailto:santoshr@cup.hp.com]
Sent: Wednesday, April 18, 2001 6:47 PM
To: ips@ece.cmu.edu
Subject: iSCSI : New PDU opcode usage in rev 5.92


Julian & All,

I've got a quick question on how the new opcode layouts would work for
dual mode scsi implementations. (i.e. initiators that responded in
target mode or targets that acted as initiators also).

The new opcode layout is :

----------------
X|I| | | | | | |
----------------
7 6 5 4 3 2 1 0

where bits 5-0 -> opcode
X -> retry bit
I -> immediate bit

The same values are used for the command as well as response opcodes and
bits X & I are intended to both be set to 1 by targets.

i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
command and response is based on targets setting X & I bits to 1.

Now, if an initiator [capable of target mode] sent the following
commands, how would they be interpreted :

1) 0xc4.
is this a text command being retried in immediate mode, 
or is it a text response ?

2) 0xc1
is this a scsi command being retried in immediate mode,
or is it a scsi response ?

3) 0xc2
is this a scsi task mgmt command being retried in immediate mode,
or is it a scsi task mgmt response ?

etc.....

- Santosh

-- 
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################


From owner-ips@ece.cmu.edu  Thu Apr 19 03:31:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA08205
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 03:31:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J5TlJ10386
	for ips-outgoing; Thu, 19 Apr 2001 01:29:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J5TAA10376
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 01:29:10 -0400 (EDT)
Received: from amrelay2.boi.hp.com (amrelay2.boi.hp.com [15.56.8.41])
	by palrel3.hp.com (Postfix) with ESMTP id 072002E7
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 22:29:10 -0700 (PDT)
Received: from xatlbh1.atl.hp.com (xatlbh1.atl.hp.com [15.45.89.186])
	by amrelay2.boi.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id XAA13755
	for <ips@ece.cmu.edu>; Wed, 18 Apr 2001 23:29:09 -0600 (MDT)
Received: by xatlbh1.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <JG4Y8RSA>; Thu, 19 Apr 2001 01:29:08 -0400
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FCE@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI: target discovery issue
Date: Thu, 19 Apr 2001 01:29:05 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Josh wrote:
> I would like to append to Mark's note that method D is
> the iSNS.  The iSNS breaks from the device-by-device
> management paradigm that the previous methods use.  It
> provides for network-wide storage device discovery,
> zoning and management.
> 

A more generic statement of "method D" is that there exist applications for
managing and administering networked resources (Novell eDirectory and
Microsoft ActiveDirectory products are examples).  One of the most important
tasks a network administrator has is making sure that users have access to
files, resources and services they need on the network.  iSCSI adds block
storage to the list of network resources, but resource assignment and
management is a problem that has existed for quite some time and solutions
exist today.  Network resource management evolved from a device-by-device
management paradigm long before iSCSI was conceived.  I don't want to
detract from the value of iSNS, but it's not correct to suggest it's the
only solution to resource management.

Typically users register/login to distributed resource management pts
(domain servers) and these applications handle authentication,
authorization, and assignment of resources.  John makes important points in
his email - you don't want all users informed of new storage coming on line,
those systems that are intended to have access should be notified, or should
explicitly "mount" the new storage.  It's not appropriate to burden each
storage device with this task, it is definitely a value add feature
appropriate to a centralized resource management application. 

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 


From owner-ips@ece.cmu.edu  Thu Apr 19 04:11:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08528
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 04:11:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J6emo12883
	for ips-outgoing; Thu, 19 Apr 2001 02:40:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J6e1A12805
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 02:40:01 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id IAA204440
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 08:39:53 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA227908
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 08:39:53 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.002499AE ; Thu, 19 Apr 2001 08:39:46 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.00249950.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 08:44:53 +0200
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

No it is a mistake. But unlike FCP we will require ordering only within a
sequence (input data or unsolicited or answer to a R2T).

Thanks,
Julo

Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 19:20:53

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI : digest error handling violates EMDP/InDataOrder




julian_satran@il.ibm.com wrote:
>
> OK - I misread it. In any case we are not FCP and we don't violate iSCSI
> rules.

Julian,

What reasons exist to differ in EMDP behaviour b/n iSCSI and FCP ?

Also, a fundamental question is that the description only speaks about
incoming data PDUs. Are you implying that InDataOrder only control the
ordering for READ data PDUs ? If so, what is the mechanism to control
ordering for write data PDUs ?

It is a useful control option for initiators to negotiate that R2T
requests be made in increasing continuous buffer offset order and R2T
requests not be sent out of order. Does iSCSI allow this ?

- Santosh

"31 InDataOrder

   InDataOrder=<yes|no>

   No is used by iSCSI to indicate that the incoming data PDUs can be in
   any order (EMDP = 1). Yes is used to indicate that incoming data PDUs
   have to be at continuously increasing addresses (EMDP = 0).

   This also sets the Connect-Disconnect mode page EMDP bit.

   The default is yes but targets MAY support no. "


------------------------------------------------------------------------------------


> > FCP uses it like iSCSI - i.e. the order has to maintained within a
> sequence
>
> Not true. If you take a look at FCP-2 rev 04 Section 10.1.1.7
> description on EMDP, it explicitly states :
> "The EMDP bit does not affect the order of frames within a sequence".
>
> For a WRITE command, an EMDP setting of 0 implies that the buffer offset
> in R2T requests must be in continuous and increasing order whereas an
> EMDP setting of 1 implies the buffer offset in R2T can be out of order.
>
> For a READ command, an EMDP setting of 0 implies the buffer offset in
> READ data PDUs is in continuous and increasing order, whereas, an EMDP
> setting of 1 implies buffer offset in READ Data PDUs can be out of
> order.
>
> Based on the above rules, iSCSI is violating EMDP setting by its error
> recovery for data digest errors detected by targets on Data PDUs.
>
> - Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr 19 04:12:02 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08541
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 04:12:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J6Srk12451
	for ips-outgoing; Thu, 19 Apr 2001 02:28:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J6SDA12436
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 02:28:13 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id IAA180390
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 08:28:06 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA267400
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 08:28:06 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.002384BB ; Thu, 19 Apr 2001 08:27:57 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.002383BD.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 08:33:01 +0200
Subject: Re: aborting an out of sequence cmdSN
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

The version 06 (to appear today) addresses your concerns with regard to
Task management - although through the mechanism I was alluding all the
time.

At David Black's suggestion - I've added a section to chapter 7 explaining
how to properly handle task management using CmdSN.
It is only a note to implementers - nothing needed change.

The error recovery chapter is still work-in-progress.

I would also appreciate if you would "tone down" your notes - recall that
we are always limited by our own imagination.

Julo

Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 19:00:43

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: aborting an out of sequence cmdSN




Julian,

I understand that nothing in the current draft prevents usage of
connection allegiance for an abort task to flush the stale PDUs of a
command.

However, as a target implementor, if such a guaranteed mechanism is not
mandated to flush stale PDUs of a command, targets will need to build
safeguards against possible stale PDUs from a command. Any such scheme
would cause targets to retire resources until a safe period (based on a
best guess !) has elapsed. The targets CANNOT depend on initiators
deploying this scheme in the absence of the spec mandating this and do
not have a guarantee that stale PDUs will not arrive after the Abort
Task has cleaned up the command at the target end.

Keeping in mind the above, I am asking that iSCSI mandate that the Abort
Task be sent on the same connection as the SCSI command being aborted.
In addition, [and this has been asked earlier], the iSCSI draft lacks a
section on I/O timeout handling under the error recovery section. Some
description must be provided on the actions an initiator should take on
an I/O timeout. (or timeout of other iSCSI PDUs).

For instance, should non-SCSI PDUs be timed ? (login, text, etc). If so,
what is their timeout values. What actions should an initiator take if
they time out. How is the task tag for that non-SCSI PDU cleaned up,
ensuring no stale PDUs are going to arrive for that task tag.

iSCSI Error Recovery descriptions MUST address task timeout handling.

- Santosh


julian_satran@il.ibm.com wrote:
>
> First the current draft does not stop you from doing what you want - it
> just does not mandate it.
> Second with CmdSN now you have a better chance - before handing the
command
> over to SCSI task management
> mark all the relevant commands in the iSCSI queue as non-deliverable to a
> LUN or all LUNs or keep a stack of barriers for active aborts (I assume
> that this is a reasonably low number) to the same effect.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 01:10:17
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Sandeep Joshi <sandeepj@research.bell-labs.com>
> cc:   ips@ece.cmu.edu
> Subject:  Re: aborting an out of sequence cmdSN
>
> I'd think option 1 is the simplest (with the caveat that the task mgmt
> PDU referred to is the Abort Task.) and only impacts the affected
> command/task.
>
> Pierre Labat and I have asked for this 4 months ago. (See :
> http://ips.pdl.cs.cmu.edu/mail/msg02958.html). The concept of connection
> allegiance should be extended to include the Abort Task. Also,
> connection allegiance should apply to the task (which spans multiple
> commands in the case of linked commands.), allowing for a deterministic
> clean up of stale PDUs of the task through the use of Abort Task.
>
> - Santosh
>
> Sandeep Joshi wrote:
>
> >
> > So our options for abort_task boil down to..
> > (1) use connection allegiance for TASK MGMT PDU.
> > (2) reject all commands prior to cmdSN of TASK MGMT PDU.
> > (3) cmdSN of original task is sent with TASK MGMT PDU and
> >     target at the iSCSI layer keeps state.
> > (4) iSCSI initiator retains state for deleted tasks to ensure
> >     that R2T/Scsi Responses are appropriately handled.
>  - santoshr.vcf
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr 19 04:14:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08566
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 04:14:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J76nE13821
	for ips-outgoing; Thu, 19 Apr 2001 03:06:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J76XA13813
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 03:06:34 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id JAA302992
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 09:06:23 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id JAA157666
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 09:06:20 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.00270344 ; Thu, 19 Apr 2001 09:06:07 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.0026DB18.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 09:09:32 +0200
Subject: Re: iSCSI : New PDU opcode usage in rev 5.92
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

On a given nexus the roles are static aren't they?

Julo

Santosh Rao <santoshr@cup.hp.com> on 19/04/2001 03:47:07

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu (ips)
cc:
Subject:  iSCSI : New PDU opcode usage in rev 5.92




Julian & All,

I've got a quick question on how the new opcode layouts would work for
dual mode scsi implementations. (i.e. initiators that responded in
target mode or targets that acted as initiators also).

The new opcode layout is :

----------------
X|I| | | | | | |
----------------
7 6 5 4 3 2 1 0

where bits 5-0 -> opcode
X -> retry bit
I -> immediate bit

The same values are used for the command as well as response opcodes and
bits X & I are intended to both be set to 1 by targets.

i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
command and response is based on targets setting X & I bits to 1.

Now, if an initiator [capable of target mode] sent the following
commands, how would they be interpreted :

1) 0xc4.
is this a text command being retried in immediate mode,
or is it a text response ?

2) 0xc1
is this a scsi command being retried in immediate mode,
or is it a scsi response ?

3) 0xc2
is this a scsi task mgmt command being retried in immediate mode,
or is it a scsi task mgmt response ?

etc.....

- Santosh

--
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################





From owner-ips@ece.cmu.edu  Thu Apr 19 05:49:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA09451
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 05:49:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J7mr715188
	for ips-outgoing; Thu, 19 Apr 2001 03:48:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J7m4A15173
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 03:48:04 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id DAA28322;
	Thu, 19 Apr 2001 03:40:38 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id BAA247376;
	Thu, 19 Apr 2001 01:48:01 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
To: "Jon Hall" <jhall@emc.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF5A15214C.A418B059-ON88256A33.0028D2EC@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Thu, 19 Apr 2001 00:47:40 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/19/2001 01:48:00 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Jon,
You said:
...I've worked in this context (though its been some years now).
It was true (at one time) that tape had a tractability limit, e.g.,
a tape backup of a terabyte was out of the question.  Has that changed?

Not knowing how far back you go, or what technology you were using, Yes I
guess it has changed, terabytes are a common back-up size, even larger.
The current high-end tapes are really quite good, and their speed continues
to increase.  They are currently tracking Disk technology, maybe one half
to at most one generation behind in head technology, etc.

The disks are getting Larger and Larger, and the Tape technology has had to
track close to disk,  the new LTO tapes continue this track.

Nightly backups of the Terabytes of data used by a Computing Center's
Servers are the rule.  And by the way Tapes continue  increasing the speed.
Of course they are not increasing at 10x, but understand that they are
being written in parallel by backup application such as TSM (aka ADSM),
which write many tapes in parallel.  The Database of the backup
applications and the Tape Libraries have solved a lot of problems that we
use to have to deal with.


.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


"Jon Hall" <jhall@emc.com>@ece.cmu.edu on 04/09/2001 09:02:10 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"



"John Hufferd" writes:
>OK, if you go at it long enough you are punished with my two cents.

:-)

>We of course need some real numbers on what the probability of the CRC
>detected error, when TCP does not detect it.
>
>Given the fact that we do not have that information,  I could only just
use
>some of the numbers that have been kicking around on this thread.
>
>A Billion is NOT a large number, especially when we are talking about 10
>Gigabit Links ( vendors sampling 10 Gigabit/sec HBAs, next year, some
>shipping them in general availability (GA) in that year, and the rest in
>2003.  And yes I also got information of a company that is currently
>developing  100 Gigabit Links.)  So when I looked at some of the numbers,
I
>found that it meant that a link would see a failure about every twenty
>minutes, some went for 200 minutes, etc.

This is exactly why its necessary to understand the flow.  In a tape
context is it OK to assume that the data flowing from the target to
the initiator is responses to cmds, and that only a part of that is
iSCSI headers with StatSNs?  If that's right, then run the numbers
against the flow (don't use my numbers, they are riddled with guesses).

>I have one war story that might apply.  Years ago when we were first
>thinking about putting the small disk drives in our large storage
>controllers, we had folks calculate the Mean Time to Failure (MTF), of the
>various Desktop HD.   Some individual MTF numbers sounded large, for any
>given drive.  But then we computed the number of drives we would have in a
>large installation,  it ended up that we would have a drive failure at
>least every day.  (Thankfully, we had significantly better MTF numbers in
>the drives that were actually used.)   So, the point is that sometimes
>these large numbers come back to bite you in ways you had not considered
at
>first, when you think about it in a large installation.
>
>OK, back to the thread.
>
>Now I see sites all the time with 10s to 100s of Tape Units, and these
>units.  In many cases this will mean that there will be a tape unit
failure
>that causes the critical  backup job to fail, somewhere on the computing
>room floor, about every 2, 20, or 200 min.  This is a major impact on a
>computing center that must process hundreds of backup each day.

Exactly, I've worked in this context (though its been some years now).
It was true (at one time) that tape had a tractability limit, e.g.,
a tape backup of a terabyte was out of the question.  Has that changed?

>Therefore, those of you that think you are talking about  very rare
events,
>should at least compute the 10 Gigabit/second Rates, and then the number
of
>paths, etc. that might be in an enterprise installation, and then state
how
>often a computing center will see such an event.  When many of these
things
>are done at night with unattended operations, these can be a significant
>issue.  If it is probable that only one failure will occur per night, then
>you are certain that when a disaster does occur,  they will not have a
>valid backup over some amount of the data.
>
>OK, I am not saying who's right or wrong here, but just that some of the
>numbers I have heard, on this thread are not that impressive when looked
at
>with a 10 Gigabit/sec links and many paths.  (Let alone the future 100
>Gigabit/sec links)  (Oh by the way, remember a 10 Gigabit link is really a
>20 Gigabit link when you factor in full duplex.)
>
>So it might be useful, for the Rare Event folks to do the calculations on
>their numbers and tell us what they mean in terms of Minutes between
>failures on 10 Gigabit links.  Then the rest of us can compute our own
>picture on how many links we will probably have in our installation.

But why does the fact that we may someday run at 10 gig change the
question?  Is there some reason to believe that at 10 gig the nature
of a tape flow has changed?  You could certainly have more flows, but
the number of packets per flow will not increase.  The speed of tape
access won't change.  You could do more tapes simultaneously, but
you still have the tractability of handling large numbers of tapes.

As an aside, is a "Rare Event" person, like a flat-earth person?
(I want to get my role right :-).

-Jon





From owner-ips@ece.cmu.edu  Thu Apr 19 05:52:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA09480
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 05:52:08 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J6AlC11843
	for ips-outgoing; Thu, 19 Apr 2001 02:10:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J6AXA11836
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 02:10:33 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id IAA114582;
	Thu, 19 Apr 2001 08:07:49 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA90832;
	Thu, 19 Apr 2001 08:07:48 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.0021ABE5 ; Thu, 19 Apr 2001 08:07:46 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Jonathan Stone <jonathan@dsg.stanford.edu>
cc: Randall Stewart <rrs@cisco.com>,
        "WENDT,JIM (HP-Roseville, ex1)" <jim_wendt@hp.com>, ips@ece.cmu.edu,
        tsvwg@ietf.org, "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>
Message-ID: <C1256A33.0021AA0A.00@d12mta05.de.ibm.com>
Date: Thu, 19 Apr 2001 08:12:48 +0200
Subject: Re: [Tsvwg] [SCTP checksum problems]
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Jonathan,

Thanks for your comments.   We are aware that we really don't know what the
error model for the end to end transport is and we took a conservative
approach  - we do an end-to-end  data check above what TCP offers and not
aligned with the TCP packets.

We assume that storage boxes will be better built that other middle boxes
and hardware accelerators in the endpoints will not cause trouble.

For a completely garbled link we think that a combination of a good, CRC
and format checks will keep us from passing around corrupted data and solid
recovery mechanisms will keep us from failing the QoS expected.

The one question that we could not get any decent answer to is - how  would
mechanisms other than CRC perform - mainly how will a connection protected
by cryptographic authenticators perform on connections with errors.

Regards,
Julo

Jonathan Stone <jonathan@dsg.stanford.edu> on 18/04/2001 17:50:24

Please respond to Jonathan Stone <jonathan@dsg.stanford.edu>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   Randall Stewart <rrs@cisco.com>, "WENDT,JIM (HP-Roseville,    ex1)"
      <jim_wendt@hp.com>, ips@ece.cmu.edu, tsvwg@ietf.org, "'Craig
      Partridge'" <craig@aland.bbn.com>, Jonathan Wood
      <Jonathan.Wood@sun.com>, xieqb@cig.mot.com, Jonathan Stone
      <jonathan@dsg.stanford.edu>
Subject:  Re: [Tsvwg] [SCTP checksum problems]





Julian,

I skimmed your i-d late last night.

I have not gone through the analysis of different CRCs. I'd like to
compare it to Raj Jain's analysis of the IEEE 802 CRC-32 in
http://www.cis.ohio-state.edu/~jain/papers/xie1.html; which I think
speaks to the siglne-bit-error point Craig has already raised.

The question I'd raise is a more fundamental one: whether link-level
bit and burst error rates are the appropriate model for an Internet
transport-level sum in the first place.

Craig Partridge and I examined that in our SIGCOMM 2000 paper.
We monitored packets at a number of points in the Internet, and looked
for packet with checksum mismatches-- packets where recomputing the
checksum did not match the content of the checksum field.  We also
looked for transport-level (TCP) retransmissions of the damaged
packets.

Let's  call packets where a recomputation of the TCP (or UDP) checksum
does not match the contents of the checksum field a "mismatch".

We observed mismatch rates of roughly 1 in 4,000 (average); best-case
around 1 in 30,000.  that's 5 or 6 orders of magnitude higher than the
link-level error rates you cite.  By comparing the checksum-mismatches
against a TCP-level retransmission, we were able to estimate how much
damage occured to the mismatches.  Keep in mind that these errors were
caught at the TCP layer: they have already passed a link-level CRC
check, usually the 802.3 CRC-32. The very high observed error rate
suggests thet these errors  occur outside the protection of the
MAC-layer CRC.


For the iSCSI analysis, a fair synopsis is that half the packets were
so thoroughly curdled we couldn't even guess at what caused thenn
damage. there are more details in the SIGCOMM paper.
(There, I focused more on analyzing how the standard TCP sum would fare.
I am in the midst of recomputing total burst length and hamming
distances, for a polynomial-xor description of the errors rather
than the `minimum edit distance'.)

That characterization of errors is very different to the independent
bit-error and correlated-burst-error models used in the ID.


I think our data supports three conclusions relevant to this
discussion. (You may of course disagree.)

First, the Internet contains a variety of error sources above the
MAC-level:i between two MAC-layer interface cards inside a router; or
inside an end-host, between its MAC-layer card and its TCP (or SCTP,
or UDP, or other transport protocol).

Second, error rates from these sources occur at rates several
orders of magnitude higher than current link-level errors.

Third, the damage done by these error sources  just does not match
the individual-bit/ single-burst model common in coding theory,
and often used to characterize link errors.

While we did observe a (very) few single-bit errors and short bursts,
we also observed a lot of much longer bursts. Approximately half the
damaged packets were so thoroughly curdled that more than half the
bytes were incorrect.  (We also found similar rates and patterns in
packet traces from Vern Paxson; those are included in our SIGCOMM 2000
paper.)

It may be helpful to think of *some* of these errors as due to (for
example) a single-bit error affecting a DMA pointer: flipping an
address bit can cause a large change in the data stream going to or
from the network interface. If the bit position flipped is high
enough, it could even skip to another packet altogether.


Its not clear how the analysis and conclusions in your draft stand up,
if instead of a link-level single-bit/burst error model, we
ed substitute the error characteristics and rates we observed in
`in the wild' Internet traffic-- that is, error rates some 5 or
6 orders of magnitude higher, and where the errors cause either multiple
bursts per packet, or (if modeled as a single polynomial) vary from a few
dozen bits, up to a substantial fraction of the packet length.

The order-of-magnitude changes in error rate will obviously have an
impact.  I haven't thought in detail about whether the conclusions
about specific CRC polynomial choices hold up.


One final point is the computational cost of software CRCs.

If you buy our conclusion that the Internet contains very significant
error sources outside of "network interface cards". Then, outboard
acceleration of either checksums or CRCs is somewhat suspect:
error checks done inside the network card simply doesn't cover those
error sources.  Software CRC calculations are typically much slower than
either ones-complement, Fletcher, or Adler sums: Dave Feldmeier's
paper suggests roughly four times slower, for a total 32-bit check,
even for generator polynomials selected to minimze nonzero
coefficients (i.e,. few taps).  For the IEEE 802 CRC, its
faster to do a table-lookup, which is slower again.
I dont know whether the iSCSI community has considered that
issue, or where they/you stand on it.





From owner-ips@ece.cmu.edu  Thu Apr 19 05:52:41 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA09502
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 05:52:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J7bo014840
	for ips-outgoing; Thu, 19 Apr 2001 03:37:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J7anA14821
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 03:36:49 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id JAA264752;
	Thu, 19 Apr 2001 09:36:42 +0200
From: biran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id JAA182408;
	Thu, 19 Apr 2001 09:36:41 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.0029CC28 ; Thu, 19 Apr 2001 09:36:32 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Stephen Bailey <steph@cs.uchicago.edu>
cc: ips@ece.cmu.edu
Message-ID: <C1256A33.0029CB72.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 10:36:50 +0300
Subject: Re: [Tsvwg] [SCTP checksum problems]
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Steph,

> I am not disagreeing that we need an additional integrity check over
> TCP in the present target environment, but I do disagree that iSCSI
> will always need such a check, independently of what is running
> beneath it.

I guess this future is taken care in the iSCSI draft by the ability
to negotiate "none" digests - so it will just be a configuration matter
when administrators become comfortable with network devices reliability.

  Regards,
   Ofer


Ofer Biran
Storage and Systems Technology
IBM Research Lab in Haifa
biran@il.ibm.com  972-4-8296253





From owner-ips@ece.cmu.edu  Thu Apr 19 07:19:41 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id HAA09981
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 07:19:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3J9Drk18061
	for ips-outgoing; Thu, 19 Apr 2001 05:13:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3J9DhA18054
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 05:13:43 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id LAA198040
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 11:13:36 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id LAA30800
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 11:13:35 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.0032AA81 ; Thu, 19 Apr 2001 11:13:24 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.0032A8CC.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 11:18:13 +0200
Subject: Re: iSCSI & Linked Commands
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



According to your logic no FCP implementation can use linked commands?

Is this true for all OS's?  Is it a verified fact or foloklor?   Is it so
also for the new MS StorPort driver?

JUlo

Santosh Rao <santoshr@cup.hp.com> on 18/04/2001 20:07:39

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   santoshr@cup.hp.com
Subject:  Re: iSCSI & Linked Commands




julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> iSCSI HBAs are being designed now and they will get a way to convey the
> tags.

Julian,

Perhaps, I should try to make a better effort to come across more
clearly :-) The linked commands require task tags to be generated by the
SCSI ULP (which is an O.S. component, out of the control of HBA vendors
usually). Most O.S. SCSI ULPs do not generate task tags (or OX_IDs) but
leave this responsibility to the LLPs.

Hence, iSCSI HBAs being designed now makes no difference to this
picture. The O.S. ULP implementations don't need to be re-written for
iSCSI (one would hope!) and therefore, O.S. ULP architectures [that
exist today] prevent the usage of linked commands. (Also, the common
feeling is that since linked commands are not used, why change ULPs to
do otherwise.)


> The question I asked myself when introducing a restriction was always not
> why not restricting but rather why restricting.

Targets need a relaible guarantee that once an initiator issues an Abort
Task for a task/command, it will receive no further PDUs for that task
upon completion of the Abort Task.

In the absence of a mandate that forces initiators to comply to the
above, targets cannot reliably release & re-use their I/O resource since
they could get stale PDUs on that task later.

That is the reason I am requesting that initiators be mandated to apply
connection allegiance to their abort task.

> As for abort as linked command can
> exist only one at a time send the abort task wherever the current command
> is and don't initiate the next. Did I miss something?

I agree that linked commands treatment would be no different than normal
commands. IOW, connection allegiance per command should suffice, as long
as the abort task is included within its purview.


> The tags will be needed for recovery as well (I know that you think that
> isn't necessary either!).

On the contrary, I do believe that task tags need to be sent in the
Abort Task and this should not be an issue. The LLP is aware of the I/O
that timed out, its task tag and it generates the Abort Task PDU and
populates the task tag. This is standard practise and nothing different
here for iSCSI.

Regards,
Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr 19 09:27:21 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA10741
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 09:27:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JBnxx03686
	for ips-outgoing; Thu, 19 Apr 2001 07:49:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JBn2A03618
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 07:49:02 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3JCux121009
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 05:56:59 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Ips" <ips@ece.cmu.edu>
Subject: iSCSI: Serialization and Acknowledgement
Date: Thu, 19 Apr 2001 04:47:07 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMEIGCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: multipart/mixed;
	boundary="----=_NextPart_000_0000_01C0C88B.C9CF76C0"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.

------=_NextPart_000_0000_01C0C88B.C9CF76C0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

As I-D will not be processed until April 23 and time is short, please
forgive this posting.

I attempted to provide an overview of my concerns regarding the handling of
serialization.  It is not longer than some of my prior posts.

------=_NextPart_000_0000_01C0C88B.C9CF76C0
Content-Type: text/plain;
	name="draft-otis-iscsi-fullack-00.txt"
Content-Disposition: attachment;
	filename="draft-otis-iscsi-fullack-00.txt"
Content-Transfer-Encoding: quoted-printable

   Internet Draft                                 Douglas Otis           =
    Expires October 2001                               SANlight
                                                April 19, 2001

                   iSCSI Full Acknowledgement
                 draft-otis-iscsi-fullack-00.txt

   Status of this Memo


   This document is an Internet-Draft and is in full conformance=20
   with all provisions of Section 10 of RFC2026 [1].=20

   Internet-Drafts are working documents of the Internet=20
   Engineering Task Force (IETF), its areas, and its working=20
   groups.  Note that other groups may also distribute working=20
   documents as Internet-Drafts. Internet-Drafts are draft=20
   documents valid for a maximum of six months and may be=20
   updated, replaced, or made obsolete by other documents at any=20
   time.  It is inappropriate to use Internet-Drafts as=20
   reference material or to cite them other than as "work in=20
   progress."=20

   The list of current Internet-Drafts can be accessed at=20
   http://www.ietf.org/ietf/1id-abstracts.txt=20
   The list of Internet-Draft Shadow Directories can be accessed at=20
   http://www.ietf.org/shadow.html.

   Abstract

   This document is illustrative of potential modifications to=20
   the iSCSI protocol proposal (draft-ietf-ips-iscsi-05+.txt). =20
   These changes are to create a means to do the following:

    - Ensure Management response is coherent.
    - Acknowledge ALL requests delivered to the Server.
    - Ensure integrity of the iSCSI request window.
    - Open request window during abnormal events.
    - Quickly eliminate invalidated requests.
    - Quickly expunge sequence holes.
    - Simplify the reception sequencer.

=0C=0A=

  draft-otis-iscsi-fullack-00.txt                     Page [2]

   Problem Statement:

   The iSCSI Service Delivery Subsystem provides a means of=20
   exchanging Device Service and Task Management requests and=20
   their associated data and responses between Client and=20
   Server.  The Service Delivery Subsystem is assumed to provide=20
   error-free requests and responses between the Client and=20
   Server.  (See SCSI Architecture Model=962.)

   This model assumes that the Service Delivery Subsystem=20
   enforces state synchronization transparent to the Server. =20
   The iSCSI Service Delivery Subsystem is also assumed able to=20
   provide sequential delivery between Client's and Server's=20
   Service Delivery Port.  Although the SAM-2 model assumes=20
   sequential delivery, the iSCSI protocol extends this to be a=20
   requirement to minimize complexity.  The Service Delivery=20
   Subsystem must also ensure responses to Task Management=20
   requests are delivered within the sequence of Server=20
   responses or Task state information becomes corrupt. =20

   iSCSI uses the TCP protocol as a transport that, in general,=20
   ensures reliable and sequential exchanges.  Much of the=20
   complexity found in adhering to the sequential delivery=20
   requirement is created through the use of multiple TCP=20
   connections.  Multiple connections allow increased=20
   reliability and capacity through multiple paths or adapters=20
   and are not to capture increased network bandwidth.  Because=20
   these connections are expected to transverse different=20
   physical equipment, their relative latency is expected to=20
   diverge.

   To ensure sequential delivery, all Client requests are=20
   serialized session wide with connection allegiance between=20
   Client request and Server response.  An exception to this=20
   serialization and sequential delivery are for requests to be=20
   presented to Server ahead of requests contained within=20
   Service Delivery Subsystem.  If limited to a single instance,=20
   the present pending provision of using a flag and non-
   incremented serialization rather than null serialization=20
   allows the identification of this request relative to=20
   requests on differing connections.  If successive ahead-of-
   sequence requests are limited to the same connection as well=20
   as the subsequent normal request carrying the same=20
   serialization, then these request's relative position can=20
   also be determined.

   Serialization of requests session wide provides two=20
   functions.  First, it allows simple detection of requests=20
   that may have been repeated.  The underlying mechanism of TCP=20
   is connectionless IP and, as a result, does not provide an=20
   indication of communication loss.  TCP will eventually detect=20
   communication loss perhaps well after iSCSI attempts=20
   corrective action.  Second, session wide request=20
   serialization allows for sequential delivery to the Server as=20
   well as timely acknowledgement of reception. =20

=0C=0A=

  draft-otis-iscsi-fullack-00.txt                     Page [3]

   In the event of the Service Delivery Subsystem attempting=20
   corrective action, the suspected connection is terminated by=20
   a new TCP connection doing a Login Restart using the same=20
   connection ID while also acknowledging the response=20
   serialization and the prior connection allegiance is=20
   transferred to this new connection.  Potential problems arise=20
   as the Service Delivery Subsystem does not acknowledge ahead-
   of-sequence requests and, if there are successive ahead-of-
   sequence requests, repeated requests cannot be determined=20
   during corrective action without examining Client Tags.

   There is not necessarily a one-to-one relationship between=20
   Task Management and affected tasks.  This creates a state=20
   synchronization problem as the connections returning to the=20
   Client are independently serialized.  The Task Management=20
   response may be seen out of context to Server responses as a=20
   result.  Task Management requests are identified to the=20
   Service Delivery Subsystem and will allow for special=20
   handling.

   Sequential delivery to the Server with a request window=20
   offers an additional problem.  The iSCSI Service Delivery=20
   Subsystem combines Logical Units.  Task Management is=20
   generally limited to one outstanding request per Logical Unit=20
   but there is only a provision for one additional Task=20
   Management request if flagged ahead-of-sequence such that=20
   successive Task Management requests will carry the same=20
   serialization and at least enough spare resources must be set=20
   aside to accommodate requests for the number of Logical=20
   Units.

   Sequential delivery potentially offers another problem=20
   depending on Logical Unit hierarchies and related delivery=20
   structures.  If the Logical Unit is a simple flat model, then=20
   delivery may be stopped by lack of associated resources=20
   together with a busy unit.  If the acknowledgement returned=20
   by the Service Delivery Subsystem is for delivery, then=20
   acknowledgement stops until resources become freed.

   The event created by an ahead-of-sequence or Task Management=20
   request will likely invalidate requests in-transit within the=20
   Service Delivery Subsystem.  The quantity and latency of this=20
   in-transit request queue may be problematic for applications=20
   that are not likely to anticipate this unusual situation. =20
   The Logical Unit must enter into an ACA condition to reject=20
   these requests that may extend beyond normal fabric timeouts. =20
   As iSCSI may include various SCSI models, this inter-locking=20
   mechanism to purge in-transit requests may not exist.


   Solutions and Benefits:

   To ensure proper context of a response to a Task Management=20
   request, it must not appear before prior Server responses. =20
   Server response serialization can be changed to session wide=20
   in the same manner as Client requests.  The benefit is Server=20
   resources can be freed without a response directed=20
   specifically to each connection.  Logging of Server responses=20
   can be compiled in a coherent fashion.  A connection failure=20
   becomes apparent across all connections at the Client.  This=20
   is important as the Client is expected to initiate recovery=20
   action. =20

=0C=0A=

  draft-otis-iscsi-fullack-00.txt                     Page [4]

   As ahead-of-sequence requests, which are most likely Task=20
   Management requests, do not increment the request=20
   serialization, these requests are without Service Delivery=20
   Subsystem acknowledgement and are without a simple sequential=20
   sorting variable.  With the exception of the sorting problem=20
   solved by examining the Client Tag, this technique keeps the=20
   sequencer handling of these events simple but there would be=20
   another technique that would also afford an even greater=20
   level of simplicity.  The side affect of these requests is=20
   the likely invalidation of many in-transit requests.  Instead=20
   of creating a special case for ahead-of-sequence request=20
   serialization, treat these requests in the same manner as all=20
   other requests.  The mechanism to advance these requests=20
   however is to reject all prior pending non-ahead-of-sequence=20
   requests back to the Client.
 =20
   This has the advantage of instantly opening up the request=20
   window to its maximum.  No additional set aside resources=20
   need to be allotted to handle ahead-of-sequence requests. =20
   The reject status could either indicate the range of requests=20
   rejected or each request could be individually rejected such=20
   that the Client is then freed to either purge or retry those=20
   requests as required.  It places no expectations on the=20
   Service Delivery Subsystem to interpret the nature of SCSI=20
   requests treated in this fashion.  It also ensures a timely=20
   removal of enqueued requests well within typical fabric=20
   timeouts.

   This rejection technique can also simplify the recovery of a=20
   terminated connection as the failed connection serialization=20
   does not need to be recalled for recovery nor are timeouts=20
   required to discovery the sequence holes created as a result=20
   of the connection termination.  This rejection technique also=20
   maintains the integrity of the iSCSI request window.  The=20
   technique removes the potentially sizeable amount resources=20
   that must be set aside otherwise.   =20

   If the sequencer, a term for the sequencing function within=20
   the Server side of the Service Delivery Subsystem, was unable=20
   to deliver a request, sending an over-ride of this request=20
   would create uncertainty as it would be unknown if progress=20
   continued as a result of the prior request being accepted or=20
   if the over-ride had taken effect without explicit status=20
   indicating such.  The rejection structure is already in place=20
   to allow for the sequencer to simply be advanced to the point=20
   this ahead-of-sequence request.

=0C=0A=

  draft-otis-iscsi-fullack-00.txt                     Page [5]

   The sequencer process could look something like the=20
   following: =20

   if ( (request_SN =96 next_request_SN ) > 2^(SERIAL_BITS - 1))
   	{
	reject_pdu(request_SN, SEQUENCER_INVALIDATION);
	}

   if (request_SN =3D=3D next_request_SN)
	{
	send_pdu(request_SN);
	next_request_SN++;
	}

   Upon receipt of a request flagged ahead-of-sequence, the=20
   'next sequence' value immediately becomes the serialization=20
   of this request as well as ExpCmd advancing to this value=20
   plus one.  Rather than silently discarding these requests as=20
   it is now defined, these requests should be rejected back to=20
   the Client.  Unexpected rejections would be an indication of=20
   nefarious spoofing attempts or a software bug.=20

   One unsatisfactory alternative would be a redefinition of=20
   Service Delivery Subsystem acknowledgement to indicate point=20
   of sequential reception without actual delivery.  This would=20
   then create a problem of again having a large quantity of=20
   enqueued requests but now beyond even the ability to remove=20
   these requests with an ahead-of-sequence flag.

   Using the ahead-of-sequence flag to create a response that=20
   indicates the range of commands rejected or rejection on an=20
   individual basis ensures the state of the Server can be=20
   quickly ascertained well within a fabric timeout.  This=20
   allows quick recovery of a connection termination, a Logical=20
   Unit hang condition, a flushing of invalidated requests, and=20
   an instant opening of the request window while still enabling=20
   the iSCSI flow control mechanism.  This technique also=20
   ensures all requests are provided a timely acknowledgement by=20
   the Service Delivery Subsystem as requests are delivered.

=0C=0A=

  draft-otis-iscsi-fullack-00.txt                     Page [6]

   Author's Address:=20

         Douglas Otis=20
         SANlight Inc.=20
         160 Saratoga Ave, #40=20
         Santa Clara, CA 95051=20
         Tel: (408) 260-1400 x2=20
         dotis@sanlight.net=20

   Full Copyright Statement=20
   Copyright (C) The Internet Society (2000).
     All Rights Reserved.=20

   This document and translations of it may be copied and=20
   furnished to others, and derivative works that comment on or=20
   otherwise explain it or assist in its implementation may be=20
   prepared, copied, published and distributed, in whole or in=20
   part, without restriction of any kind, provided that the=20
   above copyright notice and this paragraph are included on all=20
   such copies and derivative works.=20

   However, this document itself may not be modified in any way,=20
   such as by removing the copyright notice or references to the=20
   Internet Society or other Internet organizations, except as=20
   needed for the purpose of developing Internet standards in=20
   which case the procedures for copyrights defined in the=20
   Internet Standards process must be followed, or as required=20
   to translate it into languages other than English.=20

   The limited permissions granted above are perpetual and will=20
   not be revoked by the Internet Society or its successors or=20
   assigns.  This document and the information contained herein=20
   is provided on an "AS IS" basis and THE INTERNET SOCIETY AND=20
   THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES,=20
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY=20
   THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY=20
   RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR=20
   FITNESS FOR A PARTICULAR PURPOSE.

=20

------=_NextPart_000_0000_01C0C88B.C9CF76C0--



From owner-ips@ece.cmu.edu  Thu Apr 19 09:32:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA10836
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 09:32:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JC10p04213
	for ips-outgoing; Thu, 19 Apr 2001 08:01:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maile.surrey.ac.uk (maile.surrey.ac.uk [131.227.102.10])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3JBU2A02883
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 07:30:02 -0400 (EDT)
Received: from regan.ee.surrey.ac.uk by maile.surrey.ac.uk with SMTP Local (PP)
          with ESMTP; Thu, 19 Apr 2001 12:29:28 +0100
Date: Thu, 19 Apr 2001 12:29:27 +0100 (BST)
From: Lloyd Wood <L.Wood@surrey.ac.uk>
X-Sender: eep1lw@regan.ee.surrey.ac.uk
Reply-To: Lloyd Wood <L.Wood@surrey.ac.uk>
To: Stephen Bailey <steph@cs.uchicago.edu>
cc: ips@ece.cmu.edu, tsvwg@ietf.org
Subject: Re: [Tsvwg] [SCTP checksum problems]
In-Reply-To: <20010418221034.DA73F94006@sandmail.sandburst.com>
Message-ID: <Pine.GSO.4.21.0104191223250.9077-100000@regan.ee.surrey.ac.uk>
Organization: speaking for none
X-url: http://www.ee.surrey.ac.uk/Personal/L.Wood/
X-no-archive: yes
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

On Wed, 18 Apr 2001, Stephen Bailey wrote:

> An integrity check is not necessary as long as some lower layer
> provides adequate integrity guarantees.

That is the complete antithesis of the end-to-end argument. And you
earlier discussed middleboxes, where the information needed for such
guarantees can be faked...

Reality check, please.

L.

iSCSI. iNSANE.

<L.Wood@surrey.ac.uk>PGP<http://www.ee.surrey.ac.uk/Personal/L.Wood/>




From owner-ips@ece.cmu.edu  Thu Apr 19 09:32:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id JAA10837
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 09:32:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JB7xU02165
	for ips-outgoing; Thu, 19 Apr 2001 07:07:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JB7hA02155
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 07:07:44 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id NAA251034;
	Thu, 19 Apr 2001 13:05:27 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id NAA87040;
	Thu, 19 Apr 2001 13:05:27 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.003CE803 ; Thu, 19 Apr 2001 13:05:15 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Stephen Bailey <steph@cs.uchicago.edu>
cc: "CAVANNA,VICENTE V (A-Roseville,ex1)" <vince_cavanna@agilent.com>,
        "'WENDT,JIM (HP-Roseville,ex1)'" <jim_wendt@hp.com>, ips@ece.cmu.edu,
        "'Craig Partridge'" <craig@aland.bbn.com>,
        Jonathan Wood <Jonathan.Wood@sun.com>, xieqb@cig.mot.com,
        Jonathan Stone <jonathan@dsg.stanford.edu>,
        Randall Stewart <rrs@cisco.com>
Message-ID: <C1256A33.003CE64E.00@d12mta05.de.ibm.com>
Date: Thu, 19 Apr 2001 13:09:19 +0200
Subject: Re: [Tsvwg] [SCTP checksum problems]
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Steph,

You may want to add that one of the reasons for having an iSCSI integrity
check is to enable iSCSI PDU handling by middle boxes (separate header and
data digests).
And the integrity check can be made tamper proof to a greater extent by
changing from a error detection digest (CRC) to a full-fledged
authentication digest (that option exists in iSCSI too).

As for the transience of the bad middle boxes - hard to believe - new ones
are born every day now (with more software!) -:)

Julo



Stephen Bailey <steph@cs.uchicago.edu> on 19/04/2001 00:09:02

Please respond to Stephen Bailey <steph@cs.uchicago.edu>

To:   "CAVANNA,VICENTE V (A-Roseville,ex1)" <vince_cavanna@agilent.com>
cc:   "'WENDT,JIM (HP-Roseville,ex1)'" <jim_wendt@hp.com>, Julian
      Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org, "'Craig
      Partridge'" <craig@aland.bbn.com>, Jonathan Wood
      <Jonathan.Wood@sun.com>, xieqb@cig.mot.com, Jonathan Stone
      <jonathan@dsg.stanford.edu>, Randall Stewart <rrs@cisco.com>
Subject:  Re: [Tsvwg] [SCTP checksum problems]




Vince,

> I don't think iSCSI can be completely relieved of performing some data
> integrity checking as long as there exists the possibility of "middle
boxes"
> opening up the transport protocol's packet and thus potentially
invalidating
> any reliability guarantees the transport protocol makes.

Any protection provided against this failure mode will only be
transient, so we must temper the desire to introduce such a
requirement with reality.

Middleboxes can just as easily open up to the iSCSI layer and tinker
with the payload, as they do with other ULPs running on TCP (e.g HTTP)
today.  Short of securing the connection, there is ALWAYS a
possibility of a middlebox terminating and reoriginating an integrity
check.  In case you think this is a farfetched scenario, I do get the
impression that there is a high level of interest in `actively
middling' iSCSI once the specs crystalize.  Who shaves the barber?

An integrity check is not necessary as long as some lower layer
provides adequate integrity guarantees.

Adding an integrity check above the transport layer is based upon
documentation of the presence of a lot of crappy network hardware and
software and analyses of the transport integrity check (TCP checksum)
which suggests it might not be adequately strong against some such
observed errors.

I claim that the high incidence of `broken' (corruption introducing)
components is a result of a variety of factors which have shaped the
development of network components thus far.  The fact that integrity
checks are assumed to be performed in a network context substantially
lowers the bar for implementation correctness.

In a storage (or CPU) context, these types of implementation errors
are a) more easily detectable (more fatal) b) more carefully avoided
during implementation (because of the cost of a potential fatal
error).  If network components magically reached the same `quality
level' as storage and CPU components, there might be no justification
for additional integrity checks above the transport.  Similarly if the
transport (or whatever lower layer) integrity checks are very strong
(e.g. IPSec), there is, again, no need for a higher level integrity
check.

I am not disagreeing that we need an additional integrity check over
TCP in the present target environment, but I do disagree that iSCSI
will always need such a check, independently of what is running
beneath it.

Steph





From owner-ips@ece.cmu.edu  Thu Apr 19 10:51:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA11986
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 10:51:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JCj0g05993
	for ips-outgoing; Thu, 19 Apr 2001 08:45:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JChxA05971
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 08:43:59 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id OAA120784
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 14:43:54 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id OAA40090
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 14:43:54 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.0045EDCA ; Thu, 19 Apr 2001 14:43:48 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.0045EC6F.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 14:48:04 +0200
Subject: draft version 06 available at my site
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk





Dear colleagues,

I've just placed 06 at http://www.haifa.il.com/satran/ips

and submitted it to ID.

Recovery is still "work in progress" and probably won't make it to Nashua.

Only minor changes vs. 05-92:

At David Blacks suggestion I've added an explanation about how to use CmdSN
with  task manament (in chapter 7).

InDataOrder is now DataOrder - media that requires order will get-it.

Some typo's (sure not all -:))

Regards,
Julo






From owner-ips@ece.cmu.edu  Thu Apr 19 11:42:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA13034
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 11:42:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JDp2x09295
	for ips-outgoing; Thu, 19 Apr 2001 09:51:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JDoIA09235
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 09:50:18 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id JAA15614; Thu, 19 Apr 2001 09:50:07 -0400 (EDT)
Message-ID: <3ADEECB6.3E083E77@cisco.com>
Date: Thu, 19 Apr 2001 08:48:38 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: Stephen Bailey <steph@cs.uchicago.edu>
CC: ips@ece.cmu.edu
Subject: Re: Some questions about naming (newbie)
References: <002101c0c80b$6e057220$320114ac@ntwlassie>  <3ADD9DF5.FD7711DF@cisco.com> <20010419133534.43DA69400A@sandmail.sandburst.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Steven-

The smaller targets (such as the individual Seagate drives) do support
LUN WWN.  However, the large storage arrays do not support LUN WWN
as a way to correlate different paths to the same LU.

--
Mark

Stephen Bailey wrote:
> 
> Mark,
> 
> >    However, most FC devices do not support the LUN WWN, and use
> >    serial numbers, device IDs, and other mechanisms to correlate
> >    them.
> 
> ?
> 
> Every FC target I recall checking this on reported what looked like a
> WWN.  Some (many?) ||SCSI targets seemed to do this too.  I know other
> formats are supported, like vendor ID+serial number, but I can't
> recall seeing a target which reported ONLY this and not a WWN as well.
> Am I all wet?
> 
> Thanks,
>   Steph

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Thu Apr 19 11:43:05 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA13067
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 11:43:04 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JDa3f08529
	for ips-outgoing; Thu, 19 Apr 2001 09:36:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JDZYA08509
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 09:35:34 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 43DA69400A
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 09:35:34 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: Some questions about naming (newbie) 
In-Reply-To: Message from Mark Bakke <mbakke@cisco.com> 
   of "Wed, 18 Apr 2001 09:00:21 CDT." <3ADD9DF5.FD7711DF@cisco.com> 
References: <002101c0c80b$6e057220$320114ac@ntwlassie>  <3ADD9DF5.FD7711DF@cisco.com> 
Date: Thu, 19 Apr 2001 09:34:01 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010419133534.43DA69400A@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Mark,

>    However, most FC devices do not support the LUN WWN, and use
>    serial numbers, device IDs, and other mechanisms to correlate
>    them.

?

Every FC target I recall checking this on reported what looked like a
WWN.  Some (many?) ||SCSI targets seemed to do this too.  I know other
formats are supported, like vendor ID+serial number, but I can't
recall seeing a target which reported ONLY this and not a WWN as well.
Am I all wet?

Thanks,
  Steph


From owner-ips@ece.cmu.edu  Thu Apr 19 11:44:13 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA13086
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 11:44:12 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JDa1i08525
	for ips-outgoing; Thu, 19 Apr 2001 09:36:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JDZqA08517
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 09:35:52 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 223DC94009
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 09:35:34 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI : login keys & mode page settings 
In-Reply-To: Message from julian_satran@il.ibm.com 
   of "Wed, 18 Apr 2001 16:18:25 +0200." <C1256A32.004E1F02.00@d12mta02.de.ibm.com> 
References: <C1256A32.004E1F02.00@d12mta02.de.ibm.com> 
Date: Thu, 19 Apr 2001 09:34:01 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010419133534.223DC94009@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> +++ It is all about function - several people felt that the (primitive)
> negotiation element in the text commands is better than trying to set a
> parameter to an unacceptable value and finding this out through a mode
> sense
> ++++

Several other people seem to have felt that it was better not to have
redundant mechanisms for manipulating this sort of parameter.  How do
you decide?

Steph


From owner-ips@ece.cmu.edu  Thu Apr 19 13:13:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA14617
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 13:13:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JCr0s06394
	for ips-outgoing; Thu, 19 Apr 2001 08:53:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JCq8A06373
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 08:52:08 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3JDxm121055;
	Thu, 19 Apr 2001 06:59:48 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Lloyd Wood" <L.Wood@surrey.ac.uk>
Cc: <ips@ece.cmu.edu>, <tsvwg@ietf.org>
Subject: RE: [Tsvwg] [SCTP checksum problems]
Date: Thu, 19 Apr 2001 05:49:57 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEIHCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <Pine.GSO.4.21.0104191223250.9077-100000@regan.ee.surrey.ac.uk>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Lloyd,

I am aware of efforts to compare CRC with Alder-32.  CRC that is primarily
aimed at providing burst error detection but if while trying various
techniques, this modification may be interesting.

  #define BASE 65521
  unsigned s1 = 0x5555;
  unsigned s2 = 0;
  unsigned short dat_buf*

  while (length -= 2)
    {
    s1 += ntoh(*dat_buf++);  /* 16 bit summing */
    s2 += s1;
    if (s2 >= BASE)          /* Adler modulo for s2 only  */
      s2 -= BASE;
    }
  return (s2 << 16) | (s1 & 0xffff);

This would exercise more bits for small packets, improve burst error
sensitivity and trade a modulo function for a network to host swap in some
cases.  It seems to become a comparison against burst errors vs. missing
segment and stuck bit sensitivity.

The alternative code for CRC would look something like this.

Here is an example using a 256 entry (512 byte) table.

  unsigned char* dat_buf;
  unsigned long crc_syn = 0xffffffff;

  while(length--)
    crc_syn = (crc_syn >> 8) ^ crc32_table[(crc_syn & 0xff) ^ *dat_buf++];

  return (crc_syn ^ 0xffffffff);


Doug



From owner-ips@ece.cmu.edu  Thu Apr 19 15:00:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16496
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:00:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JGN9F17231
	for ips-outgoing; Thu, 19 Apr 2001 12:23:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1t.cos.agilent.com (msgbas1tx.cos.agilent.com [192.6.9.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JGMpA17218
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 12:22:51 -0400 (EDT)
Received: from msgrel1.cos.agilent.com (msgrel1.cos.agilent.com [130.29.152.77])
	by msgbas1t.cos.agilent.com (Postfix) with ESMTP
	id DA9AC30C; Thu, 19 Apr 2001 10:22:46 -0600 (MDT)
Received: from axcsbh4.cos.agilent.com (axcsbh4.cos.agilent.com [130.29.152.145])
	by msgrel1.cos.agilent.com (Postfix) with SMTP
	id 5168E9E; Thu, 19 Apr 2001 10:22:46 -0600 (MDT)
Received: from 130.29.152.145 by axcsbh4.cos.agilent.com (InterScan E-Mail VirusWall NT); Thu, 19 Apr 2001 10:22:46 -0600 (Mountain Daylight Time)
Received: by axcsbh4.cos.agilent.com with Internet Mail Service (5.5.2653.19)
	id <JHNKSBY6>; Thu, 19 Apr 2001 10:22:46 -0600
Message-ID: <FEEBE78C8360D411ACFD00D0B74779719A8859@xsj02.sjs.agilent.com>
From: vince_cavanna@agilent.com
To: steph@cs.uchicago.edu, vince_cavanna@agilent.com
Cc: jim_wendt@hp.com, julian_satran@il.ibm.com, ips@ece.cmu.edu,
        tsvwg@ietf.org, craig@aland.bbn.com, Jonathan.Wood@sun.com,
        xieqb@cig.mot.com, jonathan@dsg.stanford.edu, rrs@cisco.com
Subject: RE: [Tsvwg] [SCTP checksum problems] 
Date: Thu, 19 Apr 2001 10:22:29 -0600
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Stephen,

I have to admit that I do not have much direct experience with middle boxes,
BUT I did have fairly direct and recent experience with a popular NAT router
from a popular vendor that was corrupting data in a network of Macintoshes. 

Apple's TCP was unaware of any problem as was Apple's Filing Protocol and
most applications. The only applications that detected the corruption were
those that performed an integrity check of their own. Those applications
that assumed a reliable transport (and file system) were doomed to
experiencing the indirect effects of the corruption at some later time. The
corruption only happened when large amounts of data were transferred
quickly.  The router vendor fixed the problem once; then fixed it again;
then fixed it one last time before the data corruption finally
"disappeared". After several weeks of continuous operation the router
appeared to get into a mode where it was once again corrupting data. Power
cycling the router "fixed it". The story apparently has not yet ended.

I admit I may have given too much significance to this single incident that
I have personally experienced but on the other hand I don't see the
mechanisms in place to prevent this type of problem in the future other than
the end to end integrity checks.

Incidentally this incident change my behavior when transferring data over a
network. I will always use a compression utility; not only for reducing the
data to be transmitted but to ensure the integrity of my data is protected
end to end by the utility's CRC mechanism.

I believe quite firmly that we DO need a mechanism to allow us to tolerate
poor implementations of middle boxes and cannot simply hope that eventually
such poor implementations will vanish, nor that we will have the luxury of
being able to select only good implementations for every component of our
storage network.

Vince

|-----Original Message-----
|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
|Sent: Wednesday, April 18, 2001 3:09 PM
|To: CAVANNA,VICENTE V (A-Roseville,ex1)
|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
|Subject: Re: [Tsvwg] [SCTP checksum problems] 
|
|
|Vince,
|
|> I don't think iSCSI can be completely relieved of performing 
|some data
|> integrity checking as long as there exists the possibility 
|of "middle boxes"
|> opening up the transport protocol's packet and thus 
|potentially invalidating
|> any reliability guarantees the transport protocol makes.
|
|Any protection provided against this failure mode will only be
|transient, so we must temper the desire to introduce such a
|requirement with reality.
|
|Middleboxes can just as easily open up to the iSCSI layer and tinker
|with the payload, as they do with other ULPs running on TCP (e.g HTTP)
|today.  Short of securing the connection, there is ALWAYS a
|possibility of a middlebox terminating and reoriginating an integrity
|check.  In case you think this is a farfetched scenario, I do get the
|impression that there is a high level of interest in `actively
|middling' iSCSI once the specs crystalize.  Who shaves the barber?
|
|An integrity check is not necessary as long as some lower layer
|provides adequate integrity guarantees.
|
|Adding an integrity check above the transport layer is based upon
|documentation of the presence of a lot of crappy network hardware and
|software and analyses of the transport integrity check (TCP checksum)
|which suggests it might not be adequately strong against some such
|observed errors.
|
|I claim that the high incidence of `broken' (corruption introducing)
|components is a result of a variety of factors which have shaped the
|development of network components thus far.  The fact that integrity
|checks are assumed to be performed in a network context substantially
|lowers the bar for implementation correctness.
|
|In a storage (or CPU) context, these types of implementation errors
|are a) more easily detectable (more fatal) b) more carefully avoided
|during implementation (because of the cost of a potential fatal
|error).  If network components magically reached the same `quality
|level' as storage and CPU components, there might be no justification
|for additional integrity checks above the transport.  Similarly if the
|transport (or whatever lower layer) integrity checks are very strong
|(e.g. IPSec), there is, again, no need for a higher level integrity
|check.
|
|I am not disagreeing that we need an additional integrity check over
|TCP in the present target environment, but I do disagree that iSCSI
|will always need such a check, independently of what is running
|beneath it.
|
|Steph
|


From owner-ips@ece.cmu.edu  Thu Apr 19 15:00:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16524
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:00:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JGtGS18847
	for ips-outgoing; Thu, 19 Apr 2001 12:55:16 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JGsPA18791
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 12:54:25 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id SAA139760
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 18:54:18 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id SAA103960
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 18:54:18 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.005CDB39 ; Thu, 19 Apr 2001 18:54:14 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.005CD9E2.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 18:59:18 +0200
Subject: Re: iSCSI : login keys & mode page settings
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



The mechanism are not exactly redundant.

Julo

Stephen Bailey <steph@cs.uchicago.edu> on 19/04/2001 15:34:01

Please respond to Stephen Bailey <steph@cs.uchicago.edu>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI : login keys & mode page settings




> +++ It is all about function - several people felt that the (primitive)
> negotiation element in the text commands is better than trying to set a
> parameter to an unacceptable value and finding this out through a mode
> sense
> ++++

Several other people seem to have felt that it was better not to have
redundant mechanisms for manipulating this sort of parameter.  How do
you decide?

Steph





From owner-ips@ece.cmu.edu  Thu Apr 19 15:01:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16552
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:01:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JHVB120821
	for ips-outgoing; Thu, 19 Apr 2001 13:31:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-2.cisco.com (sj-msg-core-2.cisco.com [171.69.43.88])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JGsVA18798
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 12:54:32 -0400 (EDT)
Received: from sj-msg-av-2.cisco.com (sj-msg-av-2.cisco.com [171.69.43.85])
	by sj-msg-core-2.cisco.com (8.9.3/8.9.1) with ESMTP id JAA11475;
	Thu, 19 Apr 2001 09:54:32 -0700 (PDT)
Received: from mailman.cisco.com (localhost [127.0.0.1])
	by sj-msg-av-2.cisco.com (8.10.1/8.10.1) with ESMTP id f3JGs4A08167;
	Thu, 19 Apr 2001 09:54:04 -0700 (PDT)
Received: from chsharp-w2k.cisco.com (rtp-vpn-175.cisco.com [10.82.192.175]) by mailman.cisco.com (8.9.3/CISCO.SERVER.1.2) with ESMTP id JAA11641; Thu, 19 Apr 2001 09:54:01 -0700 (PDT)
Message-Id: <4.3.2.7.2.20010419123133.01d8a210@dogwood.cisco.com>
X-Sender: chsharp@dogwood.cisco.com
X-Mailer: QUALCOMM Windows Eudora Version 4.3.2
Date: Thu, 19 Apr 2001 12:53:53 -0400
To: vince_cavanna@agilent.com
From: Chip Sharp <chsharp@cisco.com>
Subject: RE: [Tsvwg] [SCTP checksum problems] 
Cc: steph@cs.uchicago.edu, vince_cavanna@agilent.com, jim_wendt@hp.com,
        julian_satran@il.ibm.com, ips@ece.cmu.edu, tsvwg@ietf.org,
        craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
        jonathan@dsg.stanford.edu, rrs@cisco.com
In-Reply-To: <FEEBE78C8360D411ACFD00D0B74779719A8859@xsj02.sjs.agilent.c
 om>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

As was pointed out previously, middle box operations (such as NATs) tend to 
creep up the protocol stack and into applications.

Take SIP for example.  It includes IP addresses in its INVITE.  In order to 
work across a NAT, the IP addresses it exchanges have to be replaced with 
the NATed address.  One way is for the NAT to reach up into the SIP INVITE 
and change the address.  This modifies the TCP or UDP checksum.  Now SIP 
could have included its own integrity check to protect against corrupted or 
modified TCP checksums, but all that would have happened is that NATs would 
have changed the SIP checksum in addition to the TCP/UDP checksum.

Therefore, even if iSCSI included its own integrity check, if a middle box 
is going to futz with iSCSI packets it will just strip the check, do 
whatever it does and then recalculate the check.

If this is what you want to protect against you will have to go to some 
type of digital signature.

At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
>Stephen,
>
>I have to admit that I do not have much direct experience with middle boxes,
>BUT I did have fairly direct and recent experience with a popular NAT router
>from a popular vendor that was corrupting data in a network of Macintoshes.
>
>Apple's TCP was unaware of any problem as was Apple's Filing Protocol and
>most applications. The only applications that detected the corruption were
>those that performed an integrity check of their own. Those applications
>that assumed a reliable transport (and file system) were doomed to
>experiencing the indirect effects of the corruption at some later time. The
>corruption only happened when large amounts of data were transferred
>quickly.  The router vendor fixed the problem once; then fixed it again;
>then fixed it one last time before the data corruption finally
>"disappeared". After several weeks of continuous operation the router
>appeared to get into a mode where it was once again corrupting data. Power
>cycling the router "fixed it". The story apparently has not yet ended.
>
>I admit I may have given too much significance to this single incident that
>I have personally experienced but on the other hand I don't see the
>mechanisms in place to prevent this type of problem in the future other than
>the end to end integrity checks.
>
>Incidentally this incident change my behavior when transferring data over a
>network. I will always use a compression utility; not only for reducing the
>data to be transmitted but to ensure the integrity of my data is protected
>end to end by the utility's CRC mechanism.
>
>I believe quite firmly that we DO need a mechanism to allow us to tolerate
>poor implementations of middle boxes and cannot simply hope that eventually
>such poor implementations will vanish, nor that we will have the luxury of
>being able to select only good implementations for every component of our
>storage network.
>
>Vince
>
>|-----Original Message-----
>|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
>|Sent: Wednesday, April 18, 2001 3:09 PM
>|To: CAVANNA,VICENTE V (A-Roseville,ex1)
>|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
>|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
>|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
>|Subject: Re: [Tsvwg] [SCTP checksum problems]
>|
>|
>|Vince,
>|
>|> I don't think iSCSI can be completely relieved of performing
>|some data
>|> integrity checking as long as there exists the possibility
>|of "middle boxes"
>|> opening up the transport protocol's packet and thus
>|potentially invalidating
>|> any reliability guarantees the transport protocol makes.
>|
>|Any protection provided against this failure mode will only be
>|transient, so we must temper the desire to introduce such a
>|requirement with reality.
>|
>|Middleboxes can just as easily open up to the iSCSI layer and tinker
>|with the payload, as they do with other ULPs running on TCP (e.g HTTP)
>|today.  Short of securing the connection, there is ALWAYS a
>|possibility of a middlebox terminating and reoriginating an integrity
>|check.  In case you think this is a farfetched scenario, I do get the
>|impression that there is a high level of interest in `actively
>|middling' iSCSI once the specs crystalize.  Who shaves the barber?
>|
>|An integrity check is not necessary as long as some lower layer
>|provides adequate integrity guarantees.
>|
>|Adding an integrity check above the transport layer is based upon
>|documentation of the presence of a lot of crappy network hardware and
>|software and analyses of the transport integrity check (TCP checksum)
>|which suggests it might not be adequately strong against some such
>|observed errors.
>|
>|I claim that the high incidence of `broken' (corruption introducing)
>|components is a result of a variety of factors which have shaped the
>|development of network components thus far.  The fact that integrity
>|checks are assumed to be performed in a network context substantially
>|lowers the bar for implementation correctness.
>|
>|In a storage (or CPU) context, these types of implementation errors
>|are a) more easily detectable (more fatal) b) more carefully avoided
>|during implementation (because of the cost of a potential fatal
>|error).  If network components magically reached the same `quality
>|level' as storage and CPU components, there might be no justification
>|for additional integrity checks above the transport.  Similarly if the
>|transport (or whatever lower layer) integrity checks are very strong
>|(e.g. IPSec), there is, again, no need for a higher level integrity
>|check.
>|
>|I am not disagreeing that we need an additional integrity check over
>|TCP in the present target environment, but I do disagree that iSCSI
>|will always need such a check, independently of what is running
>|beneath it.
>|
>|Steph
>|


-------------------------------------------------------------------
Chip Sharp                       Consulting Engineering
Cisco Systems
-------------------------------------------------------------------



From owner-ips@ece.cmu.edu  Thu Apr 19 15:01:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16562
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:01:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JH18619207
	for ips-outgoing; Thu, 19 Apr 2001 13:01:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JH0vA19193
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 13:00:57 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 5DB34BBF; Thu, 19 Apr 2001 10:00:47 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA02001;
	Thu, 19 Apr 2001 10:00:42 -0700 (PDT)
Message-ID: <3ADF1978.57B6B731@cup.hp.com>
Date: Thu, 19 Apr 2001 09:59:36 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu, Julian Satran <julian_satran@il.ibm.com>
Cc: Stephen Bailey <steph@cs.uchicago.edu>, David Black <Black_David@emc.com>
Subject: Re: iSCSI : login keys & mode page settings
References: <C1256A32.004E1F02.00@d12mta02.de.ibm.com> <20010419133534.223DC94009@sandmail.sandburst.com>
Content-Type: multipart/mixed;
 boundary="------------208204CF7FF8E3B09B34C918"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------208204CF7FF8E3B09B34C918
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Stephen Bailey wrote:
> 
> > +++ It is all about function - several people felt that the (primitive)
> > negotiation element in the text commands is better than trying to set a
> > parameter to an unacceptable value and finding this out through a mode
> > sense
> > ++++
> 
> Several other people seem to have felt that it was better not to have
> redundant mechanisms for manipulating this sort of parameter.  How do
> you decide?
> 
> Steph


Julian,

I would prefer a single mechanism as opposed to multiple redundant
mechanisms to set negotiation elements. (Reduce/eliminate options.). In
addition, I have the following concerns [some of which I have raised
repeatedly and have NOT gotten a reply. I would appreciate if you could
kindly comment on ALL of these.] :

1) The draft makes repeated references to a "iSCSI LUN Control Mode
Page". There is NO such page per SPC-2. The references must be changed
to "iSCSI protocol specific LUN page".

2) The new daily release of the draft (5.92 when I last checked) has now
introduced EnableACA as a negotiable login key. All references to
EnableACA are redundant and should be removed for the following reasons
:

a) An initiator knows whether a target supports ACA from the NACA bit in
the INQUIRY response. When a target indicates support for ACA, the
initiator can use it by setting the NACA bit in the CDBs it sends. There
is NO need for any sort of negotiation of this behaviour above and
beyond what is already provided thru SCSI mechanisms.

b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
use or lack thereof. This is done thru the NACA bit in CDBs.

c) (As a side note, the description of EnableACA on pg 127 refers to its
presence in the lun control mode page, but it is actually present in the
protocol specific port page.)

d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
negotiated on a per-session basis. SCSI allows initiators to request ACA
behaviour on a per I/O basis through the use of NACA bit in the CDBs.


2) 
> On a side note, the EnableCmdRN  & CmdRN fields should be re-named to
> EnableCRN and CRN to reflect the same semantics and context as the CRN
> defined in SAM-2 and FCP-2.
>
> +++ what's in a name... +++

Consistency for one ! (Any strong reasons not to call this CRN, as SAM
and FCP do ?)


3) However, having allowed 2 mechanisms to set negotiation elements,
iSCSI MUST
then comment on the need to synchronize their settings in the 2 layers
and also comment on the need to trigger a UNIT ATTENTION when changed
through the login key mechanism.
Again, I would vote for only 1 mechanism for setting these control
options, rather than have to define communication schemes b/n the ULP
and LLP to keep their values in synch and generate UNIT ATTENTION. 


4)
> If such a level of dual control is provided, the iSCSI login
> keys listed above be made LO (leading only) to allow for changes to
> operational parameters only during session login. This is to
> minimize/eliminate disruption of ongoing I/O activity that occurs due to
> the generation of a UNIT ATTENTION CHECK CONDITION when any change is
> made to the above paramters.

Are we in agreement on the above ?


5)
> If these operational parameters are allowed to be set thru iSCSI
> login and they also impact mode page settings, iSCSI spec should
> describe the scope of the mode page setting in terms of whether this
> setting is a saved page setting or not ?
>

6) 
> Should saved page settings be allowed thru iSCSI ?

I did not see any comments on the above issues (?).

- Santosh
--------------208204CF7FF8E3B09B34C918
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------208204CF7FF8E3B09B34C918--



From owner-ips@ece.cmu.edu  Thu Apr 19 15:10:41 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16706
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:10:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JHaUL21153
	for ips-outgoing; Thu, 19 Apr 2001 13:36:30 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JHZRA21069
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 13:35:28 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRLJ0>; Thu, 19 Apr 2001 10:35:17 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B3FC1E2@ariel.nishansystems.com>
From: Joshua Tseng <jtseng@NishanSystems.com>
To: "'KRUEGER,MARJORIE (HP-Roseville,ex1)'" <marjorie_krueger@hp.com>,
        ips@ece.cmu.edu
Subject: RE: iSCSI: target discovery issue
Date: Thu, 19 Apr 2001 10:35:16 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Marjorie,
> 
> A more generic statement of "method D" is that there exist 
> applications for
> managing and administering networked resources (Novell eDirectory and
> Microsoft ActiveDirectory products are examples).  One of the 
> most important
> tasks a network administrator has is making sure that users 
> have access to
> files, resources and services they need on the network.  
> iSCSI adds block
> storage to the list of network resources, but resource assignment and
> management is a problem that has existed for quite some time 
> and solutions
> exist today.  Network resource management evolved from a 
> device-by-device
> management paradigm long before iSCSI was conceived.  I don't want to
> detract from the value of iSNS, but it's not correct to 
> suggest it's the
> only solution to resource management.

eDirectory and ActiveDirectory are implementations of LDAP with 
significant amounts of proprietary extensions.  So you must be talking
about the basic LDAP protocol itself, since as much as Microsoft
would desire, I can't imagine the storage industry imbedding
ActiveDirectory clients into each and every iSCSI storage device.
Nor could I imagine IETF standardizing on a Microsoft implementation.
Same thing with Novell.

If LDAP is what you are talking about, please see my previous notes
to Doug Otis regarding how LDAP is basically a generic directory service
that passively stores information deposited by clients, without any
regard to what that information is used for.  iSNS is also a protocol,
but since it is tailored to storage, it can interpret information
registered by clients, and take appropriate action.  That isn't to
say LDAP can't be used to accomplish some of the discovery and management
functions that iSNS has, but because iSNS has state-consciousness of
its client storage devices, it has more capabilities than a basic LDAP
server would have in managing storage devices.  David has already invited
anyone interested to write a draft on how LDAP can be used, and I would
be equally interested.  If you or someone else would oblige, then we
would have a "method E" of discovery iSCSI devices.

Finally, I would like to note that iSNS capabilities are modeled on
those provided by T11's FC-GS-3.  Presumably, these capabilities are
based upon real-life lessons learned by the Fibre Channel community in
managing and operating large enterprise-class storage networks.  I like
to think that we are incorporating the fruit of those lessons into iSNS.
We looked at LDAP--we really did--to see if it could provide comparable
FC-GS-3 services in the IP domain.  But there are shortcomings which
forced us to create the iSNS protocol.  These shortcomings are documented
in the iSNS document.

> 
> Typically users register/login to distributed resource management pts
> (domain servers) and these applications handle authentication,
> authorization, and assignment of resources.  John makes 
> important points in
> his email - you don't want all users informed of new storage 
> coming on line,
> those systems that are intended to have access should be 
> notified, or should
> explicitly "mount" the new storage.  It's not appropriate to 
> burden each
> storage device with this task, it is definitely a value add feature
> appropriate to a centralized resource management application. 

Yes, this is where iSNS with the discovery domain feature provides
significant value in a large storage network.

Regards,
Josh

> 
> Marjorie Krueger
> Networked Storage Architecture
> Networked Storage Solutions Org.
> Hewlett-Packard
> tel: +1 916 785 2656
> fax: +1 916 785 0391
> email: marjorie_krueger@hp.com 
> 


From owner-ips@ece.cmu.edu  Thu Apr 19 15:10:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16707
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:10:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JHPBx20434
	for ips-outgoing; Thu, 19 Apr 2001 13:25:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JHP4A20395
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 13:25:04 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP id 79691B13
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 10:25:02 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA03658
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 10:24:56 -0700 (PDT)
Message-ID: <3ADF1F24.394D683F@cup.hp.com>
Date: Thu, 19 Apr 2001 10:23:48 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: [Fwd: FCP-2 : EMDP setting in disconnect-reconnect mode page.]
Content-Type: multipart/mixed;
 boundary="------------8266C342A6BAD3F2BCE4D413"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------8266C342A6BAD3F2BCE4D413
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,

I am forwarding responses from T11 regarding EMDP behaviour in FCP. (And
knowing that Dave Peterson is on this list, perhaps, he can comment
further).

The same level of control that FCP provides must be made available to
iSCSI. (Incidentally, I noticed that SRP had a proposal to have the same
semantics of allowing both WRITE and READ ordering to be controlled
across the entire SCSI command.)

Initiators would find such an option useful which allows them to request
that targets ask for data in-order. i.e. the buffer offset in
consecutive R2Ts be in order. EMDP in iSCSI must allow this level of
control.

- Santosh


Dave Peterson wrote:
> 
> Howdy Santosh,
> 
> EDMP applys to both READs and WRITEs:
> Yes, if EMDP=0, the target shall send continuously increasing DATA_RO values
> in its FCP_XFER_RDY requests.
> Yes, if EMDP=0, the initiator shall send out FCP_DATA IUs in order in
> response to FCP_XFER_RDY.
> 
> If the EMDP bit is set to one the target may request data out of order using
> the FCP_DATA_RO field in the FCP_XFER_RDY.
> 
> The initiator shall (always) deliver FCP_DATA IUs as specified in the
> FCP_XFER_RDY (i.e. the target is in control).
> The exception is when FCP_XFER_RDY is disabled and then it is only
> applicable to the first FCP_DATA IU.
> 
> Dave
> 
> > -----Original Message-----
> > From: owner-fc@network.com [mailto:owner-fc@network.com]On Behalf Of
> > Santosh Rao
> > Sent: Wednesday, April 18, 2001 1:31 PM
> > To: Fibre Channel T11 reflector
> > Subject: FCP-2 : EMDP setting in disconnect-reconnect mode page.
> >
> >
> > All,
> >
> > I've got a question regarding the EMDP bit semantics in the
> > disconnect-reconnect mode page as described in Section 10.1.1.7 of FCP-2
> > (Rev 04).
> >
> > The description of EMDP reads :
> > "....indicates whether or not the target may reorder FCP_DATA IUs for a
> > single SCSI command.
> > EMDP == 0  => target shall generate continuously increasing DATA_RO
> > values for each FCP_DATA sequence for a single SCSI command.
> > EMDP == 1 => target may transfer the FCP_DATA IUs for a single SCSI
> > command in any order."
> >
> > The above description reads fine for READ I/O operations where FCP_DATA
> > IUs flow from target to originator.
> >
> > What is the semantics of EMDP in the context of a WRITE I/O operation ?
> > Does EMDP of 0 imply that target shall send continuously increasing
> > DATA_RO values in its FCP_XFER_RDY requests ?Does it also imply that the
> > initiator shall send out FCP_DATA IUs in order in response to
> > FCP_XFER_RDY ?
> >
> > Or is the EMDP only applicable to READ I/Os and has no effect on WRITEs
> > ? If so, why is this the case.
> >
> > Regards,
> > Santosh Rao
--------------8266C342A6BAD3F2BCE4D413
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------8266C342A6BAD3F2BCE4D413--



From owner-ips@ece.cmu.edu  Thu Apr 19 15:12:14 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16749
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:11:43 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JI7Do22871
	for ips-outgoing; Thu, 19 Apr 2001 14:07:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JI6IA22798
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 14:06:18 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id UAA162900
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 20:05:48 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id UAA203378
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 20:05:47 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.006365CF ; Thu, 19 Apr 2001 20:05:41 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.00631D5F.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 20:07:44 +0200
Subject: Re: iSCSI : New PDU opcode usage in rev 5.92
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



The former bit of differentian is gone. The main reason we had it was for
analyzers. This is also why I choose the bits as I did.
As for the explanation - I'll check what's already in there.

Julo

Santosh Rao <santoshr@cup.hp.com> on 19/04/2001 19:08:10

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI : New PDU opcode usage in rev 5.92




That's not a restriction other SCSI transports impose. FCP allows both
the initiator and target bit to be set in PRLI indicating dual mode
behaviour within a single I-T nexus.

However, the point I'm trying to make is that the opcode cannot be
interpreted by itself and needs interpretation in the context of the
"role" the receiver is playing. i.e. Opcode parsing must be in
conjunction with the check of the receiver's role (is it receiving the
command while operating as a target or initiator ?)

Seems like such a description is missing in the draft.

- Santosh



julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> On a given nexus the roles are static aren't they?
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 19/04/2001 03:47:07
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   ips@ece.cmu.edu (ips)
> cc:
> Subject:  iSCSI : New PDU opcode usage in rev 5.92
>
> Julian & All,
>
> I've got a quick question on how the new opcode layouts would work for
> dual mode scsi implementations. (i.e. initiators that responded in
> target mode or targets that acted as initiators also).
>
> The new opcode layout is :
>
> ----------------
> X|I| | | | | | |
> ----------------
> 7 6 5 4 3 2 1 0
>
> where bits 5-0 -> opcode
> X -> retry bit
> I -> immediate bit
>
> The same values are used for the command as well as response opcodes and
> bits X & I are intended to both be set to 1 by targets.
>
> i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
> command and response is based on targets setting X & I bits to 1.
>
> Now, if an initiator [capable of target mode] sent the following
> commands, how would they be interpreted :
>
> 1) 0xc4.
> is this a text command being retried in immediate mode,
> or is it a text response ?
>
> 2) 0xc1
> is this a scsi command being retried in immediate mode,
> or is it a scsi response ?
>
> 3) 0xc2
> is this a scsi task mgmt command being retried in immediate mode,
> or is it a scsi task mgmt response ?
>
> etc.....
>
> - Santosh
>
> --
> #################################
> Santosh Rao
> Software Design Engineer,
> HP, Cupertino.
> email : santoshr@cup.hp.com
> Phone : 408-447-3751
> #################################
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr 19 15:13:14 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16788
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:13:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JHrTY22066
	for ips-outgoing; Thu, 19 Apr 2001 13:53:29 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JHpuA22001
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 13:51:56 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id NAA18870; Thu, 19 Apr 2001 13:51:24 -0400 (EDT)
Message-ID: <3ADF2543.74AA0951@cisco.com>
Date: Thu, 19 Apr 2001 12:49:55 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: Chip Sharp <chsharp@cisco.com>
CC: vince_cavanna@agilent.com, steph@cs.uchicago.edu, jim_wendt@hp.com,
        julian_satran@il.ibm.com, ips@ece.cmu.edu, tsvwg@ietf.org,
        craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
        jonathan@dsg.stanford.edu, rrs@cisco.com
Subject: Re: [Tsvwg] [SCTP checksum problems]
References: <4.3.2.7.2.20010419123133.01d8a210@dogwood.cisco.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Chip-

You are correct on all counts.  However, this does not reduce the
need for an iSCSI-level CRC; it does mean that the type of integrity
provided depends somewhat on network engineering:

For a layer-2 network, with no routers or middle boxes, there's no
need for iSCSI CRC; this is analogous to a Fibre Channel network
today.

For a carefully-built layer-3 network, the TCP checksum can be
good enough, and again, the iSCSI CRC is not always required.

That leaves us with two other types of networks:

1. Networks where the quality of the various routers and middle
   boxes are not certain, and may cause accidental harm that is
   not detected by TCP.  These might include larger, private
   networks, leased metro networks, or networks subject to
   cheap equipment purchases (small business, home, etc).

2. Networks where there is a real threat of intentional harm to
   the data.

Note that network type (1) would also include networks of type (2)
where there may be minimal threat, and the cost of dealing with
the threat is not justified.

Network type (2) requires a signature on the data; IPSec would
probably be used for that case.  The expense for doing IPSec
would be justified.

Network type (1) can use a non-cryptographic, iSCSI-level CRC.

I have a few more comments below.

--
Mark

Chip Sharp wrote:
> 
> As was pointed out previously, middle box operations (such as NATs) tend to
> creep up the protocol stack and into applications.

This will certainly happen for iSCSI.

> Take SIP for example.  It includes IP addresses in its INVITE.  In order to
> work across a NAT, the IP addresses it exchanges have to be replaced with
> the NATed address.  One way is for the NAT to reach up into the SIP INVITE
> and change the address.  This modifies the TCP or UDP checksum.  Now SIP
> could have included its own integrity check to protect against corrupted or
> modified TCP checksums, but all that would have happened is that NATs would
> have changed the SIP checksum in addition to the TCP/UDP checksum.
> 
> Therefore, even if iSCSI included its own integrity check, if a middle box
> is going to futz with iSCSI packets it will just strip the check, do
> whatever it does and then recalculate the check.

That is the reason that we have separate integrity checks on the header and
the data.  If the middle box is trusted to do so, we have to enable this
sort of behavior.  By separating the checks, we at least have end-to-end
integrity on the data, and could (in theory) do a modification of the header
check without completely throwing it away and recalculating it.  This would
be expensive to do on the data, but would probably not be prohibitive for
the headers (someone correct me if I'm wrong, but I recall some discussions
on the possibilities for modifying CRCs in this way).

> If this is what you want to protect against you will have to go to some
> type of digital signature.

Again, there are times when you want to protect against middle boxes.  In
this case, the digital signature is needed, and perhaps IPSec is the solution.
There are other times when this behavior must be enabled.  In either case
(even with IPSec), the iSCSI CRC can provide end-to-end protection.

> At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
> >Stephen,
> >
> >I have to admit that I do not have much direct experience with middle boxes,
> >BUT I did have fairly direct and recent experience with a popular NAT router
> >from a popular vendor that was corrupting data in a network of Macintoshes.
> >
> >Apple's TCP was unaware of any problem as was Apple's Filing Protocol and
> >most applications. The only applications that detected the corruption were
> >those that performed an integrity check of their own. Those applications
> >that assumed a reliable transport (and file system) were doomed to
> >experiencing the indirect effects of the corruption at some later time. The
> >corruption only happened when large amounts of data were transferred
> >quickly.  The router vendor fixed the problem once; then fixed it again;
> >then fixed it one last time before the data corruption finally
> >"disappeared". After several weeks of continuous operation the router
> >appeared to get into a mode where it was once again corrupting data. Power
> >cycling the router "fixed it". The story apparently has not yet ended.
> >
> >I admit I may have given too much significance to this single incident that
> >I have personally experienced but on the other hand I don't see the
> >mechanisms in place to prevent this type of problem in the future other than
> >the end to end integrity checks.
> >
> >Incidentally this incident change my behavior when transferring data over a
> >network. I will always use a compression utility; not only for reducing the
> >data to be transmitted but to ensure the integrity of my data is protected
> >end to end by the utility's CRC mechanism.
> >
> >I believe quite firmly that we DO need a mechanism to allow us to tolerate
> >poor implementations of middle boxes and cannot simply hope that eventually
> >such poor implementations will vanish, nor that we will have the luxury of
> >being able to select only good implementations for every component of our
> >storage network.
> >
> >Vince
> >
> >|-----Original Message-----
> >|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> >|Sent: Wednesday, April 18, 2001 3:09 PM
> >|To: CAVANNA,VICENTE V (A-Roseville,ex1)
> >|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
> >|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
> >|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
> >|Subject: Re: [Tsvwg] [SCTP checksum problems]
> >|
> >|
> >|Vince,
> >|
> >|> I don't think iSCSI can be completely relieved of performing
> >|some data
> >|> integrity checking as long as there exists the possibility
> >|of "middle boxes"
> >|> opening up the transport protocol's packet and thus
> >|potentially invalidating
> >|> any reliability guarantees the transport protocol makes.
> >|
> >|Any protection provided against this failure mode will only be
> >|transient, so we must temper the desire to introduce such a
> >|requirement with reality.
> >|
> >|Middleboxes can just as easily open up to the iSCSI layer and tinker
> >|with the payload, as they do with other ULPs running on TCP (e.g HTTP)
> >|today.  Short of securing the connection, there is ALWAYS a
> >|possibility of a middlebox terminating and reoriginating an integrity
> >|check.  In case you think this is a farfetched scenario, I do get the
> >|impression that there is a high level of interest in `actively
> >|middling' iSCSI once the specs crystalize.  Who shaves the barber?
> >|
> >|An integrity check is not necessary as long as some lower layer
> >|provides adequate integrity guarantees.
> >|
> >|Adding an integrity check above the transport layer is based upon
> >|documentation of the presence of a lot of crappy network hardware and
> >|software and analyses of the transport integrity check (TCP checksum)
> >|which suggests it might not be adequately strong against some such
> >|observed errors.
> >|
> >|I claim that the high incidence of `broken' (corruption introducing)
> >|components is a result of a variety of factors which have shaped the
> >|development of network components thus far.  The fact that integrity
> >|checks are assumed to be performed in a network context substantially
> >|lowers the bar for implementation correctness.
> >|
> >|In a storage (or CPU) context, these types of implementation errors
> >|are a) more easily detectable (more fatal) b) more carefully avoided
> >|during implementation (because of the cost of a potential fatal
> >|error).  If network components magically reached the same `quality
> >|level' as storage and CPU components, there might be no justification
> >|for additional integrity checks above the transport.  Similarly if the
> >|transport (or whatever lower layer) integrity checks are very strong
> >|(e.g. IPSec), there is, again, no need for a higher level integrity
> >|check.
> >|
> >|I am not disagreeing that we need an additional integrity check over
> >|TCP in the present target environment, but I do disagree that iSCSI
> >|will always need such a check, independently of what is running
> >|beneath it.
> >|
> >|Steph
> >|
> 
> -------------------------------------------------------------------
> Chip Sharp                       Consulting Engineering
> Cisco Systems
> -------------------------------------------------------------------

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Thu Apr 19 15:16:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16866
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:16:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JHB8W19762
	for ips-outgoing; Thu, 19 Apr 2001 13:11:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JHAaA19741
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 13:10:36 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP id 3EFA5623
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 10:10:35 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA02650
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 10:09:15 -0700 (PDT)
Message-ID: <3ADF1B7A.3B2A5414@cup.hp.com>
Date: Thu, 19 Apr 2001 10:08:10 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI : New PDU opcode usage in rev 5.92
References: <C1256A33.0026DB18.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------5C89862E6EBD1D4BAA80818C"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------5C89862E6EBD1D4BAA80818C
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

That's not a restriction other SCSI transports impose. FCP allows both
the initiator and target bit to be set in PRLI indicating dual mode
behaviour within a single I-T nexus. 

However, the point I'm trying to make is that the opcode cannot be
interpreted by itself and needs interpretation in the context of the
"role" the receiver is playing. i.e. Opcode parsing must be in
conjunction with the check of the receiver's role (is it receiving the
command while operating as a target or initiator ?)

Seems like such a description is missing in the draft.

- Santosh



julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> On a given nexus the roles are static aren't they?
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 19/04/2001 03:47:07
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   ips@ece.cmu.edu (ips)
> cc:
> Subject:  iSCSI : New PDU opcode usage in rev 5.92
> 
> Julian & All,
> 
> I've got a quick question on how the new opcode layouts would work for
> dual mode scsi implementations. (i.e. initiators that responded in
> target mode or targets that acted as initiators also).
> 
> The new opcode layout is :
> 
> ----------------
> X|I| | | | | | |
> ----------------
> 7 6 5 4 3 2 1 0
> 
> where bits 5-0 -> opcode
> X -> retry bit
> I -> immediate bit
> 
> The same values are used for the command as well as response opcodes and
> bits X & I are intended to both be set to 1 by targets.
> 
> i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
> command and response is based on targets setting X & I bits to 1.
> 
> Now, if an initiator [capable of target mode] sent the following
> commands, how would they be interpreted :
> 
> 1) 0xc4.
> is this a text command being retried in immediate mode,
> or is it a text response ?
> 
> 2) 0xc1
> is this a scsi command being retried in immediate mode,
> or is it a scsi response ?
> 
> 3) 0xc2
> is this a scsi task mgmt command being retried in immediate mode,
> or is it a scsi task mgmt response ?
> 
> etc.....
> 
> - Santosh
> 
> --
> #################################
> Santosh Rao
> Software Design Engineer,
> HP, Cupertino.
> email : santoshr@cup.hp.com
> Phone : 408-447-3751
> #################################
--------------5C89862E6EBD1D4BAA80818C
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------5C89862E6EBD1D4BAA80818C--



From owner-ips@ece.cmu.edu  Thu Apr 19 15:18:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16916
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 15:18:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JI7AE22868
	for ips-outgoing; Thu, 19 Apr 2001 14:07:10 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JI5wA22787
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 14:05:59 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id UAA301202
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 20:05:47 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id UAA203376
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 20:05:47 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.0063661C ; Thu, 19 Apr 2001 20:05:42 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.00635A8B.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 20:10:20 +0200
Subject: RE: draft version 06 available at my site
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



It's my dislectic finger day:

http://www.haifa.il.ibm.com/satran/ips

Julo

Arindam Paul <apaul@Zambeel.com> on 19/04/2001 19:26:32

Please respond to Arindam Paul <apaul@Zambeel.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  RE: draft version 06 available at my site




doesnt work still

-----Original Message-----
From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
Sent: Thursday, April 19, 2001 10:01 AM
To: ips@ece.cmu.edu
Subject: Re: draft version 06 available at my site




Brian,

Sorry - The site name is:

http://www.il.ibm.com/satran/ips

Julo

Brian Pawlowski <beepy@netapp.com> on 19/04/2001 16:39:16

Please respond to Brian Pawlowski <beepy@netapp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  Re: draft version 06 available at my site




I am not finding it.

>
> Dear colleagues,
>
> I've just placed 06 at http://www.haifa.il.com/satran/ips
>
> and submitted it to ID.
>
> Recovery is still "work in progress" and probably won't make it to
Nashua.
>
> Only minor changes vs. 05-92:
>
> At David Blacks suggestion I've added an explanation about how to use
CmdSN
> with  task manament (in chapter 7).
>
> InDataOrder is now DataOrder - media that requires order will get-it.
>
> Some typo's (sure not all -:))
>
> Regards,
> Julo
>
>
>
>








From owner-ips@ece.cmu.edu  Thu Apr 19 16:38:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA18096
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 16:38:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JIYEL24404
	for ips-outgoing; Thu, 19 Apr 2001 14:34:14 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JIXpA24395
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 14:33:51 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id OAA08197; Thu, 19 Apr 2001 14:33:33 -0400 (EDT)
Message-ID: <3ADF2F23.50CAC421@cisco.com>
Date: Thu, 19 Apr 2001 13:32:03 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: ENDL_TX@computer.org
CC: julian_satran@il.ibm.com, Sandeep Joshi <sandeepj@research.bell-labs.com>,
        IPS <ips@ece.cmu.edu>
Subject: Re: target discovery issue
References: <C1256A33.003358D4.00@d12mta05.de.ibm.com> <3ADEDFC7.2B926B74@compuserve.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

(This is a re-send of another message about my wanting to
have an async event.  My apologies to those who have received
it twice).

The background is that Ralph mentioned that from an
architectural point-of-view, targets are not supposed to
know about each other, and that the out-of-band discovery
method should really be used instead.


That's technically correct; given that, we should probably
not send such an event on a connection to a specific target.
That leaves the canonical target connection as the "discovery
port", that could send events about other targets.  Since the
canonical target is an iSCSI-level thing, rather than SCSI-level
thing, there's nothing layer-violating about it.

I agree that in an iSNS-managed environment, that iSNS' event
mechanism is the way to do this, and this event would not be
needed.  The same MAY be true for an SLP environment, depending
on where we end up with the proposed notification mechanism
from RFC 3082.  I'm not completely sure yet on that one.

That leaves us with one particular environment where I am
concerned our discovery methods don't completely cover:

  Let's say that a pair of iSCSI gateways connect Storage
  Service Provider A with Customer B.  The gateways are
  configured with each other's IP addresses; and may run
  over a private connection or secured tunnel.  In this
  case, the Customer does not need to discover targets at
  *new* addresses; it just needs to discovery additional
  targets created behind the SSP's gateway.  If the customer
  gateway keeps a connection open to the SSP gateway's
  canonical target, it could receive notifications of new
  targets behind the gateway without having to deploy
  separate discovery protocols such as iSNS and SLP over
  this network.  It would make the tunneling model much
  easier in this case to keep it in-band. 

In the normal-case environment, where there are a bunch of
initiators wanting to discover a bunch of targets, it would
be better to use SLP or iSNS to do this.  They could even
be used to do this in the gateway-to-gateway case.  However,
I don't think that running separate, out-of-band discovery
protocols in this environment will be happily accepted by
the end customers, who would rather deal with fewer protocols
to firewall and tunnel.

At that rate, we could just say that the event is sent ONLY
on connections to the canonical target; initiators using the
other methods of discovery would not have to keep a canonical
target connection open for this, and would not have to implement
support for the event.  Targets could support sending the
event only if it was felt worthwhile, such as in a gateway
implementation.

This could also be done as a vendor-unique event, but there
is no defined way for a vendor to add async event messages
in iSCSI, and it would be better to just define the event and
have it used only where necessary.  We can update the description
of the event to make its use more clear.

Does that work better?

--
Mark

Ralph Weber wrote:
> 
> Gentlemen,
> 
> Target discovery is not covered in SAM so this area
> is one where I have no official standing.  That said,
> I will till my oar in just this once.
> 
> I have more than a little difficulty seeing how
> one iSCSI portal can usefully notify logged in
> initiators of new targets appearing.  The new
> targets the initiator cares about could easily
> be coming available through an iSCSI portal
> here the initiator has not yet logged in.
> 
> If the iSCSI portal is to notify an entity of
> new target creation, it seems to me that the
> entity notified should be the iSNS server.
> Then initiators that want to be notified of
> new targets would register for such notifications
> with the iSNS server, who would be the
> clearinghouse for such activities.
> 
> Just my $0.02.
> 
> Thanks.
> 
> Ralph...

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Thu Apr 19 16:38:53 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA18110
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 16:38:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JIoLL25529
	for ips-outgoing; Thu, 19 Apr 2001 14:50:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JIncA25471
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 14:49:38 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3JJvL121305;
	Thu, 19 Apr 2001 12:57:21 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Douglas Otis" <dotis@sanlight.net>, "Lloyd Wood" <L.Wood@surrey.ac.uk>
Cc: <ips@ece.cmu.edu>, <tsvwg@ietf.org>
Subject: RE: [Tsvwg] [SCTP checksum problems]
Date: Thu, 19 Apr 2001 11:47:31 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEIOCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <NEBBJGDMMLHHCIKHGBEJEEIHCGAA.dotis@sanlight.net>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

All,

A few corrections to this post.

   #define BASE 65521
   unsigned short s1 = 0x5555;
   unsigned s2 = 0;
   unsigned short dat_buf*

   while (length -= 2)
     {
     s1 += ntoh(*dat_buf++);  /* 16 bit summing */
     s2 += s1;
     if (s2 >= BASE)          /* Adler modulo for s2 only  */
       s2 -= BASE;
     }
   return (s2 << 16) s1 );

The 32 bit CRC table for an 8 bit lookup would be 1k byte in size and not
512 bytes.

> Lloyd,
>
> I am aware of efforts to compare CRC with Alder-32.  CRC that is primarily
> aimed at providing burst error detection but if while trying various
> techniques, this modification may be interesting.
>
>   #define BASE 65521
>   unsigned s1 = 0x5555;
>   unsigned s2 = 0;
>   unsigned short dat_buf*
>
>   while (length -= 2)
>     {
>     s1 += ntoh(*dat_buf++);  /* 16 bit summing */
>     s2 += s1;
>     if (s2 >= BASE)          /* Adler modulo for s2 only  */
>       s2 -= BASE;
>     }
>   return (s2 << 16) | (s1 & 0xffff);
>
> This would exercise more bits for small packets, improve burst error
> sensitivity and trade a modulo function for a network to host swap in some
> cases.  It seems to become a comparison against burst errors vs. missing
> segment and stuck bit sensitivity.
>
> The alternative code for CRC would look something like this.
>
> Here is an example using a 256 entry (512 byte) table.
>
>   unsigned char* dat_buf;
>   unsigned long crc_syn = 0xffffffff;
>
>   while(length--)
>     crc_syn = (crc_syn >> 8) ^ crc32_table[(crc_syn & 0xff) ^ *dat_buf++];
>
>   return (crc_syn ^ 0xffffffff);
>
>
> Doug



From owner-ips@ece.cmu.edu  Thu Apr 19 16:39:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA18122
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 16:39:10 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JIUXc24211
	for ips-outgoing; Thu, 19 Apr 2001 14:30:33 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JITYA24124
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 14:29:35 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id OAA03376; Thu, 19 Apr 2001 14:29:15 -0400 (EDT)
Message-ID: <3ADF2E22.43BACB44@cisco.com>
Date: Thu, 19 Apr 2001 13:27:46 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: Sandeep Joshi <sandeepj@research.bell-labs.com>
CC: hufferd@us.ibm.com, jtseng@NishanSystems.com, ips@ece.cmu.edu
Subject: Re: iSCSI: target discovery issue
References: <200104190359.XAA29944@aura.research.bell-labs.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Sandeep-

The reason I had wanted the iSCSI-level message on the canonical
target connection was to allow the use of in-band discovery for
a gateway-to-gateway connection.  You are right; in most cases,
iSNS or SLP will be there, but in the gateway-to-gateway
configuration, it is not usually desirable to have to tunnel
additional protocols through firewalls and tunneling devices.

How about if we change the statement to a MAY instead of a MUST
(only gateways would implement it, and initiators can ignore it)?
I could add to the description to say that it need not be implemented
on the actual hosts and devices themselves, which should use
one of the other mechanisms.

I really would like to use an async event to do this in-band, and
iSCSI provides no method to do vendor-unique async events.

--
Mark


Sandeep Joshi wrote:
> 
> Josh & John,
> 
> Thanks for the hi-level perspective.  I do agree that iSNS
> and/or SLP will do the trick here and that this functionality
> falls into the OA&M domain.
> 
> Four score and seven days ago (..approx!), this thread started due
> to confusion with the following line in the naming and discovery
> draft, which had no equivalent message code defined in the iSCSI
> Async PDU.
> 
> > 1) Section 4.2 last line before Section 4.2.1
> >    "the target MUST send any iSCSI-level async on the canonical
> >    session, to allow the initiator to discover new targets as
> >    they are created.."
> 
> Can the issue be laid to rest by removing this statement ?
> 
> Thanks,
> -Sandeep
> 
> > John, Sandeep,
> >
> > I would like to append to Mark's note that method D is
> > the iSNS.  The iSNS breaks from the device-by-device
> > management paradigm that the previous methods use.  It
> > provides for network-wide storage device discovery,
> > zoning and management.
> >
> > The details of how iSNS works is documented in the
> > iSNS document, and an overview is in the iSCSI N&D
> > requirements document.  But the key concept is that
> > instead of going to device A, configure its access
> > list, then on to device B, configure its access list,
> > then on to initiator A, configure its list of targets,
> > etc....you instead go to a single entity, the iSNS
> > server, to gain a network-wide view of all storage
> > assets.  If all storage devices have slaved their
> > discovery and management functions to the iSNS server,
> > then the iSNS is a single management point that the
> > GUI can use to configure discovery and access privileges
> > for the entire storage network.
> >
> > When a new target shows up on the network, it registers
> > in the iSNS.  The iSNS server then sends notifications to
> > interested iSNS clients (only those configured for this
> > notification with the proper zoning) informing them of the
> > new device. The iSNS client is a co-resident application
> > on iSCSI targets and initiators that maintains communication
> > with the iSNS server.
> >
> > Hope this helps.
> >
> > Josh
> >
> > > -----Original Message-----
> > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > Sent: Wednesday, April 18, 2001 3:08 PM
> > > To: Sandeep Joshi
> > > Cc: ips@ece.cmu.edu
> > > Subject: iSCSI: target discovery issue
> > >
> > >
> > >
> > > First off you need to understand that we are talking about
> > > Targets, NOT LUs
> > > nor LUNs.
> > >
> > > Next this is a feature that you want to have in your
> > > management SW, and I
> > > believe that iSNS can help here. Josh Tseng pipe in here.
> > >
> > > Further, storage events don't usually just happen that
> > > everyone needs to
> > > know about it.  As a rule, storage is brought on to meet some
> > > requirement.
> > > The requirement is usually needed by a specific host.  Just
> > > because you add
> > > a Storage Controller does NOT mean that all the various host
> > > should start
> > > using the storage.  The storage is first established as LUs
> > > and the LUs are
> > > assigned to specific authorized Host (or iSCSI Nodes as we
> > > are calling them
> > > now) and when the session is started, and authenticated, that Host can
> > > issue the Report LUNs and that maybe the first time an LU has
> > > been given a
> > > Number for that particular Host.   But none of that should
> > > begin until the
> > > Host is sure that there is something for it to find.  And the thing it
> > > wants to find is a LUN not a SCSI Device/iSCSI Node or
> > > Target.  Getting
> > > knowledge of a new LU that can be used by a specific host is
> > > NOT an iSCSI
> > > thing.  It is SCSI, and very deep into the Management SW and Admin
> > > Processes.
> > >
> > > I punished you with this information so that you can see why I did not
> > > understand why you were concerned about a Host being
> > > automatically told
> > > about a new SCSI Device/ iSCSI Node/Target.  I would expect
> > > that if the
> > > administrator had gone to the work to bring in the storage controller,
> > > configure it for specific Hosts to use, set up the various LUs to be
> > > authorized for the various Hosts.  It seems reasonable for
> > > the Admin to ask
> > > the Host to recycle its discovery functions.
> > > .
> > > .
> > > .
> > > John L. Hufferd
> > > Senior Technical Staff Member (STSM)
> > > IBM/SSG San Jose Ca
> > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > Internet address: hufferd@us.ibm.com
> > >
> > >
> > > Sandeep Joshi <sandeepj@research.bell-labs.com>@ece.cmu.edu
> > > on 04/18/2001
> > > 11:21:34 AM
> > >
> > > Sent by:  owner-ips@ece.cmu.edu
> > >
> > >
> > > To:   ips@ece.cmu.edu
> > > cc:
> > > Subject:  Re: target discovery issue
> > >
> > >
> > >
> > >
> > > just realized that the reflector is not seeing this discussion.
> > > the question at hand is how should target discovery notification
> > > be sent in the iSCSI world.
> > >
> > > Mark Bakke wrote:
> > > >
> > > > Actually, I think that part of it is an iSCSI issue.  That is,
> > > > if a new target is created, that's at the SCSI level.  But if I
> > > > add an iSCSI address on which to access that target, it now must
> > > > be discovered first by the iSCSI layer on the host, before it
> > > > can be presented to the SCSI layer.  In this case, we would need
> > > > to send an iSCSI event indicating that there is a new target (or
> > > > at least that there is some change in availability of targets);
> > > > the host would then use SendTargets to find out the specifics.
> > > >
> > > > This brings up Sandeep's question #2.  If I am a target, I can
> > > > send this message either:
> > > >
> > > >  a) On every iSCSI connection
> > > >
> > > >   OR
> > > >
> > > >  b) On all connections to canonical targets
> > > >
> > > > Method a gives us better coverage, and does not require an
> > > > initiator to keep its canonical target connection around in
> > > > between these little sendtargets commands.  However, if an
> > > > initiator logs into a canonical target, finds that it has no
> > > > targets to connect to (yet), and one is added later, the
> > > > initiator would only find out if it had kept its canonical
> > > > target connection, unless it is using an out-of-band discovery
> > > > mechanism.
> > > >
> > > > Method a will also tend to bother connections to targets
> > > > that are doing the "real" work (data path stuff).
> > > >
> > > > Method b will keep these events away from the data path, and
> > > > will not generally have to send so many events.  However, it
> > > > would require each initiator that wanted to be notified to keep
> > > > its canonical connection around.
> > > >
> > > > There is a Method C, which is a combination of the above:
> > > >
> > > >  c) The device will send this async event message on ONE of the
> > > >     connections to each initiator name (formerly WWUI) that is
> > > >     connected to it.  If one of these connections is to the
> > > >     canonical target, the device will use that one.
> > > >
> > > > Method c allows the initiator to choose whether it would rather
> > > > keep an explicit canonical target connection around (e.g. if the
> > > > other connections have been pushed down to hardware), or whether
> > > > it would rather not keep the connection around, and be notified
> > > > on one of the others.  The number of messages sent by targets
> > > > would be identical to that in method b.
> > > >
> > > > --
> > > > Mark
> > > >
> > > > julian_satran@il.ibm.com wrote:
> > > > >
> > > > > Sandeep,
> > > > >
> > > > > I think we are deep in T10 territory - this is a SCSI issue.
> > > > >
> > > > > Julo
> > > > >
> > > > > Sandeep Joshi <sandeepj@research.bell-labs.com> on
> > > 18/04/2001 16:21:06
> > > > >
> > > > > Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>
> > > > >
> > > > > To:   Julian Satran/Haifa/IBM@IBMIL, mbakke@cisco.com
> > > > > cc:
> > > > > Subject:  target discovery issue
> > > > >
> > > > > Julian & Mark,
> > > > >
> > > > > Friendly reminder... the issue mentioned below may not
> > > > > have been resolved.
> > > > >
> > > > > 1) Is target discovery going to be the SCSI event or will
> > > > >    it be wrapped up as an iSCSI event ?
> > > > > 2) Do we have to keep a session to the canonical target
> > > > >    always open to be able to do target discovery?
> > > > >
> > > > > -Sandeep
> > > > >
> > > > > I am not sure.  There are some SCSI items in it too (SCSI
> > > handles now
> > > the
> > > > > appearnce of new LUs).
> > > > >
> > > > > I will need a longer discussion with NDT to understand
> > > the semantics.
> > > > >
> > > > > Julo
> > > > >
> > > > > Sandeep Joshi <sandeepj@research.bell-labs.com> on
> > > 12/03/2001 22:25:27
> > > > >
> > > > > Julian,
> > > > >
> > > > > in case you skip this one..
> > > > > your response is required on point (1) for amending iSCSI draft.
> > > > >
> > > > > -sandeep
> > > > >
> > > > > Sandeep-
> > > > >
> > > > > The problem you pointed out in item number 1 creates the need
> > > > > for an additional iSCSI-level event.  Since the discovery of
> > > > > targets happens at the iSCSI level, rather than at the SCSI
> > > > > level, how about adding this to 2.18.1 (in iSCSI-05)?
> > > > >
> > > > >   4   Network entity indicates that a "target discovery" event
> > > > >       has occurred.
> > > > >
> > > > > Upon receiving this message, the initiator should use SendTargets,
> > > > > or whatever other methods of discovery it is using, to find out
> > > > > what has changed.  Usually, this would be due to adding a new
> > > > > target.
> > > > >
> > > > > We will fix items 2-4; thanks for pointing them out.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mark
> > > > >
> > > > > Sandeep Joshi wrote:
> > > > > >
> > > > > > 1) Section 4.2 last line before Section 4.2.1
> > > > > >     "target MUST send any iSCSI-level async on this session,
> > > > > >      allowing the initiator to discover new targets.."
> > > > > >
> > > > > >    The session mentioned here is a session to the
> > > canonical target.
> > > > > >
> > > > > >    However, the iSCSI 05 draft does not mention any
> > > such condition
> > > > > >    in Sec 2.18 on Async Message.   In there, a SCSI
> > > event (note: not
> > > > > >    iSCSI) is used to notify availability of new targets.
> > > > > >

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Thu Apr 19 17:38:39 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18804
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 17:38:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JGu9618963
	for ips-outgoing; Thu, 19 Apr 2001 12:56:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JGtuA18941
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 12:55:57 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id SAA286764
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 18:55:50 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id SAA147272
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 18:55:49 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A33.005CFE9E ; Thu, 19 Apr 2001 18:55:44 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A33.005CFE6E.00@d12mta02.de.ibm.com>
Date: Thu, 19 Apr 2001 19:00:50 +0200
Subject: Re: draft version 06 available at my site
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Brian,

Sorry - The site name is:

http://www.il.ibm.com/satran/ips

Julo

Brian Pawlowski <beepy@netapp.com> on 19/04/2001 16:39:16

Please respond to Brian Pawlowski <beepy@netapp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  Re: draft version 06 available at my site




I am not finding it.

>
> Dear colleagues,
>
> I've just placed 06 at http://www.haifa.il.com/satran/ips
>
> and submitted it to ID.
>
> Recovery is still "work in progress" and probably won't make it to
Nashua.
>
> Only minor changes vs. 05-92:
>
> At David Blacks suggestion I've added an explanation about how to use
CmdSN
> with  task manament (in chapter 7).
>
> InDataOrder is now DataOrder - media that requires order will get-it.
>
> Some typo's (sure not all -:))
>
> Regards,
> Julo
>
>
>
>






From owner-ips@ece.cmu.edu  Thu Apr 19 19:07:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19513
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 19:07:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JKmXq02044
	for ips-outgoing; Thu, 19 Apr 2001 16:48:33 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JKlhA02017
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 16:47:43 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3JLtC121402;
	Thu, 19 Apr 2001 14:55:12 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Tsvwg" <tsvwg@ietf.org>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: [tsvwg][ips] TCP framing 
Date: Thu, 19 Apr 2001 13:45:22 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMEJBCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: multipart/mixed;
	boundary="----=_NextPart_000_0015_01C0C8D6.FAE2D4A0"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.

------=_NextPart_000_0015_01C0C8D6.FAE2D4A0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

All,

Attached is a proposal to allow framing compatible with SCTP. It is very
small.

Doug


------=_NextPart_000_0015_01C0C8D6.FAE2D4A0
Content-Type: text/plain;
	name="draft-otis-tcp-framing-00.txt"
Content-Disposition: attachment;
	filename="draft-otis-tcp-framing-00.txt"
Content-Transfer-Encoding: quoted-printable

   Internet Draft                                     Douglas Otis
   Expires October 2001                                   SANlight
                                                    April 19, 2001

                      TCP Framing Header
                 draft-otis-tcp-framing-00.txt

   Status of this Memo


   This document is an Internet-Draft and is in full conformance=20
   with all provisions of Section 10 of RFC2026 [1].=20

   Internet-Drafts are working documents of the Internet=20
   Engineering Task Force (IETF), its areas, and its working=20
   groups.  Note that other groups may also distribute working=20
   documents as Internet-Drafts. Internet-Drafts are draft=20
   documents valid for a maximum of six months and may be=20
   updated, replaced, or made obsolete by other documents at any=20
   time.  It is inappropriate to use Internet-Drafts as=20
   reference material or to cite them other than as "work in=20
   progress."=20

   The list of current Internet-Drafts can be accessed at=20
   http://www.ietf.org/ietf/1id-abstracts.txt=20
   The list of Internet-Draft Shadow Directories can be accessed at=20
   http://www.ietf.org/shadow.html.

   Abstract

   This document is to define a header structure suitable for
   implementing a framing structure that encapsulates SCTP (RFC 2960)
   within TCP (RFC 793).  This provides the following:

    - Frame Validation within Frame Header.=20
    - Additional Frame Error Checking.
    - Both Fixed and Variable Length Framing.
    - Identifies Payloads and incorporates SCTP Conventions.

=0C=0A=

  draft-otis-tcp-framing-00.txt                              Page [2]

   Introduction

   This framing scheme is to incorporate SCTP within TCP.  SCTP is a=20
   framing protocol unlike TCP and to create a natural transition to
   SCTP, this TCP framing scheme can be employed.=20

   To allow a generic method of framing records within TCP byte streams,
   the following two constructs are used.  These are Fixed Interval and
   Variable Interval header placement.  In the Fixed Interval mode, the
   actual SCTP frame size is indicated in the Pseudo-Frame Length with
   the remainder padded to this fixed interval.  The Fixed Interval is
   set by the 32-bit byte length Fixed Interval parameter within the
   Initiation (INIT) Chunk as assigned by IANA.  In the Variable =
Interval
   mode, the Pseudo-Frame Length indicates the byte displacement to the
   next Modified SCTP Common Header.

   Modified SCTP Common Header Format

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Mode  | Ver   |         Pseudo-Frame Length in Bytes          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Verification Tag                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           Checksum                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The Mode field identifies the possible framing mechanisms used.
     0 =3D Fixed Interval Modified SCTP Common Header Placement
     1 =3D Variable Interval Modified SCTP Common Header Placement
     2 - 15 =3D Reserved.
  =20
   The Ver field identifies different levels of compatibility.  =
Presently
   this field is defined as 0.  Verification Tag and Checksum are
   defined within the SCTP specification.  The flow control schemes and
   physical framing is determined by TCP. =20

   As an illustrative example for Fixed Interval Header Placement.
=20
    | Ethernet Frame |   |         |      |      |        |
    |  Pseudo-Frame  |   Pseudo-Frame    |  Pseudo-Frame  |
    |xxxxxxxxxxx00000|xxxxxxxxxxxxxxxxxx0|xxxxxxxxxxxxxxxx|
    [Chunk|Chunk|----[ Chunk |   Chunk   [        | Chunk [...

   [ =3D Modified SCTP Common Header
   0 =3D Padding
   x =3D payload
   | =3D boundary

   The reasons for using either fixed or variable length modes and their
   interaction with Upper Layer Protocols are beyond the scope of this
   document.



=0C=0A=

  draft-otis-tcp-framing-00.txt                               Page [3]

   Author's Address:=20

         Douglas Otis=20
         SANlight Inc.=20
         160 Saratoga Ave, #40=20
         Santa Clara, CA 95051=20
         Tel: (408) 260-1400 x2=20
         dotis@sanlight.net=20

   Full Copyright Statement=20
   Copyright (C) The Internet Society (2000).
     All Rights Reserved.=20

   This document and translations of it may be copied and=20
   furnished to others, and derivative works that comment on or=20
   otherwise explain it or assist in its implementation may be=20
   prepared, copied, published and distributed, in whole or in=20
   part, without restriction of any kind, provided that the=20
   above copyright notice and this paragraph are included on all=20
   such copies and derivative works.=20

   However, this document itself may not be modified in any way,=20
   such as by removing the copyright notice or references to the=20
   Internet Society or other Internet organizations, except as=20
   needed for the purpose of developing Internet standards in=20
   which case the procedures for copyrights defined in the=20
   Internet Standards process must be followed, or as required=20
   to translate it into languages other than English.=20

   The limited permissions granted above are perpetual and will=20
   not be revoked by the Internet Society or its successors or=20
   assigns.  This document and the information contained herein=20
   is provided on an "AS IS" basis and THE INTERNET SOCIETY AND=20
   THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES,=20
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY=20
   THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY=20
   RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR=20
   FITNESS FOR A PARTICULAR PURPOSE.

=20

------=_NextPart_000_0015_01C0C8D6.FAE2D4A0--



From owner-ips@ece.cmu.edu  Thu Apr 19 19:07:51 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19531
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 19:07:50 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JLUI004239
	for ips-outgoing; Thu, 19 Apr 2001 17:30:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JLTRA04187
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 17:29:27 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3TCZG>; Thu, 19 Apr 2001 17:30:53 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801546B@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net
Cc: ips@ece.cmu.edu
Subject: TCP framing and SCTP Checksum: TSVWG only
Date: Thu, 19 Apr 2001 17:29:18 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Both TCP framing and SCTP checksums are topics
specific to TSVWG - please post on these only
to the tsvwg list and do not cross-post to the
ips list, as we already have more than enough
traffic.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Thu Apr 19 19:10:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19579
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 19:10:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JLuQH05514
	for ips-outgoing; Thu, 19 Apr 2001 17:56:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-2.cisco.com (sj-msg-core-2.cisco.com [171.69.43.88])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JLPbA04041
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 17:25:37 -0400 (EDT)
Received: from mira-sjc5-2.cisco.com (mira-sjc5-2.cisco.com [171.71.163.16])
	by sj-msg-core-2.cisco.com (8.9.3/8.9.1) with ESMTP id OAA20645;
	Thu, 19 Apr 2001 14:21:33 -0700 (PDT)
Received: from cisco.com (rtp-dial-2-159.cisco.com [10.83.96.159])
	by mira-sjc5-2.cisco.com (Mirapoint)
	with ESMTP id ACX00962 (AUTH rrs);
	Thu, 19 Apr 2001 14:20:58 -0700 (PDT)
Message-ID: <3ADF56B7.A85B5A19@cisco.com>
Date: Thu, 19 Apr 2001 16:20:55 -0500
From: Randall Stewart <rrs@cisco.com>
X-Mailer: Mozilla 4.73 [en]C-CCK-MCD   (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Qiaobing Xie <xieqb@cig.mot.com>
CC: tsvwg@ietf.org, Chip Sharp <chsharp@cisco.com>, vince_cavanna@agilent.com,
        steph@cs.uchicago.edu, jim_wendt@hp.com, julian_satran@il.ibm.com,
        ips@ece.cmu.edu, craig@aland.bbn.com, Jonathan.Wood@sun.com,
        "Qiaobing.Xie" <Qiaobing_Xie-QXIE1@email.mot.com>,
        jonathan@dsg.stanford.edu
Subject: Re: [Tsvwg] [SCTP checksum problems]
References: <4.3.2.7.2.20010419123133.01d8a210@dogwood.cisco.com> <200104192024.PAA09924@agevole.cig.mot.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Q:

Good thoughts except for one minor detail... see below...

Qiaobing Xie wrote:

> Hi, all,
>
> Please correct me if I am under the wrong impression, but what I've been
> hearing so far from the discussion in this thread seems to be:
>
>  1) There is uncertainty on what error model should be used for
>     evaluating transport error check mechanisms;
>
>  2) iSCSI level integrity/authentication seems inevitable not matter how
>     strong the transport error check is;
>
>  3) Some uncertainty about the implementation cost (hw and sw) on some
>     of the proposed CRC-based schemes, but almost certainly any
>     CRC-based scheme will be more expensive than checksum schemes;
>
>  4) Discussion of requirements has been so far very much limited to
>     iSCSI data transport, and it seems that a CRC-based scheme is
>     strongly desirable for iSCSI case _if_ the transport error check
>     alone is all we will get to meet the iSCSi requirements.
>
> Now my question to the group is: if a weaker but much cheaper SCTP error
> checking mechanism can be found to be sufficient to the majority of the
> applications (remember SCTP is a general purpose transport), and the
> ultra-low error rate applications such as iSCSI will eventually rely on
> their own integrity check, should we adopt a selection approach which
> put more emphasis on the ease-of-implementation side?
>
> (Maybe after all the Adler-32 in the current RFC2960 is good enough :-)
> )
>

The major weakness of Adler-32 (if I understand Jonathan and Craig
correctly) is that in
short messages (very common in signalling) the upper two bytes of the
checksum are not
being incremented very often. This is due to the 8bit adds happening at the
input of
data. I am not sure of the strenght of this sum but it is on the order of
weaker than
TCP (if I have heard right). A SIMPLE fix to this is to change the adds to
16 bit
increments.. so input to the c-sum is an array of 16 bit int's instead of 8
bit ints. This
1/2s the work you must do to produce the checksum and causes the upper bits
to
be incremented better... So, I think no matter what we do CRC or checksum a
change is needed. I think to take the minimalist approach changing the
suming to
16 bits is the way to go... Other options such as 32 bit TCP type scares me
a bit
since it will require a few mor operations in any machine that does not have

a 64 bit accumulator... (which is many many many small devices). ...

I see the real issue here is we can either choose a
CRC-32 variant or
one
modified alder..

R




>
> regards,
> -Qiaobing



From owner-ips@ece.cmu.edu  Thu Apr 19 21:29:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA20618
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 21:29:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JLtKS05422
	for ips-outgoing; Thu, 19 Apr 2001 17:55:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from motgate2.mot.com (motgate2.mot.com [136.182.1.10])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JKQ0A00736
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 16:26:00 -0400 (EDT)
Received: [from mothost.mot.com (mothost.mot.com [129.188.137.101]) by motgate2.mot.com (motgate2 2.1) with ESMTP id NAA22689; Thu, 19 Apr 2001 13:24:46 -0700 (MST)]
Received: [from relay2.cig.mot.com (relay2.cig.mot.com [136.182.15.24]) by mothost.mot.com (MOT-mothost 2.0) with ESMTP id NAA11029; Thu, 19 Apr 2001 13:24:46 -0700 (MST)]
Received: from agevole.cig.mot.com (agevole [136.182.3.251]) by relay2.cig.mot.com (8.9.0/SCERG-RELAY-1.11b) with ESMTP id PAA06027; Thu, 19 Apr 2001 15:24:45 -0500 (CDT)
Received: from cig.mot.com (d42-506b.cig.mot.com [160.15.80.107]) by agevole.cig.mot.com (8.7.5 Motorola CIG/ITS v1.1 (Solaris 2.5)) with ESMTP id PAA09924; Thu, 19 Apr 2001 15:24:44 -0500 (CDT)
Message-Id: <200104192024.PAA09924@agevole.cig.mot.com>
Date: Thu, 19 Apr 2001 15:26:53 -0500
From: Qiaobing Xie <xieqb@cig.mot.com>
X-Mailer: Mozilla 4.75 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: tsvwg@ietf.org
CC: Chip Sharp <chsharp@cisco.com>, vince_cavanna@agilent.com,
        steph@cs.uchicago.edu, jim_wendt@hp.com, julian_satran@il.ibm.com,
        ips@ece.cmu.edu, craig@aland.bbn.com, Jonathan.Wood@sun.com,
        "Qiaobing.Xie" <Qiaobing_Xie-QXIE1@email.mot.com>,
        jonathan@dsg.stanford.edu, rrs@cisco.com
Subject: Re: [Tsvwg] [SCTP checksum problems]
References: <4.3.2.7.2.20010419123133.01d8a210@dogwood.cisco.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Hi, all,

Please correct me if I am under the wrong impression, but what I've been
hearing so far from the discussion in this thread seems to be:

 1) There is uncertainty on what error model should be used for
    evaluating transport error check mechanisms;

 2) iSCSI level integrity/authentication seems inevitable not matter how
    strong the transport error check is;

 3) Some uncertainty about the implementation cost (hw and sw) on some
    of the proposed CRC-based schemes, but almost certainly any 
    CRC-based scheme will be more expensive than checksum schemes;

 4) Discussion of requirements has been so far very much limited to
    iSCSI data transport, and it seems that a CRC-based scheme is 
    strongly desirable for iSCSI case _if_ the transport error check 
    alone is all we will get to meet the iSCSi requirements.

Now my question to the group is: if a weaker but much cheaper SCTP error
checking mechanism can be found to be sufficient to the majority of the
applications (remember SCTP is a general purpose transport), and the
ultra-low error rate applications such as iSCSI will eventually rely on
their own integrity check, should we adopt a selection approach which
put more emphasis on the ease-of-implementation side?

(Maybe after all the Adler-32 in the current RFC2960 is good enough :-)
)

regards,
-Qiaobing


From owner-ips@ece.cmu.edu  Thu Apr 19 22:55:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA23053
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 22:55:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K0sOh13436
	for ips-outgoing; Thu, 19 Apr 2001 20:54:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K0rYA13417
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 20:53:34 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRMNC>; Thu, 19 Apr 2001 17:53:28 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173443@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Ips (E-mail)" <ips@ece.cmu.edu>
Subject: iFCP Revision 01 Draft Available
Date: Thu, 19 Apr 2001 17:53:26 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

Although the latest iFCP rev was submitted to the IETF archive within the
publication deadline, it appears that archive submittals may be temporarily
on hold.  In the interest of timely availability, the draft can be obtained
from

http://www.nishansystems.com/ietf/draft-ietf-ips-iFCP-01.txt

Charles
Charles Monia
Senior Technology Consultant
Nishan Systems
email: cmonia@nishansystems.com
voice: (408) 519-3986
fax:   (408) 435-8385


From owner-ips@ece.cmu.edu  Thu Apr 19 23:01:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23098
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 23:01:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K0NN012167
	for ips-outgoing; Thu, 19 Apr 2001 20:23:23 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [128.221.11.32])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K0NKA12162
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 20:23:20 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <28S741QA>; Thu, 19 Apr 2001 20:23:14 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015472@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, Black_David@emc.com
Cc: ips@ece.cmu.edu
Subject: RE: TCP framing and SCTP Checksum: TSVWG only
Date: Thu, 19 Apr 2001 20:23:13 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Doug - please follow instructions from your WG co-chair.
Both TCP framing and SCTP Checksums are tsvwg topics
and need to be taken up there.  If solutions emerge
there, they can be adopted by reference here - you
can be assured that you're not the only person on
both mailing lists.

--David

> -----Original Message-----
> From:	Douglas Otis [SMTP:dotis@sanlight.net]
> Sent:	Thursday, April 19, 2001 7:37 PM
> To:	Black_David@emc.com
> Cc:	ips@ece.cmu.edu
> Subject:	RE: TCP framing and SCTP Checksum: TSVWG only
> 
> David,
> 
> The various proposals on the table for either iFCP or FCIP encapsulation
> may
> benefit from utilization of a generalized framing layer suitable for both
> TCP and SCTP.  In that respect, I thought it advantageous to consider
> these
> aspects.  This goes into the generalized aspects of extended error
> detection, frame validation, higher level retry on a digest failure, and
> identification of the underlying payload and was my reasoning for
> including
> this within the IPS WG.  I understand if this proposal is to proceed on a
> standards path, it will be within TSVWG.  If there is no interest within
> this WG, then it will not be mentioned further.
> 
> Doug
> 
> 
> > Both TCP framing and SCTP checksums are topics
> > specific to TSVWG - please post on these only
> > to the tsvwg list and do not cross-post to the
> > ips list, as we already have more than enough
> > traffic.
> >
> > Thanks,
> > --David
> >
> > ---------------------------------------------------
> > David L. Black, Senior Technologist
> > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> > black_david@emc.com       Mobile: +1 (978) 394-7754
> > ---------------------------------------------------
> >
> >


From owner-ips@ece.cmu.edu  Thu Apr 19 23:23:05 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23262
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 23:23:04 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K1fQ415649
	for ips-outgoing; Thu, 19 Apr 2001 21:41:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K1fDA15605
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 21:41:13 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LQ181>; Thu, 19 Apr 2001 21:31:47 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015476@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: Internet-Draft delays
Date: Thu, 19 Apr 2001 21:41:06 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

There appears to be a serious delay in getting submitted
drafts posted to the Internet-Draft servers.  Authors of
drafts for the interim meeting should post them to a
web site and send the URL to the mailing list.  Who would
have thought that we needed to coordinate the interim
meeting schedule with the I-D administrator :-( ??

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Thu Apr 19 23:30:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23295
	for <ips-archive@odin.ietf.org>; Thu, 19 Apr 2001 23:30:01 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K1HP914551
	for ips-outgoing; Thu, 19 Apr 2001 21:17:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K1GVA14515
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 21:16:31 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3K2OP121630;
	Thu, 19 Apr 2001 19:24:25 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: TCP framing and SCTP Checksum: TSVWG only
Date: Thu, 19 Apr 2001 18:14:35 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEJHCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0708015472@corpmx9.isus.emc.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

I did not start the checksum thread, but I will be happy to follow your
instructions.  I can not guarantee actions of others.  I understand the
common interests where both groups have WIP related to evaluating end-to-end
error detection.  I have already given assurance with respect to TCP framing
if there is no related interest although I feel here again this is
potentially related to iFCP and FCIP encapsulation.  Opportunities are rare
for sharing efforts with mutual benefits.

Doug

> Doug - please follow instructions from your WG co-chair.
> Both TCP framing and SCTP Checksums are tsvwg topics
> and need to be taken up there.  If solutions emerge
> there, they can be adopted by reference here - you
> can be assured that you're not the only person on
> both mailing lists.
>
> --David



From owner-ips@ece.cmu.edu  Fri Apr 20 02:15:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA08102
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 02:15:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K1RPQ15017
	for ips-outgoing; Thu, 19 Apr 2001 21:27:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K1QxA14982
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 21:26:59 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id 53AD41494
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 18:26:58 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id SAA05073
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 18:26:53 -0700 (PDT)
Message-ID: <3ADF901D.46537CB3@cup.hp.com>
Date: Thu, 19 Apr 2001 18:25:49 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI : More issues.... Digest related.
Content-Type: multipart/mixed;
 boundary="------------C27800805D6F5B14AF44B88E"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------C27800805D6F5B14AF44B88E
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian & All,

2 more issues :

1) If the DataSegmentLength in the BHS excludes padding bytes, how does
the initiator determine the location of the data digest [which is placed
after the padded data] ?
There is no knowledge of what amount of padding is in use, since padding
can be 4 bytes or a multiple of that quantity.


2) While on the subject of digests.....are'nt there supposed to be
login keys to indicate the use or non-use of header and data digests ? I
can't seem to find any such login keys in the latest revs
5.91....5.92....6.000...(?)

(Section 2.2.1 states :
"The digest types are negotiated during the login phase. ").

- Santosh
--------------C27800805D6F5B14AF44B88E
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------C27800805D6F5B14AF44B88E--



From owner-ips@ece.cmu.edu  Fri Apr 20 02:42:30 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA08193
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 02:42:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JNQUT09662
	for ips-outgoing; Thu, 19 Apr 2001 19:26:30 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JNPXA09606
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 19:25:34 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3K0UM121532;
	Thu, 19 Apr 2001 17:30:22 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <tsvwg@ietf.org>, "Qiaobing Xie" <xieqb@cig.mot.com>
Cc: "End2end-Interest" <end2end-interest@postel.org>, <rrs@cisco.com>,
        <jonathan@dsg.stanford.edu>,
        "Qiaobing.Xie" <Qiaobing_Xie-QXIE1@email.mot.com>,
        <Jonathan.Wood@sun.com>, <craig@aland.bbn.com>, <ips@ece.cmu.edu>,
        <julian_satran@il.ibm.com>, <jim_wendt@hp.com>,
        <steph@cs.uchicago.edu>, <vince_cavanna@agilent.com>,
        "Chip Sharp" <chsharp@cisco.com>
Subject: RE: [Tsvwg] [SCTP checksum problems]
Date: Thu, 19 Apr 2001 16:20:32 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJAEJECGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <200104192024.PAA09924@agevole.cig.mot.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Qiaobing,

In essence, the problems for iSCSI and SCTP are similar.  Both wish to
improve error detection over TCP checksums.  Examination of Adler-32
indicated weak burst error detection and low sensitivity for small frames, a
principle uses of SCTP at this point.  A realistic means of examining the
transport end-to-end error detection is needed for both of these protocols,
although SCTP is indeed more generalized.  Moving forward, it would be
beneficial if a common scheme were employed and agreed adequate.  There are
problems with various Telco components over more robust Ethernet equipment.
DSLAM interfaces and poorly aggregated SONET packets may be areas of greater
concern than Gaussian noise or burst errors.  Without a better understanding
of the present error sources, assessing any scheme is difficult.

A 128k byte table for CRC is not practical for most applications so CRC
requires about 8 instructions per byte with a 1K lookup table.  The modified
Adler-32 where S1 becomes a simple unsigned summation of 16 bit values
seeded with 0x5555 but where S2 retains the Adler modulo only requires about
2 instructions per byte versus about 3 of the prior algorithm.  This
modification reduces the present overhead depending the host to network
endian arrangement and should help with both burst error detection and
smaller packets.  This modification would also reduce the overhead
associated with a hardware implementation of this scheme.

Those working with storage products are expert at assessing burst error
detection and it is not likely any other scheme other than CRC will provide
superior fault coverage for burst errors.  This assessment is not likely
realistic if burst errors or Gaussian noise are not a predominate source.
Virtually any error scheme done in addition to the IEEE CRC found on the
various physical serial transports will provide adequate coverage.  The
problems arise from errors outside this CRC protective blanket.  The
assessment may concern the bit distribution of algorithms and susceptibility
to various pathological patterns.  Again, CRC hold advantages in this area
but it may also become a weakness with respect to some error modes where bit
coloring can be advantageous.

This area is not easily resolved because of the many facets involved.  It
would appear with the problems uncovered with Adler-32, this work should be
revisited and hopefully put to rest.

Doug

> Hi, all,
>
> Please correct me if I am under the wrong impression, but what I've been
> hearing so far from the discussion in this thread seems to be:
>
>  1) There is uncertainty on what error model should be used for
>     evaluating transport error check mechanisms;
>
>  2) iSCSI level integrity/authentication seems inevitable not matter how
>     strong the transport error check is;
>
>  3) Some uncertainty about the implementation cost (hw and sw) on some
>     of the proposed CRC-based schemes, but almost certainly any
>     CRC-based scheme will be more expensive than checksum schemes;
>
>  4) Discussion of requirements has been so far very much limited to
>     iSCSI data transport, and it seems that a CRC-based scheme is
>     strongly desirable for iSCSI case _if_ the transport error check
>     alone is all we will get to meet the iSCSi requirements.
>
> Now my question to the group is: if a weaker but much cheaper SCTP error
> checking mechanism can be found to be sufficient to the majority of the
> applications (remember SCTP is a general purpose transport), and the
> ultra-low error rate applications such as iSCSI will eventually rely on
> their own integrity check, should we adopt a selection approach which
> put more emphasis on the ease-of-implementation side?
>
> (Maybe after all the Adler-32 in the current RFC2960 is good enough :-)
> )
>
> regards,
> -Qiaobing



From owner-ips@ece.cmu.edu  Fri Apr 20 02:46:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA08215
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 02:46:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K2Aos16957
	for ips-outgoing; Thu, 19 Apr 2001 22:10:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K29RA16854
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 22:09:27 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3K3HB121676;
	Thu, 19 Apr 2001 20:17:12 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Joshua Tseng" <jtseng@NishanSystems.com>,
        "'KRUEGER,MARJORIE \(HP-Roseville,ex1\)'" <marjorie_krueger@hp.com>,
        <ips@ece.cmu.edu>
Subject: RE: iSCSI: target discovery issue
Date: Thu, 19 Apr 2001 19:07:20 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJCEJICGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <B300BD9620BCD411A366009027C21D9B3FC1E2@ariel.nishansystems.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Joshua,

Although being generic, it can also provide directory based privileges.  The
various requests structures can also provide tailored responses.  Not to
dwell on what has become a sensitive subject, there are alternatives to
signaling other than that proposed by iSNS.  It seems a signaling method
from IP based SCSI servers for informing clients of a need to check
configuration would be far more useful with respect to scaling this
architecture.

Clients would naturally already have persistent connections to these servers
greatly reducing the number of connections involved.  Signaling a change to
affected servers can be handled through any number of IPCs especially if
this number is within the realm of a few dozen servers and not thousands of
clients.  The SCSI server should be able to filter these signals adequately
based on common knowledge shared between server and client.  One other
principle area of concern comes from the need of creating this global name
space without a means of promulgating information across vendors.

The signaling already present for Fibre-Channel switches and controller may
need agents adopted for IP, but I would not be surprised if this already
exists and most of the tools for managing this equipment already share these
popular interfaces.  Signaling a change and distributing information should
be two separate functions.  Distributing information is easily handled by
generic servers.  A SAN should not be flapping at such a rate to mandate
specialized persistent connections to each and every client nor high levels
of overhead for servers to inform their clients if they find they were
affected the a resent update.  Each server must manage their own domain and
informing clients of a change seems like such a small matter.  This should
not justify adding a name server for this purpose.

Doug

> If LDAP is what you are talking about, please see my previous notes
> to Doug Otis regarding how LDAP is basically a generic directory service
> that passively stores information deposited by clients, without any
> regard to what that information is used for.  iSNS is also a protocol,
> but since it is tailored to storage, it can interpret information
> registered by clients, and take appropriate action.  That isn't to
> say LDAP can't be used to accomplish some of the discovery and management
> functions that iSNS has, but because iSNS has state-consciousness of
> its client storage devices, it has more capabilities than a basic LDAP
> server would have in managing storage devices.  David has already invited
> anyone interested to write a draft on how LDAP can be used, and I would
> be equally interested.  If you or someone else would oblige, then we
> would have a "method E" of discovery iSCSI devices.
>
> Finally, I would like to note that iSNS capabilities are modeled on
> those provided by T11's FC-GS-3.  Presumably, these capabilities are
> based upon real-life lessons learned by the Fibre Channel community in
> managing and operating large enterprise-class storage networks.  I like
> to think that we are incorporating the fruit of those lessons into iSNS.
> We looked at LDAP--we really did--to see if it could provide comparable
> FC-GS-3 services in the IP domain.  But there are shortcomings which
> forced us to create the iSNS protocol.  These shortcomings are documented
> in the iSNS document.
>
> >
> > Typically users register/login to distributed resource management pts
> > (domain servers) and these applications handle authentication,
> > authorization, and assignment of resources.  John makes
> > important points in
> > his email - you don't want all users informed of new storage
> > coming on line,
> > those systems that are intended to have access should be
> > notified, or should
> > explicitly "mount" the new storage.  It's not appropriate to
> > burden each
> > storage device with this task, it is definitely a value add feature
> > appropriate to a centralized resource management application.
>
> Yes, this is where iSNS with the discovery domain feature provides
> significant value in a large storage network.
>
> Regards,
> Josh



From owner-ips@ece.cmu.edu  Fri Apr 20 02:46:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA08225
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 02:46:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3JNdLc10185
	for ips-outgoing; Thu, 19 Apr 2001 19:39:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3JNd6A10174
	for <ips@ece.cmu.edu>; Thu, 19 Apr 2001 19:39:06 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3K0kr121545;
	Thu, 19 Apr 2001 17:46:53 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: TCP framing and SCTP Checksum: TSVWG only
Date: Thu, 19 Apr 2001 16:37:03 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEJFCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <0F31E5C394DAD311B60C00E029101A070801546B@corpmx9.isus.emc.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

The various proposals on the table for either iFCP or FCIP encapsulation may
benefit from utilization of a generalized framing layer suitable for both
TCP and SCTP.  In that respect, I thought it advantageous to consider these
aspects.  This goes into the generalized aspects of extended error
detection, frame validation, higher level retry on a digest failure, and
identification of the underlying payload and was my reasoning for including
this within the IPS WG.  I understand if this proposal is to proceed on a
standards path, it will be within TSVWG.  If there is no interest within
this WG, then it will not be mentioned further.

Doug


> Both TCP framing and SCTP checksums are topics
> specific to TSVWG - please post on these only
> to the tsvwg list and do not cross-post to the
> ips list, as we already have more than enough
> traffic.
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>



From owner-ips@ece.cmu.edu  Fri Apr 20 04:26:05 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08860
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 04:26:03 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K66aS26737
	for ips-outgoing; Fri, 20 Apr 2001 02:06:36 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K65nA26684
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 02:05:49 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id IAA99670
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 08:05:41 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA108608
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 08:05:41 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A34.002178D5 ; Fri, 20 Apr 2001 08:05:36 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A34.00216398.00@d12mta02.de.ibm.com>
Date: Fri, 20 Apr 2001 07:33:46 +0200
Subject: RE: iSCSI : New PDU opcode usage in rev 5.92
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



I would like to add to Venkat remarks only that this asymmetry has been
with us in iSCSI forever
and we had even a statement to the effect that targets should not issue
initiator codes etc. (this is irrelevant now as the codes overlap).

The reason I took out the "direction bit" (meant more for observers) was
that I felt that we are low on codes :-)

Julo

Venkat Rangan <venkat@rhapsodynetworks.com> on 19/04/2001 07:58:23

Please respond to Venkat Rangan <venkat@rhapsodynetworks.com>

To:   "'Santosh Rao'" <santoshr@cup.hp.com>, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI : New PDU opcode usage in rev 5.92




Santosh,

Is it not the case that requests go in the direction from the Initiator to
Target,
where Target is the one "listening" for new connections on the well-known
port?
A dual mode scsi implementation therefore has two separate sessions and
sets
of connections.
One set is [I->DualModeTarget] and the other is [DualModeInitiator->T]
and the connections are independent. If I and T happens to be the same
system, you
can not use a single connection for bidirectional sessions between the two.

So if you receive a PDU from a target, you can only do so with SourcePort
set to
well-known-port, and it must be a Response from target. May be I'm assuming
something
that is not valid...

Venkat Rangan
Rhapsody Networks Inc.
http://www.rhapsodynetworks.com

-----Original Message-----
From: Santosh Rao [mailto:santoshr@cup.hp.com]
Sent: Wednesday, April 18, 2001 6:47 PM
To: ips@ece.cmu.edu
Subject: iSCSI : New PDU opcode usage in rev 5.92


Julian & All,

I've got a quick question on how the new opcode layouts would work for
dual mode scsi implementations. (i.e. initiators that responded in
target mode or targets that acted as initiators also).

The new opcode layout is :

----------------
X|I| | | | | | |
----------------
7 6 5 4 3 2 1 0

where bits 5-0 -> opcode
X -> retry bit
I -> immediate bit

The same values are used for the command as well as response opcodes and
bits X & I are intended to both be set to 1 by targets.

i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
command and response is based on targets setting X & I bits to 1.

Now, if an initiator [capable of target mode] sent the following
commands, how would they be interpreted :

1) 0xc4.
is this a text command being retried in immediate mode,
or is it a text response ?

2) 0xc1
is this a scsi command being retried in immediate mode,
or is it a scsi response ?

3) 0xc2
is this a scsi task mgmt command being retried in immediate mode,
or is it a scsi task mgmt response ?

etc.....

- Santosh

--
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################





From owner-ips@ece.cmu.edu  Fri Apr 20 04:26:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08871
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 04:26:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K66ck26741
	for ips-outgoing; Fri, 20 Apr 2001 02:06:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K65nA26685
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 02:05:49 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id IAA75930
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 08:05:41 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA108606
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 08:05:41 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A34.00217917 ; Fri, 20 Apr 2001 08:05:36 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A34.00216472.00@d12mta02.de.ibm.com>
Date: Fri, 20 Apr 2001 08:07:59 +0200
Subject: Re: iSCSI : login keys & mode page settings
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

answers in text.

And again - please tone down your notes.

Julo

Santosh Rao <santoshr@cup.hp.com> on 19/04/2001 18:59:36

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu, Julian Satran/Haifa/IBM@IBMIL
cc:   Stephen Bailey <steph@cs.uchicago.edu>, David Black
      <Black_David@emc.com>
Subject:  Re: iSCSI : login keys & mode page settings




Stephen Bailey wrote:
>
> > +++ It is all about function - several people felt that the (primitive)
> > negotiation element in the text commands is better than trying to set a
> > parameter to an unacceptable value and finding this out through a mode
> > sense
> > ++++
>
> Several other people seem to have felt that it was better not to have
> redundant mechanisms for manipulating this sort of parameter.  How do
> you decide?
>
> Steph


Julian,

I would prefer a single mechanism as opposed to multiple redundant
mechanisms to set negotiation elements. (Reduce/eliminate options.). In
addition, I have the following concerns [some of which I have raised
repeatedly and have NOT gotten a reply. I would appreciate if you could
kindly comment on ALL of these.] :

+++ I will if you will stop SCREAMING. And rememember that I am not bound
to by any rules other than respect for the individual +++

1) The draft makes repeated references to a "iSCSI LUN Control Mode
Page". There is NO such page per SPC-2. The references must be changed
to "iSCSI protocol specific LUN page".

+++ it is called iSCSI LU control mode page the same way as FCP call it.
And this is one of those instances in which the tone of your note is
irritating to say the least.
+++

2) The new daily release of the draft (5.92 when I last checked) has now
introduced EnableACA as a negotiable login key. All references to
EnableACA are redundant and should be removed for the following reasons
:

a) An initiator knows whether a target supports ACA from the NACA bit in
the INQUIRY response. When a target indicates support for ACA, the
initiator can use it by setting the NACA bit in the CDBs it sends. There
is NO need for any sort of negotiation of this behaviour above and
beyond what is already provided thru SCSI mechanisms.

b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
use or lack thereof. This is done thru the NACA bit in CDBs.

c) (As a side note, the description of EnableACA on pg 127 refers to its
presence in the lun control mode page, but it is actually present in the
protocol specific port page.)

d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
negotiated on a per-session basis. SCSI allows initiators to request ACA
behaviour on a per I/O basis through the use of NACA bit in the CDBs.

+++ We have required ACA to be supported by all new iSCSI targets and
several
actions require the target to enter ACA state.
It was brought to our attention that many initiators will not react
properly to a
target entering ACA state (not do the reset).
The EnableACA bit and key are meant to enable an initiator to control this
iSCSI specific ACA behaviour.  This behaviour is related to asynchronous
events and is not controlled by the NACA CDB bit.

++++


2)
> On a side note, the EnableCmdRN  & CmdRN fields should be re-named to
> EnableCRN and CRN to reflect the same semantics and context as the CRN
> defined in SAM-2 and FCP-2.
>
> +++ what's in a name... +++

Consistency for one ! (Any strong reasons not to call this CRN, as SAM
and FCP do ?)

+++ No strong reason except the fact that iSCSI - has its own naming style
- but I've changed
this.
+++

3) However, having allowed 2 mechanisms to set negotiation elements,
iSCSI MUST
then comment on the need to synchronize their settings in the 2 layers
and also comment on the need to trigger a UNIT ATTENTION when changed
through the login key mechanism.
Again, I would vote for only 1 mechanism for setting these control
options, rather than have to define communication schemes b/n the ULP
and LLP to keep their values in synch and generate UNIT ATTENTION.

+++ Parameter changes originate from SCSI and iSCSI only enables another
mechanism to convey them.
This is an implementation issue +++

4)
> If such a level of dual control is provided, the iSCSI login
> keys listed above be made LO (leading only) to allow for changes to
> operational parameters only during session login. This is to
> minimize/eliminate disruption of ongoing I/O activity that occurs due to
> the generation of a UNIT ATTENTION CHECK CONDITION when any change is
> made to the above paramters.

Are we in agreement on the above ?

+++ No +++
5)
> If these operational parameters are allowed to be set through iSCSI
> login and they also impact mode page settings, iSCSI spec should
> describe the scope of the mode page setting in terms of whether this
> setting is a saved page setting or not ?
>
+++ I don't know - I would rather think not +++
6)
> Should saved page settings be allowed thru iSCSI ?
+++ I don't know - I would rather think not +++
6)
I did not see any comments on the above issues (?).

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Fri Apr 20 06:41:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA09945
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 06:41:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K80oV01596
	for ips-outgoing; Fri, 20 Apr 2001 04:00:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K80KA01577
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 04:00:21 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id JAA301128;
	Fri, 20 Apr 2001 09:57:29 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id JAA211478;
	Fri, 20 Apr 2001 09:57:28 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A34.002BB4F6 ; Fri, 20 Apr 2001 09:57:23 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Chip Sharp <chsharp@cisco.com>
cc: vince_cavanna@agilent.com, steph@cs.uchicago.edu, jim_wendt@hp.com,
        ips@ece.cmu.edu, tsvwg@ietf.org, craig@aland.bbn.com,
        Jonathan.Wood@sun.com, xieqb@cig.mot.com, jonathan@dsg.stanford.edu,
        rrs@cisco.com
Message-ID: <C1256A34.002BB1E8.00@d12mta05.de.ibm.com>
Date: Fri, 20 Apr 2001 10:02:22 +0200
Subject: RE: [Tsvwg] [SCTP checksum problems]
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Chip,

CRC s are not meant to protect against malicious middle boxes - rather on
boxes that strip the strong link CRCs and
let the end-system rely on the weak TCP checksum.

NAT boxes have good reason to recompute TCP checksums, but unless they are
malicious no reason to recompute iSCSI CRCs.

And against malicious boxes iSCSI has cryptographic digests as options.

And I was not aware that we are discussing - in this forum - iSCSI data
integrity options.

Julo

Chip Sharp <chsharp@cisco.com> on 19/04/2001 18:53:53

Please respond to Chip Sharp <chsharp@cisco.com>

To:   vince_cavanna@agilent.com
cc:   steph@cs.uchicago.edu, vince_cavanna@agilent.com, jim_wendt@hp.com,
      Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org,
      craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
      jonathan@dsg.stanford.edu, rrs@cisco.com
Subject:  RE: [Tsvwg] [SCTP checksum problems]




As was pointed out previously, middle box operations (such as NATs) tend to
creep up the protocol stack and into applications.

Take SIP for example.  It includes IP addresses in its INVITE.  In order to
work across a NAT, the IP addresses it exchanges have to be replaced with
the NATed address.  One way is for the NAT to reach up into the SIP INVITE
and change the address.  This modifies the TCP or UDP checksum.  Now SIP
could have included its own integrity check to protect against corrupted or
modified TCP checksums, but all that would have happened is that NATs would
have changed the SIP checksum in addition to the TCP/UDP checksum.

Therefore, even if iSCSI included its own integrity check, if a middle box
is going to futz with iSCSI packets it will just strip the check, do
whatever it does and then recalculate the check.

If this is what you want to protect against you will have to go to some
type of digital signature.

At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
>Stephen,
>
>I have to admit that I do not have much direct experience with middle
boxes,
>BUT I did have fairly direct and recent experience with a popular NAT
router
>from a popular vendor that was corrupting data in a network of
Macintoshes.
>
>Apple's TCP was unaware of any problem as was Apple's Filing Protocol and
>most applications. The only applications that detected the corruption were
>those that performed an integrity check of their own. Those applications
>that assumed a reliable transport (and file system) were doomed to
>experiencing the indirect effects of the corruption at some later time.
The
>corruption only happened when large amounts of data were transferred
>quickly.  The router vendor fixed the problem once; then fixed it again;
>then fixed it one last time before the data corruption finally
>"disappeared". After several weeks of continuous operation the router
>appeared to get into a mode where it was once again corrupting data. Power
>cycling the router "fixed it". The story apparently has not yet ended.
>
>I admit I may have given too much significance to this single incident
that
>I have personally experienced but on the other hand I don't see the
>mechanisms in place to prevent this type of problem in the future other
than
>the end to end integrity checks.
>
>Incidentally this incident change my behavior when transferring data over
a
>network. I will always use a compression utility; not only for reducing
the
>data to be transmitted but to ensure the integrity of my data is protected
>end to end by the utility's CRC mechanism.
>
>I believe quite firmly that we DO need a mechanism to allow us to tolerate
>poor implementations of middle boxes and cannot simply hope that
eventually
>such poor implementations will vanish, nor that we will have the luxury of
>being able to select only good implementations for every component of our
>storage network.
>
>Vince
>
>|-----Original Message-----
>|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
>|Sent: Wednesday, April 18, 2001 3:09 PM
>|To: CAVANNA,VICENTE V (A-Roseville,ex1)
>|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
>|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
>|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
>|Subject: Re: [Tsvwg] [SCTP checksum problems]
>|
>|
>|Vince,
>|
>|> I don't think iSCSI can be completely relieved of performing
>|some data
>|> integrity checking as long as there exists the possibility
>|of "middle boxes"
>|> opening up the transport protocol's packet and thus
>|potentially invalidating
>|> any reliability guarantees the transport protocol makes.
>|
>|Any protection provided against this failure mode will only be
>|transient, so we must temper the desire to introduce such a
>|requirement with reality.
>|
>|Middleboxes can just as easily open up to the iSCSI layer and tinker
>|with the payload, as they do with other ULPs running on TCP (e.g HTTP)
>|today.  Short of securing the connection, there is ALWAYS a
>|possibility of a middlebox terminating and reoriginating an integrity
>|check.  In case you think this is a farfetched scenario, I do get the
>|impression that there is a high level of interest in `actively
>|middling' iSCSI once the specs crystalize.  Who shaves the barber?
>|
>|An integrity check is not necessary as long as some lower layer
>|provides adequate integrity guarantees.
>|
>|Adding an integrity check above the transport layer is based upon
>|documentation of the presence of a lot of crappy network hardware and
>|software and analyses of the transport integrity check (TCP checksum)
>|which suggests it might not be adequately strong against some such
>|observed errors.
>|
>|I claim that the high incidence of `broken' (corruption introducing)
>|components is a result of a variety of factors which have shaped the
>|development of network components thus far.  The fact that integrity
>|checks are assumed to be performed in a network context substantially
>|lowers the bar for implementation correctness.
>|
>|In a storage (or CPU) context, these types of implementation errors
>|are a) more easily detectable (more fatal) b) more carefully avoided
>|during implementation (because of the cost of a potential fatal
>|error).  If network components magically reached the same `quality
>|level' as storage and CPU components, there might be no justification
>|for additional integrity checks above the transport.  Similarly if the
>|transport (or whatever lower layer) integrity checks are very strong
>|(e.g. IPSec), there is, again, no need for a higher level integrity
>|check.
>|
>|I am not disagreeing that we need an additional integrity check over
>|TCP in the present target environment, but I do disagree that iSCSI
>|will always need such a check, independently of what is running
>|beneath it.
>|
>|Steph
>|


-------------------------------------------------------------------
Chip Sharp                       Consulting Engineering
Cisco Systems
-------------------------------------------------------------------






From owner-ips@ece.cmu.edu  Fri Apr 20 06:43:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA09961
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 06:43:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3K8f2703264
	for ips-outgoing; Fri, 20 Apr 2001 04:41:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3K8dmA03169
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 04:39:48 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id KAA09026
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 10:39:41 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id KAA84076
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 10:39:41 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A34.002F9272 ; Fri, 20 Apr 2001 10:39:36 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A34.002F9227.00@d12mta02.de.ibm.com>
Date: Fri, 20 Apr 2001 10:44:47 +0200
Subject: Re: iSCSI : More issues.... Digest related.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



1.The padding is to the next 4 byte word boundary .
2. There is a Security - Appendix

and there is a numbering /formating error in the appendix

Julo

Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 03:25:49

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : More issues.... Digest related.




Julian & All,

2 more issues :

1) If the DataSegmentLength in the BHS excludes padding bytes, how does
the initiator determine the location of the data digest [which is placed
after the padded data] ?
There is no knowledge of what amount of padding is in use, since padding
can be 4 bytes or a multiple of that quantity.


2) While on the subject of digests.....are'nt there supposed to be
login keys to indicate the use or non-use of header and data digests ? I
can't seem to find any such login keys in the latest revs
5.91....5.92....6.000...(?)

(Section 2.2.1 states :
"The digest types are negotiated during the login phase. ").

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Fri Apr 20 12:00:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA13647
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 12:00:04 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KDbAY19736
	for ips-outgoing; Fri, 20 Apr 2001 09:37:10 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KDZxA19674
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 09:36:00 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id PAA292314
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 15:35:43 +0200
From: biran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id PAA21890
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 15:35:43 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A34.004AAD1A ; Fri, 20 Apr 2001 15:35:39 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A34.004AAC3F.00@d12mta02.de.ibm.com>
Date: Fri, 20 Apr 2001 16:30:29 +0300
Subject: Criteria for selecting the mandatory security
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk




The main security open issue out of Minneapolis is the 'mandatory
to implement' method. The first step is to agree on the criteria
for selecting it. Following is an initial proposed list (thanks to
Steve Senum on his help) - any comments / additional criteria /
order of importance are welcome.

Regards,
  Ofer


  Criteria for selecting the mandatory security method

1. Suitability for iSCSI implementation scenarios
The role of iSCSI initiator / target / proxy target from the security
aspect. Is the method suitable for the typical scenarios. e.g., should
initiators be defined as 'users' on target systems. Which identity
should be authenticated for doing the authorization decisions.
Naming and Discovery considerations. iSNS requirements / interoperability.
Is a central security server appropriate ?  Corporate intranet aspects -
firewalls etc.

2. Administration
The ease of security administration is probably the most important
issue for customers and system administrators. If we consider only
the authentication and privacy aspects of a security platform, the
administration includes:
  - Getting the system into operational state (i.e., initial
    configuration).
  - Adding / removing users and service principals.
  - Maintenance (password replacements, certificate revocations,
    security servers, security databases)
  - Policy (e.g. password expiration/ certificate revocation)
There are other aspects related to authorization and setting of
services that may need to be considered.

3. Standardization, existing code & implementations
Is the security method based on a formal standard. Are there existing
code (open source, commercial libraries) and implementations. How much
experience and acceptance it has.

4. Code complexity
What is the code complexity for implementation ?  (code size,
programming & testing effort).

5. Performance / hardware acceleration
Authentication performance is less an issue since occur only once per
iSCSI connection. Performance of generation and verification of digests
for message authentication/integrity, and encryption performance (if
used)  are very important for iSCSI requirements. Are there existing
hardware accelerators for the involved digest / encryption algorithms ?

6. Security considerations
This criterion is about the security quality achieved by the method.
Which attacks are the protected, are there known deficiencies in
the cryptographic algorithms that are used, other security problems
with the method scheme.

7. Licensing
Does implementation of the method involve licensing / royalties for
patents ?





Ofer Biran
Storage and Systems Technology
IBM Research Lab in Haifa
biran@il.ibm.com  972-4-8296253




From owner-ips@ece.cmu.edu  Fri Apr 20 12:04:46 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA13748
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 12:04:45 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KDkVY20258
	for ips-outgoing; Fri, 20 Apr 2001 09:46:31 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KDj7A20130
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 09:45:07 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id JAA28718
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 09:37:39 -0400
Received: from d03nmx41.almaden.ibm.com (d03nmx41.almaden.ibm.com [9.1.26.87])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id HAA135636
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 07:44:56 -0600
Importance: Normal
Subject: iSCSI: draft.ietf.ips-iscsi-name-disc-01.txt
To: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.5  September 22, 2000
Message-ID: <OFDCC88EDD.A298549D-ON88256A34.004B1BA4@almaden.ibm.com>
From: "Kaladhar Voruganti" <kaladhar@us.ibm.com>
Date: Fri, 20 Apr 2001 06:44:55 -0700
X-MIMETrack: Serialize by Router on D03NMX41/03/M/IBM(Release 5.0.6a |January 17, 2001) at
 04/20/2001 06:44:56 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

The latest version of the iSCSI Naming and Discovery draft is now available
at :

http://www.haifa.il.ibm.com/satran/ips   web site.



Kaladhar Voruganti
Storage Group
Computer Science Department
IBM Almaden Lab
San Jose, CA





From owner-ips@ece.cmu.edu  Fri Apr 20 12:10:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA13862
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 12:10:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KDq1w20541
	for ips-outgoing; Fri, 20 Apr 2001 09:52:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KDpeA20528
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 09:51:40 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id JAA08800 for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 09:51:34 -0400 (EDT)
Message-ID: <3AE03E8A.CBC3FB67@cisco.com>
Date: Fri, 20 Apr 2001 08:50:02 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: IPS <ips@ece.cmu.edu>
Subject: New Draft: iSCSI/SLP
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


The draft "Finding iSCSI Targets and Name Servers using SLP" has
been posted as a WG document.  It will be available as
draft-ietf-ips-iscsi-slp-00.txt.  However, the draft submissions
are taking some time to get through.  For now, Julian has posted
it on his web page at:

http://www.haifa.il.ibm.com/satran/ips/draft-ietf-ips-iscsi-slp-00.txt

Regards,

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Fri Apr 20 12:10:59 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id MAA13892
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 12:10:58 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KEBtL21586
	for ips-outgoing; Fri, 20 Apr 2001 10:11:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com ([168.159.1.81])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KEBGA21546
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 10:11:16 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <JG3Z9YB2>; Fri, 20 Apr 2001 10:11:10 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801547A@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 10:11:07 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

About a week ago, Santosh Rao wrote:

> I object to the following requirement :
> 
> " MUST support ordered delivery of SCSI commands from the initiator to
> the  target, to support SCSI Task Queuing. "

In the hopes of driving this discussion to closure,
let me summarize my understanding of this situation:

- SAM2 expects ordered delivery, but does not mandate
	it.  Section 4.6.2 of SAM2r16 says:

  For convenience, the SCSI architecture model assumes in-order delivery
  to be a property of the service delivery subsystem. This assumption is
  made to simplify the description of behavior and does not constitute a
  requirement.

	I will observe that there is well-known code for disk
	initiators (e.g., staircase/elevator/etc. command queue
	reordering for seek optimization) that relies on this
	expectation for performance (not correctness).  IMHO,
	making in-order delivery purely OPTIONAL would not be
	consistent with the spirit/intent of the above SAM2 text.

- In existing SCSI transports, transport errors (e.g., CRC failure)
	can cause ordering issues at higher levels.  Both Parallel SCSI
	and Fibre Channel initially drop anything that has a transport
	error.  IF the higher level code retries AND other commands were
	executed before the error was noticed by the retry logic, THEN
	the retry occurs in a different place in the command sequence
	from the original command.  I believe that existing tape
	implementations usually either don't retry, or spoon-feed the
	tape one command at the time to make sure no other command can
	get in the way.  FCP-2 has the ability to prevent other commands
	from being executed ...

- iSCSI has to face this issue because we're incorporating a retry
	mechanism into the base protocol.  Parallel SCSI and FCP-1 could
	ignore this because they didn't do retries.  FCP-2 has a retry
	mechanism, but makes order preservation of retries negotiable -
	order of command delivery in the face of retries is only assured
	if CRN is in use, and that is controlled by a settable bit in a
	mode page (FCP2r7, 4.3).  The Initiator can choose not to set
	this bit, and the Target can refuse to allow the Initiator to set
	it.

- CRN is a relatively new mechanism, and both it and the associated
	FCP-2 retry mechanism are likely to be implemented for/used by
	tapes.  Disks will still have an expectation of in-order delivery
	in some cases, even though neither of these mechanisms are in use.
	IMHO, Santosh's proposal to use CRN for all
expectations/requirements
	of ordering doesn't seem like a good idea for this reason.

So, let me try the following proposal to resolve this issue/objection:

(1) MUST provide ordered delivery of SCSI commands from
	the initiator to the target in the absence of transport
	errors visible to iSCSI (e.g., iSCSI CRC failure,
	unexpected TCP connection closure).
(2) MUST specify the ability to preserve ordered delivery
	of SCSI commands even in the presence of transport
	errors.  A mechanism MUST be provided to allow
	Initiators and Targets to negotiate this preservation
	on a per-session or finer granularity basis, 

The Rationale for (1) is the combination of the SAM2 expectation
plus the fact that there are situations in which disks expect
this ordering in the absence of mechanisms like CRN that can
enforce it.  The Rationale for (2) is to provide support
analogous to FCP-2 - this should be sufficient for tapes to
obtain the behavior they require.

Comments?

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 20 14:13:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA15686
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 14:13:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KG3uD28018
	for ips-outgoing; Fri, 20 Apr 2001 12:03:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KG2uA27978
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 12:02:56 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRNBJ>; Fri, 20 Apr 2001 09:02:51 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B3FC1EE@ariel.nishansystems.com>
From: Joshua Tseng <jtseng@NishanSystems.com>
To: ips@ece.cmu.edu
Subject: iSNS revision 02 available
Date: Fri, 20 Apr 2001 09:02:50 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Friends,

The latest version of the iSNS document is now available at the following
URL:

http://www.nishansystems.com/ietf/draft-ietf-ips-isns-02.txt

Regards,
Josh Tseng


From owner-ips@ece.cmu.edu  Fri Apr 20 14:13:55 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA15697
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 14:13:50 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KFw0t27666
	for ips-outgoing; Fri, 20 Apr 2001 11:58:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com ([168.159.1.81])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KFvIA27632
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 11:57:18 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <JG3Z95KV>; Fri, 20 Apr 2001 11:57:12 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801547C@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: FC Encapsulation Draft
Date: Fri, 20 Apr 2001 11:57:08 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

In the interim until it appears on the Internet-Draft
servers, the common FC encapsulation (for both FCIP
and iFCP) can be found at:

http://www.ultranet.com/~dlb237/draft-ietf-ips-fcencapsulation-00.txt

Anyone needing a draft posted for the interim meeting
who doesn't have convenient access to a web site can send
the draft to me and I'll post it on the same site.

Many thanks to the design team for producing an initial
draft in a timely fashion.

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 20 15:13:23 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16614
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 15:13:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KGiwT00246
	for ips-outgoing; Fri, 20 Apr 2001 12:44:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3KGiIA00228
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 12:44:19 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by crufty; Fri Apr 20 12:40:47 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Fri Apr 20 12:43:07 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id MAA12845
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 12:43:02 -0400 (EDT)
Message-ID: <3AE06716.3A7BC65D@research.bell-labs.com>
Date: Fri, 20 Apr 2001 12:43:02 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: iSCSI:out-of-order notification proposal
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


This is a proposal to allow the initiator to inform the target 
if out-of-order execution within the command stream is possible.

The target execution can rotate between "in-cmdSN-order" and 
"out-of-order" during runtime as informed by the initiator..

Appreciate comments on subtleties that I may have missed.

thanks,
-Sandeep

http://ips.pdl.cs.cmu.edu/mail/msg03152.html
   by Costa provides a good summary of the issue at hand.

http://ips.pdl.cs.cmu.edu/mail/msg04255.html
  David provides the "new" reqts.  In particular, this one

   > MUST specify the ability to preserve ordered delivery
   > of SCSI commands even in the presence of transport
   > errors.  A mechanism MUST be provided to allow
   > Initiators and Targets to negotiate this preservation
   > on a per-session or finer granularity basis

Note :
======
1) This does not rely on SCSI cmdRN, but operates at iSCSI level.
2) Flow control using cmdSN works as designed.
3) This solution is not a per-session negotiation option but can be
   disabled and re-enabled again at "runtime" by the initiator if it
   notices that Ordered/HOQ tasks (or any other need) have entered
   the iSCSI command stream which is being dispatched.

Problem:
=========
In case of out-of-order arrival or digest errors, it is NOT possible
to know if the initiator had sent an ordered command before the one
which was received.

Solution:
=========
To notify target of presence of in-flight ordered commands, we set
a flag on *every* PDU following the ordered command *until* the target
moves it expCmdSN above the cmdSN of the ordered command.  The
expCmdSN indicates target has found the ordered command.

( Those familiar with "ECN over IP" (Floyd, et al) may see this is
similar to how a congestion bit keeps being set until the sender acks
that it has received the notification)

Figuratively:
=============
So, assume there was a new "strict_order" flag in the BHS.
In the figure below, braces shows value of (cmdSN, strict_order)

Initiator                       Target
---------                       -------

simple cmd (cmdSN=100, strict_order=0) ->
simple cmd (101, 0) ->
simple cmd (102, 0) ->

   these may get reordered or have digest errors ->

                        target executes as they arrive
                             exec simple cmd(102, 0)
                             exec simple cmd(100, 0)
                             exec simple cmd(101, 0)

Now say the initiator wants to send a HOQ task.
It sets strict_order=1 on all PDUs

ordered cmd (103, 1) ->
simple  cmd (104, 1) ->
simple  cmd (105, 1) ->
                        in case of reordering or digest errors
                        target must wait & execute in cmdSN order!
simple  cmd (106, 1) ->
simple  cmd (107, 1) ->

         <---- now target sends expCmdSN=103

This implies target has seen command(cmdSN=103) and target will do the
appropriate ordering and delivery to SCSI layer.  This is left to
the target implementation to tackle.

Initiator checks (expCmdSN >= cmdSN of ordered cmd) and then resets
strict_order to zero for all subsequent PDUs.

simple  cmd (108, 0) ->
simple  cmd (109, 0) ->

Other issues:
=============
If the basic scheme is ok, then we could later tackle other questions
"what about multiple ordered commands" and the like..


From owner-ips@ece.cmu.edu  Fri Apr 20 15:21:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16751
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 15:21:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KI20R04297
	for ips-outgoing; Fri, 20 Apr 2001 14:02:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gordan.pl.gadzoox.com (i248.gadzoox.com [216.52.31.248])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KI1iA04286
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 14:01:44 -0400 (EDT)
Received: by gordan.pl.gadzoox.com with Internet Mail Service (5.5.2653.19)
	id <2HYZM5JQ>; Fri, 20 Apr 2001 10:59:29 -0700
Message-ID: <FEBDF95C7316D5119D7D00B0D0B0B7EA06B3DC@gordan.pl.gadzoox.com>
From: Vi Chau <vchau@gadzoox.com>
To: ips@ece.cmu.edu
Subject: URL for FCIP draft
Date: Fri, 20 Apr 2001 10:59:28 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Dear Collegues,

   The updated FCIP draft has been submitted and may be found at:

http://www.gadzoox.com/download/draft-ietf-ips-fcovertcpip-02.txt


Regards,
Vi Chau



From owner-ips@ece.cmu.edu  Fri Apr 20 15:57:46 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA17217
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 15:57:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KIV3f05855
	for ips-outgoing; Fri, 20 Apr 2001 14:31:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.wrs.com (unknown-1-11.wrs.com [147.11.1.11])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KIUbA05831
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 14:30:37 -0400 (EDT)
Received: from london ([147.11.45.217])
	by mail.wrs.com (8.9.3/8.9.1) with SMTP id LAA02094
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 11:30:24 -0700 (PDT)
From: "Rod Harrison" <rod.harrison@windriver.com>
To: <ips@ece.cmu.edu>
Subject: iSCSI : Negotiable padding, was More issues.... Digest related.
Date: Fri, 20 Apr 2001 11:30:14 -0700
Message-ID: <NEBBKMMOEMCINPLCHKGMEEJOCGAA.rod.harrison@windriver.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
In-Reply-To: <C1256A34.002F9227.00@d12mta02.de.ibm.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

	I have a request with respect to data padding. Can we make
the pad size login negotiable please? Preferably on a per
direction basis. This would allow the pad to be optimized
for a receivers specific requirements, e.g. cache line
alignment. Restricting padding to powers of 2 by specifying
the size as a power of 2 seems reasonable.

	For example:

	IPadSize=<any-power-of-two>
	TPadSize=<any-power-of-two>

	IPadSize=0
	TPadSize=2

	Would result in the initiator padding data to 4 byte
boundaries for the target, and the target inserting no pad
for the initiator.

	Also, a related question, if the pad is to remain mandatory
is it expected data will be padded if no data digests are in
use?

	- Rod

-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
julian_satran@il.ibm.com
Sent: Friday, April 20, 2001 1:45 AM
To: ips@ece.cmu.edu
Subject: Re: iSCSI : More issues.... Digest related.




1.The padding is to the next 4 byte word boundary .
2. There is a Security - Appendix

and there is a numbering /formating error in the appendix

Julo

Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 03:25:49

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : More issues.... Digest related.




Julian & All,

2 more issues :

1) If the DataSegmentLength in the BHS excludes padding
bytes, how does
the initiator determine the location of the data digest
[which is placed
after the padded data] ?
There is no knowledge of what amount of padding is in use,
since padding
can be 4 bytes or a multiple of that quantity.


2) While on the subject of digests.....are'nt there supposed
to be
login keys to indicate the use or non-use of header and data
digests ? I
can't seem to find any such login keys in the latest revs
5.91....5.92....6.000...(?)

(Section 2.2.1 states :
"The digest types are negotiated during the login phase. ").

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Fri Apr 20 15:59:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA17241
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 15:59:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KIEOM04937
	for ips-outgoing; Fri, 20 Apr 2001 14:14:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KIDoA04910
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 14:13:50 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRNK6>; Fri, 20 Apr 2001 11:13:38 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173448@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 11:13:37 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

One minor comment:

> (1) MUST provide ordered delivery of SCSI commands from
> 	the initiator to the target in the absence of transport
> 	errors visible to iSCSI (e.g., iSCSI CRC failure,
> 	unexpected TCP connection closure).

Does the term "SCSI commands" include task management functions as well?  If
not, it should.

Charles

> -----Original Message-----
> From: Black_David@emc.com [mailto:Black_David@emc.com]
> Sent: Friday, April 20, 2001 7:11 AM
> To: ips@ece.cmu.edu
> Subject: iSCSI Reqts: In-Order Delivery
> 
> 
> About a week ago, Santosh Rao wrote:
> 
> > I object to the following requirement :
> > 
> > " MUST support ordered delivery of SCSI commands from the 
> initiator to
> > the  target, to support SCSI Task Queuing. "
> 
> In the hopes of driving this discussion to closure,
> let me summarize my understanding of this situation:
> 
> - SAM2 expects ordered delivery, but does not mandate
> 	it.  Section 4.6.2 of SAM2r16 says:
> 
>   For convenience, the SCSI architecture model assumes 
> in-order delivery
>   to be a property of the service delivery subsystem. This 
> assumption is
>   made to simplify the description of behavior and does not 
> constitute a
>   requirement.
> 
> 	I will observe that there is well-known code for disk
> 	initiators (e.g., staircase/elevator/etc. command queue
> 	reordering for seek optimization) that relies on this
> 	expectation for performance (not correctness).  IMHO,
> 	making in-order delivery purely OPTIONAL would not be
> 	consistent with the spirit/intent of the above SAM2 text.
> 
> - In existing SCSI transports, transport errors (e.g., CRC failure)
> 	can cause ordering issues at higher levels.  Both Parallel SCSI
> 	and Fibre Channel initially drop anything that has a transport
> 	error.  IF the higher level code retries AND other commands were
> 	executed before the error was noticed by the retry logic, THEN
> 	the retry occurs in a different place in the command sequence
> 	from the original command.  I believe that existing tape
> 	implementations usually either don't retry, or spoon-feed the
> 	tape one command at the time to make sure no other command can
> 	get in the way.  FCP-2 has the ability to prevent other commands
> 	from being executed ...
> 
> - iSCSI has to face this issue because we're incorporating a retry
> 	mechanism into the base protocol.  Parallel SCSI and FCP-1 could
> 	ignore this because they didn't do retries.  FCP-2 has a retry
> 	mechanism, but makes order preservation of retries negotiable -
> 	order of command delivery in the face of retries is only assured
> 	if CRN is in use, and that is controlled by a settable bit in a
> 	mode page (FCP2r7, 4.3).  The Initiator can choose not to set
> 	this bit, and the Target can refuse to allow the 
> Initiator to set
> 	it.
> 
> - CRN is a relatively new mechanism, and both it and the associated
> 	FCP-2 retry mechanism are likely to be implemented for/used by
> 	tapes.  Disks will still have an expectation of 
> in-order delivery
> 	in some cases, even though neither of these mechanisms 
> are in use.
> 	IMHO, Santosh's proposal to use CRN for all
> expectations/requirements
> 	of ordering doesn't seem like a good idea for this reason.
> 
> So, let me try the following proposal to resolve this issue/objection:
> 
> (1) MUST provide ordered delivery of SCSI commands from
> 	the initiator to the target in the absence of transport
> 	errors visible to iSCSI (e.g., iSCSI CRC failure,
> 	unexpected TCP connection closure).
> (2) MUST specify the ability to preserve ordered delivery
> 	of SCSI commands even in the presence of transport
> 	errors.  A mechanism MUST be provided to allow
> 	Initiators and Targets to negotiate this preservation
> 	on a per-session or finer granularity basis, 
> 
> The Rationale for (1) is the combination of the SAM2 expectation
> plus the fact that there are situations in which disks expect
> this ordering in the absence of mechanisms like CRN that can
> enforce it.  The Rationale for (2) is to provide support
> analogous to FCP-2 - this should be sufficient for tapes to
> obtain the behavior they require.
> 
> Comments?
> 
> Thanks,
> --David
> 
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
> 


From owner-ips@ece.cmu.edu  Fri Apr 20 17:06:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18064
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 17:06:53 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KJ92l08059
	for ips-outgoing; Fri, 20 Apr 2001 15:09:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1.cos.agilent.com (msgbas1x.cos.agilent.com [192.6.9.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KJ8DA08035
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 15:08:13 -0400 (EDT)
Received: from msgrel1.and.agilent.com (msgrel1.and.agilent.com [130.30.33.104])
	by msgbas1.cos.agilent.com (Postfix) with ESMTP id 8BFFE1631
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 13:08:12 -0600 (MDT)
Received: from rtl.rose.agilent.com (rtl.rose.agilent.com [156.140.232.231])
	by msgrel1.and.agilent.com (Postfix) with ESMTP id A0864146
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 15:08:10 -0400 (EDT)
Received: from mail.rose.agilent.com (mailsrv@bellhop [156.140.233.51])
	by rtl.rose.agilent.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.1.0) with ESMTP id MAA13957
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 12:08:08 -0700 (PDT)
Received: from agilent.com (matt5670.rose.agilent.com [156.140.234.123])
          by mail.rose.agilent.com (Netscape Messaging Server 3.6)
           with ESMTP id AAA5347; Fri, 20 Apr 2001 12:08:04 -0700
Message-ID: <3AE089DB.8A08023D@agilent.com>
Date: Fri, 20 Apr 2001 12:11:23 -0700
From: "Matt Wakeley" <matt_wakeley@agilent.com>
Reply-To: Matt Wakeley <matt_wakeley@agilent.com>
Organization: Agilent Technologies
X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Cc: David Black <Black_David@emc.com>
Subject: Re: iSCSI : New PDU opcode usage in rev 5.92
References: <C1256A34.00216398.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

I like the "direction bit".  It is a way to exactly determine what the message
is, without the need to know the context (especially useful when analyzing
what's happening on the link).

If you think we are running short on opcodes, then don't use a "direction
bit", but just renumber the opcodes for unique values (for example, 0x01 =
command, 0x02 = response, etc.).

On another point, I don't think you should be changing things as fundamental
as this without direction from the group.  This "standard" is supposed to be
stabilizing, not constantly changing.

-Matt Wakeley
Agilent Technologies


julian_satran@il.ibm.com wrote:
> 
> I would like to add to Venkat remarks only that this asymmetry has been
> with us in iSCSI forever
> and we had even a statement to the effect that targets should not issue
> initiator codes etc. (this is irrelevant now as the codes overlap).
> 
> The reason I took out the "direction bit" (meant more for observers) was
> that I felt that we are low on codes :-)
> 
> Julo
> 
> Venkat Rangan <venkat@rhapsodynetworks.com> on 19/04/2001 07:58:23
> 
> Please respond to Venkat Rangan <venkat@rhapsodynetworks.com>
> 
> To:   "'Santosh Rao'" <santoshr@cup.hp.com>, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI : New PDU opcode usage in rev 5.92
> 
> Santosh,
> 
> Is it not the case that requests go in the direction from the Initiator to
> Target,
> where Target is the one "listening" for new connections on the well-known
> port?
> A dual mode scsi implementation therefore has two separate sessions and
> sets
> of connections.
> One set is [I->DualModeTarget] and the other is [DualModeInitiator->T]
> and the connections are independent. If I and T happens to be the same
> system, you
> can not use a single connection for bidirectional sessions between the two.
> 
> So if you receive a PDU from a target, you can only do so with SourcePort
> set to
> well-known-port, and it must be a Response from target. May be I'm assuming
> something
> that is not valid...
> 
> Venkat Rangan
> Rhapsody Networks Inc.
> http://www.rhapsodynetworks.com
> 
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Wednesday, April 18, 2001 6:47 PM
> To: ips@ece.cmu.edu
> Subject: iSCSI : New PDU opcode usage in rev 5.92
> 
> Julian & All,
> 
> I've got a quick question on how the new opcode layouts would work for
> dual mode scsi implementations. (i.e. initiators that responded in
> target mode or targets that acted as initiators also).
> 
> The new opcode layout is :
> 
> ----------------
> X|I| | | | | | |
> ----------------
> 7 6 5 4 3 2 1 0
> 
> where bits 5-0 -> opcode
> X -> retry bit
> I -> immediate bit
> 
> The same values are used for the command as well as response opcodes and
> bits X & I are intended to both be set to 1 by targets.
> 
> i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
> command and response is based on targets setting X & I bits to 1.
> 
> Now, if an initiator [capable of target mode] sent the following
> commands, how would they be interpreted :
> 
> 1) 0xc4.
> is this a text command being retried in immediate mode,
> or is it a text response ?
> 
> 2) 0xc1
> is this a scsi command being retried in immediate mode,
> or is it a scsi response ?
> 
> 3) 0xc2
> is this a scsi task mgmt command being retried in immediate mode,
> or is it a scsi task mgmt response ?
> 
> etc.....
> 
> - Santosh
> 
> --
> #################################
> Santosh Rao
> Software Design Engineer,
> HP, Cupertino.
> email : santoshr@cup.hp.com
> Phone : 408-447-3751
> #################################


From owner-ips@ece.cmu.edu  Fri Apr 20 17:06:57 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18075
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 17:06:56 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KJDCq08300
	for ips-outgoing; Fri, 20 Apr 2001 15:13:12 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KJCNA08268
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 15:12:23 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP id 8B4B03183
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 12:12:19 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id MAA11980
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 12:12:14 -0700 (PDT)
Message-ID: <3AE089D0.86FC46B7@cup.hp.com>
Date: Fri, 20 Apr 2001 12:11:12 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: iSCSI : EnableACA
References: <C1256A34.00216472.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------654F07A3229416C2DD4AD619"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------654F07A3229416C2DD4AD619
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,

Would the following not satisfy the requirements for dealing with this
ACA issue :

1) Initiators determine the target support for ACA through the NACA bit
in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
good faith, this would be supported.)

2) Initiators set the NACA bit in the CDBs of commands that need strong
ordering. (This could be a small subset of the I/O traffic to one or
more LUNs within the session and not required for all the I/Os in that
session.)

3) Any exception condition on a SCSI I/O, for which the NACA bit was set
results in ACA being established. 
Thus, ACA would only be applied if some I/O traffic that required strong
ordering was affected by the exception condition.

4) Since the initiator is ACA capable based on its usage of the NACA
bit, it should also be capable of performing the desired Clear ACA to
recover from this condition.

Such an approach would only apply ACA and its corresponding recovery
when some strongly ordered I/O encountered an exception condition,
rather than applying ACA on a session granularity.

To summarize, the above approach allows :
- ACA to be turned on/off for a subset of I/Os headed to a LUN
- ACA based recovery only used where needed.
- Keeps iSCSI ACA un-aware and rightly so, since this is a property of
the SCSI ULP.
- Avoids applying ACA recovery on a session granularity.

What am I missing here (?). Why is an EnableACA needed ?

- Santosh


julian_satran@il.ibm.com wrote:

> All references to
> EnableACA are redundant and should be removed for the following reasons
> :
> 
> a) An initiator knows whether a target supports ACA from the NACA bit in
> the INQUIRY response. When a target indicates support for ACA, the
> initiator can use it by setting the NACA bit in the CDBs it sends. There
> is NO need for any sort of negotiation of this behaviour above and
> beyond what is already provided thru SCSI mechanisms.
> 
> b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
> use or lack thereof. This is done thru the NACA bit in CDBs.
> 
> c) (As a side note, the description of EnableACA on pg 127 refers to its
> presence in the lun control mode page, but it is actually present in the
> protocol specific port page.)
> 
> d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
> negotiated on a per-session basis. SCSI allows initiators to request ACA
> behaviour on a per I/O basis through the use of NACA bit in the CDBs.
> 

 +++ We have required ACA to be supported by all new iSCSI targets and
 several
 actions require the target to enter ACA state.
 It was brought to our attention that many initiators will not react
 properly to a
 target entering ACA state (not do the reset).
 The EnableACA bit and key are meant to enable an initiator to control
this
 iSCSI specific ACA behaviour.  This behaviour is related to
asynchronous
 events and is not controlled by the NACA CDB bit.
 
 ++++
--------------654F07A3229416C2DD4AD619
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------654F07A3229416C2DD4AD619--



From owner-ips@ece.cmu.edu  Fri Apr 20 17:10:24 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18119
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 17:10:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KIm5i06784
	for ips-outgoing; Fri, 20 Apr 2001 14:48:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel3.hp.com (palrel3.hp.com [156.153.255.226])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KIlCA06735
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 14:47:12 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel3.hp.com (Postfix) with ESMTP
	id 16B1A17E1; Fri, 20 Apr 2001 11:47:12 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA08690;
	Fri, 20 Apr 2001 11:47:07 -0700 (PDT)
Message-ID: <3AE083ED.87D1A468@cup.hp.com>
Date: Fri, 20 Apr 2001 11:46:05 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Charles Monia <cmonia@NishanSystems.com>
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI Reqts: In-Order Delivery
References: <B300BD9620BCD411A366009027C21D9B173448@ariel.nishansystems.com>
Content-Type: multipart/mixed;
 boundary="------------95203DFCD74B1CAD8DD4BBBE"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------95203DFCD74B1CAD8DD4BBBE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Charles Monia wrote:

> > (1) MUST provide ordered delivery of SCSI commands from
> >       the initiator to the target in the absence of transport
> >       errors visible to iSCSI (e.g., iSCSI CRC failure,
> >       unexpected TCP connection closure).
> 
> Does the term "SCSI commands" include task management functions as well?  If
> not, it should.


Charles,

Could iSCSI use a variant of the approach FCP-2 takes to solve the
ordering issue for task mgmt error recovery ?

The FCP-2 task management error recovery scheme is :
- task mgmt function uses CRN 0
- task mgmt function is executed immediately with no ordering latencies
- both initiator & target clear all resources that can be cleared
un-ambiguously.
- any ambiguous exchanges shall be aborted by the port that detects the
ambiguous state.

In the case of iSCSI, an analogous approach could be :
- task mgmt function uses immediate delivery flag for the task mgmt PDU.
- task mgmt fn executed immediately avoiding any ordering latencies.
- initiator & target clear all resources that can be cleared
un-ambiguously.
- initiator uses Abort Task to explicitly abort all active outstanding
I/Os at the time the task mgmt fn was issued to avoid any ambiguous
stale PDUs of an exchange from appearing at the target. 

Such an approach would avoid latencies on the execution of the task mgmt
fn while still flushing out all the stale PDUs upon completion of the
initiator actions for that task mgmt fn.

Regards,
Santosh



> > -----Original Message-----
> > From: Black_David@emc.com [mailto:Black_David@emc.com]
> > Sent: Friday, April 20, 2001 7:11 AM
> > To: ips@ece.cmu.edu
> > Subject: iSCSI Reqts: In-Order Delivery
> >
> >
> > About a week ago, Santosh Rao wrote:
> >
> > > I object to the following requirement :
> > >
> > > " MUST support ordered delivery of SCSI commands from the
> > initiator to
> > > the  target, to support SCSI Task Queuing. "
> >
> > In the hopes of driving this discussion to closure,
> > let me summarize my understanding of this situation:
> >
> > - SAM2 expects ordered delivery, but does not mandate
> >       it.  Section 4.6.2 of SAM2r16 says:
> >
> >   For convenience, the SCSI architecture model assumes
> > in-order delivery
> >   to be a property of the service delivery subsystem. This
> > assumption is
> >   made to simplify the description of behavior and does not
> > constitute a
> >   requirement.
> >
> >       I will observe that there is well-known code for disk
> >       initiators (e.g., staircase/elevator/etc. command queue
> >       reordering for seek optimization) that relies on this
> >       expectation for performance (not correctness).  IMHO,
> >       making in-order delivery purely OPTIONAL would not be
> >       consistent with the spirit/intent of the above SAM2 text.
> >
> > - In existing SCSI transports, transport errors (e.g., CRC failure)
> >       can cause ordering issues at higher levels.  Both Parallel SCSI
> >       and Fibre Channel initially drop anything that has a transport
> >       error.  IF the higher level code retries AND other commands were
> >       executed before the error was noticed by the retry logic, THEN
> >       the retry occurs in a different place in the command sequence
> >       from the original command.  I believe that existing tape
> >       implementations usually either don't retry, or spoon-feed the
> >       tape one command at the time to make sure no other command can
> >       get in the way.  FCP-2 has the ability to prevent other commands
> >       from being executed ...
> >
> > - iSCSI has to face this issue because we're incorporating a retry
> >       mechanism into the base protocol.  Parallel SCSI and FCP-1 could
> >       ignore this because they didn't do retries.  FCP-2 has a retry
> >       mechanism, but makes order preservation of retries negotiable -
> >       order of command delivery in the face of retries is only assured
> >       if CRN is in use, and that is controlled by a settable bit in a
> >       mode page (FCP2r7, 4.3).  The Initiator can choose not to set
> >       this bit, and the Target can refuse to allow the
> > Initiator to set
> >       it.
> >
> > - CRN is a relatively new mechanism, and both it and the associated
> >       FCP-2 retry mechanism are likely to be implemented for/used by
> >       tapes.  Disks will still have an expectation of
> > in-order delivery
> >       in some cases, even though neither of these mechanisms
> > are in use.
> >       IMHO, Santosh's proposal to use CRN for all
> > expectations/requirements
> >       of ordering doesn't seem like a good idea for this reason.
> >
> > So, let me try the following proposal to resolve this issue/objection:
> >
> > (1) MUST provide ordered delivery of SCSI commands from
> >       the initiator to the target in the absence of transport
> >       errors visible to iSCSI (e.g., iSCSI CRC failure,
> >       unexpected TCP connection closure).
> > (2) MUST specify the ability to preserve ordered delivery
> >       of SCSI commands even in the presence of transport
> >       errors.  A mechanism MUST be provided to allow
> >       Initiators and Targets to negotiate this preservation
> >       on a per-session or finer granularity basis,
> >
> > The Rationale for (1) is the combination of the SAM2 expectation
> > plus the fact that there are situations in which disks expect
> > this ordering in the absence of mechanisms like CRN that can
> > enforce it.  The Rationale for (2) is to provide support
> > analogous to FCP-2 - this should be sufficient for tapes to
> > obtain the behavior they require.
> >
> > Comments?
> >
> > Thanks,
> > --David
> >
> > ---------------------------------------------------
> > David L. Black, Senior Technologist
> > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> > black_david@emc.com       Mobile: +1 (978) 394-7754
> > ---------------------------------------------------
> >
--------------95203DFCD74B1CAD8DD4BBBE
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------95203DFCD74B1CAD8DD4BBBE--



From owner-ips@ece.cmu.edu  Fri Apr 20 17:13:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18155
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 17:13:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KJKDf08729
	for ips-outgoing; Fri, 20 Apr 2001 15:20:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KJJnA08705
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 15:19:49 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3KKRa122871;
	Fri, 20 Apr 2001 13:27:36 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 12:17:45 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEJOCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <0F31E5C394DAD311B60C00E029101A070801547A@corpmx9.isus.emc.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

In my prior message:
http://ips.pdl.cs.cmu.edu/mail/msg04218.html

I included a concern with respect to the in-order delivery of Server
responses.  In particular, it concerns Management responses with respect to
prior Server responses.  With multiple connections, the per-connection
serialization from the Server does not allow any assurances with respect to
sequential delivery of responses.  This may cause unexpected return of
status assumed encompassed by a management response affected by Management
requests sent on different connections.

In general, the Must requirement seems appropriate.  There may be some gray
area with respect to SCSI layer delivery.  Admittedly, I have not full
reviewed the latest version of the iSCSI specifications.

Doug


> About a week ago, Santosh Rao wrote:
>
> > I object to the following requirement :
> >
> > " MUST support ordered delivery of SCSI commands from the initiator to
> > the  target, to support SCSI Task Queuing. "
>
> In the hopes of driving this discussion to closure,
> let me summarize my understanding of this situation:
>
> - SAM2 expects ordered delivery, but does not mandate
> 	it.  Section 4.6.2 of SAM2r16 says:
>
>   For convenience, the SCSI architecture model assumes in-order delivery
>   to be a property of the service delivery subsystem. This assumption is
>   made to simplify the description of behavior and does not constitute a
>   requirement.
>
> 	I will observe that there is well-known code for disk
> 	initiators (e.g., staircase/elevator/etc. command queue
> 	reordering for seek optimization) that relies on this
> 	expectation for performance (not correctness).  IMHO,
> 	making in-order delivery purely OPTIONAL would not be
> 	consistent with the spirit/intent of the above SAM2 text.
>
> - In existing SCSI transports, transport errors (e.g., CRC failure)
> 	can cause ordering issues at higher levels.  Both Parallel SCSI
> 	and Fibre Channel initially drop anything that has a transport
> 	error.  IF the higher level code retries AND other commands were
> 	executed before the error was noticed by the retry logic, THEN
> 	the retry occurs in a different place in the command sequence
> 	from the original command.  I believe that existing tape
> 	implementations usually either don't retry, or spoon-feed the
> 	tape one command at the time to make sure no other command can
> 	get in the way.  FCP-2 has the ability to prevent other commands
> 	from being executed ...
>
> - iSCSI has to face this issue because we're incorporating a retry
> 	mechanism into the base protocol.  Parallel SCSI and FCP-1 could
> 	ignore this because they didn't do retries.  FCP-2 has a retry
> 	mechanism, but makes order preservation of retries negotiable -
> 	order of command delivery in the face of retries is only assured
> 	if CRN is in use, and that is controlled by a settable bit in a
> 	mode page (FCP2r7, 4.3).  The Initiator can choose not to set
> 	this bit, and the Target can refuse to allow the Initiator to set
> 	it.
>
> - CRN is a relatively new mechanism, and both it and the associated
> 	FCP-2 retry mechanism are likely to be implemented for/used by
> 	tapes.  Disks will still have an expectation of in-order delivery
> 	in some cases, even though neither of these mechanisms are in use.
> 	IMHO, Santosh's proposal to use CRN for all
> expectations/requirements
> 	of ordering doesn't seem like a good idea for this reason.
>
> So, let me try the following proposal to resolve this issue/objection:
>
> (1) MUST provide ordered delivery of SCSI commands from
> 	the initiator to the target in the absence of transport
> 	errors visible to iSCSI (e.g., iSCSI CRC failure,
> 	unexpected TCP connection closure).
> (2) MUST specify the ability to preserve ordered delivery
> 	of SCSI commands even in the presence of transport
> 	errors.  A mechanism MUST be provided to allow
> 	Initiators and Targets to negotiate this preservation
> 	on a per-session or finer granularity basis,
>
> The Rationale for (1) is the combination of the SAM2 expectation
> plus the fact that there are situations in which disks expect
> this ordering in the absence of mechanisms like CRN that can
> enforce it.  The Rationale for (2) is to provide support
> analogous to FCP-2 - this should be sufficient for tapes to
> obtain the behavior they require.
>
> Comments?
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>



From owner-ips@ece.cmu.edu  Fri Apr 20 17:50:38 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA18538
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 17:50:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KHCwi01777
	for ips-outgoing; Fri, 20 Apr 2001 13:12:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com ([168.159.1.81])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KHCDA01761
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 13:12:13 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <JG3Z96MG>; Fri, 20 Apr 2001 13:12:07 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801547E@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: RE: iSCSI : login keys & mode page settings
Date: Fri, 20 Apr 2001 13:12:04 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Without getting into the technical details of the
discussion, I have a couple of observations:

(A) The issue of whether to allow mode page
	access to and modification of iSCSI parameters
	will need to be taken up at the interim
	meeting.  IMHO, access seems like a good
	idea, so that SCSI-generic code that doesn't
	know specifically about iSCSI can find
	what it expects where it expects it, but
	I'm unsure about modification because it
	may carry a risk of code that's iSCSI-unaware
	getting something wrong.  The mode page
	commands should be transparent to iSCSI.

(B) The mode page and text key mechanisms have
	to access the same data.  Section 3 of the
	-06 version says this, but needs some editing
	to enforce it by using "MUST" or its equivalent
	(cf. RFC 2119).  This is to prevent an
	implementation from having two instances of
	the same parameter - one for the mode page and
	one for the text keys - which would be a bad
	thing.

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 20 20:57:39 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA20520
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 20:57:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KK24Q10967
	for ips-outgoing; Fri, 20 Apr 2001 16:02:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KK1TA10947
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 16:01:29 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT34Z9X>; Fri, 20 Apr 2001 16:02:56 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015483@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: RE: Internet-Draft delays
Date: Fri, 20 Apr 2001 16:01:21 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Update on this - all the drafts submitted
by about mid-day today should hit the I-D
servers tonight and be announced to the list
tomorrow.  --David

> -----Original Message-----
> From:	Black_David@emc.com [SMTP:Black_David@emc.com]
> Sent:	Thursday, April 19, 2001 9:41 PM
> To:	ips@ece.cmu.edu
> Subject:	Internet-Draft delays
> 
> There appears to be a serious delay in getting submitted
> drafts posted to the Internet-Draft servers.  Authors of
> drafts for the interim meeting should post them to a
> web site and send the URL to the mailing list.  Who would
> have thought that we needed to coordinate the interim
> meeting schedule with the I-D administrator :-( ??
> 
> --David
> 
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 20 20:58:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA20554
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 20:58:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KKcEk12842
	for ips-outgoing; Fri, 20 Apr 2001 16:38:14 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [128.221.11.32])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KKbdA12810
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 16:37:39 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <28S7VKDH>; Fri, 20 Apr 2001 16:37:34 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015485@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 16:37:33 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Focusing solely on the discussion needed to resolve the
(last call) issue in the requirements draft:

(A) Charles suggests that "ordered delivery of SCSI commands"
	should include task management commands.  That
	was the intent of the proposal and words should be
	added to make this clear.  Section 7.3 of the -06
	version of the main iSCSI document contains an
	initial version of a description of how task management
	commands can be executed immediately but have the
	effects they would have had if delivered in order.

(B) Doug is concerned that the task management response
	may arrive before the responses to one or more
	commands that were affected by the task management
	command.  While his technical concern is valid,
	and has/is being discussed, I don't think foreclosing
	that discussion by requiring session-wide
	synchronization of responses in the requirements
	document is the right thing to do.  Hence I would
	not change the proposal to require such synchronization.

Thanks,
--David

p.s. Here's the proposal and brief Rationale:

> > So, let me try the following proposal to resolve this issue/objection:
> >
> > (1) MUST provide ordered delivery of SCSI commands from
> > 	the initiator to the target in the absence of transport
> > 	errors visible to iSCSI (e.g., iSCSI CRC failure,
> > 	unexpected TCP connection closure).
> > (2) MUST specify the ability to preserve ordered delivery
> > 	of SCSI commands even in the presence of transport
> > 	errors.  A mechanism MUST be provided to allow
> > 	Initiators and Targets to negotiate this preservation
> > 	on a per-session or finer granularity basis,
> >
> > The Rationale for (1) is the combination of the SAM2 expectation
> > plus the fact that there are situations in which disks expect
> > this ordering in the absence of mechanisms like CRN that can
> > enforce it.  The Rationale for (2) is to provide support
> > analogous to FCP-2 - this should be sufficient for tapes to
> > obtain the behavior they require.
> 
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 20 21:03:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA20617
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 21:03:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KKxAX13904
	for ips-outgoing; Fri, 20 Apr 2001 16:59:10 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KKw6A13869
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 16:58:06 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT347YR>; Fri, 20 Apr 2001 16:59:34 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015486@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: matt_wakeley@agilent.com, ips@ece.cmu.edu
Subject: RE: iSCSI : New PDU opcode usage in rev 5.92
Date: Fri, 20 Apr 2001 16:57:58 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> On another point, I don't think you should be changing things as
fundamental
> as this without direction from the group.  This "standard" is supposed to
be
> stabilizing, not constantly changing.

With my WG co-chair hat on, Matt has a valid point.  Given the stage we're
at
with the base iSCSI specification, any substantive change like this really
needs
to be reviewed on the mailing list before being made, lest we repeat the
drawn-
out experience with the header changes (which were initially made shortly
after the January meeting in Orlando).  At this juncture the right thing to
do is tee this up for discussion in Nashua with a request to Julian to be
more
careful about these sort of changes in the future - even if the technical
reasons
for doing this were 100% correct, not discussing this sort of thing on the
list beforehand risks having to do it over at great length (timewise) and
consternation.

Matt - in case this slips into a crack, please make sure it's brought up
at the Tuesday AM agenda bashing if it doesn't appear on the agenda.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 20 22:01:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA22031
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 22:01:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KNqDZ21913
	for ips-outgoing; Fri, 20 Apr 2001 19:52:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KNq3A21906
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 19:52:04 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3L0xs123070;
	Fri, 20 Apr 2001 17:59:54 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 16:50:04 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEKBCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0708015485@corpmx9.isus.emc.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

I suggested one solution that has several benefits.  This one suggestion is
not the only option to resolve this issue.  Connection Allegiance does not
resolve state with respect to Management requests.  Off hand I can think of
several other options as these requests are clearly indicated.  How this
problem is resolved should be considered a separate issue, but there is this
requirement that should not be overlook.  My interpretation of SAM places
this obligation on the transport.

Doug


> Focusing solely on the discussion needed to resolve the
> (last call) issue in the requirements draft:
>
> (A) Charles suggests that "ordered delivery of SCSI commands"
> 	should include task management commands.  That
> 	was the intent of the proposal and words should be
> 	added to make this clear.  Section 7.3 of the -06
> 	version of the main iSCSI document contains an
> 	initial version of a description of how task management
> 	commands can be executed immediately but have the
> 	effects they would have had if delivered in order.
>
> (B) Doug is concerned that the task management response
> 	may arrive before the responses to one or more
> 	commands that were affected by the task management
> 	command.  While his technical concern is valid,
> 	and has/is being discussed, I don't think foreclosing
> 	that discussion by requiring session-wide
> 	synchronization of responses in the requirements
> 	document is the right thing to do.  Hence I would
> 	not change the proposal to require such synchronization.
>
> Thanks,
> --David
>



From owner-ips@ece.cmu.edu  Fri Apr 20 22:02:34 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA22043
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 22:02:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KNiEn21568
	for ips-outgoing; Fri, 20 Apr 2001 19:44:14 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KNhXA21552
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 19:43:33 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id 6F71F4D9
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 16:43:32 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id QAA14377
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 16:43:27 -0700 (PDT)
Message-ID: <3AE0C961.7AB697B0@cup.hp.com>
Date: Fri, 20 Apr 2001 16:42:25 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI : login keys & mode page settings
References: <C1256A34.00216472.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------B91985E2B5DCB4AC83FF123A"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------B91985E2B5DCB4AC83FF123A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:

> 3) However, having allowed 2 mechanisms to set negotiation elements,
> iSCSI MUST
> then comment on the need to synchronize their settings in the 2 layers
> and also comment on the need to trigger a UNIT ATTENTION when changed
> through the login key mechanism.
> Again, I would vote for only 1 mechanism for setting these control
> options, rather than have to define communication schemes b/n the ULP
> and LLP to keep their values in synch and generate UNIT ATTENTION.
> 
> +++ Parameter changes originate from SCSI and iSCSI only enables another
> mechanism to convey them.
> This is an implementation issue +++

Julian,

From reading the spec, it is difficult to arrive at the above conclusion
that there currently exists only 1 layer allowed to make the changes,
albeit through 2 different mechanisms. If the changes are intended to
always originate from the SCSI layer (and I'm not sure why this should
be the case), then, would it not be more apt for the SCSI ULP to use a
mode select which is the mechanism available at this layer, rather,
than, invent down calls into the iSCSI layer to map to an equivalent
login/text key setting.

Again, I would like to request that the setting of these control options
be made from 1 layer only. 


> 4)
> > If such a level of dual control is provided, the iSCSI login
> > keys listed above be made LO (leading only) to allow for changes to
> > operational parameters only during session login. This is to
> > minimize/eliminate disruption of ongoing I/O activity that occurs due to
> > the generation of a UNIT ATTENTION CHECK CONDITION when any change is
> > made to the above paramters.
> 
> Are we in agreement on the above ?
> 
> +++ No +++

Could you please throw light on the basis for LO classification ? If
some key negotiation after a login causes disruption of all outstanding
I/O (keeping in mind its impact on tape type of devices), should'nt such
a key be not made LO (?)

IMHO, any negotiation key change that can disrupt I/O must be restricted
to login time negotiation only. (LO). This would prevent potential iSCSI
windows for disruption of tape I/O.


> 5)
> > If these operational parameters are allowed to be set through iSCSI
> > login and they also impact mode page settings, iSCSI spec should
> > describe the scope of the mode page setting in terms of whether this
> > setting is a saved page setting or not ?
> >
> +++ I don't know - I would rather think not +++
> 6)
> > Should saved page settings be allowed thru iSCSI ?
> +++ I don't know - I would rather think not +++

I agree with the above. Perhaps, the draft could explicitly state the
same. (Then again, setting these options through 1 mechanism alone would
solve this issue.)

- Santosh
--------------B91985E2B5DCB4AC83FF123A
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------B91985E2B5DCB4AC83FF123A--



From owner-ips@ece.cmu.edu  Fri Apr 20 22:06:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA22054
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 22:06:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3L0BEL22621
	for ips-outgoing; Fri, 20 Apr 2001 20:11:14 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3L0B3A22616
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 20:11:03 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 9580FFF; Fri, 20 Apr 2001 17:10:29 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id RAA19231;
	Fri, 20 Apr 2001 17:10:25 -0700 (PDT)
Message-ID: <3AE0CFB3.360AEB32@cup.hp.com>
Date: Fri, 20 Apr 2001 17:09:23 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Black_David@emc.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI : login keys & mode page settings
References: <0F31E5C394DAD311B60C00E029101A070801547E@corpmx9.isus.emc.com>
Content-Type: multipart/mixed;
 boundary="------------C3D9AE49124519FF346B3E20"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------C3D9AE49124519FF346B3E20
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

David,

Some clarification on the basis for classifying login keys as LO would
also be helpful. Should login keys that can disrupt I/O on their change
be allowed to be non-LO ?

Thanks,
Santosh

Black_David@emc.com wrote:
> 
> Without getting into the technical details of the
> discussion, I have a couple of observations:
> 
> (A) The issue of whether to allow mode page
>         access to and modification of iSCSI parameters
>         will need to be taken up at the interim
>         meeting.  IMHO, access seems like a good
>         idea, so that SCSI-generic code that doesn't
>         know specifically about iSCSI can find
>         what it expects where it expects it, but
>         I'm unsure about modification because it
>         may carry a risk of code that's iSCSI-unaware
>         getting something wrong.  The mode page
>         commands should be transparent to iSCSI.
> 
> (B) The mode page and text key mechanisms have
>         to access the same data.  Section 3 of the
>         -06 version says this, but needs some editing
>         to enforce it by using "MUST" or its equivalent
>         (cf. RFC 2119).  This is to prevent an
>         implementation from having two instances of
>         the same parameter - one for the mode page and
>         one for the text keys - which would be a bad
>         thing.
> 
> --David
> 
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
--------------C3D9AE49124519FF346B3E20
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------C3D9AE49124519FF346B3E20--



From owner-ips@ece.cmu.edu  Fri Apr 20 22:07:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA22065
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 22:07:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3L0PFo23116
	for ips-outgoing; Fri, 20 Apr 2001 20:25:15 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3L0OoA23095
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 20:24:50 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LR6P5>; Fri, 20 Apr 2001 20:15:23 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015488@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 20:24:43 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Doug,

Attempting a fast exit ... I agree with the interpretation
of SAM insofar as SCSI responses are concerned - the
description of ABORT TASK in SAM (6.1) is clear that a
SCSI response to an aborted task must not be delivered to
an initiator after the FUNCTION COMPLETE from the ABORT
TASK that aborted it is, and similarly for both ABORT
TASK SET and CLEAR TASK SET.

Since this requirement is contained in the
existing requirement to adhere to SAM, we don't need
any additional text in the iSCSI requirements draft,
right ;-) ?

???,
--David

> -----Original Message-----
> From:	Douglas Otis [SMTP:dotis@sanlight.net]
> Sent:	Friday, April 20, 2001 7:50 PM
> To:	Black_David@emc.com; ips@ece.cmu.edu
> Subject:	RE: iSCSI Reqts: In-Order Delivery
> 
> David,
> 
> I suggested one solution that has several benefits.  This one suggestion
> is
> not the only option to resolve this issue.  Connection Allegiance does not
> resolve state with respect to Management requests.  Off hand I can think
> of
> several other options as these requests are clearly indicated.  How this
> problem is resolved should be considered a separate issue, but there is
> this
> requirement that should not be overlook.  My interpretation of SAM places
> this obligation on the transport.
> 
> Doug
> 
> 
> > Focusing solely on the discussion needed to resolve the
> > (last call) issue in the requirements draft:
> >
> > (A) Charles suggests that "ordered delivery of SCSI commands"
> > 	should include task management commands.  That
> > 	was the intent of the proposal and words should be
> > 	added to make this clear.  Section 7.3 of the -06
> > 	version of the main iSCSI document contains an
> > 	initial version of a description of how task management
> > 	commands can be executed immediately but have the
> > 	effects they would have had if delivered in order.
> >
> > (B) Doug is concerned that the task management response
> > 	may arrive before the responses to one or more
> > 	commands that were affected by the task management
> > 	command.  While his technical concern is valid,
> > 	and has/is being discussed, I don't think foreclosing
> > 	that discussion by requiring session-wide
> > 	synchronization of responses in the requirements
> > 	document is the right thing to do.  Hence I would
> > 	not change the proposal to require such synchronization.
> >
> > Thanks,
> > --David
> >


From owner-ips@ece.cmu.edu  Fri Apr 20 23:14:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23547
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 23:14:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KLO8p15161
	for ips-outgoing; Fri, 20 Apr 2001 17:24:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel2.hp.com (atlrel2.hp.com [156.153.255.202])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KLNUA15141
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 17:23:30 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by atlrel2.hp.com (Postfix) with ESMTP id 7378F287
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 17:23:29 -0400 (EDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id OAA22477 for ips@ece.cmu.edu; Fri, 20 Apr 2001 14:24:32 -0700 (PDT)
Message-Id: <200104202124.OAA22477@core.rose.hp.com>
Subject: Re: iSCSI:out-of-order notification proposal
To: ips@ece.cmu.edu
Date: Fri, 20 Apr 2001 14:24:31 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sandeep,

Some comments on your proposal.

- The "strict_order" flag that you mention appears to carry
  information that's already contained by the SCSI task attributes
  (ATTR field).  

- This proposal requires the target iSCSI layer to look at more
  variables (SCSI-level information in that) for making its
  sequencing decisions. 

- Simple commands (intended to be) received prior to an Ordered 
  command must be executed before the Ordered one.  If you lose a 
  Simple command, you must wait for it before you act on the Ordered 
  that you received out-of-order.

This appears more an implementaion optimization where desired than
something we want to get into the draft.  You are proposing that 
even in the absence of errors, iSCSI should make ordering decisions
based on SCSI task attributes.  I disagree with that.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com


>This is a proposal to allow the initiator to inform the target 
>if out-of-order execution within the command stream is possible.
>
>The target execution can rotate between "in-cmdSN-order" and 
>"out-of-order" during runtime as informed by the initiator..
>
>Appreciate comments on subtleties that I may have missed.
>
>thanks,
>-Sandeep
>
>http://ips.pdl.cs.cmu.edu/mail/msg03152.html
>   by Costa provides a good summary of the issue at hand.
>
>http://ips.pdl.cs.cmu.edu/mail/msg04255.html
>  David provides the "new" reqts.  In particular, this one
>
>   > MUST specify the ability to preserve ordered delivery
>   > of SCSI commands even in the presence of transport
>   > errors.  A mechanism MUST be provided to allow
>   > Initiators and Targets to negotiate this preservation
>   > on a per-session or finer granularity basis
>
>Note :
>======
>1) This does not rely on SCSI cmdRN, but operates at iSCSI level.
>2) Flow control using cmdSN works as designed.
>3) This solution is not a per-session negotiation option but can be
>   disabled and re-enabled again at "runtime" by the initiator if it
>   notices that Ordered/HOQ tasks (or any other need) have entered
>   the iSCSI command stream which is being dispatched.
>
>Problem:
>=========
>In case of out-of-order arrival or digest errors, it is NOT possible
>to know if the initiator had sent an ordered command before the one
>which was received.
>
>Solution:
>=========
>To notify target of presence of in-flight ordered commands, we set
>a flag on *every* PDU following the ordered command *until* the target
>moves it expCmdSN above the cmdSN of the ordered command.  The
>expCmdSN indicates target has found the ordered command.
>
>( Those familiar with "ECN over IP" (Floyd, et al) may see this is
>similar to how a congestion bit keeps being set until the sender acks
>that it has received the notification)
>
>Figuratively:
>=============
>So, assume there was a new "strict_order" flag in the BHS.
>In the figure below, braces shows value of (cmdSN, strict_order)
>
>Initiator                       Target
>---------                       -------
>
>simple cmd (cmdSN=100, strict_order=0) ->
>simple cmd (101, 0) ->
>simple cmd (102, 0) ->
>
>   these may get reordered or have digest errors ->
>
>                        target executes as they arrive
>                             exec simple cmd(102, 0)
>                             exec simple cmd(100, 0)
>                             exec simple cmd(101, 0)
>
>Now say the initiator wants to send a HOQ task.
>It sets strict_order=1 on all PDUs
>
>ordered cmd (103, 1) ->
>simple  cmd (104, 1) ->
>simple  cmd (105, 1) ->
>                        in case of reordering or digest errors
>                        target must wait & execute in cmdSN order!
>simple  cmd (106, 1) ->
>simple  cmd (107, 1) ->
>
>         <---- now target sends expCmdSN=103
>
>This implies target has seen command(cmdSN=103) and target will do the
>appropriate ordering and delivery to SCSI layer.  This is left to
>the target implementation to tackle.
>
>Initiator checks (expCmdSN >= cmdSN of ordered cmd) and then resets
>strict_order to zero for all subsequent PDUs.
>
>simple  cmd (108, 0) ->
>simple  cmd (109, 0) ->
>
>Other issues:
>=============
>If the basic scheme is ok, then we could later tackle other questions
>"what about multiple ordered commands" and the like..




From owner-ips@ece.cmu.edu  Fri Apr 20 23:20:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA23579
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 23:20:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3KLJA914910
	for ips-outgoing; Fri, 20 Apr 2001 17:19:10 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sphmraaa.compuserve.com (hs-img-rel-1.compuserve.com [149.174.177.156])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3KLICA14881
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 17:18:12 -0400 (EDT)
Received: (from mailgate@localhost)
	by sphmraaa.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) id RAA04786
	for ips@ece.cmu.edu; Fri, 20 Apr 2001 17:18:06 -0400 (EDT)
Received: from compuserve.com (sfr-tgn-sfp-vty45.as.wcom.net [216.192.35.45])
	by sphmraaa.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) with ESMTP id RAA04732
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 17:17:56 -0400 (EDT)
Message-ID: <3AE0A7D6.2100FE26@compuserve.com>
Date: Fri, 20 Apr 2001 14:19:18 -0700
From: Ralph Weber <ralphoweber@compuserve.com>
Reply-To: ENDL_TX@computer.org
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD NSCPCD47  (Win98; I)
X-Accept-Language: en,pdf
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI : EnableACA
References: <C1256A34.00216472.00@d12mta02.de.ibm.com> <3AE089D0.86FC46B7@cup.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

I like what Santosh is proposing and have big time reservations
about this hack EnableACA bit.  The cases the Julian is trying
to cover with the EnableACA bit should be covered via the
proposal Ed Gardner is supposed to bring to T10 to make Unit
Attention "sticky" and turn things like BUSY status into
CHECK CONDITIONs.

There is one tiny nit in Santosh's proposal.  The bit in the
standard INQUIRY data is called NormACA, not NACA.  NACA is
the bit in the CDB, only.

Thanks.

Ralph...

Santosh Rao wrote:

>
> Julian,
>
> Would the following not satisfy the requirements for dealing with this
> ACA issue :
>
> 1) Initiators determine the target support for ACA through the NACA bit
> in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
> good faith, this would be supported.)
>
> 2) Initiators set the NACA bit in the CDBs of commands that need strong
> ordering. (This could be a small subset of the I/O traffic to one or
> more LUNs within the session and not required for all the I/Os in that
> session.)
>
> 3) Any exception condition on a SCSI I/O, for which the NACA bit was set
> results in ACA being established.
> Thus, ACA would only be applied if some I/O traffic that required strong
> ordering was affected by the exception condition.
>
> 4) Since the initiator is ACA capable based on its usage of the NACA
> bit, it should also be capable of performing the desired Clear ACA to
> recover from this condition.
>
> Such an approach would only apply ACA and its corresponding recovery
> when some strongly ordered I/O encountered an exception condition,
> rather than applying ACA on a session granularity.
>
> To summarize, the above approach allows :
> - ACA to be turned on/off for a subset of I/Os headed to a LUN
> - ACA based recovery only used where needed.
> - Keeps iSCSI ACA un-aware and rightly so, since this is a property of
> the SCSI ULP.
> - Avoids applying ACA recovery on a session granularity.
>
> What am I missing here (?). Why is an EnableACA needed ?
>
> - Santosh
>
> julian_satran@il.ibm.com wrote:
>
> > All references to
> > EnableACA are redundant and should be removed for the following reasons
> > :
> >
> > a) An initiator knows whether a target supports ACA from the NACA bit in
> > the INQUIRY response. When a target indicates support for ACA, the
> > initiator can use it by setting the NACA bit in the CDBs it sends. There
> > is NO need for any sort of negotiation of this behaviour above and
> > beyond what is already provided thru SCSI mechanisms.
> >
> > b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
> > use or lack thereof. This is done thru the NACA bit in CDBs.
> >
> > c) (As a side note, the description of EnableACA on pg 127 refers to its
> > presence in the lun control mode page, but it is actually present in the
> > protocol specific port page.)
> >
> > d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
> > negotiated on a per-session basis. SCSI allows initiators to request ACA
> > behaviour on a per I/O basis through the use of NACA bit in the CDBs.
> >
>
>  +++ We have required ACA to be supported by all new iSCSI targets and
>  several
>  actions require the target to enter ACA state.
>  It was brought to our attention that many initiators will not react
>  properly to a
>  target entering ACA state (not do the reset).
>  The EnableACA bit and key are meant to enable an initiator to control
> this
>  iSCSI specific ACA behaviour.  This behaviour is related to
> asynchronous
>  events and is not controlled by the NACA CDB bit.
>
>  ++++




From owner-ips@ece.cmu.edu  Fri Apr 20 23:42:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA24032
	for <ips-archive@odin.ietf.org>; Fri, 20 Apr 2001 23:42:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3L1dIB25890
	for ips-outgoing; Fri, 20 Apr 2001 21:39:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3L1d2A25883
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 21:39:03 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3L2kr123149;
	Fri, 20 Apr 2001 19:46:53 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 18:37:03 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIEKDCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0708015488@corpmx9.isus.emc.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

iSCSI has presently made providing this impossible.  You can not make
assumptions about relative delivery rates between connections.  This can and
should be fixed.  As you know, I like my solution but there are many others.

Doug


> Doug,
>
> Attempting a fast exit ... I agree with the interpretation
> of SAM insofar as SCSI responses are concerned - the
> description of ABORT TASK in SAM (6.1) is clear that a
> SCSI response to an aborted task must not be delivered to
> an initiator after the FUNCTION COMPLETE from the ABORT
> TASK that aborted it is, and similarly for both ABORT
> TASK SET and CLEAR TASK SET.
>
> Since this requirement is contained in the
> existing requirement to adhere to SAM, we don't need
> any additional text in the iSCSI requirements draft,
> right ;-) ?
>
> ???,
> --David
>
> > -----Original Message-----
> > From:	Douglas Otis [SMTP:dotis@sanlight.net]
> > Sent:	Friday, April 20, 2001 7:50 PM
> > To:	Black_David@emc.com; ips@ece.cmu.edu
> > Subject:	RE: iSCSI Reqts: In-Order Delivery
> >
> > David,
> >
> > I suggested one solution that has several benefits.  This one suggestion
> > is
> > not the only option to resolve this issue.  Connection
> Allegiance does not
> > resolve state with respect to Management requests.  Off hand I can think
> > of
> > several other options as these requests are clearly indicated.  How this
> > problem is resolved should be considered a separate issue, but there is
> > this
> > requirement that should not be overlook.  My interpretation of
> SAM places
> > this obligation on the transport.
> >
> > Doug
> >
> >
> > > Focusing solely on the discussion needed to resolve the
> > > (last call) issue in the requirements draft:
> > >
> > > (A) Charles suggests that "ordered delivery of SCSI commands"
> > > 	should include task management commands.  That
> > > 	was the intent of the proposal and words should be
> > > 	added to make this clear.  Section 7.3 of the -06
> > > 	version of the main iSCSI document contains an
> > > 	initial version of a description of how task management
> > > 	commands can be executed immediately but have the
> > > 	effects they would have had if delivered in order.
> > >
> > > (B) Doug is concerned that the task management response
> > > 	may arrive before the responses to one or more
> > > 	commands that were affected by the task management
> > > 	command.  While his technical concern is valid,
> > > 	and has/is being discussed, I don't think foreclosing
> > > 	that discussion by requiring session-wide
> > > 	synchronization of responses in the requirements
> > > 	document is the right thing to do.  Hence I would
> > > 	not change the proposal to require such synchronization.
> > >
> > > Thanks,
> > > --David
> > >
>



From owner-ips@ece.cmu.edu  Sat Apr 21 00:27:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA24295
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 00:27:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3L0XGM23437
	for ips-outgoing; Fri, 20 Apr 2001 20:33:16 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com (mxbh4.isus.emc.com [128.221.10.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3L0WOA23417
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 20:32:24 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <JG3Z90VJ>; Fri, 20 Apr 2001 20:32:19 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015489@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: santoshr@cup.hp.com
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI : login keys & mode page settings
Date: Fri, 20 Apr 2001 20:32:17 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I'm not sure -- this sounds somewhat like the
old principle of not asking why there's a hole
in one's foot when one has aimed the gun at
it and pulled the trigger.  For the tape
example, if some tape driver changes a
Target iSCSI parameter that disrupts that
driver's own tape I/O in a fashion that the
driver can't recover from, I think it's
clear where the fault lies.  If one Initiator
can damage another in this fashion, then we
may indeed have a problem.

Comments?,
--David

> -----Original Message-----
> From:	Santosh Rao [SMTP:santoshr@cup.hp.com]
> Sent:	Friday, April 20, 2001 8:09 PM
> To:	Black_David@emc.com
> Cc:	ips@ece.cmu.edu
> Subject:	Re: iSCSI : login keys & mode page settings
> 
> David,
> 
> Some clarification on the basis for classifying login keys as LO would
> also be helpful. Should login keys that can disrupt I/O on their change
> be allowed to be non-LO ?
> 
> Thanks,
> Santosh
> 
> Black_David@emc.com wrote:
> > 
> > Without getting into the technical details of the
> > discussion, I have a couple of observations:
> > 
> > (A) The issue of whether to allow mode page
> >         access to and modification of iSCSI parameters
> >         will need to be taken up at the interim
> >         meeting.  IMHO, access seems like a good
> >         idea, so that SCSI-generic code that doesn't
> >         know specifically about iSCSI can find
> >         what it expects where it expects it, but
> >         I'm unsure about modification because it
> >         may carry a risk of code that's iSCSI-unaware
> >         getting something wrong.  The mode page
> >         commands should be transparent to iSCSI.
> > 
> > (B) The mode page and text key mechanisms have
> >         to access the same data.  Section 3 of the
> >         -06 version says this, but needs some editing
> >         to enforce it by using "MUST" or its equivalent
> >         (cf. RFC 2119).  This is to prevent an
> >         implementation from having two instances of
> >         the same parameter - one for the mode page and
> >         one for the text keys - which would be a bad
> >         thing.
> > 
> > --David
> > 
> > ---------------------------------------------------
> > David L. Black, Senior Technologist
> > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> > black_david@emc.com       Mobile: +1 (978) 394-7754
> > --------------------------------------------------- << File: Card for
> Santosh Rao >> 


From owner-ips@ece.cmu.edu  Sat Apr 21 00:51:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA24412
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 00:51:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3L2ZLZ27766
	for ips-outgoing; Fri, 20 Apr 2001 22:35:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3L2YdA27703
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 22:34:39 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTR3D7>; Fri, 20 Apr 2001 19:34:32 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B17344E@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Santosh Rao (E-mail)" <santoshr@cup.hp.com>
Cc: "Ips (E-mail)" <ips@ece.cmu.edu>
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 19:34:32 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi Santosh:

Please see below.

> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Friday, April 20, 2001 11:46 AM
> To: Charles Monia
> Cc: ips@ece.cmu.edu
> Subject: Re: iSCSI Reqts: In-Order Delivery
> 
> 
> Charles Monia wrote:
> 
> > > (1) MUST provide ordered delivery of SCSI commands from
> > >       the initiator to the target in the absence of transport
> > >       errors visible to iSCSI (e.g., iSCSI CRC failure,
> > >       unexpected TCP connection closure).
> > 
> > Does the term "SCSI commands" include task management 
> functions as well?  If
> > not, it should.
> 
> 
> Charles,
> 
> Could iSCSI use a variant of the approach FCP-2 takes to solve the
> ordering issue for task mgmt error recovery ?
> 
> The FCP-2 task management error recovery scheme is :
> - task mgmt function uses CRN 0
> - task mgmt function is executed immediately with no ordering 
> latencies
> - both initiator & target clear all resources that can be cleared
> un-ambiguously.
> - any ambiguous exchanges shall be aborted by the port that 
> detects the
> ambiguous state.
> 
> In the case of iSCSI, an analogous approach could be :
> - task mgmt function uses immediate delivery flag for the 
> task mgmt PDU.
> - task mgmt fn executed immediately avoiding any ordering latencies.
> - initiator & target clear all resources that can be cleared
> un-ambiguously.
> - initiator uses Abort Task to explicitly abort all active outstanding
> I/Os at the time the task mgmt fn was issued to avoid any ambiguous
> stale PDUs of an exchange from appearing at the target. 
> 
> Such an approach would avoid latencies on the execution of 
> the task mgmt
> fn while still flushing out all the stale PDUs upon completion of the
> initiator actions for that task mgmt fn.
> 

The problem is to avoid scenarios where the initiator and target's view of
the task set are out of step.  Specifically, we must avoid the case where an
initiator receives a PDU from a task it believes has been terminated. 

In that respect, the technique you describe above should work for an ABORT
TASK operation.

In the case of ABORT TASK SET, the function could be emulated by issuing a
series of ABORT TASK requests. For CLEAR TASK SET, an initiator would
probably want to do the individual ABORT TASK operations, followed by a
CLEAR TASK SET to terminate tasks from other initiators.  I assume TARGET
RESET and LUN RESET would be emulated in a manner similar to CLEAR TASK SET.
In all of these cases there may be some "atomicity" side effects caused by
doing things one at a time instead of all at once.

The only sticky problem is insuring that the CLEAR ACA function works right.
By that I mean that you don't want to issue the function until all prior
SCSI commands that were in flight when the ACA occurred have been terminated
with the ACA ACTIVE status.  You can't simply replicate the command on each
connection since you might inadvertently clear a subsequent ACA. (Yes -- I
know these are all edge cases, but we may as well try to get it right.)
Maybe the thing to do is implement the function such that the ACA interlock
is not cleared until the CLEAR ACA function is sent on all the connections
comprising the session.

One minor distinction worth noting is that CRN is enforced in the SCSI
layer, whereas cmdSN is enforced in the iSCSI transport.  So, a CRN of 0
doesn't take effect until the transport presents the command to the SCSI
layer for processing.  In that case, leapfrogging of PDU ordering never
occurs.

Incidentally, I've made the tacit assumption that commands on a given
connection are presented to the SCSI layer in order they were sent,
regardless of whether or nor cmdSN was set to 0.  I assume the framing
mechanisms that have been discussed for buffer offloading do not affect this
behavior.  I.e., a fully formed PDU slated for immediate delivery won't be
passed to the SCSI layer before a partially complete PDU that was received
earlier.

If that's true, immediate delivery seems to have no meaning in a
single-connection scenario.  What's more, in all cases, the iSCSI layer
doesn't really have to be aware of task management semantics -- unless
someone decides to intermix immediate and sequential commands in a
multi-connection session.  Then all bets are off.

Charles


From owner-ips@ece.cmu.edu  Sat Apr 21 04:25:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA08742
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 04:25:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3L1rJH26369
	for ips-outgoing; Fri, 20 Apr 2001 21:53:19 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3L1qZA26354
	for <ips@ece.cmu.edu>; Fri, 20 Apr 2001 21:52:36 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3VDXM>; Fri, 20 Apr 2001 21:54:02 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801548C@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Fri, 20 Apr 2001 21:52:25 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Doug,

At the moment, I'm trying to resolve the last call
issue around the requirements draft.  In what you've
written, I see an issue with the iSCSI specification
draft, but not a need for any change to the
requirements draft.

Thanks,
--David

> -----Original Message-----
> From:	Douglas Otis [SMTP:dotis@sanlight.net]
> Sent:	Friday, April 20, 2001 9:37 PM
> To:	Black_David@emc.com; ips@ece.cmu.edu
> Subject:	RE: iSCSI Reqts: In-Order Delivery
> 
> David,
> 
> iSCSI has presently made providing this impossible.  You can not make
> assumptions about relative delivery rates between connections.  This can
> and
> should be fixed.  As you know, I like my solution but there are many
> others.
> 
> Doug
> 
> 
> > Doug,
> >
> > Attempting a fast exit ... I agree with the interpretation
> > of SAM insofar as SCSI responses are concerned - the
> > description of ABORT TASK in SAM (6.1) is clear that a
> > SCSI response to an aborted task must not be delivered to
> > an initiator after the FUNCTION COMPLETE from the ABORT
> > TASK that aborted it is, and similarly for both ABORT
> > TASK SET and CLEAR TASK SET.
> >
> > Since this requirement is contained in the
> > existing requirement to adhere to SAM, we don't need
> > any additional text in the iSCSI requirements draft,
> > right ;-) ?
> >
> > ???,
> > --David
> >
> > > -----Original Message-----
> > > From:	Douglas Otis [SMTP:dotis@sanlight.net]
> > > Sent:	Friday, April 20, 2001 7:50 PM
> > > To:	Black_David@emc.com; ips@ece.cmu.edu
> > > Subject:	RE: iSCSI Reqts: In-Order Delivery
> > >
> > > David,
> > >
> > > I suggested one solution that has several benefits.  This one
> suggestion
> > > is
> > > not the only option to resolve this issue.  Connection
> > Allegiance does not
> > > resolve state with respect to Management requests.  Off hand I can
> think
> > > of
> > > several other options as these requests are clearly indicated.  How
> this
> > > problem is resolved should be considered a separate issue, but there
> is
> > > this
> > > requirement that should not be overlook.  My interpretation of
> > SAM places
> > > this obligation on the transport.
> > >
> > > Doug
> > >
> > >
> > > > Focusing solely on the discussion needed to resolve the
> > > > (last call) issue in the requirements draft:
> > > >
> > > > (A) Charles suggests that "ordered delivery of SCSI commands"
> > > > 	should include task management commands.  That
> > > > 	was the intent of the proposal and words should be
> > > > 	added to make this clear.  Section 7.3 of the -06
> > > > 	version of the main iSCSI document contains an
> > > > 	initial version of a description of how task management
> > > > 	commands can be executed immediately but have the
> > > > 	effects they would have had if delivered in order.
> > > >
> > > > (B) Doug is concerned that the task management response
> > > > 	may arrive before the responses to one or more
> > > > 	commands that were affected by the task management
> > > > 	command.  While his technical concern is valid,
> > > > 	and has/is being discussed, I don't think foreclosing
> > > > 	that discussion by requiring session-wide
> > > > 	synchronization of responses in the requirements
> > > > 	document is the right thing to do.  Hence I would
> > > > 	not change the proposal to require such synchronization.
> > > >
> > > > Thanks,
> > > > --David
> > > >
> >


From owner-ips@ece.cmu.edu  Sat Apr 21 13:19:31 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA11286
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 13:19:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LFtrR04556
	for ips-outgoing; Sat, 21 Apr 2001 11:55:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3LFsrA04486
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 11:54:53 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id LAA11973 for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 11:54:51 -0400 (EDT)
Message-ID: <3AE1ACED.75A64A2F@cisco.com>
Date: Sat, 21 Apr 2001 10:53:17 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: IPS <ips@ece.cmu.edu>
Subject: iSCSI MIB: New draft
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


The iSCSI MIB has been submitted as an IPS WG document.  It
should appear at the IETF site this weekend as:

draft-ietf-ips-iscsi-mib-00.txt

In the meantime, Julian has posted it on his web site at:

http://www.haifa.il.ibm.com/satran/ips/draft-ietf-ips-iscsi-mib-00.txt

Happy reading,

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Sat Apr 21 13:19:59 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA11304
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 13:19:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LFSpg03590
	for ips-outgoing; Sat, 21 Apr 2001 11:28:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3LFRwA03580
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 11:27:58 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id RAA159624
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 17:27:50 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id RAA146688
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 17:27:50 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A35.0054F118 ; Sat, 21 Apr 2001 17:27:47 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A35.0054EFFF.00@d12mta02.de.ibm.com>
Date: Sat, 21 Apr 2001 17:32:55 +0200
Subject: RE: iSCSI : login keys & mode page settings
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk





Black_David@emc.com on 20/04/2001 19:12:04

Please respond to Black_David@emc.com

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI : login keys & mode page settings




Without getting into the technical details of the
discussion, I have a couple of observations:

(A) The issue of whether to allow mode page
     access to and modification of iSCSI parameters
     will need to be taken up at the interim
     meeting.  IMHO, access seems like a good
     idea, so that SCSI-generic code that doesn't
     know specifically about iSCSI can find
     what it expects where it expects it, but
     I'm unsure about modification because it
     may carry a risk of code that's iSCSI-unaware
     getting something wrong.  The mode page
     commands should be transparent to iSCSI.

+++ I would like to point out that there are three mode pages that have
parameters that are either completely protocol specific (LU & Port) or with
semantics that are protocol specific(disconnect/reconnect). Code that is
completely iSCSI unaware must not touch protocol specific parameters.

The setting is is more delicate issue and our solution just bring the
"negotiation" into modern age.  It is not radically better but it gives
some uniformity to iSCSI driver and target logic.

I would have suggested removing mode set access for those parameters but
this is hard to enforce in a port/miniport of high-level/low-level code
structure.  :-)

+++

(B) The mode page and text key mechanisms have
     to access the same data.  Section 3 of the
     -06 version says this, but needs some editing
     to enforce it by using "MUST" or its equivalent
     (cf. RFC 2119).  This is to prevent an
     implementation from having two instances of
     the same parameter - one for the mode page and
     one for the text keys - which would be a bad
     thing.
+++ I will try to - although boldface is forbiden in text-only :-) +++
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------






From owner-ips@ece.cmu.edu  Sat Apr 21 13:24:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA11287
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 13:19:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LEn2102301
	for ips-outgoing; Sat, 21 Apr 2001 10:49:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3LEmOA02293
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 10:48:24 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id QAA316918
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 16:48:17 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA228500
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 16:48:17 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A35.00515070 ; Sat, 21 Apr 2001 16:48:10 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A35.0051504D.00@d12mta02.de.ibm.com>
Date: Sat, 21 Apr 2001 16:53:20 +0200
Subject: Re: iSCSI:out-of-order notification proposal
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



What exactly does it buy you? (that is not already in).

And FWIW David (B) raised a question related to the requirements doc.

Julo

Sandeep Joshi <sandeepj@research.bell-labs.com> on 20/04/2001 18:43:02

Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI:out-of-order notification proposal





This is a proposal to allow the initiator to inform the target
if out-of-order execution within the command stream is possible.

The target execution can rotate between "in-cmdSN-order" and
"out-of-order" during runtime as informed by the initiator..

Appreciate comments on subtleties that I may have missed.

thanks,
-Sandeep

http://ips.pdl.cs.cmu.edu/mail/msg03152.html
   by Costa provides a good summary of the issue at hand.

http://ips.pdl.cs.cmu.edu/mail/msg04255.html
  David provides the "new" reqts.  In particular, this one

   > MUST specify the ability to preserve ordered delivery
   > of SCSI commands even in the presence of transport
   > errors.  A mechanism MUST be provided to allow
   > Initiators and Targets to negotiate this preservation
   > on a per-session or finer granularity basis

Note :
======
1) This does not rely on SCSI cmdRN, but operates at iSCSI level.
2) Flow control using cmdSN works as designed.
3) This solution is not a per-session negotiation option but can be
   disabled and re-enabled again at "runtime" by the initiator if it
   notices that Ordered/HOQ tasks (or any other need) have entered
   the iSCSI command stream which is being dispatched.

Problem:
=========
In case of out-of-order arrival or digest errors, it is NOT possible
to know if the initiator had sent an ordered command before the one
which was received.

Solution:
=========
To notify target of presence of in-flight ordered commands, we set
a flag on *every* PDU following the ordered command *until* the target
moves it expCmdSN above the cmdSN of the ordered command.  The
expCmdSN indicates target has found the ordered command.

( Those familiar with "ECN over IP" (Floyd, et al) may see this is
similar to how a congestion bit keeps being set until the sender acks
that it has received the notification)

Figuratively:
=============
So, assume there was a new "strict_order" flag in the BHS.
In the figure below, braces shows value of (cmdSN, strict_order)

Initiator                       Target
---------                       -------

simple cmd (cmdSN=100, strict_order=0) ->
simple cmd (101, 0) ->
simple cmd (102, 0) ->

   these may get reordered or have digest errors ->

                        target executes as they arrive
                             exec simple cmd(102, 0)
                             exec simple cmd(100, 0)
                             exec simple cmd(101, 0)

Now say the initiator wants to send a HOQ task.
It sets strict_order=1 on all PDUs

ordered cmd (103, 1) ->
simple  cmd (104, 1) ->
simple  cmd (105, 1) ->
                        in case of reordering or digest errors
                        target must wait & execute in cmdSN order!
simple  cmd (106, 1) ->
simple  cmd (107, 1) ->

         <---- now target sends expCmdSN=103

This implies target has seen command(cmdSN=103) and target will do the
appropriate ordering and delivery to SCSI layer.  This is left to
the target implementation to tackle.

Initiator checks (expCmdSN >= cmdSN of ordered cmd) and then resets
strict_order to zero for all subsequent PDUs.

simple  cmd (108, 0) ->
simple  cmd (109, 0) ->

Other issues:
=============
If the basic scheme is ok, then we could later tackle other questions
"what about multiple ordered commands" and the like..





From owner-ips@ece.cmu.edu  Sat Apr 21 13:28:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA11328
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 13:28:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LFxrb04703
	for ips-outgoing; Sat, 21 Apr 2001 11:59:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3LFx2A04645
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 11:59:02 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id RAA53654
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 17:58:55 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id RAA127452
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 17:58:55 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A35.0057C968 ; Sat, 21 Apr 2001 17:58:51 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A35.0057C8E2.00@d12mta02.de.ibm.com>
Date: Sat, 21 Apr 2001 18:04:02 +0200
Subject: Re: iSCSI : New PDU opcode usage in rev 5.92
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Matt,

Sorry - I was running late and still hopping to get the recovery done.
As for the matter on hand - If the "fake indication" offered by the 2
initial bits is not good enough
we might reinstate a direction bit and reduce the vendor specific space.

I am sure that David can take this on the list.

Regards,
Julo

"Matt Wakeley" <matt_wakeley@agilent.com> on 20/04/2001 21:11:23

Please respond to Matt Wakeley <matt_wakeley@agilent.com>

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:   David Black <Black_David@emc.com>
Subject:  Re: iSCSI : New PDU opcode usage in rev 5.92




Julian,

I like the "direction bit".  It is a way to exactly determine what the
message
is, without the need to know the context (especially useful when analyzing
what's happening on the link).

If you think we are running short on opcodes, then don't use a "direction
bit", but just renumber the opcodes for unique values (for example, 0x01 =
command, 0x02 = response, etc.).

On another point, I don't think you should be changing things as
fundamental
as this without direction from the group.  This "standard" is supposed to
be
stabilizing, not constantly changing.

-Matt Wakeley
Agilent Technologies


julian_satran@il.ibm.com wrote:
>
> I would like to add to Venkat remarks only that this asymmetry has been
> with us in iSCSI forever
> and we had even a statement to the effect that targets should not issue
> initiator codes etc. (this is irrelevant now as the codes overlap).
>
> The reason I took out the "direction bit" (meant more for observers) was
> that I felt that we are low on codes :-)
>
> Julo
>
> Venkat Rangan <venkat@rhapsodynetworks.com> on 19/04/2001 07:58:23
>
> Please respond to Venkat Rangan <venkat@rhapsodynetworks.com>
>
> To:   "'Santosh Rao'" <santoshr@cup.hp.com>, ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI : New PDU opcode usage in rev 5.92
>
> Santosh,
>
> Is it not the case that requests go in the direction from the Initiator
to
> Target,
> where Target is the one "listening" for new connections on the well-known
> port?
> A dual mode scsi implementation therefore has two separate sessions and
> sets
> of connections.
> One set is [I->DualModeTarget] and the other is [DualModeInitiator->T]
> and the connections are independent. If I and T happens to be the same
> system, you
> can not use a single connection for bidirectional sessions between the
two.
>
> So if you receive a PDU from a target, you can only do so with SourcePort
> set to
> well-known-port, and it must be a Response from target. May be I'm
assuming
> something
> that is not valid...
>
> Venkat Rangan
> Rhapsody Networks Inc.
> http://www.rhapsodynetworks.com
>
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Wednesday, April 18, 2001 6:47 PM
> To: ips@ece.cmu.edu
> Subject: iSCSI : New PDU opcode usage in rev 5.92
>
> Julian & All,
>
> I've got a quick question on how the new opcode layouts would work for
> dual mode scsi implementations. (i.e. initiators that responded in
> target mode or targets that acted as initiators also).
>
> The new opcode layout is :
>
> ----------------
> X|I| | | | | | |
> ----------------
> 7 6 5 4 3 2 1 0
>
> where bits 5-0 -> opcode
> X -> retry bit
> I -> immediate bit
>
> The same values are used for the command as well as response opcodes and
> bits X & I are intended to both be set to 1 by targets.
>
> i.e. opcode for scsi command = scsi response = 0x01. the distinction b/n
> command and response is based on targets setting X & I bits to 1.
>
> Now, if an initiator [capable of target mode] sent the following
> commands, how would they be interpreted :
>
> 1) 0xc4.
> is this a text command being retried in immediate mode,
> or is it a text response ?
>
> 2) 0xc1
> is this a scsi command being retried in immediate mode,
> or is it a scsi response ?
>
> 3) 0xc2
> is this a scsi task mgmt command being retried in immediate mode,
> or is it a scsi task mgmt response ?
>
> etc.....
>
> - Santosh
>
> --
> #################################
> Santosh Rao
> Software Design Engineer,
> HP, Cupertino.
> email : santoshr@cup.hp.com
> Phone : 408-447-3751
> #################################





From owner-ips@ece.cmu.edu  Sat Apr 21 13:52:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA11411
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 13:52:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LGQsa05752
	for ips-outgoing; Sat, 21 Apr 2001 12:26:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3LGQ9A05709
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 12:26:10 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id SAA51172
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 18:26:03 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id SAA45776
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 18:26:02 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A35.005A44FA ; Sat, 21 Apr 2001 18:25:58 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A35.005A43E7.00@d12mta02.de.ibm.com>
Date: Sat, 21 Apr 2001 18:31:07 +0200
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Ralph & Santosh,

You are right. I was confused by the NACA and thought that it is again the
CDB bit.
This bit conveys the right information and if it had been a target-wide bit
it would have fit the bill.
Some of the events that have raised the need for the EnableACA are target
wide.

As it is specified today it NornACA is tied to a "device server" (i.e. a
LU) and has many other items that
are LU specific.

I did place the EnableACA in a target wide Page.

It is a hack in the sense it applies only to iSCSI mandated ACA behavior
and it bound to be misinterpreted sooner or later.

Julo



Ralph Weber <ralphoweber@compuserve.com> on 20/04/2001 23:19:18

Please respond to ENDL_TX@computer.org

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI : EnableACA




I like what Santosh is proposing and have big time reservations
about this hack EnableACA bit.  The cases the Julian is trying
to cover with the EnableACA bit should be covered via the
proposal Ed Gardner is supposed to bring to T10 to make Unit
Attention "sticky" and turn things like BUSY status into
CHECK CONDITIONs.

There is one tiny nit in Santosh's proposal.  The bit in the
standard INQUIRY data is called NormACA, not NACA.  NACA is
the bit in the CDB, only.

Thanks.

Ralph...

Santosh Rao wrote:

>
> Julian,
>
> Would the following not satisfy the requirements for dealing with this
> ACA issue :
>
> 1) Initiators determine the target support for ACA through the NACA bit
> in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
> good faith, this would be supported.)
>
> 2) Initiators set the NACA bit in the CDBs of commands that need strong
> ordering. (This could be a small subset of the I/O traffic to one or
> more LUNs within the session and not required for all the I/Os in that
> session.)
>
> 3) Any exception condition on a SCSI I/O, for which the NACA bit was set
> results in ACA being established.
> Thus, ACA would only be applied if some I/O traffic that required strong
> ordering was affected by the exception condition.
>
> 4) Since the initiator is ACA capable based on its usage of the NACA
> bit, it should also be capable of performing the desired Clear ACA to
> recover from this condition.
>
> Such an approach would only apply ACA and its corresponding recovery
> when some strongly ordered I/O encountered an exception condition,
> rather than applying ACA on a session granularity.
>
> To summarize, the above approach allows :
> - ACA to be turned on/off for a subset of I/Os headed to a LUN
> - ACA based recovery only used where needed.
> - Keeps iSCSI ACA un-aware and rightly so, since this is a property of
> the SCSI ULP.
> - Avoids applying ACA recovery on a session granularity.
>
> What am I missing here (?). Why is an EnableACA needed ?
>
> - Santosh
>
> julian_satran@il.ibm.com wrote:
>
> > All references to
> > EnableACA are redundant and should be removed for the following reasons
> > :
> >
> > a) An initiator knows whether a target supports ACA from the NACA bit
in
> > the INQUIRY response. When a target indicates support for ACA, the
> > initiator can use it by setting the NACA bit in the CDBs it sends.
There
> > is NO need for any sort of negotiation of this behaviour above and
> > beyond what is already provided thru SCSI mechanisms.
> >
> > b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating
its
> > use or lack thereof. This is done thru the NACA bit in CDBs.
> >
> > c) (As a side note, the description of EnableACA on pg 127 refers to
its
> > presence in the lun control mode page, but it is actually present in
the
> > protocol specific port page.)
> >
> > d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT
be
> > negotiated on a per-session basis. SCSI allows initiators to request
ACA
> > behaviour on a per I/O basis through the use of NACA bit in the CDBs.
> >
>
>  +++ We have required ACA to be supported by all new iSCSI targets and
>  several
>  actions require the target to enter ACA state.
>  It was brought to our attention that many initiators will not react
>  properly to a
>  target entering ACA state (not do the reset).
>  The EnableACA bit and key are meant to enable an initiator to control
> this
>  iSCSI specific ACA behaviour.  This behaviour is related to
> asynchronous
>  events and is not controlled by the NACA CDB bit.
>
>  ++++







From owner-ips@ece.cmu.edu  Sat Apr 21 14:02:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA11477
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 14:02:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LFlr104265
	for ips-outgoing; Sat, 21 Apr 2001 11:47:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3LFkpA04246
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 11:46:52 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id RAA141074
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 17:46:47 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id RAA165550
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 17:46:48 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A35.0056ADE4 ; Sat, 21 Apr 2001 17:46:46 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A35.0056AD35.00@d12mta02.de.ibm.com>
Date: Sat, 21 Apr 2001 17:51:54 +0200
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

You read it wrong. It is all about initiators that do not have the right
support for ACA.
The will want to prevent new iSCSI target to act in their canonical manner
using a tinny piece of code in the iSCSI miniport or low-level driver.

Julo

Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 21:11:12

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI : EnableACA




Julian,

Would the following not satisfy the requirements for dealing with this
ACA issue :

1) Initiators determine the target support for ACA through the NACA bit
in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
good faith, this would be supported.)

2) Initiators set the NACA bit in the CDBs of commands that need strong
ordering. (This could be a small subset of the I/O traffic to one or
more LUNs within the session and not required for all the I/Os in that
session.)

3) Any exception condition on a SCSI I/O, for which the NACA bit was set
results in ACA being established.
Thus, ACA would only be applied if some I/O traffic that required strong
ordering was affected by the exception condition.

4) Since the initiator is ACA capable based on its usage of the NACA
bit, it should also be capable of performing the desired Clear ACA to
recover from this condition.

Such an approach would only apply ACA and its corresponding recovery
when some strongly ordered I/O encountered an exception condition,
rather than applying ACA on a session granularity.

To summarize, the above approach allows :
- ACA to be turned on/off for a subset of I/Os headed to a LUN
- ACA based recovery only used where needed.
- Keeps iSCSI ACA un-aware and rightly so, since this is a property of
the SCSI ULP.
- Avoids applying ACA recovery on a session granularity.

What am I missing here (?). Why is an EnableACA needed ?

- Santosh


julian_satran@il.ibm.com wrote:

> All references to
> EnableACA are redundant and should be removed for the following reasons
> :
>
> a) An initiator knows whether a target supports ACA from the NACA bit in
> the INQUIRY response. When a target indicates support for ACA, the
> initiator can use it by setting the NACA bit in the CDBs it sends. There
> is NO need for any sort of negotiation of this behaviour above and
> beyond what is already provided thru SCSI mechanisms.
>
> b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
> use or lack thereof. This is done thru the NACA bit in CDBs.
>
> c) (As a side note, the description of EnableACA on pg 127 refers to its
> presence in the lun control mode page, but it is actually present in the
> protocol specific port page.)
>
> d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
> negotiated on a per-session basis. SCSI allows initiators to request ACA
> behaviour on a per I/O basis through the use of NACA bit in the CDBs.
>

 +++ We have required ACA to be supported by all new iSCSI targets and
 several
 actions require the target to enter ACA state.
 It was brought to our attention that many initiators will not react
 properly to a
 target entering ACA state (not do the reset).
 The EnableACA bit and key are meant to enable an initiator to control
this
 iSCSI specific ACA behaviour.  This behaviour is related to
asynchronous
 events and is not controlled by the NACA CDB bit.

 ++++
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Sat Apr 21 16:07:17 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA12082
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 16:07:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LHQQE07829
	for ips-outgoing; Sat, 21 Apr 2001 13:26:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3LHPtA07816
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 13:25:55 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id TAA14300
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 19:25:48 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id TAA26126
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 19:25:48 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A35.005FBA7E ; Sat, 21 Apr 2001 19:25:36 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A35.005FB9D9.00@d12mta02.de.ibm.com>
Date: Sat, 21 Apr 2001 19:30:45 +0200
Subject: Re: iSCSI : Negotiable padding, was More issues.... Digest
	 related.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Rod,

On the wire padding is not required at all and many of us have resisted
padding up to the advent of markers.

Please do explain what will a padding option do that you can't do by
yourself in the endsystem with buffer alignment - or how can padding on the
wire stop you from getting buffer alignment all bad.

Julo

"Rod Harrison" <rod.harrison@windriver.com> on 20/04/2001 20:30:14

Please respond to "Rod Harrison" <rod.harrison@windriver.com>

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI : Negotiable padding, was More issues.... Digest related.




Julian,

     I have a request with respect to data padding. Can we make
the pad size login negotiable please? Preferably on a per
direction basis. This would allow the pad to be optimized
for a receivers specific requirements, e.g. cache line
alignment. Restricting padding to powers of 2 by specifying
the size as a power of 2 seems reasonable.

     For example:

     IPadSize=<any-power-of-two>
     TPadSize=<any-power-of-two>

     IPadSize=0
     TPadSize=2

     Would result in the initiator padding data to 4 byte
boundaries for the target, and the target inserting no pad
for the initiator.

     Also, a related question, if the pad is to remain mandatory
is it expected data will be padded if no data digests are in
use?

     - Rod

-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
julian_satran@il.ibm.com
Sent: Friday, April 20, 2001 1:45 AM
To: ips@ece.cmu.edu
Subject: Re: iSCSI : More issues.... Digest related.




1.The padding is to the next 4 byte word boundary .
2. There is a Security - Appendix

and there is a numbering /formating error in the appendix

Julo

Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 03:25:49

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : More issues.... Digest related.




Julian & All,

2 more issues :

1) If the DataSegmentLength in the BHS excludes padding
bytes, how does
the initiator determine the location of the data digest
[which is placed
after the padded data] ?
There is no knowledge of what amount of padding is in use,
since padding
can be 4 bytes or a multiple of that quantity.


2) While on the subject of digests.....are'nt there supposed
to be
login keys to indicate the use or non-use of header and data
digests ? I
can't seem to find any such login keys in the latest revs
5.91....5.92....6.000...(?)

(Section 2.2.1 states :
"The digest types are negotiated during the login phase. ").

- Santosh
 - santoshr.vcf








From owner-ips@ece.cmu.edu  Sat Apr 21 18:21:51 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA12537
	for <ips-archive@odin.ietf.org>; Sat, 21 Apr 2001 18:21:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3LJu4213308
	for ips-outgoing; Sat, 21 Apr 2001 15:56:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3LJtuA13303
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 15:55:56 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by dirty; Sat Apr 21 15:55:20 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Sat Apr 21 15:55:19 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id PAA28947
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 15:55:14 -0400 (EDT)
Message-ID: <3AE1E5A2.F186F5C5@research.bell-labs.com>
Date: Sat, 21 Apr 2001 15:55:14 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI:out-of-order notification proposal
References: <200104202124.OAA22477@core.rose.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


Answering two responses in one email.  This would be mighty 
easier if there was a whiteboard around here :-)

> What exactly does it buy you? (that is not already in).

The ability to execute subsequent PDUs in the presence of holes.
(Holes due to multiple connections OR due to skipping onto next marker)

If the hole is due to a REAL ordering issue (e.g. ORdered task
was lost) then the target will pause for the hole to fill.

But if the hole is due to simple task, then the target
can continue execution of the new commands it is receiving.

Currently, target will *always* pause due to cmdSN ordering,
which is somewhat like a primitive microprocessor which wont 
reorder its instruction stream, though its feasible.

> 
> And FWIW David (B) raised a question related to the requirements doc.

My mistake, the rephrasing of requirements seemed to match the intent 
of previous discussions on why total cmdSN ordering is overkill.
That was the only point i was trying to reinforce.

> 
> Julo



"Mallikarjun C." wrote:
> 
> Sandeep,
> 
> Some comments on your proposal.
> 
> - The "strict_order" flag that you mention appears to carry
>   information that's already contained by the SCSI task attributes
>   (ATTR field).

But now the information is also contained in PDUs following it.
(e.g. A simple command PDU may also have strict_order=1)

If an Ordered task PDU is lost, the target *knows* there is a REAL
ordering issue (later PDUs have info) so it must wait for the hole 
to be filled.

> 
> - This proposal requires the target iSCSI layer to look at more
>   variables (SCSI-level information in that) for making its
>   sequencing decisions.

Only one more bit.. to decide if the command can be passed
to SCSI layer now or later in cmdSN sequence.  

To build an efficient SCSI transport, some hints would need
to be passed between layers.

> 
> - Simple commands (intended to be) received prior to an Ordered
>   command must be executed before the Ordered one.  If you lose a
>   Simple command, you must wait for it before you act on the Ordered
>   that you received out-of-order.

Good pt.  This, I forgot to mention below.  The target must clear
out all PDUs received with strict-order=0 before commencing 
execution of PDUs received with strict-order=1.

> 
> This appears more an implementaion optimization where desired than
> something we want to get into the draft.  You are proposing that
> even in the absence of errors, iSCSI should make ordering decisions
> based on SCSI task attributes.  I disagree with that.

The optimization needs a bit in the PDU, otherwise I wouldnt
have gone to the trouble of writing this out!  And you would agree
that there are a bunch of TCP optimizations in various drafts.

Lastly, what is the point in building a sync-and-steering layer to
skip ahead in the command stream if you cannot execute that next command
?
With this little more work, you could as well continue execution.

> --
> Mallikarjun
> 
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668 Hewlett-Packard, Roseville.
> cbm@rose.hp.com
> 
> >This is a proposal to allow the initiator to inform the target
> >if out-of-order execution within the command stream is possible.
> >
> >The target execution can rotate between "in-cmdSN-order" and
> >"out-of-order" during runtime as informed by the initiator..
> >
> >Appreciate comments on subtleties that I may have missed.
> >
> >thanks,
> >-Sandeep
> >
> >http://ips.pdl.cs.cmu.edu/mail/msg03152.html
> >   by Costa provides a good summary of the issue at hand.
> >
> >http://ips.pdl.cs.cmu.edu/mail/msg04255.html
> >  David provides the "new" reqts.  In particular, this one
> >
> >   > MUST specify the ability to preserve ordered delivery
> >   > of SCSI commands even in the presence of transport
> >   > errors.  A mechanism MUST be provided to allow
> >   > Initiators and Targets to negotiate this preservation
> >   > on a per-session or finer granularity basis
> >
> >Note :
> >======
> >1) This does not rely on SCSI cmdRN, but operates at iSCSI level.
> >2) Flow control using cmdSN works as designed.
> >3) This solution is not a per-session negotiation option but can be
> >   disabled and re-enabled again at "runtime" by the initiator if it
> >   notices that Ordered/HOQ tasks (or any other need) have entered
> >   the iSCSI command stream which is being dispatched.
> >
> >Problem:
> >=========
> >In case of out-of-order arrival or digest errors, it is NOT possible
> >to know if the initiator had sent an ordered command before the one
> >which was received.
> >
> >Solution:
> >=========
> >To notify target of presence of in-flight ordered commands, we set
> >a flag on *every* PDU following the ordered command *until* the target
> >moves it expCmdSN above the cmdSN of the ordered command.  The
> >expCmdSN indicates target has found the ordered command.
> >
> >( Those familiar with "ECN over IP" (Floyd, et al) may see this is
> >similar to how a congestion bit keeps being set until the sender acks
> >that it has received the notification)
> >
> >Figuratively:
> >=============
> >So, assume there was a new "strict_order" flag in the BHS.
> >In the figure below, braces shows value of (cmdSN, strict_order)
> >
> >Initiator                       Target
> >---------                       -------
> >
> >simple cmd (cmdSN=100, strict_order=0) ->
> >simple cmd (101, 0) ->
> >simple cmd (102, 0) ->
> >
> >   these may get reordered or have digest errors ->
> >
> >                        target executes as they arrive
> >                             exec simple cmd(102, 0)
> >                             exec simple cmd(100, 0)
> >                             exec simple cmd(101, 0)
> >
> >Now say the initiator wants to send a HOQ task.
> >It sets strict_order=1 on all PDUs
> >
> >ordered cmd (103, 1) ->
> >simple  cmd (104, 1) ->
> >simple  cmd (105, 1) ->
> >                        in case of reordering or digest errors
> >                        target must wait & execute in cmdSN order!
> >simple  cmd (106, 1) ->
> >simple  cmd (107, 1) ->
> >
> >         <---- now target sends expCmdSN=103
> >
> >This implies target has seen command(cmdSN=103) and target will do the
> >appropriate ordering and delivery to SCSI layer.  This is left to
> >the target implementation to tackle.
> >
> >Initiator checks (expCmdSN >= cmdSN of ordered cmd) and then resets
> >strict_order to zero for all subsequent PDUs.
> >
> >simple  cmd (108, 0) ->
> >simple  cmd (109, 0) ->
> >
> >Other issues:
> >=============
> >If the basic scheme is ok, then we could later tackle other questions
> >"what about multiple ordered commands" and the like..


From owner-ips@ece.cmu.edu  Sun Apr 22 02:08:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA29759
	for <ips-archive@odin.ietf.org>; Sun, 22 Apr 2001 02:08:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3M3jHn00678
	for ips-outgoing; Sat, 21 Apr 2001 23:45:17 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel2.hp.com (atlrel2.hp.com [156.153.255.202])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3M3iLA00629
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 23:44:21 -0400 (EDT)
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by atlrel2.hp.com (Postfix) with ESMTP id 6CF9FE2F
	for <ips@ece.cmu.edu>; Sat, 21 Apr 2001 23:44:20 -0400 (EDT)
Received: (from cbm@localhost) by core.rose.hp.com (8.8.6 (PHNE_14041)/8.8.6 SMKit7.02) id UAA27619 for ips@ece.cmu.edu; Sat, 21 Apr 2001 20:45:23 -0700 (PDT)
Message-Id: <200104220345.UAA27619@core.rose.hp.com>
Subject: Re: iSCSI:out-of-order notification proposal
To: ips@ece.cmu.edu
Date: Sat, 21 Apr 2001 20:45:23 PDT
Reply-To: cbm@rose.hp.com
From: "Mallikarjun C." <cbm@rose.hp.com>
X-Mailer: Elm [revision: 212.4]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sandeep,

It's just not a single SCSI info-bit that the target iSCSI layer should 
look at for the optimization you're proposing - the LUN needs to be 
examined as well since the task attributes are scoped per-LU.

In effect, what you're suggesting is an iSCSI-level per-LU ordering with 
a session-global command sequencing scheme - that is a quite a bit of 
complexity for targets to manage!  

BTW, synch and steering layer is *not* meant to help iSCSI (or SCSI)
process payloads out-of-order.  It is targetted only for out-of-order
data placements to reduce implementation costs, as the current draft 
clearly states.

Regards.
--
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com


>Answering two responses in one email.  This would be mighty 
>easier if there was a whiteboard around here :-)
>
>> What exactly does it buy you? (that is not already in).
>
>The ability to execute subsequent PDUs in the presence of holes.
>(Holes due to multiple connections OR due to skipping onto next marker)
>
>If the hole is due to a REAL ordering issue (e.g. ORdered task
>was lost) then the target will pause for the hole to fill.
>
>But if the hole is due to simple task, then the target
>can continue execution of the new commands it is receiving.
>
>Currently, target will *always* pause due to cmdSN ordering,
>which is somewhat like a primitive microprocessor which wont 
>reorder its instruction stream, though its feasible.
>
>> 
>> And FWIW David (B) raised a question related to the requirements doc.
>
>My mistake, the rephrasing of requirements seemed to match the intent 
>of previous discussions on why total cmdSN ordering is overkill.
>That was the only point i was trying to reinforce.
>
>> 
>> Julo
>
>
>
>"Mallikarjun C." wrote:
>> 
>> Sandeep,
>> 
>> Some comments on your proposal.
>> 
>> - The "strict_order" flag that you mention appears to carry
>>   information that's already contained by the SCSI task attributes
>>   (ATTR field).
>
>But now the information is also contained in PDUs following it.
>(e.g. A simple command PDU may also have strict_order=1)
>
>If an Ordered task PDU is lost, the target *knows* there is a REAL
>ordering issue (later PDUs have info) so it must wait for the hole 
>to be filled.
>
>> 
>> - This proposal requires the target iSCSI layer to look at more
>>   variables (SCSI-level information in that) for making its
>>   sequencing decisions.
>
>Only one more bit.. to decide if the command can be passed
>to SCSI layer now or later in cmdSN sequence.  
>
>To build an efficient SCSI transport, some hints would need
>to be passed between layers.
>
>> 
>> - Simple commands (intended to be) received prior to an Ordered
>>   command must be executed before the Ordered one.  If you lose a
>>   Simple command, you must wait for it before you act on the Ordered
>>   that you received out-of-order.
>
>Good pt.  This, I forgot to mention below.  The target must clear
>out all PDUs received with strict-order=0 before commencing 
>execution of PDUs received with strict-order=1.
>
>> 
>> This appears more an implementaion optimization where desired than
>> something we want to get into the draft.  You are proposing that
>> even in the absence of errors, iSCSI should make ordering decisions
>> based on SCSI task attributes.  I disagree with that.
>
>The optimization needs a bit in the PDU, otherwise I wouldnt
>have gone to the trouble of writing this out!  And you would agree
>that there are a bunch of TCP optimizations in various drafts.
>
>Lastly, what is the point in building a sync-and-steering layer to
>skip ahead in the command stream if you cannot execute that next command
>?
>With this little more work, you could as well continue execution.
>
>> --
>> Mallikarjun
>> 
>> Mallikarjun Chadalapaka
>> Networked Storage Architecture
>> Network Storage Solutions Organization
>> MS 5668 Hewlett-Packard, Roseville.
>> cbm@rose.hp.com
>> 
>> >This is a proposal to allow the initiator to inform the target
>> >if out-of-order execution within the command stream is possible.
>> >
>> >The target execution can rotate between "in-cmdSN-order" and
>> >"out-of-order" during runtime as informed by the initiator..
>> >
>> >Appreciate comments on subtleties that I may have missed.
>> >
>> >thanks,
>> >-Sandeep
>> >
>> >http://ips.pdl.cs.cmu.edu/mail/msg03152.html
>> >   by Costa provides a good summary of the issue at hand.
>> >
>> >http://ips.pdl.cs.cmu.edu/mail/msg04255.html
>> >  David provides the "new" reqts.  In particular, this one
>> >
>> >   > MUST specify the ability to preserve ordered delivery
>> >   > of SCSI commands even in the presence of transport
>> >   > errors.  A mechanism MUST be provided to allow
>> >   > Initiators and Targets to negotiate this preservation
>> >   > on a per-session or finer granularity basis
>> >
>> >Note :
>> >======
>> >1) This does not rely on SCSI cmdRN, but operates at iSCSI level.
>> >2) Flow control using cmdSN works as designed.
>> >3) This solution is not a per-session negotiation option but can be
>> >   disabled and re-enabled again at "runtime" by the initiator if it
>> >   notices that Ordered/HOQ tasks (or any other need) have entered
>> >   the iSCSI command stream which is being dispatched.
>> >
>> >Problem:
>> >=========
>> >In case of out-of-order arrival or digest errors, it is NOT possible
>> >to know if the initiator had sent an ordered command before the one
>> >which was received.
>> >
>> >Solution:
>> >=========
>> >To notify target of presence of in-flight ordered commands, we set
>> >a flag on *every* PDU following the ordered command *until* the target
>> >moves it expCmdSN above the cmdSN of the ordered command.  The
>> >expCmdSN indicates target has found the ordered command.
>> >
>> >( Those familiar with "ECN over IP" (Floyd, et al) may see this is
>> >similar to how a congestion bit keeps being set until the sender acks
>> >that it has received the notification)
>> >
>> >Figuratively:
>> >=============
>> >So, assume there was a new "strict_order" flag in the BHS.
>> >In the figure below, braces shows value of (cmdSN, strict_order)
>> >
>> >Initiator                       Target
>> >---------                       -------
>> >
>> >simple cmd (cmdSN=100, strict_order=0) ->
>> >simple cmd (101, 0) ->
>> >simple cmd (102, 0) ->
>> >
>> >   these may get reordered or have digest errors ->
>> >
>> >                        target executes as they arrive
>> >                             exec simple cmd(102, 0)
>> >                             exec simple cmd(100, 0)
>> >                             exec simple cmd(101, 0)
>> >
>> >Now say the initiator wants to send a HOQ task.
>> >It sets strict_order=1 on all PDUs
>> >
>> >ordered cmd (103, 1) ->
>> >simple  cmd (104, 1) ->
>> >simple  cmd (105, 1) ->
>> >                        in case of reordering or digest errors
>> >                        target must wait & execute in cmdSN order!
>> >simple  cmd (106, 1) ->
>> >simple  cmd (107, 1) ->
>> >
>> >         <---- now target sends expCmdSN=103
>> >
>> >This implies target has seen command(cmdSN=103) and target will do the
>> >appropriate ordering and delivery to SCSI layer.  This is left to
>> >the target implementation to tackle.
>> >
>> >Initiator checks (expCmdSN >= cmdSN of ordered cmd) and then resets
>> >strict_order to zero for all subsequent PDUs.
>> >
>> >simple  cmd (108, 0) ->
>> >simple  cmd (109, 0) ->
>> >
>> >Other issues:
>> >=============
>> >If the basic scheme is ok, then we could later tackle other questions
>> >"what about multiple ordered commands" and the like..
>




From owner-ips@ece.cmu.edu  Sun Apr 22 07:21:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id HAA08106
	for <ips-archive@odin.ietf.org>; Sun, 22 Apr 2001 07:21:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3M8xZc11939
	for ips-outgoing; Sun, 22 Apr 2001 04:59:35 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3M8xLA11932
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 04:59:21 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id EAA90764
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 04:51:55 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id CAA21368
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 02:59:15 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: Target Reset
To: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFFC3A08D4.11A95275-ON88256A36.003018BB@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Sun, 22 Apr 2001 01:59:03 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/22/2001 02:59:18 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I thought we had a number of discussion previously about Target Reset (Warm
or Cold).  I thought there was general feeling that this command is so
dangerous that it should not be supported by iSCSI.  The long distance
capability of iSCSI makes the risks involved unmanageable.  There should
only be an Admin way to do this.

Some folks have said that we could permit it and have special authorization
etc.  This would probably cause a separate section in the spec. to define
the authorization approach,  and what ever other security is needed to
prevent this from inappropriately being used.  All for what purpose?  This
can not be part of error recovery from a normal initiator.  The wide spread
effect is too great for that.

I would like to hear from the list about their feeling on this item.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com



From owner-ips@ece.cmu.edu  Sun Apr 22 13:32:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA11772
	for <ips-archive@odin.ietf.org>; Sun, 22 Apr 2001 13:32:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3MFEpW12664
	for ips-outgoing; Sun, 22 Apr 2001 11:14:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from magic.adaptec.com (magic.adaptec.com [208.236.45.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MFE0A12645
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 11:14:00 -0400 (EDT)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id IAA25172
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 08:13:51 -0700 (PDT)
Received: from otcexc01.otc.adaptec.com (otcexc01.otc.adaptec.com [10.12.1.27])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id IAA09442
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 08:04:39 -0700 (PDT)
Received: by otcexc01.otc.adaptec.com with Internet Mail Service (5.5.2650.21)
	id <JK8AHXRK>; Sun, 22 Apr 2001 11:13:50 -0400
Message-ID: <50DB155AD0CED411988E009027D61DB32AB157@otcexc01.otc.adaptec.com>
From: "Dillard, David" <david_dillard@adaptec.com>
To: ips@ece.cmu.edu
Subject: RE: Target Reset
Date: Sun, 22 Apr 2001 11:13:49 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

John,

I understand the danger of issuing a target reset and I agree that it should
not be a part of the an initiator's normal error recovery procedure.
However, looking at this from a management perspective I'd like to see a
standardized way of resetting a target.  I don't want to see a variety of
vendor unique methods of resetting targets sprout up.

If resetting a target using the protocol is not desirable from your
perspective would incorporating this feature into the MIB be acceptable?
(MIBs are for management after all)

----------------------------------------------------------------
David Dillard                          david_dillard@adaptec.com
Management Software Group
Adaptec, Inc.                          www.adaptec.com




-----Original Message-----
From: John Hufferd [mailto:hufferd@us.ibm.com]
Sent: Sunday, April 22, 2001 4:59 AM
To: ips@ece.cmu.edu
Subject: Target Reset


I thought we had a number of discussion previously about Target Reset (Warm
or Cold).  I thought there was general feeling that this command is so
dangerous that it should not be supported by iSCSI.  The long distance
capability of iSCSI makes the risks involved unmanageable.  There should
only be an Admin way to do this.

Some folks have said that we could permit it and have special authorization
etc.  This would probably cause a separate section in the spec. to define
the authorization approach,  and what ever other security is needed to
prevent this from inappropriately being used.  All for what purpose?  This
can not be part of error recovery from a normal initiator.  The wide spread
effect is too great for that.

I would like to hear from the list about their feeling on this item.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


From owner-ips@ece.cmu.edu  Sun Apr 22 14:40:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA12121
	for <ips-archive@odin.ietf.org>; Sun, 22 Apr 2001 14:40:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3MGct116277
	for ips-outgoing; Sun, 22 Apr 2001 12:38:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3MGbvA16262
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 12:37:57 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by dirty; Sun Apr 22 12:36:25 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Sun Apr 22 12:36:22 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id MAA14218
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 12:36:19 -0400 (EDT)
Message-ID: <3AE30883.CA51ACA0@research.bell-labs.com>
Date: Sun, 22 Apr 2001 12:36:19 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI:out-of-order notification proposal
References: <200104220345.UAA27619@core.rose.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


Comments in text..

"Mallikarjun C." wrote:
> 
> Sandeep,
> 
> It's just not a single SCSI info-bit that the target iSCSI layer should
> look at for the optimization you're proposing - the LUN needs to be
> examined as well since the task attributes are scoped per-LU.
> 
> In effect, what you're suggesting is an iSCSI-level per-LU ordering with
> a session-global command sequencing scheme - that is a quite a bit of
> complexity for targets to manage!

Ah, I did not understand what you were getting at.

Let me emphasize - this is just a HINT of what's in the pipeline, 
and not a method to replace or duplicate SCSI functionality.

In the (common?) case where you have an incoming stream of  
unrelated tasks and a large command window, you might as well 
pass them onto SCSI layer right away.  

In the worst case, the hint will be a bad one and the target will 
perform no better than is currently planned.

A high-performance transport architecture with multiple connections and
multi-NIC support is bound to see PDU reordering, and consequently
suffer if it does not overcome this strict cmdSN ordering rule. 

> 
> BTW, synch and steering layer is *not* meant to help iSCSI (or SCSI)
> process payloads out-of-order.  It is targetted only for out-of-order
> data placements to reduce implementation costs, as the current draft
> clearly states.

ok..I havent looked at this section in depth yet :-)

> 
> Regards.
> --
> Mallikarjun
> 
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668 Hewlett-Packard, Roseville.
> cbm@rose.hp.com
> 
> >Answering two responses in one email.  This would be mighty
> >easier if there was a whiteboard around here :-)
> >
> >> What exactly does it buy you? (that is not already in).
> >
> >The ability to execute subsequent PDUs in the presence of holes.
> >(Holes due to multiple connections OR due to skipping onto next marker)
> >
> >If the hole is due to a REAL ordering issue (e.g. ORdered task
> >was lost) then the target will pause for the hole to fill.
> >
> >But if the hole is due to simple task, then the target
> >can continue execution of the new commands it is receiving.
> >
> >Currently, target will *always* pause due to cmdSN ordering,
> >which is somewhat like a primitive microprocessor which wont
> >reorder its instruction stream, though its feasible.
> >
> >>
> >> And FWIW David (B) raised a question related to the requirements doc.
> >
> >My mistake, the rephrasing of requirements seemed to match the intent
> >of previous discussions on why total cmdSN ordering is overkill.
> >That was the only point i was trying to reinforce.
> >
> >>
> >> Julo
> >
> >
> >
> >"Mallikarjun C." wrote:
> >>
> >> Sandeep,
> >>
> >> Some comments on your proposal.
> >>
> >> - The "strict_order" flag that you mention appears to carry
> >>   information that's already contained by the SCSI task attributes
> >>   (ATTR field).
> >
> >But now the information is also contained in PDUs following it.
> >(e.g. A simple command PDU may also have strict_order=1)
> >
> >If an Ordered task PDU is lost, the target *knows* there is a REAL
> >ordering issue (later PDUs have info) so it must wait for the hole
> >to be filled.
> >
> >>
> >> - This proposal requires the target iSCSI layer to look at more
> >>   variables (SCSI-level information in that) for making its
> >>   sequencing decisions.
> >
> >Only one more bit.. to decide if the command can be passed
> >to SCSI layer now or later in cmdSN sequence.
> >
> >To build an efficient SCSI transport, some hints would need
> >to be passed between layers.
> >
> >>
> >> - Simple commands (intended to be) received prior to an Ordered
> >>   command must be executed before the Ordered one.  If you lose a
> >>   Simple command, you must wait for it before you act on the Ordered
> >>   that you received out-of-order.
> >
> >Good pt.  This, I forgot to mention below.  The target must clear
> >out all PDUs received with strict-order=0 before commencing
> >execution of PDUs received with strict-order=1.
> >
> >>
> >> This appears more an implementaion optimization where desired than
> >> something we want to get into the draft.  You are proposing that
> >> even in the absence of errors, iSCSI should make ordering decisions
> >> based on SCSI task attributes.  I disagree with that.
> >
> >The optimization needs a bit in the PDU, otherwise I wouldnt
> >have gone to the trouble of writing this out!  And you would agree
> >that there are a bunch of TCP optimizations in various drafts.
> >
> >Lastly, what is the point in building a sync-and-steering layer to
> >skip ahead in the command stream if you cannot execute that next command
> >?
> >With this little more work, you could as well continue execution.
> >
> >> --
> >> Mallikarjun
> >>
> >> Mallikarjun Chadalapaka
> >> Networked Storage Architecture
> >> Network Storage Solutions Organization
> >> MS 5668 Hewlett-Packard, Roseville.
> >> cbm@rose.hp.com
> >>
> >> >This is a proposal to allow the initiator to inform the target
> >> >if out-of-order execution within the command stream is possible.
> >> >
> >> >The target execution can rotate between "in-cmdSN-order" and
> >> >"out-of-order" during runtime as informed by the initiator..
> >> >
> >> >Appreciate comments on subtleties that I may have missed.
> >> >
> >> >thanks,
> >> >-Sandeep
> >> >
> >> >http://ips.pdl.cs.cmu.edu/mail/msg03152.html
> >> >   by Costa provides a good summary of the issue at hand.
> >> >
> >> >http://ips.pdl.cs.cmu.edu/mail/msg04255.html
> >> >  David provides the "new" reqts.  In particular, this one
> >> >
> >> >   > MUST specify the ability to preserve ordered delivery
> >> >   > of SCSI commands even in the presence of transport
> >> >   > errors.  A mechanism MUST be provided to allow
> >> >   > Initiators and Targets to negotiate this preservation
> >> >   > on a per-session or finer granularity basis
> >> >
> >> >Note :
> >> >======
> >> >1) This does not rely on SCSI cmdRN, but operates at iSCSI level.
> >> >2) Flow control using cmdSN works as designed.
> >> >3) This solution is not a per-session negotiation option but can be
> >> >   disabled and re-enabled again at "runtime" by the initiator if it
> >> >   notices that Ordered/HOQ tasks (or any other need) have entered
> >> >   the iSCSI command stream which is being dispatched.
> >> >
> >> >Problem:
> >> >=========
> >> >In case of out-of-order arrival or digest errors, it is NOT possible
> >> >to know if the initiator had sent an ordered command before the one
> >> >which was received.
> >> >
> >> >Solution:
> >> >=========
> >> >To notify target of presence of in-flight ordered commands, we set
> >> >a flag on *every* PDU following the ordered command *until* the target
> >> >moves it expCmdSN above the cmdSN of the ordered command.  The
> >> >expCmdSN indicates target has found the ordered command.
> >> >
> >> >( Those familiar with "ECN over IP" (Floyd, et al) may see this is
> >> >similar to how a congestion bit keeps being set until the sender acks
> >> >that it has received the notification)
> >> >
> >> >Figuratively:
> >> >=============
> >> >So, assume there was a new "strict_order" flag in the BHS.
> >> >In the figure below, braces shows value of (cmdSN, strict_order)
> >> >
> >> >Initiator                       Target
> >> >---------                       -------
> >> >
> >> >simple cmd (cmdSN=100, strict_order=0) ->
> >> >simple cmd (101, 0) ->
> >> >simple cmd (102, 0) ->
> >> >
> >> >   these may get reordered or have digest errors ->
> >> >
> >> >                        target executes as they arrive
> >> >                             exec simple cmd(102, 0)
> >> >                             exec simple cmd(100, 0)
> >> >                             exec simple cmd(101, 0)
> >> >
> >> >Now say the initiator wants to send a HOQ task.
> >> >It sets strict_order=1 on all PDUs
> >> >
> >> >ordered cmd (103, 1) ->
> >> >simple  cmd (104, 1) ->
> >> >simple  cmd (105, 1) ->
> >> >                        in case of reordering or digest errors
> >> >                        target must wait & execute in cmdSN order!
> >> >simple  cmd (106, 1) ->
> >> >simple  cmd (107, 1) ->
> >> >
> >> >         <---- now target sends expCmdSN=103
> >> >
> >> >This implies target has seen command(cmdSN=103) and target will do the
> >> >appropriate ordering and delivery to SCSI layer.  This is left to
> >> >the target implementation to tackle.
> >> >
> >> >Initiator checks (expCmdSN >= cmdSN of ordered cmd) and then resets
> >> >strict_order to zero for all subsequent PDUs.
> >> >
> >> >simple  cmd (108, 0) ->
> >> >simple  cmd (109, 0) ->
> >> >
> >> >Other issues:
> >> >=============
> >> >If the basic scheme is ok, then we could later tackle other questions
> >> >"what about multiple ordered commands" and the like..
> >


From owner-ips@ece.cmu.edu  Sun Apr 22 17:41:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA13707
	for <ips-archive@odin.ietf.org>; Sun, 22 Apr 2001 17:41:12 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3MJL2H25285
	for ips-outgoing; Sun, 22 Apr 2001 15:21:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MJKkA25275
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 15:20:46 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id PAA16670;
	Sun, 22 Apr 2001 15:13:21 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id NAA43834;
	Sun, 22 Apr 2001 13:20:45 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: RE: Target Reset
To: "Dillard, David" <david_dillard@adaptec.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFCAEBAA8A.97EC3912-ON88256A36.0068A802@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Sun, 22 Apr 2001 12:20:31 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/22/2001 01:20:44 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


This is at least better.  But I do not have the issue of it being vendor
unique.  This is a shut down and restart of the complete Target, and will
probably be part of the vendors' operator console or their own remote
support functions, it is not clear that it needs to be a general management
function that works the same on all iSCSI Storage Controllers.

Many of the major Storage Controller do not support this feature today.

I do not believe that most SNMP implementations are very secure.  Most
folks do not want to have a changeable MIB until they have secure SNMP, and
even though there is a version of SNMP that has security features, this has
not been well supported.

I do NOT think that Target Reset should be in the base iSCSI protocol, but
I think it is reasonable to hold this discussion apart from the base
protocol document, and the question should be asked if this is a general
management function or a vendor specific console or remote support
function.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


"Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
08:13:49 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  RE: Target Reset



John,

I understand the danger of issuing a target reset and I agree that it
should
not be a part of the an initiator's normal error recovery procedure.
However, looking at this from a management perspective I'd like to see a
standardized way of resetting a target.  I don't want to see a variety of
vendor unique methods of resetting targets sprout up.

If resetting a target using the protocol is not desirable from your
perspective would incorporating this feature into the MIB be acceptable?
(MIBs are for management after all)

----------------------------------------------------------------
David Dillard                          david_dillard@adaptec.com
Management Software Group
Adaptec, Inc.                          www.adaptec.com




-----Original Message-----
From: John Hufferd [mailto:hufferd@us.ibm.com]
Sent: Sunday, April 22, 2001 4:59 AM
To: ips@ece.cmu.edu
Subject: Target Reset


I thought we had a number of discussion previously about Target Reset (Warm
or Cold).  I thought there was general feeling that this command is so
dangerous that it should not be supported by iSCSI.  The long distance
capability of iSCSI makes the risks involved unmanageable.  There should
only be an Admin way to do this.

Some folks have said that we could permit it and have special authorization
etc.  This would probably cause a separate section in the spec. to define
the authorization approach,  and what ever other security is needed to
prevent this from inappropriately being used.  All for what purpose?  This
can not be part of error recovery from a normal initiator.  The wide spread
effect is too great for that.

I would like to hear from the list about their feeling on this item.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com





From owner-ips@ece.cmu.edu  Sun Apr 22 19:44:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA14241
	for <ips-archive@odin.ietf.org>; Sun, 22 Apr 2001 19:44:03 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3MLr6A03319
	for ips-outgoing; Sun, 22 Apr 2001 17:53:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MLqAA03285
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 17:52:10 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id RAA11196
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 17:44:45 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id PAA37290
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 15:52:09 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: iSCSI:Target Reset
To: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFFC3A08D4.11A95275-ON88256A36.003018BB@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Sun, 22 Apr 2001 14:51:56 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/22/2001 03:52:08 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

(resend of message with iSCSI in Subject)
I thought we had a number of discussion previously about Target Reset (Warm
or Cold).  I thought there was general feeling that this command is so
dangerous that it should not be supported by iSCSI.  The long distance
capability of iSCSI makes the risks involved unmanageable.  There should
only be an Admin way to do this.

Some folks have said that we could permit it and have special authorization
etc.  This would probably cause a separate section in the spec. to define
the authorization approach,  and what ever other security is needed to
prevent this from inappropriately being used.  All for what purpose?  This
can not be part of error recovery from a normal initiator.  The wide spread
effect is too great for that.

I would like to hear from the list about their feeling on this item.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com



From owner-ips@ece.cmu.edu  Sun Apr 22 19:46:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA14253
	for <ips-archive@odin.ietf.org>; Sun, 22 Apr 2001 19:46:43 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3MLrAj03325
	for ips-outgoing; Sun, 22 Apr 2001 17:53:10 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MLqeA03308
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 17:52:40 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id RAA46736;
	Sun, 22 Apr 2001 17:45:15 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id PAA43996;
	Sun, 22 Apr 2001 15:52:39 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: RE: iSCSI Target Reset
To: "Dillard, David" <david_dillard@adaptec.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFCAEBAA8A.97EC3912-ON88256A36.0068A802@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Sun, 22 Apr 2001 14:52:25 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/22/2001 03:52:38 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


This is at least better.  But I do not have the issue of it being vendor
unique.  This is a shut down and restart of the complete Target, and will
probably be part of the vendors' operator console or their own remote
support functions, it is not clear that it needs to be a general management
function that works the same on all iSCSI Storage Controllers.

Many of the major Storage Controller do not support this feature today.

I do not believe that most SNMP implementations are very secure.  Most
folks do not want to have a changeable MIB until they have secure SNMP, and
even though there is a version of SNMP that has security features, this has
not been well supported.

I do NOT think that Target Reset should be in the base iSCSI protocol, but
I think it is reasonable to hold this discussion apart from the base
protocol document, and the question should be asked if this is a general
management function or a vendor specific console or remote support
function.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


"Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
08:13:49 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  RE: Target Reset



John,

I understand the danger of issuing a target reset and I agree that it
should
not be a part of the an initiator's normal error recovery procedure.
However, looking at this from a management perspective I'd like to see a
standardized way of resetting a target.  I don't want to see a variety of
vendor unique methods of resetting targets sprout up.

If resetting a target using the protocol is not desirable from your
perspective would incorporating this feature into the MIB be acceptable?
(MIBs are for management after all)

----------------------------------------------------------------
David Dillard                          david_dillard@adaptec.com
Management Software Group
Adaptec, Inc.                          www.adaptec.com




-----Original Message-----
From: John Hufferd [mailto:hufferd@us.ibm.com]
Sent: Sunday, April 22, 2001 4:59 AM
To: ips@ece.cmu.edu
Subject: Target Reset


I thought we had a number of discussion previously about Target Reset (Warm
or Cold).  I thought there was general feeling that this command is so
dangerous that it should not be supported by iSCSI.  The long distance
capability of iSCSI makes the risks involved unmanageable.  There should
only be an Admin way to do this.

Some folks have said that we could permit it and have special authorization
etc.  This would probably cause a separate section in the spec. to define
the authorization approach,  and what ever other security is needed to
prevent this from inappropriately being used.  All for what purpose?  This
can not be part of error recovery from a normal initiator.  The wide spread
effect is too great for that.

I would like to hear from the list about their feeling on this item.



.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com





From owner-ips@ece.cmu.edu  Mon Apr 23 00:53:42 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA18085
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 00:53:40 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2taM19815
	for ips-outgoing; Sun, 22 Apr 2001 22:55:36 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEAjA04677
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:10:45 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08746;
	Sun, 22 Apr 2001 10:10:44 -0400 (EDT)
Message-Id: <200104221410.KAA08746@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-fcencapsulation-00.txt
Date: Sun, 22 Apr 2001 10:10:43 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: FC Frame Encapsulation
	Author(s)	: R. Weber, M. Rajagopal, F. Travostino, V. Chau, M. O''Donnell, C. Monia, M. Merhar
	Filename	: draft-ietf-ips-fcencapsulation-00.txt
	Pages		: 11
	Date		: 20-Apr-01
	
This is the ips (IP Storage) working group draft describing the
common encapsulation format for use by any IETF protocol that
encapsulates Fibre Channel (FC) frames. This draft describes a
frame header containing information mandated for encapsulating,
transmitting, and de-encapsulating FC frames.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-fcencapsulation-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-fcencapsulation-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154240.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-fcencapsulation-00.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-fcencapsulation-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154240.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 00:54:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA18109
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 00:54:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tit19830
	for ips-outgoing; Sun, 22 Apr 2001 22:55:44 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEDAA04783
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:13:10 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08804;
	Sun, 22 Apr 2001 10:13:08 -0400 (EDT)
Message-Id: <200104221413.KAA08804@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-ifcp-01.txt
Date: Sun, 22 Apr 2001 10:13:08 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iFCP - A Protocol for Internet Fibre Channel Storage Networking
	Author(s)	: C. Monia
	Filename	: draft-ietf-ips-ifcp-01.txt
	Pages		: 54
	Date		: 20-Apr-01
	
This document specifies iFCP, a gateway-to-gateway protocol for the
implementation of a Fibre Channel fabric in which TCP/IP switching
and routing elements replace Fibre Channel components. The
protocol enables the attachment of existing Fibre Channel storage
products to an IP network by supporting the subset of fabric
services required by such devices.
The encapsulation described in this version of the document is
obsolete.  It will be replaced by an encapsulation format which
will be common to both the iFCP and FCIP protocols.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-ifcp-01.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-ifcp-01.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-ifcp-01.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154315.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-ifcp-01.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-ifcp-01.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154315.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 00:55:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA18124
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 00:55:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tdF19820
	for ips-outgoing; Sun, 22 Apr 2001 22:55:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEK2A05045
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:20:03 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08938;
	Sun, 22 Apr 2001 10:20:00 -0400 (EDT)
Message-Id: <200104221420.KAA08938@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-isns-02.txt
Date: Sun, 22 Apr 2001 10:20:00 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSNS Internet Storage Name Service
	Author(s)	: K. Gibbons
	Filename	: draft-ietf-ips-isns-02.txt
	Pages		: 74
	Date		: 20-Apr-01
	
This document provides a generic framework centering around use of
the iSNS for discovery and management of storage entities in an
enterprise-scale IP storage network.  iSNS is an application that
stores client attributes and monitors the availability and
reachability of storage assets in an integrated IP storage network.
Due to its role as a consolidated information repository, iSNS
provides for more efficient and scalable management of IP storage
assets.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-isns-02.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-isns-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-isns-02.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154426.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-isns-02.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-isns-02.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154426.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 00:57:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA18135
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 00:57:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tkg19833
	for ips-outgoing; Sun, 22 Apr 2001 22:55:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEAtA04692
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:10:55 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08762;
	Sun, 22 Apr 2001 10:10:53 -0400 (EDT)
Message-Id: <200104221410.KAA08762@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-fcovertcpip-02.txt
Date: Sun, 22 Apr 2001 10:10:53 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Fibre Channel Over TCP/IP (FCIP)
	Author(s)	: M. Rajagopal, R. Bhagwat
	Filename	: draft-ietf-ips-fcovertcpip-02.txt
	Pages		: 26
	Date		: 20-Apr-01
	
Fibre Channel (FC) is a dominant technology used in Storage Area
Networks (SAN). The purpose of this draft is to specify a standard
way of encapsulating FC frames over TCP/IP and to describe mechanisms
that allow islands of FC SANs to be interconnected  over IP-based
networks. FC over TCP/IP relies on IP-based network services to
provide the connectivity between the SAN islands over LANs, MANs, or
WANs.  The FC over TCP/IP specification relies upon TCP for
congestion control and management and upon both TCP and FC for data
error and data loss recovery.  FC over TCP/IP treats all classes of
FC frames the same -- as datagrams.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcovertcpip-02.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-fcovertcpip-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-fcovertcpip-02.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154253.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-fcovertcpip-02.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-fcovertcpip-02.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154253.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 00:58:09 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA18146
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 00:58:08 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tKa19793
	for ips-outgoing; Sun, 22 Apr 2001 22:55:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEIVA05003
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:18:31 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08905;
	Sun, 22 Apr 2001 10:18:29 -0400 (EDT)
Message-Id: <200104221418.KAA08905@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-iscsi-name-disc-01.txt
Date: Sun, 22 Apr 2001 10:18:29 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI Naming and Discovery Requirements
	Author(s)	: M. Bakke
	Filename	: draft-ietf-ips-iscsi-name-disc-01.txt
	Pages		: 32
	Date		: 20-Apr-01
	
This document describes the  iSCSI [7] naming and discovery 
requirements. The  requirements presented in this document have been 
agreed to by the members of  the iSCSI naming and discovery team. This
document complements the iSCSI IETF  draft.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-name-disc-01.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-name-disc-01.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-name-disc-01.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154353.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-name-disc-01.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-name-disc-01.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154353.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 01:02:05 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA18177
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 01:02:04 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tWE19810
	for ips-outgoing; Sun, 22 Apr 2001 22:55:32 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEAOA04672
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:10:24 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08728;
	Sun, 22 Apr 2001 10:10:22 -0400 (EDT)
Message-Id: <200104221410.KAA08728@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-iscsi-mib-00.txt
Date: Sun, 22 Apr 2001 10:10:21 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Definitions of Managed Objects for iSCSI
	Author(s)	: M. Bakke
	Filename	: draft-ietf-ips-iscsi-mib-00.txt
	Pages		: 120
	Date		: 20-Apr-01
	
This memo defines a portion of the Management Information Base (MIB)
for use with network management protocols in TCP/IP based internets.
In particular it defines objects for managing a client using the
iSCSI (SCSI over TCP) protocol.  It is meant to match the latest
version of iSCSI defined in [ISCSI].

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-mib-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-mib-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-mib-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154216.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-mib-00.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-mib-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154216.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 01:02:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA18175
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 01:02:01 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tTh19806
	for ips-outgoing; Sun, 22 Apr 2001 22:55:29 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEDIA04786
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:13:18 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08820;
	Sun, 22 Apr 2001 10:13:16 -0400 (EDT)
Message-Id: <200104221413.KAA08820@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-iscsi-06.txt
Date: Sun, 22 Apr 2001 10:13:15 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI
	Author(s)	: J. Satran
	Filename	: draft-ietf-ips-iscsi-06.txt
	Pages		: 133
	Date		: 20-Apr-01
	
The Small Computer Systems Interface (SCSI) is a popular family of 
protocols for communicating with I/O devices, especially storage 
devices.  This memo describes a transport protocol for SCSI that 
operates on top of TCP.  The iSCSI protocol aims to be fully 
compliant with the requirements laid out in the SCSI Architecture 
Model - 2 [SAM2] document.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-06.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-06.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-06.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154329.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-06.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-06.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154329.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 03:25:59 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA02348
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 03:25:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tfk19826
	for ips-outgoing; Sun, 22 Apr 2001 22:55:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEIPA05001
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:18:25 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08889;
	Sun, 22 Apr 2001 10:18:19 -0400 (EDT)
Message-Id: <200104221418.KAA08889@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-otis-iscsi-fullack-00.txt
Date: Sun, 22 Apr 2001 10:18:19 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI Full Acknowledgement
	Author(s)	: D. Otis
	Filename	: draft-otis-iscsi-fullack-00.txt
	Pages		: 7
	Date		: 20-Apr-01
	
This document is illustrative of potential modifications to
the iSCSI protocol proposal (draft-ietf-ips-iscsi-05+.txt).
These changes are to create a means to do the following:

 - Ensure Management response is coherent.

 - Acknowledge ALL requests delivered to the Server.

 - Ensure integrity of the iSCSI request window.

 - Open request window during abnormal events.

 - Quickly eliminate invalidated requests.

 - Quickly expunge sequence holes.

 - Simplify the reception sequencer.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-otis-iscsi-fullack-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-otis-iscsi-fullack-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154342.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-otis-iscsi-fullack-00.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-otis-iscsi-fullack-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154342.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 03:29:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA02370
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 03:29:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N2tPu19801
	for ips-outgoing; Sun, 22 Apr 2001 22:55:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MED3A04778
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:13:03 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08788;
	Sun, 22 Apr 2001 10:13:00 -0400 (EDT)
Message-Id: <200104221413.KAA08788@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-iscsi-slp-00.txt
Date: Sun, 22 Apr 2001 10:13:00 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Finding iSCSI Targets and Name Servers Using SLP
	Author(s)	: M. Bakke
	Filename	: draft-ietf-ips-iscsi-slp-00.txt
	Pages		: 17
	Date		: 20-Apr-02
	
The iSCSI protocol provides a way for hosts to access SCSI devices
over an IP network.  This document defines the use of the Service
Location Protocol (SLP) by iSCSI hosts, devices, and name services,
along with the SLP service type templates that describe the services   they provide.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-slp-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-slp-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-slp-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154304.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-slp-00.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-slp-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154304.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 07:42:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id HAA03567
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 07:42:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3N9XhW14734
	for ips-outgoing; Mon, 23 Apr 2001 05:33:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3N9WxA14717
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 05:32:59 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id LAA111772
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 11:32:45 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id LAA146556
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 11:32:44 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A37.00346D18 ; Mon, 23 Apr 2001 11:32:38 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A37.00346CA6.00@d12mta02.de.ibm.com>
Date: Mon, 23 Apr 2001 11:37:49 +0200
Subject: Re: iSCSI:out-of-order notification proposal
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Sandeep,

If you take a look at what is already - look also a 7.3 - you will find
that what you want to do can be achieved with what is already in - even if
the exact mechanism is not completely spelled out (the draft is not an
implementation).

Regards,
Julo

Sandeep Joshi <sandeepj@research.bell-labs.com> on 21/04/2001 21:55:14

Please respond to Sandeep Joshi <sandeepj@research.bell-labs.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI:out-of-order notification proposal





Answering two responses in one email.  This would be mighty
easier if there was a whiteboard around here :-)

> What exactly does it buy you? (that is not already in).

The ability to execute subsequent PDUs in the presence of holes.
(Holes due to multiple connections OR due to skipping onto next marker)

If the hole is due to a REAL ordering issue (e.g. ORdered task
was lost) then the target will pause for the hole to fill.

But if the hole is due to simple task, then the target
can continue execution of the new commands it is receiving.

Currently, target will *always* pause due to cmdSN ordering,
which is somewhat like a primitive microprocessor which wont
reorder its instruction stream, though its feasible.

>
> And FWIW David (B) raised a question related to the requirements doc.

My mistake, the rephrasing of requirements seemed to match the intent
of previous discussions on why total cmdSN ordering is overkill.
That was the only point i was trying to reinforce.

>
> Julo



"Mallikarjun C." wrote:
>
> Sandeep,
>
> Some comments on your proposal.
>
> - The "strict_order" flag that you mention appears to carry
>   information that's already contained by the SCSI task attributes
>   (ATTR field).

But now the information is also contained in PDUs following it.
(e.g. A simple command PDU may also have strict_order=1)

If an Ordered task PDU is lost, the target *knows* there is a REAL
ordering issue (later PDUs have info) so it must wait for the hole
to be filled.

>
> - This proposal requires the target iSCSI layer to look at more
>   variables (SCSI-level information in that) for making its
>   sequencing decisions.

Only one more bit.. to decide if the command can be passed
to SCSI layer now or later in cmdSN sequence.

To build an efficient SCSI transport, some hints would need
to be passed between layers.

>
> - Simple commands (intended to be) received prior to an Ordered
>   command must be executed before the Ordered one.  If you lose a
>   Simple command, you must wait for it before you act on the Ordered
>   that you received out-of-order.

Good pt.  This, I forgot to mention below.  The target must clear
out all PDUs received with strict-order=0 before commencing
execution of PDUs received with strict-order=1.

>
> This appears more an implementaion optimization where desired than
> something we want to get into the draft.  You are proposing that
> even in the absence of errors, iSCSI should make ordering decisions
> based on SCSI task attributes.  I disagree with that.

The optimization needs a bit in the PDU, otherwise I wouldnt
have gone to the trouble of writing this out!  And you would agree
that there are a bunch of TCP optimizations in various drafts.

Lastly, what is the point in building a sync-and-steering layer to
skip ahead in the command stream if you cannot execute that next command
?
With this little more work, you could as well continue execution.

> --
> Mallikarjun
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668 Hewlett-Packard, Roseville.
> cbm@rose.hp.com
>
> >This is a proposal to allow the initiator to inform the target
> >if out-of-order execution within the command stream is possible.
> >
> >The target execution can rotate between "in-cmdSN-order" and
> >"out-of-order" during runtime as informed by the initiator..
> >
> >Appreciate comments on subtleties that I may have missed.
> >
> >thanks,
> >-Sandeep
> >
> >http://ips.pdl.cs.cmu.edu/mail/msg03152.html
> >   by Costa provides a good summary of the issue at hand.
> >
> >http://ips.pdl.cs.cmu.edu/mail/msg04255.html
> >  David provides the "new" reqts.  In particular, this one
> >
> >   > MUST specify the ability to preserve ordered delivery
> >   > of SCSI commands even in the presence of transport
> >   > errors.  A mechanism MUST be provided to allow
> >   > Initiators and Targets to negotiate this preservation
> >   > on a per-session or finer granularity basis
> >
> >Note :
> >======
> >1) This does not rely on SCSI cmdRN, but operates at iSCSI level.
> >2) Flow control using cmdSN works as designed.
> >3) This solution is not a per-session negotiation option but can be
> >   disabled and re-enabled again at "runtime" by the initiator if it
> >   notices that Ordered/HOQ tasks (or any other need) have entered
> >   the iSCSI command stream which is being dispatched.
> >
> >Problem:
> >=========
> >In case of out-of-order arrival or digest errors, it is NOT possible
> >to know if the initiator had sent an ordered command before the one
> >which was received.
> >
> >Solution:
> >=========
> >To notify target of presence of in-flight ordered commands, we set
> >a flag on *every* PDU following the ordered command *until* the target
> >moves it expCmdSN above the cmdSN of the ordered command.  The
> >expCmdSN indicates target has found the ordered command.
> >
> >( Those familiar with "ECN over IP" (Floyd, et al) may see this is
> >similar to how a congestion bit keeps being set until the sender acks
> >that it has received the notification)
> >
> >Figuratively:
> >=============
> >So, assume there was a new "strict_order" flag in the BHS.
> >In the figure below, braces shows value of (cmdSN, strict_order)
> >
> >Initiator                       Target
> >---------                       -------
> >
> >simple cmd (cmdSN=100, strict_order=0) ->
> >simple cmd (101, 0) ->
> >simple cmd (102, 0) ->
> >
> >   these may get reordered or have digest errors ->
> >
> >                        target executes as they arrive
> >                             exec simple cmd(102, 0)
> >                             exec simple cmd(100, 0)
> >                             exec simple cmd(101, 0)
> >
> >Now say the initiator wants to send a HOQ task.
> >It sets strict_order=1 on all PDUs
> >
> >ordered cmd (103, 1) ->
> >simple  cmd (104, 1) ->
> >simple  cmd (105, 1) ->
> >                        in case of reordering or digest errors
> >                        target must wait & execute in cmdSN order!
> >simple  cmd (106, 1) ->
> >simple  cmd (107, 1) ->
> >
> >         <---- now target sends expCmdSN=103
> >
> >This implies target has seen command(cmdSN=103) and target will do the
> >appropriate ordering and delivery to SCSI layer.  This is left to
> >the target implementation to tackle.
> >
> >Initiator checks (expCmdSN >= cmdSN of ordered cmd) and then resets
> >strict_order to zero for all subsequent PDUs.
> >
> >simple  cmd (108, 0) ->
> >simple  cmd (109, 0) ->
> >
> >Other issues:
> >=============
> >If the basic scheme is ok, then we could later tackle other questions
> >"what about multiple ordered commands" and the like..





From owner-ips@ece.cmu.edu  Mon Apr 23 11:02:02 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA05255
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 11:02:01 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NDHgd23860
	for ips-outgoing; Mon, 23 Apr 2001 09:17:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NDH7A23842
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 09:17:08 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id E36F09400C
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 09:17:06 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI : Negotiable padding, was More issues.... Digest related. 
In-Reply-To: Message from "Rod Harrison" <rod.harrison@windriver.com> 
   of "Fri, 20 Apr 2001 11:30:14 PDT." <NEBBKMMOEMCINPLCHKGMEEJOCGAA.rod.harrison@windriver.com> 
References: <NEBBKMMOEMCINPLCHKGMEEJOCGAA.rod.harrison@windriver.com> 
Date: Mon, 23 Apr 2001 09:15:30 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010423131706.E36F09400C@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> I have a request with respect to data padding. Can we make
> the pad size login negotiable please? Preferably on a per
> direction basis. This would allow the pad to be optimized
> for a receivers specific requirements, e.g. cache line
> alignment.
> Restricting padding to powers of 2 by specifying
> the size as a power of 2 seems reasonable.

This is exactly what we did with SST, and it scratched the itch well.

Steph


From owner-ips@ece.cmu.edu  Mon Apr 23 11:03:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA05267
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 11:03:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NDHkn23865
	for ips-outgoing; Mon, 23 Apr 2001 09:17:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NDH7A23843
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 09:17:08 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP
	id 3634D9400D; Mon, 23 Apr 2001 09:17:07 -0400 (EDT)
To: ips@ece.cmu.edu, tsvwg@ietf.org
Cc: L.Wood@surrey.ac.uk
Subject: Re: [Tsvwg] [SCTP checksum problems] 
In-Reply-To: Message from Lloyd Wood <L.Wood@surrey.ac.uk> 
   of "Thu, 19 Apr 2001 12:29:27 BST." <Pine.GSO.4.21.0104191223250.9077-100000@regan.ee.surrey.ac.uk> 
References: <Pine.GSO.4.21.0104191223250.9077-100000@regan.ee.surrey.ac.uk> 
Date: Mon, 23 Apr 2001 09:15:30 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010423131707.3634D9400D@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Lloyd,

Thanks for your input.

> Reality check, please.

Is your point that no protocol should ever assume that there's an
end-to-end integrity check operating at some layer below it?

The point of the iSCSI integrity check is not that to assume that
underlying layers (the transport and/or security & integrity) have
abdicated responsibility for end-to-end integrity.  It is a trust, but
verify model.  In a suitably trustworthy [or suitably devious]
environment you're wasting your time verifying.

Steph


From owner-ips@ece.cmu.edu  Mon Apr 23 11:03:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA05278
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 11:03:50 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NCEfo20670
	for ips-outgoing; Mon, 23 Apr 2001 08:14:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3MEAOA04672
	for <ips@ece.cmu.edu>; Sun, 22 Apr 2001 10:10:24 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA08728;
	Sun, 22 Apr 2001 10:10:22 -0400 (EDT)
Message-Id: <200104221410.KAA08728@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-iscsi-mib-00.txt
Date: Sun, 22 Apr 2001 10:10:21 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Definitions of Managed Objects for iSCSI
	Author(s)	: M. Bakke
	Filename	: draft-ietf-ips-iscsi-mib-00.txt
	Pages		: 120
	Date		: 20-Apr-01
	
This memo defines a portion of the Management Information Base (MIB)
for use with network management protocols in TCP/IP based internets.
In particular it defines objects for managing a client using the
iSCSI (SCSI over TCP) protocol.  It is meant to match the latest
version of iSCSI defined in [ISCSI].

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-mib-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-mib-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-mib-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154216.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-mib-00.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-mib-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154216.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 11:04:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA05290
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 11:04:45 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NDKhk24046
	for ips-outgoing; Mon, 23 Apr 2001 09:20:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NDK3A23991
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 09:20:03 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id PAA41252
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 15:19:44 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id PAA45084
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 15:19:44 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A37.00493736 ; Mon, 23 Apr 2001 15:19:42 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A37.0049365B.00@d12mta02.de.ibm.com>
Date: Mon, 23 Apr 2001 13:36:07 +0300
Subject: Re: iSCSI : login keys & mode page settings
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Comment in text

Julo

Santosh Rao <santoshr@cup.hp.com> on 21/04/2001 02:42:25

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI : login keys & mode page settings




julian_satran@il.ibm.com wrote:

> 3) However, having allowed 2 mechanisms to set negotiation elements,
> iSCSI MUST
> then comment on the need to synchronize their settings in the 2 layers
> and also comment on the need to trigger a UNIT ATTENTION when changed
> through the login key mechanism.
> Again, I would vote for only 1 mechanism for setting these control
> options, rather than have to define communication schemes b/n the ULP
> and LLP to keep their values in synch and generate UNIT ATTENTION.
>
> +++ Parameter changes originate from SCSI and iSCSI only enables another
> mechanism to convey them.
> This is an implementation issue +++

Julian,

From reading the spec, it is difficult to arrive at the above conclusion
that there currently exists only 1 layer allowed to make the changes,
albeit through 2 different mechanisms. If the changes are intended to
always originate from the SCSI layer (and I'm not sure why this should
be the case), then, would it not be more apt for the SCSI ULP to use a
mode select which is the mechanism available at this layer, rather,
than, invent down calls into the iSCSI layer to map to an equivalent
login/text key setting.

Again, I would like to request that the setting of these control options
be made from 1 layer only.

> 4)
> > If such a level of dual control is provided, the iSCSI login
> > keys listed above be made LO (leading only) to allow for changes to
> > operational parameters only during session login. This is to
> > minimize/eliminate disruption of ongoing I/O activity that occurs due
to
> > the generation of a UNIT ATTENTION CHECK CONDITION when any change is
> > made to the above paramters.
>
> Are we in agreement on the above ?
>
> +++ No +++

Could you please throw light on the basis for LO classification ? If
some key negotiation after a login causes disruption of all outstanding
I/O (keeping in mind its impact on tape type of devices), should'nt such
a key be not made LO (?)

IMHO, any negotiation key change that can disrupt I/O must be restricted
to login time negotiation only. (LO). This would prevent potential iSCSI
windows for disruption of tape I/O.

+++ LO stands for Leading only and
as for the general principle - that is motherhood and I agree that I/O
shoulde not be disrupted.
The exact mechanisms to achieve this are a more complicated issue.
As we can have several logins possible in a session - even with one
connection I thought that
that I sort of placed whatever can be disruptive in the LO category
(stronger than login-only).
If there are more please we can discuss each case +++

> 5)
> > If these operational parameters are allowed to be set through iSCSI
> > login and they also impact mode page settings, iSCSI spec should
> > describe the scope of the mode page setting in terms of whether this
> > setting is a saved page setting or not ?
> >
> +++ I don't know - I would rather think not +++
> 6)
> > Should saved page settings be allowed thru iSCSI ?
> +++ I don't know - I would rather think not +++

I agree with the above. Perhaps, the draft could explicitly state the
same. (Then again, setting these options through 1 mechanism alone would
solve this issue.)

+++ let's hear other opinions too +++

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Mon Apr 23 11:06:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA05302
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 11:06:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NDHn723868
	for ips-outgoing; Mon, 23 Apr 2001 09:17:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NDH8A23844
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 09:17:08 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 6719E9400E
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 09:17:07 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: iSCSI: Re: iSCSI & Linked Commands 
In-Reply-To: Message from julian_satran@il.ibm.com 
   of "Thu, 19 Apr 2001 11:18:13 +0200." <C1256A33.0032A8CC.00@d12mta02.de.ibm.com> 
References: <C1256A33.0032A8CC.00@d12mta02.de.ibm.com> 
Date: Mon, 23 Apr 2001 09:15:30 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010423131707.6719E9400E@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julian,

> According to your logic no FCP implementation can use linked commands?
> Is this true for all OS's?  Is it a verified fact or foloklor?

In my experience it's fact.  I have never used a SCSI stack which both
supported AND used linked commands.  Like some others here, I always
assumed AIX might :^) Ralph has pointed out that T10 is well aware
that the feature is not popular.  There are other ways of
accomplishing the same thing that are less likely to blow up in your
face.

> Is it so also for the new MS StorPort driver?

I don't know, but I'd be really surprised if they did use linked
commands.  You have to be pretty nuts to rely on a feature that's not
even exercised by most SCSI implementations.

Steph


From owner-ips@ece.cmu.edu  Mon Apr 23 13:12:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA07106
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 13:12:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NEHk227311
	for ips-outgoing; Mon, 23 Apr 2001 10:17:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NEGiA27279
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 10:16:44 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id KAA11750
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 10:16:43 -0400
Message-Id: <200104231416.KAA11750@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: owner-ips@ece.cmu.edu: BOUNCE ips@ece.cmu.edu: Non-member submission from [Internet-Drafts@ietf.org]
Date: Mon, 23 Apr 2001 10:16:43 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


- --NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSNS Internet Storage Name Service
	Author(s)	: K. Gibbons
	Filename	: draft-ietf-ips-isns-02.txt
	Pages		: 74
	Date		: 20-Apr-01
	
This document provides a generic framework centering around use of
the iSNS for discovery and management of storage entities in an
enterprise-scale IP storage network.  iSNS is an application that
stores client attributes and monitors the availability and
reachability of storage assets in an integrated IP storage network.
Due to its role as a consolidated information repository, iSNS
provides for more efficient and scalable management of IP storage
assets.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-isns-02.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-isns-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-isns-02.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154426.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-isns-02.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-isns-02.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154426.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 13:14:39 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA07149
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 13:14:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NFqt602587
	for ips-outgoing; Mon, 23 Apr 2001 11:52:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [128.221.11.32])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NFpuA02542
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 11:51:56 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <28S7WWZ0>; Mon, 23 Apr 2001 11:51:46 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015494@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: Proof that CRCs are not secure
Date: Mon, 23 Apr 2001 11:51:38 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

A while back, there was some interest in using a CRC
as a secure integrity checksum (e.g., by adding it
to IPSec).  This was rejected on principle as being
insecure.  I recently stumbled across a practical
demonstration of this insecurity.

The WEP protocol for 802.11b wireless security made
the mistake of using a CRC as their integrity checksum,
and as a result can be attacked by tampering with
data in flight - the fact that the data was tampered
with is undetectable even when everything is encrypted
because it's too easy to figure out which bits have
to be flipped in the CRC to hide the tampering.  This
is described in the fourth paragraph of the "Problems"
section at:

http://www.isaac.cs.berkeley.edu/isaac/wep-faq.html

<SECURITY-EXPERT-COMMENT>
Lest I get jumped on by the security experts, a
contributing factor to this WEP vulnerability is
the use of a stream cipher (RC4) rather than a
block cipher (e.g., DES, AES) - i.e., this simple
bit-flipping attack won't work against a good block
cipher, but such an approach requires confidentiality
(i.e., encryption) to obtain cryptographic integrity.
The usual design approach is to use something like
a keyed HMAC (cf. RFC 2104) to support cryptographic
integrity without requiring confidentiality.
</SECURITY-EXPERT-COMMENT>

This doesn't change anything the WG is currently doing -
there is no active proposal to use a CRC for cryptographic
integrity, and this does not affect the use of CRCs
in header digests, data digests, and the like.  This
is just an FYI based on past discussion of this topic.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Mon Apr 23 13:15:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA07186
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 13:15:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NFJqf00594
	for ips-outgoing; Mon, 23 Apr 2001 11:19:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NFJiA00578
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 11:19:44 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3WYNZ>; Mon, 23 Apr 2001 11:21:11 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A0708015493@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: hufferd@us.ibm.com, ips@ece.cmu.edu
Subject: RE: iSCSI:Target Reset
Date: Mon, 23 Apr 2001 11:19:29 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

With my co-chair hat off, my inclination would be to
specify it (since it's in SAM) but include words about
the risks, make support for it OPTIONAL, and point out
that implementations may want to have access controls
on which initiators are permitted to do this.  I'm 
assuming that at least two implementations will do
this so that we don't get into that issues of potentially
needing to remove this in order to go from Proposed
Standard to Draft Standard in the future.

--David

> -----Original Message-----
> From:	John Hufferd [SMTP:hufferd@us.ibm.com]
> Sent:	Sunday, April 22, 2001 5:52 PM
> To:	ips@ece.cmu.edu
> Subject:	iSCSI:Target Reset
> 
> (resend of message with iSCSI in Subject)
> I thought we had a number of discussion previously about Target Reset
> (Warm
> or Cold).  I thought there was general feeling that this command is so
> dangerous that it should not be supported by iSCSI.  The long distance
> capability of iSCSI makes the risks involved unmanageable.  There should
> only be an Admin way to do this.
> 
> Some folks have said that we could permit it and have special
> authorization
> etc.  This would probably cause a separate section in the spec. to define
> the authorization approach,  and what ever other security is needed to
> prevent this from inappropriately being used.  All for what purpose?  This
> can not be part of error recovery from a normal initiator.  The wide
> spread
> effect is too great for that.
> 
> I would like to hear from the list about their feeling on this item.
> 
> 
> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com


From owner-ips@ece.cmu.edu  Mon Apr 23 13:16:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA07322
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 13:16:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NFJn900587
	for ips-outgoing; Mon, 23 Apr 2001 11:19:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c017.sfo.cp.net (c017-h020.c017.sfo.cp.net [209.228.12.234])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3NFJ6A00557
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 11:19:06 -0400 (EDT)
Received: (cpmta 15833 invoked from network); 23 Apr 2001 08:18:59 -0700
Received: from sangate-GW.ser.netvision.net.il (HELO sangate.com) (212.143.114.146)
  by smtp.sangate.com (209.228.12.234) with SMTP; 23 Apr 2001 08:18:59 -0700
X-Sent: 23 Apr 2001 15:18:59 GMT
Message-ID: <3AE43297.520061E5@sangate.com>
Date: Mon, 23 Apr 2001 16:48:07 +0300
From: Mark Mokryn <mark@sangate.com>
Organization: SANgate Systems
X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16-22 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI: Re: iSCSI & Linked Commands
References: <C1256A33.0032A8CC.00@d12mta02.de.ibm.com> <20010423131707.6719E9400E@sandmail.sandburst.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Stephen Bailey wrote:
> 
> Julian,
> 
> > According to your logic no FCP implementation can use linked commands?
> > Is this true for all OS's?  Is it a verified fact or foloklor?
> 
> In my experience it's fact.  I have never used a SCSI stack which both
> supported AND used linked commands.  Like some others here, I always
> assumed AIX might :^) Ralph has pointed out that T10 is well aware
> that the feature is not popular.  There are other ways of
> accomplishing the same thing that are less likely to blow up in your
> face.

According to the Shark SCSI spec, from the inquiry data specifically for
an AIX host, linked commands are not supported, both for SCSI and FCP.

-mark


From owner-ips@ece.cmu.edu  Mon Apr 23 13:18:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA07390
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 13:18:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NFM4l00808
	for ips-outgoing; Mon, 23 Apr 2001 11:22:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.brocade.com (asbestos.brocade.com [63.121.140.244])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NEl0A28880
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 10:47:00 -0400 (EDT)
Received: from thor.brocade.com (thor [192.168.126.45])
	by mail.brocade.com (8.8.8+Sun/8.8.8) with ESMTP id HAA03941;
	Mon, 23 Apr 2001 07:46:50 -0700 (PDT)
Received: by thor.brocade.com with Internet Mail Service (5.5.2653.19)
	id <FXLA0P0Z>; Mon, 23 Apr 2001 07:46:50 -0700
Message-ID: <FFD40DB4943CD411876500508BAD0279026B20EF@sj5-ex2.brocade.com>
From: Robert Snively <rsnively@Brocade.COM>
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: RE: iSCSI & Linked Commands
Date: Mon, 23 Apr 2001 07:46:48 -0700
X-Mailer: Internet Mail Service (5.5.2653.19)
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Folks,

There are no implementation problems with linked commands other than
the fact that nobody has really found a good use for them outside
a few specialized environments.  It is for that reason that they
are relatively rarely called upon, but strongly defended by their
few users.

This thread has created a significant amount of unclarity.

First:

	A task includes all commands executed in a single
	link.  It has the same Exchange identifier for all 
	steps in executing a command.  That takes care of the
	concerns expressed by Santosh.

Second:

	A set of linked commands is executed in order, by
	definition, but their ordering with respect to other
	tasks from the same or other initiators is not defined
	except through the ordinary command ordering stuff.
	The task ordering applies to the entire task, although
	this is not as explicitly defined in SAM as would be
	desirable.  This eliminates them as being tremendously
	useful in ordered execution.

Third:

	I am not aware of any usage of commands using relative
	addressing.  They are denigrated in all profiles.  They
	are a historic anomaly dating back to when SCSI was thought
	of as a total replacement for the IBM OEM channel.
	Relative addressing is defined for disk devices only,
	not sequential devices.  Sequential devices use the 
	more traditional space, skip, and locate commands.
	This eliminates relative addressing for linked commands as
	being useful to sequential devices.

Fourth:

	Stale PDUs are a fact of life in any stream of data that
	is not acknowledged and sequenced on a PDU by PDU basis.
	Both Fibre Channel and iSCSI have to deal with that.
	Fibre Channel has a function called "recovery qualifier"
	that automatically discards stale PDUs.  It has an additional
	function called R_A_TOV which guarantees that stale PDUs
	will not appear later than a certain time.  With those two
	mechanisms, stale PDUs become a non-problem.  In FCP,
	an additional mechanism is provided to handle the stale
	PDUs created by Abort Task actions.

	One way or another, iSCSI will have to deal with the same
	problem.

Notes of possible interest:

Command linking is an anachronism.  It has been used by a few
devices that think of themselves as having IBM OEM channel-like
characteristics, where a list of commands is the normal I/O
execution unit.  It was also used by Sun Microsystems in early
SCSI implementations that used software selection algorithms that
were so high in overhead that command linking provided improved
performance.  Aside from those cases that have been listed here
(AS400 and some Unisys systems), command linking is not in modern
use as far as I know.  Command queuing and buffering have provided
performance improvements that are so notable that command linking
has fallen aside.


From owner-ips@ece.cmu.edu  Mon Apr 23 14:52:52 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA10273
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 14:52:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NGcnE05281
	for ips-outgoing; Mon, 23 Apr 2001 12:38:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NGcDA05254
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:38:13 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3NHk3129237;
	Mon, 23 Apr 2001 10:46:03 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Stephen Bailey" <steph@cs.uchicago.edu>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Re: iSCSI & Linked Commands 
Date: Mon, 23 Apr 2001 09:36:21 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOELACGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <20010423131707.6719E9400E@sandmail.sandburst.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Stephen,

Unlike random access devices, sequential access devices operate with
relative addressing.  For random access devices, this is a seldom used
option.  There is a requirement to bind commands together to ensure order of
execution on these devices.  By popular, you mean not sequential?

Doug


> Julian,
>
> > According to your logic no FCP implementation can use linked commands?
> > Is this true for all OS's?  Is it a verified fact or foloklor?
>
> In my experience it's fact.  I have never used a SCSI stack which both
> supported AND used linked commands.  Like some others here, I always
> assumed AIX might :^) Ralph has pointed out that T10 is well aware
> that the feature is not popular.  There are other ways of
> accomplishing the same thing that are less likely to blow up in your
> face.
>
> > Is it so also for the new MS StorPort driver?
>
> I don't know, but I'd be really surprised if they did use linked
> commands.  You have to be pretty nuts to rely on a feature that's not
> even exercised by most SCSI implementations.
>
> Steph
>



From owner-ips@ece.cmu.edu  Mon Apr 23 14:54:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA10335
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 14:54:52 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NGiwh05592
	for ips-outgoing; Mon, 23 Apr 2001 12:44:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe71.law11.hotmail.com [64.4.16.206])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NGiCA05568
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:44:12 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Mon, 23 Apr 2001 09:44:05 -0700
X-Originating-IP: [66.31.72.237]
From: "Eddy Quicksall" <ESQuicksall@hotmail.com>
To: "Dillard, David" <david_dillard@adaptec.com>,
        "John Hufferd" <hufferd@us.ibm.com>
Cc: <ips@ece.cmu.edu>
References: <OFCAEBAA8A.97EC3912-ON88256A36.0068A802@LocalDomain>
Subject: Re: Target Reset
Date: Mon, 23 Apr 2001 12:44:05 -0400
MIME-Version: 1.0
Content-Type: text/plain;	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.3018.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Message-ID: <OE713E1ioWwr2UAaTzv0000246d@hotmail.com>
X-OriginalArrivalTime: 23 Apr 2001 16:44:05.0968 (UTC) FILETIME=[9C46ED00:01C0CC14]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

I am wondering how clustering will work on NT without some sort of reset.

On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
during a challenge for the quorum drive.

On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
assume the NT driver would have to issue a TARGET RESET to each target that
it is supporting.

How would you propose this would be done without a TARGET RESET?

Eddy

----- Original Message -----
From: "John Hufferd" <hufferd@us.ibm.com>
To: "Dillard, David" <david_dillard@adaptec.com>
Cc: <ips@ece.cmu.edu>
Sent: Sunday, April 22, 2001 3:20 PM
Subject: RE: Target Reset


>
> This is at least better.  But I do not have the issue of it being vendor
> unique.  This is a shut down and restart of the complete Target, and will
> probably be part of the vendors' operator console or their own remote
> support functions, it is not clear that it needs to be a general
management
> function that works the same on all iSCSI Storage Controllers.
>
> Many of the major Storage Controller do not support this feature today.
>
> I do not believe that most SNMP implementations are very secure.  Most
> folks do not want to have a changeable MIB until they have secure SNMP,
and
> even though there is a version of SNMP that has security features, this
has
> not been well supported.
>
> I do NOT think that Target Reset should be in the base iSCSI protocol, but
> I think it is reasonable to hold this discussion apart from the base
> protocol document, and the question should be asked if this is a general
> management function or a vendor specific console or remote support
> function.
>
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> "Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
> 08:13:49 AM
>
> Sent by:  owner-ips@ece.cmu.edu
>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  RE: Target Reset
>
>
>
> John,
>
> I understand the danger of issuing a target reset and I agree that it
> should
> not be a part of the an initiator's normal error recovery procedure.
> However, looking at this from a management perspective I'd like to see a
> standardized way of resetting a target.  I don't want to see a variety of
> vendor unique methods of resetting targets sprout up.
>
> If resetting a target using the protocol is not desirable from your
> perspective would incorporating this feature into the MIB be acceptable?
> (MIBs are for management after all)
>
> ----------------------------------------------------------------
> David Dillard                          david_dillard@adaptec.com
> Management Software Group
> Adaptec, Inc.                          www.adaptec.com
>
>
>
>
> -----Original Message-----
> From: John Hufferd [mailto:hufferd@us.ibm.com]
> Sent: Sunday, April 22, 2001 4:59 AM
> To: ips@ece.cmu.edu
> Subject: Target Reset
>
>
> I thought we had a number of discussion previously about Target Reset
(Warm
> or Cold).  I thought there was general feeling that this command is so
> dangerous that it should not be supported by iSCSI.  The long distance
> capability of iSCSI makes the risks involved unmanageable.  There should
> only be an Admin way to do this.
>
> Some folks have said that we could permit it and have special
authorization
> etc.  This would probably cause a separate section in the spec. to define
> the authorization approach,  and what ever other security is needed to
> prevent this from inappropriately being used.  All for what purpose?  This
> can not be part of error recovery from a normal initiator.  The wide
spread
> effect is too great for that.
>
> I would like to hear from the list about their feeling on this item.
>
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
>
>


From owner-ips@ece.cmu.edu  Mon Apr 23 14:57:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA10392
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 14:57:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NGZvY05119
	for ips-outgoing; Mon, 23 Apr 2001 12:35:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel2.hp.com (atlrel2.hp.com [156.153.255.202])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NGZSA05097
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:35:28 -0400 (EDT)
Received: from xatlrelay1.atl.hp.com (xatlrelay1.atl.hp.com [15.45.89.190])
	by atlrel2.hp.com (Postfix) with ESMTP id 4E4A51941
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:35:28 -0400 (EDT)
Received: from xatlbh2.atl.hp.com (xatlbh2.atl.hp.com [15.45.89.187])
	by xatlrelay1.atl.hp.com (Postfix) with ESMTP id 686AD1F509
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:33:30 -0400 (EDT)
Received: by xatlbh2.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <JKNP1VH1>; Mon, 23 Apr 2001 12:35:14 -0400
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A08FF0@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: FW: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
Date: Mon, 23 Apr 2001 12:35:09 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: multipart/mixed;
	boundary="----_=_NextPart_000_01C0CC13.5C48E560"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_000_01C0CC13.5C48E560
Content-Type: text/plain;
	charset="iso-8859-1"

I haven't seen the I-D ACTION for the iSCSI Requirements Document, but it is
posted on the IETF web site, so I'm forwarding this link.  

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 

> -----Original Message-----
> Subject: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
> 
> 
> A New Internet-Draft is available from the on-line 
> Internet-Drafts directories.
> This draft is a work item of the IP Storage Working Group of the IETF.
> 
> 	Title		: iSCSI Requirements and Design Considerations
> 	Author(s)	: M. Krueger et al.
> 	Filename	: draft-ietf-ips-iscsi-reqmts-03.txt
> 	Pages		: 22
> 	Date		: 16-Apr-01
> 	
> The IP Storage Working group is chartered with developing 
> comprehensive 
> technology to transport block storage data over IP protocols. 
>  This effort includes a protocol to transport the Small 
> Computer Systems Interface (SCSI) protocol over the internet (iSCSI).
> 
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt
> 
> Internet-Drafts are also available by anonymous FTP. Login 
> with the username
> "anonymous" and a password of your e-mail address. After logging in,
> type "cd internet-drafts" and then
> 	"get draft-ietf-ips-iscsi-reqmts-03.txt".
> 
> A list of Internet-Drafts directories can be found in
> http://www.ietf.org/shadow.html 
> or ftp://ftp.ietf.org/ietf/1shadow-sites.txt
> 
> 
> Internet-Drafts can also be obtained by e-mail.
> 
> Send a message to:
> 	mailserv@ietf.org.
> In the body type:
> 	"FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt".
> 	
> NOTE:	The mail server at ietf.org can return the document in
> 	MIME-encoded form by using the "mpack" utility.  To use this
> 	feature, insert the command "ENCODING mime" before the "FILE"
> 	command.  To decode the response(s), you will need "munpack" or
> 	a MIME-compliant mail reader.  Different MIME-compliant 
> mail readers
> 	exhibit different behavior, especially when dealing with
> 	"multipart" MIME messages (i.e. documents which have been split
> 	up into multiple messages), so check your local documentation on
> 	how to manipulate these messages.
> 		
> 		
> Below is the data which will enable a MIME compliant mail reader
> implementation to automatically retrieve the ASCII version of the
> Internet-Draft.
> 


------_=_NextPart_000_01C0CC13.5C48E560
Content-Type: message/rfc822

To: 
Subject: 
Date: Wed, 11 Apr 2001 12:49:40 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: multipart/mixed;
	boundary="----_=_NextPart_002_01C0CC13.5C48E560"


------_=_NextPart_002_01C0CC13.5C48E560
Content-Type: text/plain



------_=_NextPart_002_01C0CC13.5C48E560
Content-Type: application/octet-stream;
	name="ATT10464"
Content-Disposition: attachment;
	filename="ATT10464"

Content-type: message/external-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010410135017.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-02.txt

------_=_NextPart_002_01C0CC13.5C48E560
Content-Type: message/external-body;
	site="internet-drafts";
	dir="draft-ietf-ips-iscsi-reqmts-02.txt";
	mode="ftp.ietf.org";
	access-type="anon-ftp"


------_=_NextPart_002_01C0CC13.5C48E560--

------_=_NextPart_000_01C0CC13.5C48E560--


From owner-ips@ece.cmu.edu  Mon Apr 23 15:02:05 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA10554
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 15:02:04 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NGpto06019
	for ips-outgoing; Mon, 23 Apr 2001 12:51:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1.cos.agilent.com (msgbas1x.cos.agilent.com [192.6.9.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NGp3A05961
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:51:03 -0400 (EDT)
Received: from msgrel1.cos.agilent.com (msgrel1.cos.agilent.com [130.29.152.77])
	by msgbas1.cos.agilent.com (Postfix) with ESMTP
	id 5AFE911A5; Mon, 23 Apr 2001 10:51:02 -0600 (MDT)
Received: from axcsbh4.cos.agilent.com (axcsbh4.cos.agilent.com [130.29.152.145])
	by msgrel1.cos.agilent.com (Postfix) with SMTP
	id 03AF2E6; Mon, 23 Apr 2001 10:51:02 -0600 (MDT)
Received: from 130.29.152.145 by axcsbh4.cos.agilent.com (InterScan E-Mail VirusWall NT); Mon, 23 Apr 2001 10:51:01 -0600 (Mountain Daylight Time)
Received: by axcsbh4.cos.agilent.com with Internet Mail Service (5.5.2653.19)
	id <JHNKVL6A>; Mon, 23 Apr 2001 10:51:01 -0600
Message-ID: <FEEBE78C8360D411ACFD00D0B74779719A8861@xsj02.sjs.agilent.com>
From: vince_cavanna@agilent.com
To: Black_David@emc.com, ips@ece.cmu.edu
Cc: vince_cavanna@agilent.com
Subject: RE: Proof that CRCs are not secure
Date: Mon, 23 Apr 2001 10:50:52 -0600
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi David,
In fact it is VERY EASY to figure out what to change to make the CRC come
out hte same. See for example, at http://surf.to/anarchriz , the technique
used by hackers to "reverse" the CRC. Follow the links to "CRC Essay". In
essence, for data protected by a 32 bit CRC, if you want to make an
arbitrary change in the data you can always compensate by changing an
additional 32 bits so the CRC will be consistent. The paper explains, in a
long-winded way, how to compute the compensatory 32 bits. Basically you need
to compute the 32 input bits that will take the CRC circuit (a 32 bit state
machine) from one arbitrary (what it is after the changes) present state to
the desired (what it would have been without the changes) next state. The
CRC is far from being a one-way hash function since it is very easy to find
an alternate message that results in the same CRC as the original message.
Vince

|-----Original Message-----
|From: Black_David@emc.com [mailto:Black_David@emc.com]
|Sent: Monday, April 23, 2001 8:52 AM
|To: ips@ece.cmu.edu
|Subject: Proof that CRCs are not secure
|
|
|A while back, there was some interest in using a CRC
|as a secure integrity checksum (e.g., by adding it
|to IPSec).  This was rejected on principle as being
|insecure.  I recently stumbled across a practical
|demonstration of this insecurity.
|
|The WEP protocol for 802.11b wireless security made
|the mistake of using a CRC as their integrity checksum,
|and as a result can be attacked by tampering with
|data in flight - the fact that the data was tampered
|with is undetectable even when everything is encrypted
|because it's too easy to figure out which bits have
|to be flipped in the CRC to hide the tampering.  This
|is described in the fourth paragraph of the "Problems"
|section at:
|
|http://www.isaac.cs.berkeley.edu/isaac/wep-faq.html
|
|<SECURITY-EXPERT-COMMENT>
|Lest I get jumped on by the security experts, a
|contributing factor to this WEP vulnerability is
|the use of a stream cipher (RC4) rather than a
|block cipher (e.g., DES, AES) - i.e., this simple
|bit-flipping attack won't work against a good block
|cipher, but such an approach requires confidentiality
|(i.e., encryption) to obtain cryptographic integrity.
|The usual design approach is to use something like
|a keyed HMAC (cf. RFC 2104) to support cryptographic
|integrity without requiring confidentiality.
|</SECURITY-EXPERT-COMMENT>
|
|This doesn't change anything the WG is currently doing -
|there is no active proposal to use a CRC for cryptographic
|integrity, and this does not affect the use of CRCs
|in header digests, data digests, and the like.  This
|is just an FYI based on past discussion of this topic.
|
|Thanks,
|--David
|
|---------------------------------------------------
|David L. Black, Senior Technologist
|EMC Corporation, 42 South St., Hopkinton, MA  01748
|+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
|black_david@emc.com       Mobile: +1 (978) 394-7754
|---------------------------------------------------
|


From owner-ips@ece.cmu.edu  Mon Apr 23 15:42:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA11365
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 15:42:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NFM0000804
	for ips-outgoing; Mon, 23 Apr 2001 11:22:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.brocade.com (asbestos.brocade.com [63.121.140.244])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NElBA28887
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 10:47:11 -0400 (EDT)
Received: from thor.brocade.com (thor [192.168.126.45])
	by mail.brocade.com (8.8.8+Sun/8.8.8) with ESMTP id HAA03947;
	Mon, 23 Apr 2001 07:46:55 -0700 (PDT)
Received: by thor.brocade.com with Internet Mail Service (5.5.2653.19)
	id <FXLA0P05>; Mon, 23 Apr 2001 07:46:55 -0700
Message-ID: <FFD40DB4943CD411876500508BAD0279026B20F0@sj5-ex2.brocade.com>
From: Robert Snively <rsnively@brocade.com>
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: RE: iSCSI : digest error handling violates EMDP/InDataOrder
Date: Mon, 23 Apr 2001 07:46:51 -0700
X-Mailer: Internet Mail Service (5.5.2653.19)
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Seems to me that there are some unclarities in this area as well.

There are really two pieces being discussed as one:

	EMDP (a SCSI functionality)

	Random relative offset (a transport functionality)

EMDP is used to allow a target to request or deliver its data
out of order.  This is used for things like passing a stripe
segment from a RAID data extent as soon as it has been accumulated,
rather than waiting until all previous parts of the RAID data
extent have also been accumulated and delivered.  It is also used
for things like "start anywhere" reading of a disk track.

It says nothing about the ordering of data within a PDU or sequence
which must be ordered according to the rules of the protocol.  Fibre
Channel allows the data within a sequence to be transmitted in order
or out of order by using the login parameter "random relative offset".
Almost all devices choose to login and require "continuously increasing
relative offset".



From owner-ips@ece.cmu.edu  Mon Apr 23 15:44:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA11406
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 15:44:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NFNll00907
	for ips-outgoing; Mon, 23 Apr 2001 11:23:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from prue.eim.surrey.ac.uk (IDENT:exim@prue.eim.surrey.ac.uk [131.227.76.5])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NFAhA00132
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 11:10:43 -0400 (EDT)
Received: from regan.ee.surrey.ac.uk ([131.227.89.11])
	by prue.eim.surrey.ac.uk with esmtp (Exim 3.16 #1)
	id 14rhYf-0005HK-00; Mon, 23 Apr 2001 15:43:21 +0100
Date: Mon, 23 Apr 2001 15:43:21 +0100 (BST)
From: Lloyd Wood <l.wood@eim.surrey.ac.uk>
X-Sender: eep1lw@regan.ee.surrey.ac.uk
Reply-To: Lloyd Wood <L.Wood@eim.surrey.ac.uk>
To: Stephen Bailey <steph@cs.uchicago.edu>
cc: ips@ece.cmu.edu, tsvwg@ietf.org
Subject: Re: [Tsvwg] [SCTP checksum problems]
In-Reply-To: <20010423131707.3634D9400D@sandmail.sandburst.com>
Message-ID: <Pine.GSO.4.21.0104231524570.11071-100000@regan.ee.surrey.ac.uk>
Organization: speaking for none
X-url: http://www.ee.surrey.ac.uk/Personal/L.Wood/
X-no-archive: yes
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Scanner: exiscan *14rhYf-0005HK-00*qdsUrkUA33g* http://duncanthrax.net/exiscan/
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

On Mon, 23 Apr 2001, Stephen Bailey wrote:

> Lloyd,
> 
> Thanks for your input.
> 
> > Reality check, please.
> 
> Is your point that no protocol should ever assume that there's an
> end-to-end integrity check operating at some layer below it?

No. (How can an *end-to-end* integrity check operate at a lower
layer?)

> The point of the iSCSI integrity check is not that to assume that
> underlying layers (the transport and/or security & integrity) have
> abdicated responsibility for end-to-end integrity. 

I've spent several minutes trying to parse your sentence and its
opposed negatives. I don't think that sentence means whatever you
think it means.

> It is a trust, but verify model.  In a suitably trustworthy [or
> suitably devious] environment you're wasting your time verifying.

In a *perfectly* trustworthy or *perfectly* devious environment that
might be the case. But environments are not perfect - and they change,
too.

So, always verify.

L.

<L.Wood@surrey.ac.uk>PGP<http://www.ee.surrey.ac.uk/Personal/L.Wood/>





From owner-ips@ece.cmu.edu  Mon Apr 23 16:29:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA12260
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 16:29:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NI2u810293
	for ips-outgoing; Mon, 23 Apr 2001 14:02:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ietf.org (odin.ietf.org [132.151.1.176])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NHIjA07476
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 13:18:45 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA07418;
	Mon, 23 Apr 2001 13:18:43 -0400 (EDT)
Message-Id: <200104231718.NAA07418@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: ips@ece.cmu.edu
From: Internet-Drafts@ietf.org
Reply-to: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
Date: Mon, 23 Apr 2001 13:18:43 -0400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI Requirements and Design Considerations
	Author(s)	: M. Krueger
	Filename	: draft-ietf-ips-iscsi-reqmts-03.txt
	Pages		: 23
	Date		: 20-Apr-01
	
The IP Storage Working group is chartered with developing comprehensive 
technology to transport block storage data over IP protocols.  This effort includes
a protocol to transport the Small Computer Systems Interface (SCSI) protocol over
the internet (iSCSI).

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-reqmts-03.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154406.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-reqmts-03.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154406.I-D@ietf.org>

--OtherAccess--

--NextPart--




From owner-ips@ece.cmu.edu  Mon Apr 23 16:29:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA12281
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 16:29:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NJGr319565
	for ips-outgoing; Mon, 23 Apr 2001 15:16:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NJG2A19514
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 15:16:02 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRP4V>; Mon, 23 Apr 2001 12:15:47 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173450@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI: Re: iSCSI & Linked Commands 
Date: Mon, 23 Apr 2001 12:15:47 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

This seems like a non-issue.

As I recall, what triggered this thread was the concern that the iSCSI had
transport had to do something special to support linked commands. While
there were and continue to be issues with existing implementations, these
seem to be strictly confined to the SCSI layer.

I don't believe linked commands impose any requirements on iSCSI over and
above what's needed to support normal unlinked commands.

Charles
> -----Original Message-----
> From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> Sent: Monday, April 23, 2001 6:16 AM
> To: ips@ece.cmu.edu
> Subject: iSCSI: Re: iSCSI & Linked Commands 
> 
> 
> Julian,
> 
> > According to your logic no FCP implementation can use 
> linked commands?
> > Is this true for all OS's?  Is it a verified fact or foloklor?
> 
> In my experience it's fact.  I have never used a SCSI stack which both
> supported AND used linked commands.  Like some others here, I always
> assumed AIX might :^) Ralph has pointed out that T10 is well aware
> that the feature is not popular.  There are other ways of
> accomplishing the same thing that are less likely to blow up in your
> face.
> 
> > Is it so also for the new MS StorPort driver?
> 
> I don't know, but I'd be really surprised if they did use linked
> commands.  You have to be pretty nuts to rely on a feature that's not
> even exercised by most SCSI implementations.
> 
> Steph
> 


From owner-ips@ece.cmu.edu  Mon Apr 23 16:30:10 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA12298
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 16:30:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NHqsd09644
	for ips-outgoing; Mon, 23 Apr 2001 13:52:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from zmamail04.zma.compaq.com (zmamail04.zma.compaq.com [161.114.64.104])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NHq4A09615
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 13:52:04 -0400 (EDT)
Received: by zmamail04.zma.compaq.com (Postfix, from userid 12345)
	id 6EB2D5CA7; Mon, 23 Apr 2001 13:51:59 -0400 (EDT)
Received: from exchou-gh02.cca.cpqcorp.net (exchou-gh02.cca.cpqcorp.net [16.110.248.202])
	by zmamail04.zma.compaq.com (Postfix) with ESMTP id 37D755E25
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 13:51:59 -0400 (EDT)
Received: by exchou-gh02.cca.cpqcorp.net with Internet Mail Service (5.5.2652.78)
	id <J2342KMX>; Mon, 23 Apr 2001 12:51:58 -0500
Message-ID: <78AF3C342AEAEF4BA33B35A8A15668C659C176@cceexc17.americas.cpqcorp.net>
From: "Elliott, Robert" <Robert.Elliott@compaq.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 12:51:50 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2652.78)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

[iSCSI added to subject]

> -----Original Message-----
> From: Eddy Quicksall [mailto:ESQuicksall@hotmail.com]
> Sent: Monday, April 23, 2001 11:44 AM
> Cc: ips@ece.cmu.edu
> Subject: Re: Target Reset
> 
> I am wondering how clustering will work on NT without some 
> sort of reset.
> 
> On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
> during a challenge for the quorum drive.
> 
> On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
> assume the NT driver would have to issue a TARGET RESET to 
> each target that it is supporting.
> 
> How would you propose this would be done without a TARGET RESET?
> 
> Eddy

LOGICAL UNIT RESET breaks the reservations with the correct scope.

According to a WinHEC presentation, the new STORPORT port driver will 
attempt resets in this order:
1. LOGICAL UNIT RESET
2. TARGET RESET
3. Bus reset 

LOGICAL UNIT RESET isn't supported by all existing logical units, so 
a fallback scheme (with its unfortunate side effects) is necessary.  
This is mandatory as of SAM-2 revision 16, so new devices should 
start supporting it (on all protocols).

TARGET RESET is the first level of fallback (medium hammer).  This 
is optional for protocols to support as of SAM-2 revision 16.  
Protocols that don't support it require their logical units to play 
nicely and support LOGICAL UNIT RESET.  Protocols that do support
it leave themselves open to all the problems Fibre Channel went
through.

Bus reset is the final level of fallback (big hammer), which doesn't
exist on serial busses.

A miniport driver for SCSIPORT (the existing port driver) can convert 
a "target reset" request into LOGICAL UNIT RESETs for each logical 
unit known to that system to help limit the damage.

From what I understand, Linux also relies on target resets and 
bus resets and will need some mid-layer work to play well in a fabric.

---
Rob Elliott, Compaq Server Storage
Robert.Elliott@compaq.com


From owner-ips@ece.cmu.edu  Mon Apr 23 16:32:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA12347
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 16:32:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NJKr319790
	for ips-outgoing; Mon, 23 Apr 2001 15:20:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from packer.xiotech.com (ftp.xiotech.com [209.46.118.18] (may be forged))
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NJJpA19692
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 15:19:51 -0400 (EDT)
Message-ID: <ED8EDD517E0AA84FA2C36C8D6D205C1367344F@alfred.xiotech.com>
From: "Peglar, Robert" <robert_peglar@xiotech.com>
To: "'Eddy Quicksall'" <ESQuicksall@hotmail.com>,
        "Dillard, David"
	 <david_dillard@adaptec.com>,
        John Hufferd <hufferd@us.ibm.com>
Cc: ips@ece.cmu.edu
Subject: RE: Target Reset
Date: Mon, 23 Apr 2001 14:19:47 -0500
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Forgive my stepping into this thread, but
would not PERSISTENT RESERVE OUT/IN (10) suffice?
(SPC 7.12,7.13).  Seems like bus reset is a
somewhat hard approach to perform reservation
management.

Rob


> -----Original Message-----
> From: Eddy Quicksall [mailto:ESQuicksall@hotmail.com]
> Sent: Monday, April 23, 2001 11:44 AM
> To: Dillard, David; John Hufferd
> Cc: ips@ece.cmu.edu
> Subject: Re: Target Reset
> 
> 
> I am wondering how clustering will work on NT without some 
> sort of reset.
> 
> On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
> during a challenge for the quorum drive.
> 
> On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
> assume the NT driver would have to issue a TARGET RESET to 
> each target that
> it is supporting.
> 
> How would you propose this would be done without a TARGET RESET?
> 
> Eddy
> 
> 


From owner-ips@ece.cmu.edu  Mon Apr 23 16:35:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA12418
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 16:35:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NJH0e19577
	for ips-outgoing; Mon, 23 Apr 2001 15:17:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NJGHA19522
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 15:16:17 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3NKJW129353;
	Mon, 23 Apr 2001 13:19:33 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Charles Monia" <cmonia@NishanSystems.com>,
        "Santosh Rao \(E-mail\)" <santoshr@cup.hp.com>
Cc: "Ips \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Mon, 23 Apr 2001 12:09:44 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIELDCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <B300BD9620BCD411A366009027C21D9B17344E@ariel.nishansystems.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Charles,

Your solution requires a fair amount of tracking of commands based solely on
their Client Tags.  These Tags are randomly generated but will need to
retain sequential order for your scheme.  The transport must remember the
type of command sent together with their relative placement based only on
the Client Tag.  In addition, these commands will need to be placed into
different categories.  Those commands executed out of sequence by means of a
bypass flag, those commands that are Task Management commands, and commands
affected by these other types of commands.  It seems that in large part,
these concerns can be met with proper handing of the transport without such
laborious sorting of the Client Tags.  The out-of-sequence or bypass flag
also depends on the transport sorting the Client Tag.  In addition to
disabling flow-control, this technique of not incrementing the serialization
of these commands, requires all commands with the same serialization value
to be sent on the same connection without acknowledgment, if these commands
are also to be kept in sequence.  This connection requirement is yet to be
specified.

Ver 6, Pg 12:
   "iSCSI may avoid delivering some command to the
   SCSI layer if so required by some prior SCSI or iSCSI action (e.g.,
   clear task set Task Management request received before all the
   commands it was supposed to act on)."

Here, there seems to be expectations of the iSCSI transport interpreting the
content of the SCSI commands.  How this is done is not obvious.  Is the
transport expected to generate SCSI responses?

In addition, although iSCSI presently relies on ACA, there are few
applications that implement ACA.  It would appear for iSCSI to work with the
present protocol, significant application changes are required.  With the
proposal I am suggesting, this is not a problem as all bypassed commands are
rejected back to the Initiator.  The drivers that implement iSCSI will be
required to provide handling for these commands that bypass other commands.
The amount of information contained in a rejected command list should be
relatively small and these occasions for such Management rare.  Without
proper handling of these events, there will be 2:00 AM alarm pagers going
off.

Here in the proposal, sorting CmdSN based on LUN values takes place within a
"Barrier List."  I can not tell what is implied by these recovery
instructions.  What is meant by Remove, Release, Drop, Cleanup, Placeholder,
and ALL.  What is the intended feedback to the initiator for this Clean-up?
It would appear the transport works on behalf of the target.  In the
proposal that I am suggesting, there is no actions within the transport on
behalf of the target.  All decisions are done either by the Target or the
Initiator.  None by the transport.

The concept is simple.  Keep the transport simple.  Do not expect the
transport to decipher SCSI commands.  Do not expect the transport to respond
on behalf of the Target.  Do not expect the transport to sort pending
commands based on LUN value.  Do not expect the transport to require SCSI
and iSCSI ACA.

In the case of session wide serialization, what is good for the goose is
also good for the gander.  It is important from the prospect of quickly
detecting an error and knowing the server state to also use session wide
serialization from the server.  The technique of replicating Management
commands down each connection in addition to changing global commands into
specific commands already over burdens the set-aside that must be made to
handle these non-serialized management commands.  My proposal eliminates the
problem of set-aside resources and loss of server state.  Rather than
silently rejecting commands out-of-sequence, these rejections are reported.
Once done, this feature can be used to extract pending commands in a simple
and direct manner without burdening the transport.

As attempts are made to support the SCSI architecture, rather than
increasing the intelligence of the transport, efforts should be made to
simplify the transport.  The number of fields that the transport must
manipulate will be met with complexity and non-uniform implementation.

See:
http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt

Ver 6, Pg 92:
     "N.B. As an alternative to Logout and reissue commands, the
      initiator MAY instead reset the target and terminate all
      outstanding commands with a service response indicating
      Delivery Subsystem Failure. The initiator MUST perform one of
      the two actions."

...

Ver 6, Pg 93:
   "The following general mechanism can be used to achieve the effect of
   ordered delivery for task management commands while enabling the
   "urgent" delivery that some of them imply and immediate execution of
   the task management commands without:

      At Initiator when a relevant task management command is issued:

         a) if ExpCmdSN is equal to CmdSN skip to step c
         b) mark all pending commands with a CmdSN field between
         ExpCmdSN and the current CmdSN and a relevant LUN as
         candidates for cleanup and retain CmdSN in a "barrier list".
         c) send the task management command for immediate delivery
         to the target

      At initiator when updating ExpCmdSN:

         a) if the "barrier list" is empty or ExpCmdSN is less than
         the first entry in the barrier list then skip to step d
         b) remove the barrier list entry and remove and drop all
         entries marked for cleanup having a CmdSN field less than
         ExpCmdSN
         c) go to step a
         d) release all queued entries between the old and new
         ExpCmdSN from the queue

      At target when receiving a relevant task management command for
      immediate delivery:

         a) if ExpCmdSN is equal to CmdSN skip to step c
         b) mark all pending entries (commands received and
         placeholders) with a CmdSN field between ExpCmdSN and the
         current CmdSN as candidates for cleanup and retain CmdSN in
         a "barrier list" including the referenced LUN (or an ALL
         marker)
         c) send the task management command to SCSI for immediate
         execution

      At target when updating ExpCmdSN (releasing ordered commands to
      SCSI):

         a) if the "barrier list" is empty or ExpCmdSN is less than
         the first entry in the barrier list then skip to step d
         b) remove the barrier list entry and remove and drop all
         entries marked for cleanup and having the same LUN as the
         barrier entry (any if the barrier is marked ALL) and a CmdSN
         field less than ExpCmdSN
         c) go to step a
         d) release all queued entries between the old and new
         ExpCmdSN from the queue

   Note that this scheme will withstand connection recovery."

Doug

> Hi Santosh:
>
> Please see below.
>
> > Charles Monia wrote:
> >
> > > > (1) MUST provide ordered delivery of SCSI commands from
> > > >       the initiator to the target in the absence of transport
> > > >       errors visible to iSCSI (e.g., iSCSI CRC failure,
> > > >       unexpected TCP connection closure).
> > >
> > > Does the term "SCSI commands" include task management
> > functions as well?  If
> > > not, it should.
> >
> >
> > Charles,
> >
> > Could iSCSI use a variant of the approach FCP-2 takes to solve the
> > ordering issue for task mgmt error recovery ?
> >
> > The FCP-2 task management error recovery scheme is :
> > - task mgmt function uses CRN 0
> > - task mgmt function is executed immediately with no ordering
> > latencies
> > - both initiator & target clear all resources that can be cleared
> > un-ambiguously.
> > - any ambiguous exchanges shall be aborted by the port that
> > detects the
> > ambiguous state.
> >
> > In the case of iSCSI, an analogous approach could be :
> > - task mgmt function uses immediate delivery flag for the
> > task mgmt PDU.
> > - task mgmt fn executed immediately avoiding any ordering latencies.
> > - initiator & target clear all resources that can be cleared
> > un-ambiguously.
> > - initiator uses Abort Task to explicitly abort all active outstanding
> > I/Os at the time the task mgmt fn was issued to avoid any ambiguous
> > stale PDUs of an exchange from appearing at the target.
> >
> > Such an approach would avoid latencies on the execution of
> > the task mgmt
> > fn while still flushing out all the stale PDUs upon completion of the
> > initiator actions for that task mgmt fn.
> >
>
> The problem is to avoid scenarios where the initiator and target's view of
> the task set are out of step.  Specifically, we must avoid the
> case where an
> initiator receives a PDU from a task it believes has been terminated.
>
> In that respect, the technique you describe above should work for an ABORT
> TASK operation.
>
> In the case of ABORT TASK SET, the function could be emulated by issuing a
> series of ABORT TASK requests. For CLEAR TASK SET, an initiator would
> probably want to do the individual ABORT TASK operations, followed by a
> CLEAR TASK SET to terminate tasks from other initiators.  I assume TARGET
> RESET and LUN RESET would be emulated in a manner similar to
> CLEAR TASK SET.
> In all of these cases there may be some "atomicity" side effects caused by
> doing things one at a time instead of all at once.
>
> The only sticky problem is insuring that the CLEAR ACA function
> works right.
> By that I mean that you don't want to issue the function until all prior
> SCSI commands that were in flight when the ACA occurred have been
> terminated
> with the ACA ACTIVE status.  You can't simply replicate the
> command on each
> connection since you might inadvertently clear a subsequent ACA. (Yes -- I
> know these are all edge cases, but we may as well try to get it right.)
> Maybe the thing to do is implement the function such that the ACA
> interlock
> is not cleared until the CLEAR ACA function is sent on all the connections
> comprising the session.
>
> One minor distinction worth noting is that CRN is enforced in the SCSI
> layer, whereas cmdSN is enforced in the iSCSI transport.  So, a CRN of 0
> doesn't take effect until the transport presents the command to the SCSI
> layer for processing.  In that case, leapfrogging of PDU ordering never
> occurs.
>
> Incidentally, I've made the tacit assumption that commands on a given
> connection are presented to the SCSI layer in order they were sent,
> regardless of whether or nor cmdSN was set to 0.  I assume the framing
> mechanisms that have been discussed for buffer offloading do not
> affect this
> behavior.  I.e., a fully formed PDU slated for immediate delivery won't be
> passed to the SCSI layer before a partially complete PDU that was received
> earlier.
>
> If that's true, immediate delivery seems to have no meaning in a
> single-connection scenario.  What's more, in all cases, the iSCSI layer
> doesn't really have to be aware of task management semantics -- unless
> someone decides to intermix immediate and sequential commands in a
> multi-connection session.  Then all bets are off.
>
> Charles
>



From owner-ips@ece.cmu.edu  Mon Apr 23 17:13:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA12992
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 17:13:47 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NJNtK19971
	for ips-outgoing; Mon, 23 Apr 2001 15:23:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from magic.adaptec.com (magic.adaptec.com [208.236.45.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NJNSA19954
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 15:23:28 -0400 (EDT)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id MAA16201;
	Mon, 23 Apr 2001 12:22:01 -0700 (PDT)
Received: from otcexc01.otc.adaptec.com (otcexc01.otc.adaptec.com [10.12.1.27])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id MAA03485;
	Mon, 23 Apr 2001 12:12:47 -0700 (PDT)
Received: by otcexc01.otc.adaptec.com with Internet Mail Service (5.5.2650.21)
	id <JK8AHX8B>; Mon, 23 Apr 2001 15:22:00 -0400
Message-ID: <50DB155AD0CED411988E009027D61DB32AB161@otcexc01.otc.adaptec.com>
From: "Dillard, David" <david_dillard@adaptec.com>
To: "'Elliott, Robert'" <Robert.Elliott@compaq.com>
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 15:22:00 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> LOGICAL UNIT RESET breaks the reservations with the correct scope.
> 
> According to a WinHEC presentation, the new STORPORT port driver will 
> attempt resets in this order:
> 1. LOGICAL UNIT RESET
> 2. TARGET RESET
> 3. Bus reset 

When will STORPORT be generally available?  The latest STORPORT document
that I found on the MS web site is version 0.6a, dated March 18, 2001.
Given this it seems like STORPORT might not be available soon.  In that case
do you know what happens with the current drivers?  Are we going to be
telling customers that if they want to use iSCSI and NT clustering they have
to update to Whistler?

----------------------------------------------------------------
David Dillard                          david_dillard@adaptec.com
Management Software Group
Adaptec, Inc.                          www.adaptec.com


From owner-ips@ece.cmu.edu  Mon Apr 23 18:14:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13843
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:14:40 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKU0Q23948
	for ips-outgoing; Mon, 23 Apr 2001 16:30:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKT2A23888
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:29:02 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20689
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:29:01 -0400
Message-Id: <200104232029.QAA20689@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:29:01 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: FC Frame Encapsulation
	Author(s)	: R. Weber, M. Rajagopal, F. Travostino, V. Chau, M. O''Donnell, C. Monia, 
++M. Merhar
	Filename	: draft-ietf-ips-fcencapsulation-00.txt
	Pages		: 11
	Date		: 20-Apr-01
	
This is the ips (IP Storage) working group draft describing the
common encapsulation format for use by any IETF protocol that
encapsulates Fibre Channel (FC) frames. This draft describes a
frame header containing information mandated for encapsulating,
transmitting, and de-encapsulating FC frames.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-fcencapsulation-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-fcencapsulation-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154240.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-fcencapsulation-00.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-fcencapsulation-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154240.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 18:14:53 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13854
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:14:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKS3G23835
	for ips-outgoing; Mon, 23 Apr 2001 16:28:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKRqA23814
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:27:52 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20535
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:27:51 -0400
Message-Id: <200104232027.QAA20535@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:27:51 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

- --NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI Full Acknowledgement
	Author(s)	: D. Otis
	Filename	: draft-otis-iscsi-fullack-00.txt
	Pages		: 7
	Date		: 20-Apr-01
	
This document is illustrative of potential modifications to
the iSCSI protocol proposal (draft-ietf-ips-iscsi-05+.txt).
These changes are to create a means to do the following:

 - Ensure Management response is coherent.

 - Acknowledge ALL requests delivered to the Server.

 - Ensure integrity of the iSCSI request window.

 - Open request window during abnormal events.

 - Quickly eliminate invalidated requests.

 - Quickly expunge sequence holes.

 - Simplify the reception sequencer.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-otis-iscsi-fullack-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-otis-iscsi-fullack-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154342.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-otis-iscsi-fullack-00.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-otis-iscsi-fullack-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154342.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 18:15:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13865
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:15:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKSxK23882
	for ips-outgoing; Mon, 23 Apr 2001 16:28:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKSfA23865
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:41 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20636
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:41 -0400
Message-Id: <200104232028.QAA20636@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:28:41 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



- --NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Finding iSCSI Targets and Name Servers Using SLP
	Author(s)	: M. Bakke
	Filename	: draft-ietf-ips-iscsi-slp-00.txt
	Pages		: 17
	Date		: 20-Apr-02
	
The iSCSI protocol provides a way for hosts to access SCSI devices
over an IP network.  This document defines the use of the Service
Location Protocol (SLP) by iSCSI hosts, devices, and name services,
along with the SLP service type templates that describe the services   they provide.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-slp-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-slp-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-slp-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154304.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-slp-00.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-slp-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154304.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 18:15:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13876
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:15:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NK81e22596
	for ips-outgoing; Mon, 23 Apr 2001 16:08:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NK7MA22558
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:07:23 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id PAA89144;
	Mon, 23 Apr 2001 15:59:57 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id OAA40042;
	Mon, 23 Apr 2001 14:07:18 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: Re: Target Reset
To: "Eddy Quicksall" <ESQuicksall@hotmail.com>
Cc: <ips@ece.cmu.edu>
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF82EA9DE5.A9FA747B-ON88256A37.006E0F65@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Mon, 23 Apr 2001 13:07:01 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/23/2001 02:07:16 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Eddy,
A target like an IBM Shark or EMC Symmetrix will have thousands of LUs and
10s to 100s of Hosts connected to it, and you want to reset the whole
Target?  I do not think that is a good idea.  Perhaps Task Reset or LU
reset etc. but not Target Reset.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


"Eddy Quicksall" <ESQuicksall@hotmail.com> on 04/23/2001 09:44:05 AM

To:   "Dillard, David" <david_dillard@adaptec.com>, John Hufferd/San
      Jose/IBM@IBMUS
cc:   <ips@ece.cmu.edu>
Subject:  Re: Target Reset



I am wondering how clustering will work on NT without some sort of reset.

On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
during a challenge for the quorum drive.

On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
assume the NT driver would have to issue a TARGET RESET to each target that
it is supporting.

How would you propose this would be done without a TARGET RESET?

Eddy

----- Original Message -----
From: "John Hufferd" <hufferd@us.ibm.com>
To: "Dillard, David" <david_dillard@adaptec.com>
Cc: <ips@ece.cmu.edu>
Sent: Sunday, April 22, 2001 3:20 PM
Subject: RE: Target Reset


>
> This is at least better.  But I do not have the issue of it being vendor
> unique.  This is a shut down and restart of the complete Target, and will
> probably be part of the vendors' operator console or their own remote
> support functions, it is not clear that it needs to be a general
management
> function that works the same on all iSCSI Storage Controllers.
>
> Many of the major Storage Controller do not support this feature today.
>
> I do not believe that most SNMP implementations are very secure.  Most
> folks do not want to have a changeable MIB until they have secure SNMP,
and
> even though there is a version of SNMP that has security features, this
has
> not been well supported.
>
> I do NOT think that Target Reset should be in the base iSCSI protocol,
but
> I think it is reasonable to hold this discussion apart from the base
> protocol document, and the question should be asked if this is a general
> management function or a vendor specific console or remote support
> function.
>
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> "Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
> 08:13:49 AM
>
> Sent by:  owner-ips@ece.cmu.edu
>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  RE: Target Reset
>
>
>
> John,
>
> I understand the danger of issuing a target reset and I agree that it
> should
> not be a part of the an initiator's normal error recovery procedure.
> However, looking at this from a management perspective I'd like to see a
> standardized way of resetting a target.  I don't want to see a variety of
> vendor unique methods of resetting targets sprout up.
>
> If resetting a target using the protocol is not desirable from your
> perspective would incorporating this feature into the MIB be acceptable?
> (MIBs are for management after all)
>
> ----------------------------------------------------------------
> David Dillard                          david_dillard@adaptec.com
> Management Software Group
> Adaptec, Inc.                          www.adaptec.com
>
>
>
>
> -----Original Message-----
> From: John Hufferd [mailto:hufferd@us.ibm.com]
> Sent: Sunday, April 22, 2001 4:59 AM
> To: ips@ece.cmu.edu
> Subject: Target Reset
>
>
> I thought we had a number of discussion previously about Target Reset
(Warm
> or Cold).  I thought there was general feeling that this command is so
> dangerous that it should not be supported by iSCSI.  The long distance
> capability of iSCSI makes the risks involved unmanageable.  There should
> only be an Admin way to do this.
>
> Some folks have said that we could permit it and have special
authorization
> etc.  This would probably cause a separate section in the spec. to define
> the authorization approach,  and what ever other security is needed to
> prevent this from inappropriately being used.  All for what purpose?
This
> can not be part of error recovery from a normal initiator.  The wide
spread
> effect is too great for that.
>
> I would like to hear from the list about their feeling on this item.
>
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
>
>





From owner-ips@ece.cmu.edu  Mon Apr 23 18:16:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13888
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:15:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKPxQ23689
	for ips-outgoing; Mon, 23 Apr 2001 16:25:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKP2A23585
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:25:02 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id 5737A14C2
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 13:18:39 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id NAA29417
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 13:24:56 -0700 (PDT)
Message-ID: <3AE48F62.1D438D85@cup.hp.com>
Date: Mon, 23 Apr 2001 13:24:02 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI Target Reset
References: <50DB155AD0CED411988E009027D61DB32AB161@otcexc01.otc.adaptec.com>
Content-Type: multipart/mixed;
 boundary="------------DE796855AA34D25047341D10"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------DE796855AA34D25047341D10
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

"Dillard, David" wrote:
> 
> When will STORPORT be generally available?  The latest STORPORT document
> that I found on the MS web site is version 0.6a, dated March 18, 2001.
> Given this it seems like STORPORT might not be available soon.  In that case
> do you know what happens with the current drivers?  Are we going to be
> telling customers that if they want to use iSCSI and NT clustering they have
> to update to Whistler?


[One would hope that this list does not turn into a Microsoft
release/product discussion mailing list (?) ]

Without going into specifics of A certain O.S., does it suffice to
require that iSCSI not break existing legacy SCSI applications ?

If the above is a valid requirement, then, knowing that legacy
applications continue to use SCSI-2 Reserve/Release and the target reset
as a mechanism of breaking SCSI-2 reservations, should'nt iSCSI continue
to support the target reset ?

- Santosh
--------------DE796855AA34D25047341D10
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------DE796855AA34D25047341D10--



From owner-ips@ece.cmu.edu  Mon Apr 23 18:17:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13908
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:17:53 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKT5t23897
	for ips-outgoing; Mon, 23 Apr 2001 16:29:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKSHA23853
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:17 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20594
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:17 -0400
Message-Id: <200104232028.QAA20594@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:28:17 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


- --NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iFCP - A Protocol for Internet Fibre Channel Storage Networking
	Author(s)	: C. Monia
	Filename	: draft-ietf-ips-ifcp-01.txt
	Pages		: 54
	Date		: 20-Apr-01
	
This document specifies iFCP, a gateway-to-gateway protocol for the
implementation of a Fibre Channel fabric in which TCP/IP switching
and routing elements replace Fibre Channel components. The
protocol enables the attachment of existing Fibre Channel storage
products to an IP network by supporting the subset of fabric
services required by such devices.
The encapsulation described in this version of the document is
obsolete.  It will be replaced by an encapsulation format which
will be common to both the iFCP and FCIP protocols.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-ifcp-01.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-ifcp-01.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-ifcp-01.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154315.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-ifcp-01.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-ifcp-01.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154315.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 18:18:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13924
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:18:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NK1LM22184
	for ips-outgoing; Mon, 23 Apr 2001 16:01:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from storeage_nt.store-age.com ([199.203.178.211])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NK0NA22109
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:00:24 -0400 (EDT)
Received: from e4u5e0 (ras14-p403.hfa.netvision.net.il [62.0.113.163]) by storeage_nt.store-age.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21)
	id HN4P8JVQ; Mon, 23 Apr 2001 22:57:12 +0200
Message-ID: <002e01c0cc38$2edacce0$a371003e@e4u5e0>
From: "Nelson Nahum" <nnahum@store-age.com>
To: "Robert Snively" <rsnively@Brocade.COM>, <ips@ece.cmu.edu>
References: <FFD40DB4943CD411876500508BAD0279026B20EF@sj5-ex2.brocade.com>
Subject: Re: iSCSI & Linked Commands
Date: Mon, 23 Apr 2001 22:58:42 +0200
MIME-Version: 1.0
Content-Type: text/plain;
	charset="windows-1255"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2615.200
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

The Skip Read and Skip Write commands used by the AS400 for skip sectors
already in cache must use linked commands.
I cannot see how command queuing can be used instead, as the data of the
first command is used as the parameter for the subsequent.
(need to ensure that both commands are one after the other)

Nelson
----- Original Message -----
From: Robert Snively <rsnively@Brocade.COM>
To: <julian_satran@il.ibm.com>; <ips@ece.cmu.edu>
Sent: Monday, April 23, 2001 4:46 PM
Subject: RE: iSCSI & Linked Commands


> Folks,
>
> There are no implementation problems with linked commands other than
> the fact that nobody has really found a good use for them outside
> a few specialized environments.  It is for that reason that they
> are relatively rarely called upon, but strongly defended by their
> few users.
>
> This thread has created a significant amount of unclarity.
>
> First:
>
> A task includes all commands executed in a single
> link.  It has the same Exchange identifier for all
> steps in executing a command.  That takes care of the
> concerns expressed by Santosh.
>
> Second:
>
> A set of linked commands is executed in order, by
> definition, but their ordering with respect to other
> tasks from the same or other initiators is not defined
> except through the ordinary command ordering stuff.
> The task ordering applies to the entire task, although
> this is not as explicitly defined in SAM as would be
> desirable.  This eliminates them as being tremendously
> useful in ordered execution.
>
> Third:
>
> I am not aware of any usage of commands using relative
> addressing.  They are denigrated in all profiles.  They
> are a historic anomaly dating back to when SCSI was thought
> of as a total replacement for the IBM OEM channel.
> Relative addressing is defined for disk devices only,
> not sequential devices.  Sequential devices use the
> more traditional space, skip, and locate commands.
> This eliminates relative addressing for linked commands as
> being useful to sequential devices.
>
> Fourth:
>
> Stale PDUs are a fact of life in any stream of data that
> is not acknowledged and sequenced on a PDU by PDU basis.
> Both Fibre Channel and iSCSI have to deal with that.
> Fibre Channel has a function called "recovery qualifier"
> that automatically discards stale PDUs.  It has an additional
> function called R_A_TOV which guarantees that stale PDUs
> will not appear later than a certain time.  With those two
> mechanisms, stale PDUs become a non-problem.  In FCP,
> an additional mechanism is provided to handle the stale
> PDUs created by Abort Task actions.
>
> One way or another, iSCSI will have to deal with the same
> problem.
>
> Notes of possible interest:
>
> Command linking is an anachronism.  It has been used by a few
> devices that think of themselves as having IBM OEM channel-like
> characteristics, where a list of commands is the normal I/O
> execution unit.  It was also used by Sun Microsystems in early
> SCSI implementations that used software selection algorithms that
> were so high in overhead that command linking provided improved
> performance.  Aside from those cases that have been listed here
> (AS400 and some Unisys systems), command linking is not in modern
> use as far as I know.  Command queuing and buffering have provided
> performance improvements that are so notable that command linking
> has fallen aside.



From owner-ips@ece.cmu.edu  Mon Apr 23 18:18:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13923
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:18:13 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKT8c23900
	for ips-outgoing; Mon, 23 Apr 2001 16:29:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKSqA23872
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:52 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20657
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:52 -0400
Message-Id: <200104232028.QAA20657@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:28:51 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Fibre Channel Over TCP/IP (FCIP)
	Author(s)	: M. Rajagopal, R. Bhagwat
	Filename	: draft-ietf-ips-fcovertcpip-02.txt
	Pages		: 26
	Date		: 20-Apr-01
	
Fibre Channel (FC) is a dominant technology used in Storage Area
Networks (SAN). The purpose of this draft is to specify a standard
way of encapsulating FC frames over TCP/IP and to describe mechanisms
that allow islands of FC SANs to be interconnected  over IP-based
networks. FC over TCP/IP relies on IP-based network services to
provide the connectivity between the SAN islands over LANs, MANs, or
WANs.  The FC over TCP/IP specification relies upon TCP for
congestion control and management and upon both TCP and FC for data
error and data loss recovery.  FC over TCP/IP treats all classes of
FC frames the same -- as datagrams.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcovertcpip-02.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-fcovertcpip-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-fcovertcpip-02.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154253.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-fcovertcpip-02.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-fcovertcpip-02.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154253.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 18:18:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13946
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:18:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKT2e23890
	for ips-outgoing; Mon, 23 Apr 2001 16:29:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKSSA23857
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:28 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20615
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:27 -0400
Message-Id: <200104232028.QAA20615@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:28:27 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


- --NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI
	Author(s)	: J. Satran
	Filename	: draft-ietf-ips-iscsi-06.txt
	Pages		: 133
	Date		: 20-Apr-01
	
The Small Computer Systems Interface (SCSI) is a popular family of 
protocols for communicating with I/O devices, especially storage 
devices.  This memo describes a transport protocol for SCSI that 
operates on top of TCP.  The iSCSI protocol aims to be fully 
compliant with the requirements laid out in the SCSI Architecture 
Model - 2 [SAM2] document.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-06.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-06.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-06.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154329.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-06.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-06.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154329.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 18:18:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13961
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:18:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKPup23685
	for ips-outgoing; Mon, 23 Apr 2001 16:25:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from magic.adaptec.com (magic.adaptec.com [208.236.45.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKPXA23666
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:25:34 -0400 (EDT)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id NAA24499;
	Mon, 23 Apr 2001 13:25:03 -0700 (PDT)
Received: from otcexc01.otc.adaptec.com (otcexc01.otc.adaptec.com [10.12.1.27])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id NAA10723;
	Mon, 23 Apr 2001 13:15:49 -0700 (PDT)
Received: by otcexc01.otc.adaptec.com with Internet Mail Service (5.5.2650.21)
	id <JK8AHX9T>; Mon, 23 Apr 2001 16:25:03 -0400
Message-ID: <50DB155AD0CED411988E009027D61DB32AB162@otcexc01.otc.adaptec.com>
From: "Dillard, David" <david_dillard@adaptec.com>
To: "'John Hufferd'" <hufferd@us.ibm.com>
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 16:25:02 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> This is at least better.  But I do not have the issue of it being vendor
> unique.  This is a shut down and restart of the complete Target, and will
> probably be part of the vendors' operator console or their own remote
> support functions, it is not clear that it needs to be a general
management
> function that works the same on all iSCSI Storage Controllers.

From a management vendor's perspective they want to be able to fully manage
storage devices so the less vendor unique stuff there is the better.  From a
customer's perspective many customers do not like having to spawn off
separate consoles within the main management application (e.g. products from
Veritas, Tivoli, et.al.) to manage specific features of a device.  So,
standardizing features makes it easier for the manage app vendors to support
those features, making some customers happier.


 
> I do not believe that most SNMP implementations are very secure.  Most
> folks do not want to have a changeable MIB until they have 
> secure SNMP, and even though there is a version of SNMP that has security 
> features, this has not been well supported.

This is an excellent point (slapping forehead! - should have thought before
I sent my message).  Target reset is obviously something that must be
secure, so if it were put into the MIB it could only be read/write if the
method of setting the value were secure.

----------------------------------------------------------------
David Dillard                          david_dillard@adaptec.com
Management Software Group
Adaptec, Inc.                          www.adaptec.com


From owner-ips@ece.cmu.edu  Mon Apr 23 18:19:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA13981
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 18:19:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKpvv25415
	for ips-outgoing; Mon, 23 Apr 2001 16:51:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe46.law11.hotmail.com [64.4.16.18])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKpdA25404
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:51:39 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Mon, 23 Apr 2001 13:51:32 -0700
X-Originating-IP: [66.31.72.237]
From: "Eddy Quicksall" <ESQuicksall@hotmail.com>
To: "Peglar, Robert" <robert_peglar@xiotech.com>,
        "Dillard, David" <david_dillard@adaptec.com>,
        "John Hufferd" <hufferd@us.ibm.com>
Cc: <ips@ece.cmu.edu>
References: <ED8EDD517E0AA84FA2C36C8D6D205C1367344F@alfred.xiotech.com>
Subject: Re: Target Reset
Date: Mon, 23 Apr 2001 16:51:31 -0400
MIME-Version: 1.0
Content-Type: text/plain;	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.3018.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Message-ID: <OE46KvPEkGhAj5p1Fjx000022da@hotmail.com>
X-OriginalArrivalTime: 23 Apr 2001 20:51:32.0580 (UTC) FILETIME=[2D8C2240:01C0CC37]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

The problem is that Microsoft has not told us if they will be using that
yet. And, if they do use it in the future, it won't be used on legacy
systems.

Eddy

----- Original Message -----
From: "Peglar, Robert" <robert_peglar@xiotech.com>
To: "'Eddy Quicksall'" <ESQuicksall@hotmail.com>; "Dillard, David"
<david_dillard@adaptec.com>; "John Hufferd" <hufferd@us.ibm.com>
Cc: <ips@ece.cmu.edu>
Sent: Monday, April 23, 2001 3:19 PM
Subject: RE: Target Reset


> Forgive my stepping into this thread, but
> would not PERSISTENT RESERVE OUT/IN (10) suffice?
> (SPC 7.12,7.13).  Seems like bus reset is a
> somewhat hard approach to perform reservation
> management.
>
> Rob
>
>
> > -----Original Message-----
> > From: Eddy Quicksall [mailto:ESQuicksall@hotmail.com]
> > Sent: Monday, April 23, 2001 11:44 AM
> > To: Dillard, David; John Hufferd
> > Cc: ips@ece.cmu.edu
> > Subject: Re: Target Reset
> >
> >
> > I am wondering how clustering will work on NT without some
> > sort of reset.
> >
> > On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
> > during a challenge for the quorum drive.
> >
> > On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
> > assume the NT driver would have to issue a TARGET RESET to
> > each target that
> > it is supporting.
> >
> > How would you propose this would be done without a TARGET RESET?
> >
> > Eddy
> >
> >
>


From owner-ips@ece.cmu.edu  Mon Apr 23 19:05:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA14393
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 19:05:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NLZ0E27914
	for ips-outgoing; Mon, 23 Apr 2001 17:35:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NLYdA27890
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 17:34:39 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id RAA91858;
	Mon, 23 Apr 2001 17:27:14 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id PAA146336;
	Mon, 23 Apr 2001 15:34:38 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: Re: iSCSI Target Reset
To: Santosh Rao <santoshr@cup.hp.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF1FDE19E2.A3D61BC6-ON88256A37.007604FE@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Mon, 23 Apr 2001 14:34:21 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/23/2001 03:34:34 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Absolutely not,  Why would we think that impacting 32 different other
initiators is an OK thing to do.  By the way there are lots more Initiators
possible with FC on Shark, and would hope that there would be even more
with iSCSI.

I have been told that these large Storage Controllers do not support Target
Reset today.  So I see no loss in not supporting such an item in iSCSI
especially since many Initiators will be beyond even the distances and
mischief that is possible with FC.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001 01:24:02 PM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI Target Reset



"Dillard, David" wrote:
>
> When will STORPORT be generally available?  The latest STORPORT document
> that I found on the MS web site is version 0.6a, dated March 18, 2001.
> Given this it seems like STORPORT might not be available soon.  In that
case
> do you know what happens with the current drivers?  Are we going to be
> telling customers that if they want to use iSCSI and NT clustering they
have
> to update to Whistler?


[One would hope that this list does not turn into a Microsoft
release/product discussion mailing list (?) ]

Without going into specifics of A certain O.S., does it suffice to
require that iSCSI not break existing legacy SCSI applications ?

If the above is a valid requirement, then, knowing that legacy
applications continue to use SCSI-2 Reserve/Release and the target reset
as a mechanism of breaking SCSI-2 reservations, should'nt iSCSI continue
to support the target reset ?

- Santosh






From owner-ips@ece.cmu.edu  Mon Apr 23 19:57:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA14799
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 19:56:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NJGvN19572
	for ips-outgoing; Mon, 23 Apr 2001 15:16:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NJFrA19507
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 15:15:54 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP id 16681D12
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:09:16 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id MAA23551
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:15:48 -0700 (PDT)
Message-ID: <3AE47F2E.3D5BB3CF@cup.hp.com>
Date: Mon, 23 Apr 2001 12:14:54 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI : LUN field in NOP-OUT & NOP-IN PDUs.
Content-Type: multipart/mixed;
 boundary="------------900380A96804E78CEAFF3B07"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------900380A96804E78CEAFF3B07
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


At the risk of raising a subject that has been beaten to death [and
still not resolved ?], could someone please clarify on what was the
outcome of the prior thread discussing the usage of the LUN field in
NOP-OUT & NOP-IN PDUs ?

As has been discussed in the past, the NOP-OUT and NOP-IN PDUs are
transport specific and are used without any LUN context. Hence, it is
not clear why a LUN field is required in either the NOP-OUT or NOP-IN
PDUs. 

Some side notes on this subject :
a) the LUN field description is missing in the NOP-OUT description and
it is not clear from either the figure or the text whether this field
can be reserved. If so, what would be the reserved value for LUN in
NOP-OUT ? 

b) The NOP-IN PDU description shows a value of LUN 0 as reserved in the
NOP-IN PDU diagram. Is not LUN 0 a valid LU number for a LU (?) 

- Santosh
--------------900380A96804E78CEAFF3B07
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------900380A96804E78CEAFF3B07--



From owner-ips@ece.cmu.edu  Mon Apr 23 20:12:14 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA15023
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 20:12:12 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NMS1b02423
	for ips-outgoing; Mon, 23 Apr 2001 18:28:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NMRXA02398
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 18:27:33 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRQCF>; Mon, 23 Apr 2001 15:27:27 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173455@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Ips (E-mail)" <ips@ece.cmu.edu>
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Mon, 23 Apr 2001 15:27:26 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

The point of my original posting was to suggest ways in which the semantics
of all the task management functions could be preserved in multi-connection,
command striping implementations without a lot of complicated bookeeping.

In that regard, the proposed solution imposes no additional tracking
equirements on initiators aside from those that would be needed anyhow to
issue the ABORT TASK request. For the most part, that amounts to keeping
track of each pending I/O request including a handle by which the task can
be referenced and a pointer to the connection the SCSI command was issued
on.

I did neglect one restiction however: Specifically, that the initiator have
no more than one task management request pending at a time to a specific
target. 

In other respects, as long as ordered delivery to the SCSI layer is
preserved for individual connections, I don't see a problem.

> ....Those commands executed out of 
> sequence by means of a bypass flag, those commands that are Task
Management .....

I apparently don't understand how the bypass flag is supposed to work.  I'd
assumed its function was to maximize the benefits of command striping by
allowing commands on other connections in the session to be bypassed. I'd
assumed that commands on the same connection are never bypassed (since there
appears to be no benefit in doing so).

Hence my statement:

> > .....I've made the tacit assumption that commands 
> on a given
> > connection are presented to the SCSI layer in order they were sent,
> > regardless of whether or nor cmdSN was set to 0.  I assume 
> the framing
> > mechanisms that have been discussed for buffer offloading do not
> > affect this
> > behavior.  I.e., a fully formed PDU slated for immediate 
> delivery won't be
> > passed to the SCSI layer before a partially complete PDU 
> that was received
> > earlier.

Is this assumption incorrect?

Charles

> -----Original Message-----
> From: Douglas Otis [mailto:dotis@sanlight.net]
> Sent: Monday, April 23, 2001 12:10 PM
> To: Charles Monia; Santosh Rao (E-mail)
> Cc: Ips (E-mail)
> Subject: RE: iSCSI Reqts: In-Order Delivery
> 
> 
> Charles,
> 
> Your solution requires a fair amount of tracking of commands 
> based solely on
> their Client Tags.  These Tags are randomly generated but will need to
> retain sequential order for your scheme.  The transport must 
> remember the
> type of command sent together with their relative placement 
> based only on
> the Client Tag.  In addition, these commands will need to be 
> placed into
> different categories.  Those commands executed out of 
> sequence by means of a
> bypass flag, those commands that are Task Management 
> commands, and commands
> affected by these other types of commands.  It seems that in 
> large part,
> these concerns can be met with proper handing of the 
> transport without such
> laborious sorting of the Client Tags.  The out-of-sequence or 
> bypass flag
> also depends on the transport sorting the Client Tag.  In addition to
> disabling flow-control, this technique of not incrementing 
> the serialization
> of these commands, requires all commands with the same 
> serialization value
> to be sent on the same connection without acknowledgment, if 
> these commands
> are also to be kept in sequence.  This connection requirement 
> is yet to be
> specified.
> 
> Ver 6, Pg 12:
>    "iSCSI may avoid delivering some command to the
>    SCSI layer if so required by some prior SCSI or iSCSI action (e.g.,
>    clear task set Task Management request received before all the
>    commands it was supposed to act on)."
> 
> Here, there seems to be expectations of the iSCSI transport 
> interpreting the
> content of the SCSI commands.  How this is done is not 
> obvious.  Is the
> transport expected to generate SCSI responses?
> 
> In addition, although iSCSI presently relies on ACA, there are few
> applications that implement ACA.  It would appear for iSCSI 
> to work with the
> present protocol, significant application changes are 
> required.  With the
> proposal I am suggesting, this is not a problem as all 
> bypassed commands are
> rejected back to the Initiator.  The drivers that implement 
> iSCSI will be
> required to provide handling for these commands that bypass 
> other commands.
> The amount of information contained in a rejected command 
> list should be
> relatively small and these occasions for such Management 
> rare.  Without
> proper handling of these events, there will be 2:00 AM alarm 
> pagers going
> off.
> 
> Here in the proposal, sorting CmdSN based on LUN values takes 
> place within a
> "Barrier List."  I can not tell what is implied by these recovery
> instructions.  What is meant by Remove, Release, Drop, 
> Cleanup, Placeholder,
> and ALL.  What is the intended feedback to the initiator for 
> this Clean-up?
> It would appear the transport works on behalf of the target.  In the
> proposal that I am suggesting, there is no actions within the 
> transport on
> behalf of the target.  All decisions are done either by the 
> Target or the
> Initiator.  None by the transport.
> 
> The concept is simple.  Keep the transport simple.  Do not expect the
> transport to decipher SCSI commands.  Do not expect the 
> transport to respond
> on behalf of the Target.  Do not expect the transport to sort pending
> commands based on LUN value.  Do not expect the transport to 
> require SCSI
> and iSCSI ACA.
> 
> In the case of session wide serialization, what is good for 
> the goose is
> also good for the gander.  It is important from the prospect 
> of quickly
> detecting an error and knowing the server state to also use 
> session wide
> serialization from the server.  The technique of replicating 
> Management
> commands down each connection in addition to changing global 
> commands into
> specific commands already over burdens the set-aside that 
> must be made to
> handle these non-serialized management commands.  My proposal 
> eliminates the
> problem of set-aside resources and loss of server state.  Rather than
> silently rejecting commands out-of-sequence, these rejections 
> are reported.
> Once done, this feature can be used to extract pending 
> commands in a simple
> and direct manner without burdening the transport.
> 
> As attempts are made to support the SCSI architecture, rather than
> increasing the intelligence of the transport, efforts should 
> be made to
> simplify the transport.  The number of fields that the transport must
> manipulate will be met with complexity and non-uniform implementation.
> 
> See:
> http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt
> 
> Ver 6, Pg 92:
>      "N.B. As an alternative to Logout and reissue commands, the
>       initiator MAY instead reset the target and terminate all
>       outstanding commands with a service response indicating
>       Delivery Subsystem Failure. The initiator MUST perform one of
>       the two actions."
> 
> ...
> 

> Ver 6, Pg 93:
>    "The following general mechanism can be used to achieve 
> the effect of
>    ordered delivery for task management commands while enabling the
>    "urgent" delivery that some of them imply and immediate 
> execution of
>    the task management commands without:
> 
>       At Initiator when a relevant task management command is issued:
> 
>          a) if ExpCmdSN is equal to CmdSN skip to step c
>          b) mark all pending commands with a CmdSN field between
>          ExpCmdSN and the current CmdSN and a relevant LUN as
>          candidates for cleanup and retain CmdSN in a "barrier list".
>          c) send the task management command for immediate delivery
>          to the target
> 
>       At initiator when updating ExpCmdSN:
> 
>          a) if the "barrier list" is empty or ExpCmdSN is less than
>          the first entry in the barrier list then skip to step d
>          b) remove the barrier list entry and remove and drop all
>          entries marked for cleanup having a CmdSN field less than
>          ExpCmdSN
>          c) go to step a
>          d) release all queued entries between the old and new
>          ExpCmdSN from the queue
> 
>       At target when receiving a relevant task management command for
>       immediate delivery:
> 
>          a) if ExpCmdSN is equal to CmdSN skip to step c
>          b) mark all pending entries (commands received and
>          placeholders) with a CmdSN field between ExpCmdSN and the
>          current CmdSN as candidates for cleanup and retain CmdSN in
>          a "barrier list" including the referenced LUN (or an ALL
>          marker)
>          c) send the task management command to SCSI for immediate
>          execution
> 
>       At target when updating ExpCmdSN (releasing ordered commands to
>       SCSI):
> 
>          a) if the "barrier list" is empty or ExpCmdSN is less than
>          the first entry in the barrier list then skip to step d
>          b) remove the barrier list entry and remove and drop all
>          entries marked for cleanup and having the same LUN as the
>          barrier entry (any if the barrier is marked ALL) and a CmdSN
>          field less than ExpCmdSN
>          c) go to step a
>          d) release all queued entries between the old and new
>          ExpCmdSN from the queue
> 
>    Note that this scheme will withstand connection recovery."
> 
> Doug

< remainder deleleted>


From owner-ips@ece.cmu.edu  Mon Apr 23 20:46:51 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA15464
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 20:46:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKUtK24045
	for ips-outgoing; Mon, 23 Apr 2001 16:30:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKUoA24028
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:30:50 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20796
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:30:50 -0400
Message-Id: <200104232030.QAA20796@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
subject: ips / ietf documents
Date: Mon, 23 Apr 2001 16:30:50 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


I've resent the IETF announcements (the mailer got jammed last night)
to the mailing list.  Also, you'll find the most up to date copies of
the IETF documents at:

 www.ece.cmu.edu/~ips/Docs/docs.html

dave.................


From owner-ips@ece.cmu.edu  Mon Apr 23 20:48:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA15490
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 20:48:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKTwK23945
	for ips-outgoing; Mon, 23 Apr 2001 16:29:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKTHA23910
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:29:17 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20718
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:29:17 -0400
Message-Id: <200104232029.QAA20718@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:29:17 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


- --NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Definitions of Managed Objects for iSCSI
	Author(s)	: M. Bakke
	Filename	: draft-ietf-ips-iscsi-mib-00.txt
	Pages		: 120
	Date		: 20-Apr-01
	
This memo defines a portion of the Management Information Base (MIB)
for use with network management protocols in TCP/IP based internets.
In particular it defines objects for managing a client using the
iSCSI (SCSI over TCP) protocol.  It is meant to match the latest
version of iSCSI defined in [ISCSI].

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-mib-00.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-mib-00.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-mib-00.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154216.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-mib-00.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-mib-00.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154216.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 20:49:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA15501
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 20:49:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKS1U23829
	for ips-outgoing; Mon, 23 Apr 2001 16:28:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKRdA23808
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:27:39 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20514
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:27:39 -0400
Message-Id: <200104232027.QAA20514@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:27:39 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSNS Internet Storage Name Service
	Author(s)	: K. Gibbons
	Filename	: draft-ietf-ips-isns-02.txt
	Pages		: 74
	Date		: 20-Apr-01
	
This document provides a generic framework centering around use of
the iSNS for discovery and management of storage entities in an
enterprise-scale IP storage network.  iSNS is an application that
stores client attributes and monitors the availability and
reachability of storage assets in an integrated IP storage network.
Due to its role as a consolidated information repository, iSNS
provides for more efficient and scalable management of IP storage
assets.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-isns-02.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-isns-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-isns-02.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154426.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-isns-02.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-isns-02.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154426.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 21:04:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA15613
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 21:04:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NNC2K04695
	for ips-outgoing; Mon, 23 Apr 2001 19:12:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NNBxA04686
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 19:11:59 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRQFC>; Mon, 23 Apr 2001 16:11:53 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173458@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 16:11:53 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

These seem to be implementation decisions. I don't see how that justifies
removing support from the protocol.

Charles

> -----Original Message-----
> From: John Hufferd [mailto:hufferd@us.ibm.com]
> Sent: Monday, April 23, 2001 2:34 PM
> To: Santosh Rao
> Cc: ips@ece.cmu.edu
> Subject: Re: iSCSI Target Reset
> 
> 
> 
> Absolutely not,  Why would we think that impacting 32 different other
> initiators is an OK thing to do.  By the way there are lots 
> more Initiators
> possible with FC on Shark, and would hope that there would be 
> even more
> with iSCSI.
> 
> I have been told that these large Storage Controllers do not 
> support Target
> Reset today.  So I see no loss in not supporting such an item in iSCSI
> especially since many Initiators will be beyond even the distances and
> mischief that is possible with FC.
> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> 
> 
> Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001 
> 01:24:02 PM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI Target Reset
> 
> 
> 
> "Dillard, David" wrote:
> >
> > When will STORPORT be generally available?  The latest 
> STORPORT document
> > that I found on the MS web site is version 0.6a, dated 
> March 18, 2001.
> > Given this it seems like STORPORT might not be available 
> soon.  In that
> case
> > do you know what happens with the current drivers?  Are we 
> going to be
> > telling customers that if they want to use iSCSI and NT 
> clustering they
> have
> > to update to Whistler?
> 
> 
> [One would hope that this list does not turn into a Microsoft
> release/product discussion mailing list (?) ]
> 
> Without going into specifics of A certain O.S., does it suffice to
> require that iSCSI not break existing legacy SCSI applications ?
> 
> If the above is a valid requirement, then, knowing that legacy
> applications continue to use SCSI-2 Reserve/Release and the 
> target reset
> as a mechanism of breaking SCSI-2 reservations, should'nt 
> iSCSI continue
> to support the target reset ?
> 
> - Santosh
> 
> 
> 
> 


From owner-ips@ece.cmu.edu  Mon Apr 23 21:05:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA15627
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 21:05:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NNW4G05628
	for ips-outgoing; Mon, 23 Apr 2001 19:32:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com (mxbh4.isus.emc.com [128.221.10.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NNVWA05609
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 19:31:32 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <JG3Z0VAG>; Mon, 23 Apr 2001 19:31:27 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801549F@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: cmonia@NishanSystems.com, ips@ece.cmu.edu
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 19:31:24 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I agree with Charles that this is an implementation
issue.  If a Shark wants to reset all 32 adapters
when it receives a Target Reset on one of them, that's
a Shark implementation decision.  It's completely valid
to reset only the adapter that the Target Reset is
received on (common Fibre Channel behavior) or
only the iSCSI target to which the Target Reset is
addressed if there's more than one Target behind  
the adapter.

As for leaving things out of iSCSI - the default modus
operandi should be to put in everything that's described
in SAM2 unless we can convince T10 to take the feature
out of SAM2.  Let's not go deciding to cast things out
of SCSI on T10's behalf.

Thanks,
--David

> -----Original Message-----
> From:	Charles Monia [SMTP:cmonia@nishansystems.com]
> Sent:	Monday, April 23, 2001 7:12 PM
> To:	ips@ece.cmu.edu
> Subject:	RE: iSCSI Target Reset
> 
> Hi:
> 
> These seem to be implementation decisions. I don't see how that justifies
> removing support from the protocol.
> 
> Charles
> 
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Monday, April 23, 2001 2:34 PM
> > To: Santosh Rao
> > Cc: ips@ece.cmu.edu
> > Subject: Re: iSCSI Target Reset
> > 
> > 
> > 
> > Absolutely not,  Why would we think that impacting 32 different other
> > initiators is an OK thing to do.  By the way there are lots 
> > more Initiators
> > possible with FC on Shark, and would hope that there would be 
> > even more
> > with iSCSI.
> > 
> > I have been told that these large Storage Controllers do not 
> > support Target
> > Reset today.  So I see no loss in not supporting such an item in iSCSI
> > especially since many Initiators will be beyond even the distances and
> > mischief that is possible with FC.
> > 
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> > 
> > 
> > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001 
> > 01:24:02 PM
> > 
> > Sent by:  owner-ips@ece.cmu.edu
> > 
> > 
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI Target Reset
> > 
> > 
> > 
> > "Dillard, David" wrote:
> > >
> > > When will STORPORT be generally available?  The latest 
> > STORPORT document
> > > that I found on the MS web site is version 0.6a, dated 
> > March 18, 2001.
> > > Given this it seems like STORPORT might not be available 
> > soon.  In that
> > case
> > > do you know what happens with the current drivers?  Are we 
> > going to be
> > > telling customers that if they want to use iSCSI and NT 
> > clustering they
> > have
> > > to update to Whistler?
> > 
> > 
> > [One would hope that this list does not turn into a Microsoft
> > release/product discussion mailing list (?) ]
> > 
> > Without going into specifics of A certain O.S., does it suffice to
> > require that iSCSI not break existing legacy SCSI applications ?
> > 
> > If the above is a valid requirement, then, knowing that legacy
> > applications continue to use SCSI-2 Reserve/Release and the 
> > target reset
> > as a mechanism of breaking SCSI-2 reservations, should'nt 
> > iSCSI continue
> > to support the target reset ?
> > 
> > - Santosh
> > 
> > 
> > 
> > 


From owner-ips@ece.cmu.edu  Mon Apr 23 21:07:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA15666
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 21:07:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NNK2705000
	for ips-outgoing; Mon, 23 Apr 2001 19:20:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [128.221.11.32])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NNJYA04986
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 19:19:34 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <28S7XGP8>; Mon, 23 Apr 2001 19:19:28 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A070801549E@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, marjorie_krueger@hp.com, ips@ece.cmu.edu
Cc: mankin@east.isi.edu, Black_David@emc.com, egrodriguez@lucent.com,
        sob@harvard.edu
Subject: iSCSI reqmts and Ethernet adapters
Date: Mon, 23 Apr 2001 19:19:24 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> This requirements document makes it clear there is expectation of
modifying
> Ethernet adapters to support this protocol.  Should this required hardware
> support be made in a general fashion to allow common use among other
> protocols?  

There are at least two announced iSCSI products and
an open source driver that do not require any
modifications to existing Ethernet adapters,
so such modifications are clearly not a requirement.

> This hardware requirement is primarily based on two
> requirements, to increase the level of error detection and to allow
framing.

Error detection (i.e., CRC or checksum) can be done in
software as Doug has frequently pointed out on this list.
I don't think the statement about IPS and TSVWG pursuing
a common error detection algorithm is correct -- while
I'll defer to the ADs (who are the chairs of TSVWG), I
believe TSVWG has significant interest in an improved
checksum (e.g., Adler-32 based on adding 16-bit quantities
instead of 8-bit), whereas IPS intends to use CRCs.

Framing is optional and being pursued in a layered fashion
as called for by the WG charter.  The instructions in
the WG charter should be sufficient - adding text to the
iSCSI requirements document that introduces otherwise
unneeded dependencies on other protocol specification
efforts (i.e., iFCP, FCIP) is a bad idea.  Heaven help
us if we have to submit all the protocol drafts for the
IPS WG to the IESG in one big bundle - at the very least
the IESG will be annoyed, and annoying the IESG has all
sorts of bad side effects ;-).

I don't see that any change to the iSCSI requirements draft
is needed in either of these areas based on this set of
comments.

Doug also posted a pointer to the wrong draft - I suspect
he meant to point to draft-otis-tcp-framing-00.txt.

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------


> -----Original Message-----
> From:	Douglas Otis [SMTP:dotis@sanlight.net]
> Sent:	Monday, April 23, 2001 4:15 PM
> To:	KRUEGER,MARJORIE (HP-Roseville,ex1); Ips Reflector (E-mail)
> Cc:	Allison Mankin; David Black; Elizabeth G Rodriguez (Elizabeth);
> Scott Bradner
> Subject:	RE: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
> 
> Marjorie,
> 
> This requirements document makes it clear there is expectation of
> modifying
> Ethernet adapters to support this protocol.  Should this required hardware
> support be made in a general fashion to allow common use among other
> protocols?  This hardware requirement is primarily based on two
> requirements, to increase the level of error detection and to allow
> framing.
> Presently, IETF supports a framing protocol that also increases the level
> of
> error detection.
> 
> Presently TSVWG and IPS are working on a common error detection algorithm.
> In addition, there are two other protocols expecting hardware for framing
> and error detection.  This is iFCP and FCIP.
> 
> See:
> http://www.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt
> 
> It is possible to have all these protocols use the same error detection
> and
> framing.  If this MUST be done using TCP, as this requirement document
> demands, then here is a possible general propose header that would allow
> common use of hardware and a easy transition into SCTP.
> 
> I will be happy to define a mapping from the present protocols into this
> generalized form.  The advantage should be obvious.  One Ethernet adapter
> can handle these various protocols without specialized hardware for each.
> 
> For those wishing to update and route based on encapsulated headers, a
> fix-up field at the end of these headers will allow use of a common error
> scheme using header fix-up.
> 
> Here is an example of how TCP can be made to look like SCTP.
> http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt
> 
> This header could become a TCP option field to allow for negotiation.
> 
> P.S.
> One additional question however.
> 
> On page 18,
>    "The iSCSI protocol document SHOULD NOT define the management
>    architecture for iSCSI within the network infrastructure."
> 
> What does this mean?
> 
> Doug
> 
> 
> > The IP Storage Working group is chartered with developing
> > comprehensive technology to transport block storage data
> > over IP protocols.  This effort includes a protocol to
> > transport the Small Computer Systems Interface (SCSI)
> > protocol over the internet (iSCSI).
> >
> > A URL for this Internet-Draft is:
> > http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt
> >


From owner-ips@ece.cmu.edu  Mon Apr 23 21:24:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA15839
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 21:24:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NImtx13144
	for ips-outgoing; Mon, 23 Apr 2001 14:48:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c017.sfo.cp.net (c017-h015.c017.sfo.cp.net [209.228.12.229])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3NImLA13130
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 14:48:22 -0400 (EDT)
Received: (cpmta 4929 invoked from network); 23 Apr 2001 11:48:14 -0700
Received: from sangate-GW.ser.netvision.net.il (HELO sangate.com) (212.143.114.146)
  by smtp.sangate.com (209.228.12.229) with SMTP; 23 Apr 2001 11:48:14 -0700
X-Sent: 23 Apr 2001 18:48:14 GMT
Message-ID: <3AE463A2.2E994455@sangate.com>
Date: Mon, 23 Apr 2001 20:17:22 +0300
From: Mark Mokryn <mark@sangate.com>
Organization: SANgate Systems
X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16-22 i686)
X-Accept-Language: en
MIME-Version: 1.0
CC: ips@ece.cmu.edu
Subject: Re: iSCSI: Re: iSCSI & Linked Commands
References: <NEBBJGDMMLHHCIKHGBEJOELACGAA.dotis@sanlight.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Doug,

Whether or not a device uses relative addressing has nothing to do with
command linking. A LOCATE command moves the head, and a following READ
reads the data. Using command linking, relative addressing, etc. does
not protect the sender of the above commands from another command, sent
by that initiator or any other, which may move the head, and thus cause
the original READ to get the data from the wrong place. There is no
protection between the first and any subsequent commands under command
linking. Thus, the proper way to use LOCATE and READ is with
RESERVE/RELEASE. I assume that it was in order to settle this confusion,
that FC-TAPE decided to ban linked commands.

-mark

Douglas Otis wrote:
> 
> Stephen,
> 
> Unlike random access devices, sequential access devices operate with
> relative addressing.  For random access devices, this is a seldom used
> option.  There is a requirement to bind commands together to ensure order of
> execution on these devices.  By popular, you mean not sequential?
> 
> Doug
> 
> > Julian,
> >
> > > According to your logic no FCP implementation can use linked commands?
> > > Is this true for all OS's?  Is it a verified fact or foloklor?
> >
> > In my experience it's fact.  I have never used a SCSI stack which both
> > supported AND used linked commands.  Like some others here, I always
> > assumed AIX might :^) Ralph has pointed out that T10 is well aware
> > that the feature is not popular.  There are other ways of
> > accomplishing the same thing that are less likely to blow up in your
> > face.
> >
> > > Is it so also for the new MS StorPort driver?
> >
> > I don't know, but I'd be really surprised if they did use linked
> > commands.  You have to be pretty nuts to rely on a feature that's not
> > even exercised by most SCSI implementations.
> >
> > Steph
> >

-- 
Mark Mokryn      SANgate Systems Inc.      mark@SANgate.com
Phone: +972-9-8919821                Mobile: +972-54-270030
Fax: +972-9-8919449                  http://www.SANgate.com
P.O. Box 1486 41 Hameyasdim St., Even Yehuda 40500 Israel


From owner-ips@ece.cmu.edu  Mon Apr 23 21:37:34 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA16885
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 21:37:33 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKSv223877
	for ips-outgoing; Mon, 23 Apr 2001 16:28:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKS6A23839
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:06 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20571
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:28:06 -0400
Message-Id: <200104232028.QAA20571@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:28:05 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI Naming and Discovery Requirements
	Author(s)	: M. Bakke
	Filename	: draft-ietf-ips-iscsi-name-disc-01.txt
	Pages		: 32
	Date		: 20-Apr-01
	
This document describes the  iSCSI [7] naming and discovery 
requirements. The  requirements presented in this document have been 
agreed to by the members of  the iSCSI naming and discovery team. This
document complements the iSCSI IETF  draft.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-name-disc-01.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-name-disc-01.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-name-disc-01.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154353.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-name-disc-01.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-name-disc-01.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154353.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 21:41:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA16923
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 21:41:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKRwq23826
	for ips-outgoing; Mon, 23 Apr 2001 16:27:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from opus.ece.cmu.edu (root@OPUS.ECE.CMU.EDU [128.2.134.91])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKRSA23793
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:27:28 -0400 (EDT)
Received: from opus.ece.cmu.edu (bassoon@localhost [127.0.0.1])
	by opus.ece.cmu.edu (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id QAA20493
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:27:28 -0400
Message-Id: <200104232027.QAA20493@opus.ece.cmu.edu>
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement
Date: Mon, 23 Apr 2001 16:27:27 -0400
From: Dave Nagle <bassoon@ece.cmu.edu>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



- --NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: iSCSI Requirements and Design Considerations
	Author(s)	: M. Krueger
	Filename	: draft-ietf-ips-iscsi-reqmts-03.txt
	Pages		: 23
	Date		: 20-Apr-01
	
The IP Storage Working group is chartered with developing comprehensive 
technology to transport block storage data over IP protocols.  This effort includes
a protocol to transport the Small Computer Systems Interface (SCSI) protocol over
the internet (iSCSI).

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-iscsi-reqmts-03.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154406.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-iscsi-reqmts-03.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154406.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message



From owner-ips@ece.cmu.edu  Mon Apr 23 21:54:04 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA17054
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 21:54:03 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NL3C526073
	for ips-outgoing; Mon, 23 Apr 2001 17:03:12 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe31.law11.hotmail.com [64.4.16.88])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NL2gA26056
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 17:02:43 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Mon, 23 Apr 2001 14:02:32 -0700
X-Originating-IP: [66.31.72.237]
From: "Eddy Quicksall" <ESQuicksall@hotmail.com>
To: "John Hufferd" <hufferd@us.ibm.com>
Cc: <ips@ece.cmu.edu>
References: <OF82EA9DE5.A9FA747B-ON88256A37.006E0F65@LocalDomain>
Subject: Re: Target Reset
Date: Mon, 23 Apr 2001 17:02:33 -0400
MIME-Version: 1.0
Content-Type: text/plain;	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.3018.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Message-ID: <OE31FckeB2PcSoN5gfU0000248c@hotmail.com>
X-OriginalArrivalTime: 23 Apr 2001 21:02:32.0720 (UTC) FILETIME=[B7058100:01C0CC38]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Actually, I made an assumption but did not make a suggestion; I only wanted
to bring up the point that there has to be some way to emulate what NT is
doing and at this time.

On the current NT MSCS, there will only be 2 hosts on the target. I have
heard that they have a multi-node cluster but I don't know how they do a
challenge there (I suspect that they don't use RESET).

Regarding the number of LUs, NT it is limited to a small number of LUs on a
target and a small number of targets on a bus.

Eddy

----- Original Message -----
From: "John Hufferd" <hufferd@us.ibm.com>
To: "Eddy Quicksall" <ESQuicksall@hotmail.com>
Cc: <ips@ece.cmu.edu>
Sent: Monday, April 23, 2001 4:07 PM
Subject: Re: Target Reset


>
> Eddy,
> A target like an IBM Shark or EMC Symmetrix will have thousands of LUs and
> 10s to 100s of Hosts connected to it, and you want to reset the whole
> Target?  I do not think that is a good idea.  Perhaps Task Reset or LU
> reset etc. but not Target Reset.
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> "Eddy Quicksall" <ESQuicksall@hotmail.com> on 04/23/2001 09:44:05 AM
>
> To:   "Dillard, David" <david_dillard@adaptec.com>, John Hufferd/San
>       Jose/IBM@IBMUS
> cc:   <ips@ece.cmu.edu>
> Subject:  Re: Target Reset
>
>
>
> I am wondering how clustering will work on NT without some sort of reset.
>
> On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
> during a challenge for the quorum drive.
>
> On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
> assume the NT driver would have to issue a TARGET RESET to each target
that
> it is supporting.
>
> How would you propose this would be done without a TARGET RESET?
>
> Eddy
>
> ----- Original Message -----
> From: "John Hufferd" <hufferd@us.ibm.com>
> To: "Dillard, David" <david_dillard@adaptec.com>
> Cc: <ips@ece.cmu.edu>
> Sent: Sunday, April 22, 2001 3:20 PM
> Subject: RE: Target Reset
>
>
> >
> > This is at least better.  But I do not have the issue of it being vendor
> > unique.  This is a shut down and restart of the complete Target, and
will
> > probably be part of the vendors' operator console or their own remote
> > support functions, it is not clear that it needs to be a general
> management
> > function that works the same on all iSCSI Storage Controllers.
> >
> > Many of the major Storage Controller do not support this feature today.
> >
> > I do not believe that most SNMP implementations are very secure.  Most
> > folks do not want to have a changeable MIB until they have secure SNMP,
> and
> > even though there is a version of SNMP that has security features, this
> has
> > not been well supported.
> >
> > I do NOT think that Target Reset should be in the base iSCSI protocol,
> but
> > I think it is reasonable to hold this discussion apart from the base
> > protocol document, and the question should be asked if this is a general
> > management function or a vendor specific console or remote support
> > function.
> >
> >
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > "Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
> > 08:13:49 AM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  RE: Target Reset
> >
> >
> >
> > John,
> >
> > I understand the danger of issuing a target reset and I agree that it
> > should
> > not be a part of the an initiator's normal error recovery procedure.
> > However, looking at this from a management perspective I'd like to see a
> > standardized way of resetting a target.  I don't want to see a variety
of
> > vendor unique methods of resetting targets sprout up.
> >
> > If resetting a target using the protocol is not desirable from your
> > perspective would incorporating this feature into the MIB be acceptable?
> > (MIBs are for management after all)
> >
> > ----------------------------------------------------------------
> > David Dillard                          david_dillard@adaptec.com
> > Management Software Group
> > Adaptec, Inc.                          www.adaptec.com
> >
> >
> >
> >
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Sunday, April 22, 2001 4:59 AM
> > To: ips@ece.cmu.edu
> > Subject: Target Reset
> >
> >
> > I thought we had a number of discussion previously about Target Reset
> (Warm
> > or Cold).  I thought there was general feeling that this command is so
> > dangerous that it should not be supported by iSCSI.  The long distance
> > capability of iSCSI makes the risks involved unmanageable.  There should
> > only be an Admin way to do this.
> >
> > Some folks have said that we could permit it and have special
> authorization
> > etc.  This would probably cause a separate section in the spec. to
define
> > the authorization approach,  and what ever other security is needed to
> > prevent this from inappropriately being used.  All for what purpose?
> This
> > can not be part of error recovery from a normal initiator.  The wide
> spread
> > effect is too great for that.
> >
> > I would like to hear from the list about their feeling on this item.
> >
> >
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> >
> >
>
>
>
>


From owner-ips@ece.cmu.edu  Mon Apr 23 22:25:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA17255
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 22:25:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O0K9r07903
	for ips-outgoing; Mon, 23 Apr 2001 20:20:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from zmamail04.zma.compaq.com (zmamail04.zma.compaq.com [161.114.64.104])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O0JqA07884
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 20:19:52 -0400 (EDT)
Received: by zmamail04.zma.compaq.com (Postfix, from userid 12345)
	id 0A3D55F85; Mon, 23 Apr 2001 20:19:47 -0400 (EDT)
Received: from exchou-gh02.cca.cpqcorp.net (exchou-gh02.cca.cpqcorp.net [16.110.248.202])
	by zmamail04.zma.compaq.com (Postfix) with ESMTP
	id AA21D5F7A; Mon, 23 Apr 2001 20:19:46 -0400 (EDT)
Received: by exchou-gh02.cca.cpqcorp.net with Internet Mail Service (5.5.2652.78)
	id <J234J4PQ>; Mon, 23 Apr 2001 19:19:46 -0500
Message-ID: <78AF3C342AEAEF4BA33B35A8A15668C659C179@cceexc17.americas.cpqcorp.net>
From: "Elliott, Robert" <Robert.Elliott@compaq.com>
To: ips@ece.cmu.edu, "'Black_David@emc.com'" <Black_David@emc.com>,
        "'cmonia@nishansystems.com'" <cmonia@NishanSystems.com>
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 19:19:42 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2652.78)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

It's not a very good situation when each device chooses the
interpretation of TARGET RESET that it thinks is appropriate.  
IBM's Shark might choose a different hack than Compaq's
RA8000.  How is software supposed to make sense of this?

The rule in SAM-2 is that TARGET RESET resets every logical
unit (subject to access controls, if implemented).  The fact
is that in Fibre Channel, not many multi-LUN multi-port
targets followed that rule.  The result is that software 
cannot tell what's going to happen and may have to handle
targets from each vendor differently.  This is not very 
interoperable.

Charles suggested in the T10 meeting that we allow it to 
be implemented as a no-op rather than let protocols drop 
support for it.  That doesn't help software like Windows that 
does expect certain effects - a no-op implementation 
would break clustering.  By removing it from the protocol, 
software is forced to find a suitable replacement (e.g. use 
LOGICAL UNIT RESET or switch to persistent reservations).
In Windows, this can be done at the port driver level
(STORPORT improving on SCSIPORT) or at the miniport level
(convert each target reset request into multiple LOGICAL UNIT
RESETs).

Note that other protocols like NFS and HTTP over IP don't
seem to have "server resets."

The recent SAM-2 change was expressly designed to encourage 
iSCSI and SRP to drop support for TARGET RESET.  Please don't 
keep it because you think T10 would be offended :-)

---
Rob Elliott, Compaq Server Storage
Robert.Elliott@compaq.com






> -----Original Message-----
> From: Black_David@emc.com [mailto:Black_David@emc.com]
> Sent: Monday, April 23, 2001 6:31 PM
> To: cmonia@NishanSystems.com; ips@ece.cmu.edu
> Subject: RE: iSCSI Target Reset
> 
> 
> I agree with Charles that this is an implementation
> issue.  If a Shark wants to reset all 32 adapters
> when it receives a Target Reset on one of them, that's
> a Shark implementation decision.  It's completely valid
> to reset only the adapter that the Target Reset is
> received on (common Fibre Channel behavior) or
> only the iSCSI target to which the Target Reset is
> addressed if there's more than one Target behind  
> the adapter.
> 
> As for leaving things out of iSCSI - the default modus
> operandi should be to put in everything that's described
> in SAM2 unless we can convince T10 to take the feature
> out of SAM2.  Let's not go deciding to cast things out
> of SCSI on T10's behalf.
> 
> Thanks,
> --David
> 
> > -----Original Message-----
> > From:	Charles Monia [SMTP:cmonia@nishansystems.com]
> > Sent:	Monday, April 23, 2001 7:12 PM
> > To:	ips@ece.cmu.edu
> > Subject:	RE: iSCSI Target Reset
> > 
> > Hi:
> > 
> > These seem to be implementation decisions. I don't see how 
> that justifies
> > removing support from the protocol.
> > 
> > Charles
> > 
> > > -----Original Message-----
> > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > Sent: Monday, April 23, 2001 2:34 PM
> > > To: Santosh Rao
> > > Cc: ips@ece.cmu.edu
> > > Subject: Re: iSCSI Target Reset
> > > 
> > > 
> > > 
> > > Absolutely not,  Why would we think that impacting 32 
> different other
> > > initiators is an OK thing to do.  By the way there are lots 
> > > more Initiators
> > > possible with FC on Shark, and would hope that there would be 
> > > even more
> > > with iSCSI.
> > > 
> > > I have been told that these large Storage Controllers do not 
> > > support Target
> > > Reset today.  So I see no loss in not supporting such an 
> item in iSCSI
> > > especially since many Initiators will be beyond even the 
> distances and
> > > mischief that is possible with FC.
> > > 
> > > .
> > > .
> > > .
> > > John L. Hufferd
> > > Senior Technical Staff Member (STSM)
> > > IBM/SSG San Jose Ca
> > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > Internet address: hufferd@us.ibm.com
> > > 
> > > 
> > > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001 
> > > 01:24:02 PM
> > > 
> > > Sent by:  owner-ips@ece.cmu.edu
> > > 
> > > 
> > > To:   ips@ece.cmu.edu
> > > cc:
> > > Subject:  Re: iSCSI Target Reset
> > > 
> > > 
> > > 
> > > "Dillard, David" wrote:
> > > >
> > > > When will STORPORT be generally available?  The latest 
> > > STORPORT document
> > > > that I found on the MS web site is version 0.6a, dated 
> > > March 18, 2001.
> > > > Given this it seems like STORPORT might not be available 
> > > soon.  In that
> > > case
> > > > do you know what happens with the current drivers?  Are we 
> > > going to be
> > > > telling customers that if they want to use iSCSI and NT 
> > > clustering they
> > > have
> > > > to update to Whistler?
> > > 
> > > 
> > > [One would hope that this list does not turn into a Microsoft
> > > release/product discussion mailing list (?) ]
> > > 
> > > Without going into specifics of A certain O.S., does it suffice to
> > > require that iSCSI not break existing legacy SCSI applications ?
> > > 
> > > If the above is a valid requirement, then, knowing that legacy
> > > applications continue to use SCSI-2 Reserve/Release and the 
> > > target reset
> > > as a mechanism of breaking SCSI-2 reservations, should'nt 
> > > iSCSI continue
> > > to support the target reset ?
> > > 
> > > - Santosh
> > > 
> > > 
> > > 
> > > 
> 


From owner-ips@ece.cmu.edu  Mon Apr 23 23:18:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA18488
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 23:18:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O0s9H09388
	for ips-outgoing; Mon, 23 Apr 2001 20:54:09 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O0rxA09380
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 20:53:59 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id 047551178; Mon, 23 Apr 2001 17:47:22 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id RAA05667;
	Mon, 23 Apr 2001 17:53:54 -0700 (PDT)
Message-ID: <3AE4CE6D.56ECFA64@cup.hp.com>
Date: Mon, 23 Apr 2001 17:53:01 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: John Hufferd <hufferd@us.ibm.com>
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI Target Reset
References: <OF1FDE19E2.A3D61BC6-ON88256A37.007604FE@LocalDomain>
Content-Type: multipart/mixed;
 boundary="------------F04CD1612172AB99084124F3"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------F04CD1612172AB99084124F3
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Such large targets are also capable of features like target zoning which
can restrict the impact caused by host[s] using the target reset.Host
environments that deploy such forms of recovery are zoned to avoid
disrupting I/O activity from other OS'. This restricts the scope of
error recovery while not breaking legacy apps dependent on target reset. 

I'm aware of at least 2 high end FC arrays (one being the EMC Symmetrix)
[and several mid-range arrays] that do support the target reset. 

- Santosh



John Hufferd wrote:
> 
> Absolutely not,  Why would we think that impacting 32 different other
> initiators is an OK thing to do.  By the way there are lots more Initiators
> possible with FC on Shark, and would hope that there would be even more
> with iSCSI.
> 
> I have been told that these large Storage Controllers do not support Target
> Reset today.  So I see no loss in not supporting such an item in iSCSI
> especially since many Initiators will be beyond even the distances and
> mischief that is possible with FC.
> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> 
> Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001 01:24:02 PM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI Target Reset
> 
> "Dillard, David" wrote:
> >
> > When will STORPORT be generally available?  The latest STORPORT document
> > that I found on the MS web site is version 0.6a, dated March 18, 2001.
> > Given this it seems like STORPORT might not be available soon.  In that
> case
> > do you know what happens with the current drivers?  Are we going to be
> > telling customers that if they want to use iSCSI and NT clustering they
> have
> > to update to Whistler?
> 
> [One would hope that this list does not turn into a Microsoft
> release/product discussion mailing list (?) ]
> 
> Without going into specifics of A certain O.S., does it suffice to
> require that iSCSI not break existing legacy SCSI applications ?
> 
> If the above is a valid requirement, then, knowing that legacy
> applications continue to use SCSI-2 Reserve/Release and the target reset
> as a mechanism of breaking SCSI-2 reservations, should'nt iSCSI continue
> to support the target reset ?
> 
> - Santosh
--------------F04CD1612172AB99084124F3
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------F04CD1612172AB99084124F3--



From owner-ips@ece.cmu.edu  Mon Apr 23 23:48:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA19075
	for <ips-archive@odin.ietf.org>; Mon, 23 Apr 2001 23:47:58 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NLD3V26662
	for ips-outgoing; Mon, 23 Apr 2001 17:13:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NLC9A26624
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 17:12:09 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id RAA26948;
	Mon, 23 Apr 2001 17:04:44 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id PAA169182;
	Mon, 23 Apr 2001 15:12:07 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: Re: Target Reset
To: "Eddy Quicksall" <ESQuicksall@hotmail.com>
Cc: <ips@ece.cmu.edu>
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF76CCB0F2.2BE34D8D-ON88256A37.0074321C@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Mon, 23 Apr 2001 14:11:49 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/23/2001 03:12:04 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Eddy,
All that might be true of NT, but the Shark, and Symmetrix have many other
clusters and non clusters connected to it.  The shark for example has 32
different SCSI connections.  A target reset will effect every one, not just
a specific NT cluster.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


"Eddy Quicksall" <ESQuicksall@hotmail.com> on 04/23/2001 02:02:33 PM

To:   John Hufferd/San Jose/IBM@IBMUS
cc:   <ips@ece.cmu.edu>
Subject:  Re: Target Reset



Actually, I made an assumption but did not make a suggestion; I only wanted
to bring up the point that there has to be some way to emulate what NT is
doing and at this time.

On the current NT MSCS, there will only be 2 hosts on the target. I have
heard that they have a multi-node cluster but I don't know how they do a
challenge there (I suspect that they don't use RESET).

Regarding the number of LUs, NT it is limited to a small number of LUs on a
target and a small number of targets on a bus.

Eddy

----- Original Message -----
From: "John Hufferd" <hufferd@us.ibm.com>
To: "Eddy Quicksall" <ESQuicksall@hotmail.com>
Cc: <ips@ece.cmu.edu>
Sent: Monday, April 23, 2001 4:07 PM
Subject: Re: Target Reset


>
> Eddy,
> A target like an IBM Shark or EMC Symmetrix will have thousands of LUs
and
> 10s to 100s of Hosts connected to it, and you want to reset the whole
> Target?  I do not think that is a good idea.  Perhaps Task Reset or LU
> reset etc. but not Target Reset.
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> "Eddy Quicksall" <ESQuicksall@hotmail.com> on 04/23/2001 09:44:05 AM
>
> To:   "Dillard, David" <david_dillard@adaptec.com>, John Hufferd/San
>       Jose/IBM@IBMUS
> cc:   <ips@ece.cmu.edu>
> Subject:  Re: Target Reset
>
>
>
> I am wondering how clustering will work on NT without some sort of reset.
>
> On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
> during a challenge for the quorum drive.
>
> On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
> assume the NT driver would have to issue a TARGET RESET to each target
that
> it is supporting.
>
> How would you propose this would be done without a TARGET RESET?
>
> Eddy
>
> ----- Original Message -----
> From: "John Hufferd" <hufferd@us.ibm.com>
> To: "Dillard, David" <david_dillard@adaptec.com>
> Cc: <ips@ece.cmu.edu>
> Sent: Sunday, April 22, 2001 3:20 PM
> Subject: RE: Target Reset
>
>
> >
> > This is at least better.  But I do not have the issue of it being
vendor
> > unique.  This is a shut down and restart of the complete Target, and
will
> > probably be part of the vendors' operator console or their own remote
> > support functions, it is not clear that it needs to be a general
> management
> > function that works the same on all iSCSI Storage Controllers.
> >
> > Many of the major Storage Controller do not support this feature today.
> >
> > I do not believe that most SNMP implementations are very secure.  Most
> > folks do not want to have a changeable MIB until they have secure SNMP,
> and
> > even though there is a version of SNMP that has security features, this
> has
> > not been well supported.
> >
> > I do NOT think that Target Reset should be in the base iSCSI protocol,
> but
> > I think it is reasonable to hold this discussion apart from the base
> > protocol document, and the question should be asked if this is a
general
> > management function or a vendor specific console or remote support
> > function.
> >
> >
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > "Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
> > 08:13:49 AM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  RE: Target Reset
> >
> >
> >
> > John,
> >
> > I understand the danger of issuing a target reset and I agree that it
> > should
> > not be a part of the an initiator's normal error recovery procedure.
> > However, looking at this from a management perspective I'd like to see
a
> > standardized way of resetting a target.  I don't want to see a variety
of
> > vendor unique methods of resetting targets sprout up.
> >
> > If resetting a target using the protocol is not desirable from your
> > perspective would incorporating this feature into the MIB be
acceptable?
> > (MIBs are for management after all)
> >
> > ----------------------------------------------------------------
> > David Dillard                          david_dillard@adaptec.com
> > Management Software Group
> > Adaptec, Inc.                          www.adaptec.com
> >
> >
> >
> >
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Sunday, April 22, 2001 4:59 AM
> > To: ips@ece.cmu.edu
> > Subject: Target Reset
> >
> >
> > I thought we had a number of discussion previously about Target Reset
> (Warm
> > or Cold).  I thought there was general feeling that this command is so
> > dangerous that it should not be supported by iSCSI.  The long distance
> > capability of iSCSI makes the risks involved unmanageable.  There
should
> > only be an Admin way to do this.
> >
> > Some folks have said that we could permit it and have special
> authorization
> > etc.  This would probably cause a separate section in the spec. to
define
> > the authorization approach,  and what ever other security is needed to
> > prevent this from inappropriately being used.  All for what purpose?
> This
> > can not be part of error recovery from a normal initiator.  The wide
> spread
> > effect is too great for that.
> >
> > I would like to hear from the list about their feeling on this item.
> >
> >
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> >
> >
>
>
>
>





From owner-ips@ece.cmu.edu  Tue Apr 24 01:29:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA22945
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 01:28:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O3bDB16866
	for ips-outgoing; Mon, 23 Apr 2001 23:37:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O3aWA16835
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 23:36:32 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRQRW>; Mon, 23 Apr 2001 20:36:21 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B17345A@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: ips@ece.cmu.edu
Cc: Charles Monia <cmonia@NishanSystems.com>
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 20:36:20 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

Sorry to reopen old issues, but unfortunately that seems necessary since
something has been lost in translation.

My concerns about TARGET RESET center on three areas:

1.  Adverse side effects at the transport layer that could affect other
users.
2.  Similar adverse side effects on the affected devices,
3.  The impact on legacy software.

The gist of my opinion on the first issue, (as expressed on the T10
reflector) is as follows:

"> > > > The ... issue is whether mechanisms, such as terget reset, 
> > > > are appropriate for a given transport.  In my view, the only 
> > > > immutable requirement is to preserve the transport-independant 
> > > > part of the semantics.  The definition of transport-specific
> > > > side effects is best handled in the appropriate transport
> > > > specification."

The point of the above is that, in my view, a protocol specification has
leeway to define the protocol-specific side effects in a rational and benign
manner provided that the observable effects on the attached devices are
preserved.

Regarding adverse device-level side effects, I also stated that:

"....restricting the operation [target reset] means providing hooks so that
only a
trusted class of initiators can perform the function.  It's a bit like
controlling access to a file so that lots of users can read it but only a
trusted few can perform a write or delete operation." 

Finally, with regard to legacy implementations, my main intent was to avoid
the situation where an operation that was previously legal becomes illegal.
In that regard, I also recall suggesting that the function be treated as a
LUN RESET broadcast to all the logical units to which the initator had
access privileges.  That would also support the notion of preventing adverse
effects on other users as well. 

At any rate, since there is a proposal to make LUN RESET mandatory (see
below), I believed this was a reasonable and consistent alternative. I
therefore felt (and continue to believe) there is no justification for
making the function optional.

Anyhow, thess views did not prevail at the meeting in question. For that
reason, the so-called NOP proposal was made as a last ditch effort to
accomodate legacy implementations by preserving a measure of backwards
compatibility (albeit token compatibility). In that regard, it seemed the
lesser evil.

> > As for leaving things out of iSCSI - the default modus
> > operandi should be to put in everything that's described
> > in SAM2 unless we can convince T10 to take the feature
> > out of SAM2.  Let's not go deciding to cast things out
> > of SCSI on T10's behalf.

So, I guess that anyone wishing to support a change in the spec is, of
course, free to pursue it in that forum. 

Charles

PS: The proposal referenced above can be found at:
ftp://ftp.t10.org/t10/document.01/01-015r2.pdf

> -----Original Message-----
> From: Elliott, Robert [mailto:Robert.Elliott@COMPAQ.com]
> Sent: Monday, April 23, 2001 5:20 PM
> To: ips@ece.cmu.edu; 'Black_David@emc.com'; 'cmonia@nishansystems.com'
> Subject: RE: iSCSI Target Reset
> 
> 
> It's not a very good situation when each device chooses the
> interpretation of TARGET RESET that it thinks is appropriate.  
> IBM's Shark might choose a different hack than Compaq's
> RA8000.  How is software supposed to make sense of this?
> 
> The rule in SAM-2 is that TARGET RESET resets every logical
> unit (subject to access controls, if implemented).  The fact
> is that in Fibre Channel, not many multi-LUN multi-port
> targets followed that rule.  The result is that software 
> cannot tell what's going to happen and may have to handle
> targets from each vendor differently.  This is not very 
> interoperable.
> 
> Charles suggested in the T10 meeting that we allow it to 
> be implemented as a no-op rather than let protocols drop 
> support for it.  That doesn't help software like Windows that 
> does expect certain effects - a no-op implementation 
> would break clustering.  By removing it from the protocol, 
> software is forced to find a suitable replacement (e.g. use 
> LOGICAL UNIT RESET or switch to persistent reservations).
> In Windows, this can be done at the port driver level
> (STORPORT improving on SCSIPORT) or at the miniport level
> (convert each target reset request into multiple LOGICAL UNIT
> RESETs).
> 
> Note that other protocols like NFS and HTTP over IP don't
> seem to have "server resets."
> 
> The recent SAM-2 change was expressly designed to encourage 
> iSCSI and SRP to drop support for TARGET RESET.  Please don't 
> keep it because you think T10 would be offended :-)
> 
> ---
> Rob Elliott, Compaq Server Storage
> Robert.Elliott@compaq.com
> 
> 
> 
> 
> 
> 
> > -----Original Message-----
> > From: Black_David@emc.com [mailto:Black_David@emc.com]
> > Sent: Monday, April 23, 2001 6:31 PM
> > To: cmonia@NishanSystems.com; ips@ece.cmu.edu
> > Subject: RE: iSCSI Target Reset
> > 
> > 
> > I agree with Charles that this is an implementation
> > issue.  If a Shark wants to reset all 32 adapters
> > when it receives a Target Reset on one of them, that's
> > a Shark implementation decision.  It's completely valid
> > to reset only the adapter that the Target Reset is
> > received on (common Fibre Channel behavior) or
> > only the iSCSI target to which the Target Reset is
> > addressed if there's more than one Target behind  
> > the adapter.
> > 
> > As for leaving things out of iSCSI - the default modus
> > operandi should be to put in everything that's described
> > in SAM2 unless we can convince T10 to take the feature
> > out of SAM2.  Let's not go deciding to cast things out
> > of SCSI on T10's behalf.
> > 
> > Thanks,
> > --David
> > 
> > > -----Original Message-----
> > > From:	Charles Monia [SMTP:cmonia@nishansystems.com]
> > > Sent:	Monday, April 23, 2001 7:12 PM
> > > To:	ips@ece.cmu.edu
> > > Subject:	RE: iSCSI Target Reset
> > > 
> > > Hi:
> > > 
> > > These seem to be implementation decisions. I don't see how 
> > that justifies
> > > removing support from the protocol.
> > > 
> > > Charles
> > > 
> > > > -----Original Message-----
> > > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > > Sent: Monday, April 23, 2001 2:34 PM
> > > > To: Santosh Rao
> > > > Cc: ips@ece.cmu.edu
> > > > Subject: Re: iSCSI Target Reset
> > > > 
> > > > 
> > > > 
> > > > Absolutely not,  Why would we think that impacting 32 
> > different other
> > > > initiators is an OK thing to do.  By the way there are lots 
> > > > more Initiators
> > > > possible with FC on Shark, and would hope that there would be 
> > > > even more
> > > > with iSCSI.
> > > > 
> > > > I have been told that these large Storage Controllers do not 
> > > > support Target
> > > > Reset today.  So I see no loss in not supporting such an 
> > item in iSCSI
> > > > especially since many Initiators will be beyond even the 
> > distances and
> > > > mischief that is possible with FC.
> > > > 
> > > > .
> > > > .
> > > > .
> > > > John L. Hufferd
> > > > Senior Technical Staff Member (STSM)
> > > > IBM/SSG San Jose Ca
> > > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > > Internet address: hufferd@us.ibm.com
> > > > 
> > > > 
> > > > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001 
> > > > 01:24:02 PM
> > > > 
> > > > Sent by:  owner-ips@ece.cmu.edu
> > > > 
> > > > 
> > > > To:   ips@ece.cmu.edu
> > > > cc:
> > > > Subject:  Re: iSCSI Target Reset
> > > > 
> > > > 
> > > > 
> > > > "Dillard, David" wrote:
> > > > >
> > > > > When will STORPORT be generally available?  The latest 
> > > > STORPORT document
> > > > > that I found on the MS web site is version 0.6a, dated 
> > > > March 18, 2001.
> > > > > Given this it seems like STORPORT might not be available 
> > > > soon.  In that
> > > > case
> > > > > do you know what happens with the current drivers?  Are we 
> > > > going to be
> > > > > telling customers that if they want to use iSCSI and NT 
> > > > clustering they
> > > > have
> > > > > to update to Whistler?
> > > > 
> > > > 
> > > > [One would hope that this list does not turn into a Microsoft
> > > > release/product discussion mailing list (?) ]
> > > > 
> > > > Without going into specifics of A certain O.S., does it 
> suffice to
> > > > require that iSCSI not break existing legacy SCSI applications ?
> > > > 
> > > > If the above is a valid requirement, then, knowing that legacy
> > > > applications continue to use SCSI-2 Reserve/Release and the 
> > > > target reset
> > > > as a mechanism of breaking SCSI-2 reservations, should'nt 
> > > > iSCSI continue
> > > > to support the target reset ?
> > > > 
> > > > - Santosh
> > > > 
> > > > 
> > > > 
> > > > 
> > 
> 


From owner-ips@ece.cmu.edu  Tue Apr 24 02:50:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA04267
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 02:50:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKIh523227
	for ips-outgoing; Mon, 23 Apr 2001 16:18:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKH3A23160
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:17:04 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3NLOf129405;
	Mon, 23 Apr 2001 14:24:42 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>,
        "Ips Reflector \(E-mail\)" <ips@ece.cmu.edu>
Cc: "Allison Mankin" <mankin@east.isi.edu>,
        "David Black" <Black_David@emc.com>,
        "Elizabeth G Rodriguez \(Elizabeth\)" <egrodriguez@lucent.com>,
        "Scott Bradner" <sob@harvard.edu>
Subject: RE: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
Date: Mon, 23 Apr 2001 13:14:48 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEELICGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <6BD67FFB937FD411A04F00D0B74FE87802A08FF0@xrose06.rose.hp.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie,

This requirements document makes it clear there is expectation of modifying
Ethernet adapters to support this protocol.  Should this required hardware
support be made in a general fashion to allow common use among other
protocols?  This hardware requirement is primarily based on two
requirements, to increase the level of error detection and to allow framing.
Presently, IETF supports a framing protocol that also increases the level of
error detection.

Presently TSVWG and IPS are working on a common error detection algorithm.
In addition, there are two other protocols expecting hardware for framing
and error detection.  This is iFCP and FCIP.

See:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt

It is possible to have all these protocols use the same error detection and
framing.  If this MUST be done using TCP, as this requirement document
demands, then here is a possible general propose header that would allow
common use of hardware and a easy transition into SCTP.

I will be happy to define a mapping from the present protocols into this
generalized form.  The advantage should be obvious.  One Ethernet adapter
can handle these various protocols without specialized hardware for each.

For those wishing to update and route based on encapsulated headers, a
fix-up field at the end of these headers will allow use of a common error
scheme using header fix-up.

Here is an example of how TCP can be made to look like SCTP.
http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt

This header could become a TCP option field to allow for negotiation.

P.S.
One additional question however.

On page 18,
   "The iSCSI protocol document SHOULD NOT define the management
   architecture for iSCSI within the network infrastructure."

What does this mean?

Doug


> The IP Storage Working group is chartered with developing
> comprehensive technology to transport block storage data
> over IP protocols.  This effort includes a protocol to
> transport the Small Computer Systems Interface (SCSI)
> protocol over the internet (iSCSI).
>
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt
>



From owner-ips@ece.cmu.edu  Tue Apr 24 02:54:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA04314
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 02:54:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NKL0423382
	for ips-outgoing; Mon, 23 Apr 2001 16:21:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.wrs.com (unknown-1-11.wrs.com [147.11.1.11])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NKKHA23304
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 16:20:22 -0400 (EDT)
Received: from london ([147.11.45.217])
	by mail.wrs.com (8.9.3/8.9.1) with SMTP id NAA10138
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 13:19:59 -0700 (PDT)
From: "Rod Harrison" <rod.harrison@windriver.com>
To: <ips@ece.cmu.edu>
Subject: RE: iSCSI : Negotiable padding, was More issues.... Digest related.
Date: Mon, 23 Apr 2001 13:19:44 -0700
Message-ID: <NEBBKMMOEMCINPLCHKGMKEKKCGAA.rod.harrison@windriver.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
In-Reply-To: <C1256A35.005FB9D9.00@d12mta02.de.ibm.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

	I'm confused. I've never seen the need for padding on the
wire, but somewhere along the line I've gotten the
impression that's what we had decided to do. This being the
case it seems a small step to have the sending side pad to
whatever alignment the receiver wants, especially since most
will want the currently specified value of 4 bytes.

	To answer your question, I believe the end system can
achieve whatever data alignment it requires regardless of
padding on the wire. My request was made on the basis that
if we were going to pad to 4 bytes on the wire anyway we
might as well go the small extra distance and make that
padding negotiable. We can set a default of 4 to be used if
padding is not negotiated.

	So, straighten me out. Are we padding on the wire or not?
And if not, why are we mentioning padding at all? Also, do
we expect to see any data padding if there are no data
digests in use?

	Thanks,

	- Rod

-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
julian_satran@il.ibm.com
Sent: Saturday, April 21, 2001 10:31 AM
To: ips@ece.cmu.edu
Subject: Re: iSCSI : Negotiable padding, was More issues....
Digest
related.




Rod,

On the wire padding is not required at all and many of us
have resisted
padding up to the advent of markers.

Please do explain what will a padding option do that you
can't do by
yourself in the endsystem with buffer alignment - or how can
padding on the
wire stop you from getting buffer alignment all bad.

Julo

"Rod Harrison" <rod.harrison@windriver.com> on 20/04/2001
20:30:14

Please respond to "Rod Harrison"
<rod.harrison@windriver.com>

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI : Negotiable padding, was More issues....
Digest related.




Julian,

     I have a request with respect to data padding. Can we
make
the pad size login negotiable please? Preferably on a per
direction basis. This would allow the pad to be optimized
for a receivers specific requirements, e.g. cache line
alignment. Restricting padding to powers of 2 by specifying
the size as a power of 2 seems reasonable.

     For example:

     IPadSize=<any-power-of-two>
     TPadSize=<any-power-of-two>

     IPadSize=0
     TPadSize=2

     Would result in the initiator padding data to 4 byte
boundaries for the target, and the target inserting no pad
for the initiator.

     Also, a related question, if the pad is to remain
mandatory
is it expected data will be padded if no data digests are in
use?

     - Rod

-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
julian_satran@il.ibm.com
Sent: Friday, April 20, 2001 1:45 AM
To: ips@ece.cmu.edu
Subject: Re: iSCSI : More issues.... Digest related.




1.The padding is to the next 4 byte word boundary .
2. There is a Security - Appendix

and there is a numbering /formating error in the appendix

Julo

Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 03:25:49

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : More issues.... Digest related.




Julian & All,

2 more issues :

1) If the DataSegmentLength in the BHS excludes padding
bytes, how does
the initiator determine the location of the data digest
[which is placed
after the padded data] ?
There is no knowledge of what amount of padding is in use,
since padding
can be 4 bytes or a multiple of that quantity.


2) While on the subject of digests.....are'nt there supposed
to be
login keys to indicate the use or non-use of header and data
digests ? I
can't seem to find any such login keys in the latest revs
5.91....5.92....6.000...(?)

(Section 2.2.1 states :
"The digest types are negotiated during the login phase. ").

- Santosh
 - santoshr.vcf








From owner-ips@ece.cmu.edu  Tue Apr 24 03:40:02 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA04581
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 03:40:01 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O2mBb14605
	for ips-outgoing; Mon, 23 Apr 2001 22:48:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe10.law11.hotmail.com [64.4.16.114])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O2lNA14585
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 22:47:23 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Mon, 23 Apr 2001 19:47:17 -0700
X-Originating-IP: [66.31.72.237]
From: "Eddy Quicksall" <ESQuicksall@hotmail.com>
To: "John Hufferd" <hufferd@us.ibm.com>
Cc: <ips@ece.cmu.edu>
References: <OF76CCB0F2.2BE34D8D-ON88256A37.0074321C@LocalDomain>
Subject: Re: Target Reset
Date: Mon, 23 Apr 2001 22:47:17 -0400
MIME-Version: 1.0
Content-Type: text/plain;	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.3018.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Message-ID: <OE10TuHZncJfKHmj9n10000276c@hotmail.com>
X-OriginalArrivalTime: 24 Apr 2001 02:47:17.0029 (UTC) FILETIME=[DFD49550:01C0CC68]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

I think the point is whether or not you should support a TARGET RESET, not
whether or not your system should issue it.

On systems like you mention, you should probably not do a TARGET RESET.

On NT MSCS, you could probably do a LOGICAL UNIT RESET for all known
devices. Since the driver doesn't know which device is the quorum device, it
would have to reset all known devices.

Eddy
----- Original Message -----
From: "John Hufferd" <hufferd@us.ibm.com>
To: "Eddy Quicksall" <ESQuicksall@hotmail.com>
Cc: <ips@ece.cmu.edu>
Sent: Monday, April 23, 2001 5:11 PM
Subject: Re: Target Reset


>
> Eddy,
> All that might be true of NT, but the Shark, and Symmetrix have many other
> clusters and non clusters connected to it.  The shark for example has 32
> different SCSI connections.  A target reset will effect every one, not
just
> a specific NT cluster.
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> "Eddy Quicksall" <ESQuicksall@hotmail.com> on 04/23/2001 02:02:33 PM
>
> To:   John Hufferd/San Jose/IBM@IBMUS
> cc:   <ips@ece.cmu.edu>
> Subject:  Re: Target Reset
>
>
>
> Actually, I made an assumption but did not make a suggestion; I only
wanted
> to bring up the point that there has to be some way to emulate what NT is
> doing and at this time.
>
> On the current NT MSCS, there will only be 2 hosts on the target. I have
> heard that they have a multi-node cluster but I don't know how they do a
> challenge there (I suspect that they don't use RESET).
>
> Regarding the number of LUs, NT it is limited to a small number of LUs on
a
> target and a small number of targets on a bus.
>
> Eddy
>
> ----- Original Message -----
> From: "John Hufferd" <hufferd@us.ibm.com>
> To: "Eddy Quicksall" <ESQuicksall@hotmail.com>
> Cc: <ips@ece.cmu.edu>
> Sent: Monday, April 23, 2001 4:07 PM
> Subject: Re: Target Reset
>
>
> >
> > Eddy,
> > A target like an IBM Shark or EMC Symmetrix will have thousands of LUs
> and
> > 10s to 100s of Hosts connected to it, and you want to reset the whole
> > Target?  I do not think that is a good idea.  Perhaps Task Reset or LU
> > reset etc. but not Target Reset.
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > "Eddy Quicksall" <ESQuicksall@hotmail.com> on 04/23/2001 09:44:05 AM
> >
> > To:   "Dillard, David" <david_dillard@adaptec.com>, John Hufferd/San
> >       Jose/IBM@IBMUS
> > cc:   <ips@ece.cmu.edu>
> > Subject:  Re: Target Reset
> >
> >
> >
> > I am wondering how clustering will work on NT without some sort of
reset.
> >
> > On parallel SCSI, NT will issue a SCSI BUS RESET to break reservations
> > during a challenge for the quorum drive.
> >
> > On iSCSI, there is no full equivalent to the SCSI BUS RESET so I would
> > assume the NT driver would have to issue a TARGET RESET to each target
> that
> > it is supporting.
> >
> > How would you propose this would be done without a TARGET RESET?
> >
> > Eddy
> >
> > ----- Original Message -----
> > From: "John Hufferd" <hufferd@us.ibm.com>
> > To: "Dillard, David" <david_dillard@adaptec.com>
> > Cc: <ips@ece.cmu.edu>
> > Sent: Sunday, April 22, 2001 3:20 PM
> > Subject: RE: Target Reset
> >
> >
> > >
> > > This is at least better.  But I do not have the issue of it being
> vendor
> > > unique.  This is a shut down and restart of the complete Target, and
> will
> > > probably be part of the vendors' operator console or their own remote
> > > support functions, it is not clear that it needs to be a general
> > management
> > > function that works the same on all iSCSI Storage Controllers.
> > >
> > > Many of the major Storage Controller do not support this feature
today.
> > >
> > > I do not believe that most SNMP implementations are very secure.  Most
> > > folks do not want to have a changeable MIB until they have secure
SNMP,
> > and
> > > even though there is a version of SNMP that has security features,
this
> > has
> > > not been well supported.
> > >
> > > I do NOT think that Target Reset should be in the base iSCSI protocol,
> > but
> > > I think it is reasonable to hold this discussion apart from the base
> > > protocol document, and the question should be asked if this is a
> general
> > > management function or a vendor specific console or remote support
> > > function.
> > >
> > >
> > >
> > > .
> > > .
> > > .
> > > John L. Hufferd
> > > Senior Technical Staff Member (STSM)
> > > IBM/SSG San Jose Ca
> > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > Internet address: hufferd@us.ibm.com
> > >
> > >
> > > "Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
> > > 08:13:49 AM
> > >
> > > Sent by:  owner-ips@ece.cmu.edu
> > >
> > >
> > > To:   ips@ece.cmu.edu
> > > cc:
> > > Subject:  RE: Target Reset
> > >
> > >
> > >
> > > John,
> > >
> > > I understand the danger of issuing a target reset and I agree that it
> > > should
> > > not be a part of the an initiator's normal error recovery procedure.
> > > However, looking at this from a management perspective I'd like to see
> a
> > > standardized way of resetting a target.  I don't want to see a variety
> of
> > > vendor unique methods of resetting targets sprout up.
> > >
> > > If resetting a target using the protocol is not desirable from your
> > > perspective would incorporating this feature into the MIB be
> acceptable?
> > > (MIBs are for management after all)
> > >
> > > ----------------------------------------------------------------
> > > David Dillard                          david_dillard@adaptec.com
> > > Management Software Group
> > > Adaptec, Inc.                          www.adaptec.com
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > Sent: Sunday, April 22, 2001 4:59 AM
> > > To: ips@ece.cmu.edu
> > > Subject: Target Reset
> > >
> > >
> > > I thought we had a number of discussion previously about Target Reset
> > (Warm
> > > or Cold).  I thought there was general feeling that this command is so
> > > dangerous that it should not be supported by iSCSI.  The long distance
> > > capability of iSCSI makes the risks involved unmanageable.  There
> should
> > > only be an Admin way to do this.
> > >
> > > Some folks have said that we could permit it and have special
> > authorization
> > > etc.  This would probably cause a separate section in the spec. to
> define
> > > the authorization approach,  and what ever other security is needed to
> > > prevent this from inappropriately being used.  All for what purpose?
> > This
> > > can not be part of error recovery from a normal initiator.  The wide
> > spread
> > > effect is too great for that.
> > >
> > > I would like to hear from the list about their feeling on this item.
> > >
> > >
> > >
> > > .
> > > .
> > > .
> > > John L. Hufferd
> > > Senior Technical Staff Member (STSM)
> > > IBM/SSG San Jose Ca
> > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > Internet address: hufferd@us.ibm.com
> > >
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>


From owner-ips@ece.cmu.edu  Tue Apr 24 03:46:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA04597
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 03:46:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O5rIK22848
	for ips-outgoing; Tue, 24 Apr 2001 01:53:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O5qxA22838
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 01:52:59 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id BAA35802;
	Tue, 24 Apr 2001 01:45:34 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id XAA70126;
	Mon, 23 Apr 2001 23:52:58 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: RE: iSCSI Target Reset
To: Black_David@emc.com
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFB3E72E87.D226EAC3-ON88256A38.001F6682@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Mon, 23 Apr 2001 22:52:41 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/23/2001 11:52:54 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


David,
you described the action in FC terms.  In iSCSI terms, I think the Target
Device would be a complete Symmetrix.  Do you think that is correct?  What
does Symmetrix do today to support Target Reset?  And how do you think it
should act in the future?  Surly you do not want the complete controller
reset, do you?

What type of flexibility do you think T10 gives you in this situation?

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


Black_David@emc.com@ece.cmu.edu on 04/23/2001 04:31:24 PM

Sent by:  owner-ips@ece.cmu.edu


To:   cmonia@NishanSystems.com, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI Target Reset



I agree with Charles that this is an implementation
issue.  If a Shark wants to reset all 32 adapters
when it receives a Target Reset on one of them, that's
a Shark implementation decision.  It's completely valid
to reset only the adapter that the Target Reset is
received on (common Fibre Channel behavior) or
only the iSCSI target to which the Target Reset is
addressed if there's more than one Target behind
the adapter.

As for leaving things out of iSCSI - the default modus
operandi should be to put in everything that's described
in SAM2 unless we can convince T10 to take the feature
out of SAM2.  Let's not go deciding to cast things out
of SCSI on T10's behalf.

Thanks,
--David

> -----Original Message-----
> From:   Charles Monia [SMTP:cmonia@nishansystems.com]
> Sent:   Monday, April 23, 2001 7:12 PM
> To:     ips@ece.cmu.edu
> Subject:     RE: iSCSI Target Reset
>
> Hi:
>
> These seem to be implementation decisions. I don't see how that justifies
> removing support from the protocol.
>
> Charles
>
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Monday, April 23, 2001 2:34 PM
> > To: Santosh Rao
> > Cc: ips@ece.cmu.edu
> > Subject: Re: iSCSI Target Reset
> >
> >
> >
> > Absolutely not,  Why would we think that impacting 32 different other
> > initiators is an OK thing to do.  By the way there are lots
> > more Initiators
> > possible with FC on Shark, and would hope that there would be
> > even more
> > with iSCSI.
> >
> > I have been told that these large Storage Controllers do not
> > support Target
> > Reset today.  So I see no loss in not supporting such an item in iSCSI
> > especially since many Initiators will be beyond even the distances and
> > mischief that is possible with FC.
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > 01:24:02 PM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI Target Reset
> >
> >
> >
> > "Dillard, David" wrote:
> > >
> > > When will STORPORT be generally available?  The latest
> > STORPORT document
> > > that I found on the MS web site is version 0.6a, dated
> > March 18, 2001.
> > > Given this it seems like STORPORT might not be available
> > soon.  In that
> > case
> > > do you know what happens with the current drivers?  Are we
> > going to be
> > > telling customers that if they want to use iSCSI and NT
> > clustering they
> > have
> > > to update to Whistler?
> >
> >
> > [One would hope that this list does not turn into a Microsoft
> > release/product discussion mailing list (?) ]
> >
> > Without going into specifics of A certain O.S., does it suffice to
> > require that iSCSI not break existing legacy SCSI applications ?
> >
> > If the above is a valid requirement, then, knowing that legacy
> > applications continue to use SCSI-2 Reserve/Release and the
> > target reset
> > as a mechanism of breaking SCSI-2 reservations, should'nt
> > iSCSI continue
> > to support the target reset ?
> >
> > - Santosh
> >
> >
> >
> >





From owner-ips@ece.cmu.edu  Tue Apr 24 04:55:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA05082
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 04:55:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NM93c01320
	for ips-outgoing; Mon, 23 Apr 2001 18:09:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NM88A01276
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 18:08:08 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3NNFx129488;
	Mon, 23 Apr 2001 16:15:59 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Mark Mokryn" <mark@sangate.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI: Re: iSCSI & Linked Commands
Date: Mon, 23 Apr 2001 15:06:05 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEELKCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <3AE463A2.2E994455@sangate.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Mark,

I have little doubt that what you are saying is true for Fibre-Channel.
Much of the confusion seems to stem from the choices in architectural
models.  FC found a solution selectively using CRN and reservation. iSCSI,
choosing to support multiple connections, has then required more extensive
use of serialization.  For devices that can only handle one task at a time,
linking is assured at providing the desired results within parallel SCSI.

One view with the general provisions at providing non-idem potent means of
commanding a device expands into a general discussion of ensuring
synchronization of server state.  This is an obligation of the transport
that is not met with connection allegiance alone.  There are many means of
accomplishing this goal.  With respect to linking, I was simply correcting
the language used to describe command allegiance to a connection.  Unless
linked commands are banned from SCSI, allegiance as now described for iSCSI
does not apply to the task.  As there is never more than one outstanding
command active within a linked command set, iSCSI should have no difficulty
in providing this feature.

What happens at the application and how that translates into the transport
is something the proposal is to provide.  Backup applications are behind the
paradigm shift so do not expect the application to conform to your
expectations.  From the requirement proposal, scanners and printers and
mechanical loaders will need workable solutions within iSCSI.  Should the
transport be able to understand the commands for all of these devices, some
of these devices, and what of devices using vendor unique commands or wholly
new commands?  In my view, the transport should not respond to or modify the
SCSI layer.  There are techniques that make no assumptions about the SCSI
layer content.

Doug

> Doug,
>
> Whether or not a device uses relative addressing has nothing to do with
> command linking. A LOCATE command moves the head, and a following READ
> reads the data. Using command linking, relative addressing, etc. does
> not protect the sender of the above commands from another command, sent
> by that initiator or any other, which may move the head, and thus cause
> the original READ to get the data from the wrong place. There is no
> protection between the first and any subsequent commands under command
> linking. Thus, the proper way to use LOCATE and READ is with
> RESERVE/RELEASE. I assume that it was in order to settle this confusion,
> that FC-TAPE decided to ban linked commands.
>
> -mark
>
> Douglas Otis wrote:
> >
> > Stephen,
> >
> > Unlike random access devices, sequential access devices operate with
> > relative addressing.  For random access devices, this is a seldom used
> > option.  There is a requirement to bind commands together to
> ensure order of
> > execution on these devices.  By popular, you mean not sequential?
> >
> > Doug
> >
> > > Julian,
> > >
> > > > According to your logic no FCP implementation can use
> linked commands?
> > > > Is this true for all OS's?  Is it a verified fact or foloklor?
> > >
> > > In my experience it's fact.  I have never used a SCSI stack which both
> > > supported AND used linked commands.  Like some others here, I always
> > > assumed AIX might :^) Ralph has pointed out that T10 is well aware
> > > that the feature is not popular.  There are other ways of
> > > accomplishing the same thing that are less likely to blow up in your
> > > face.
> > >
> > > > Is it so also for the new MS StorPort driver?
> > >
> > > I don't know, but I'd be really surprised if they did use linked
> > > commands.  You have to be pretty nuts to rely on a feature that's not
> > > even exercised by most SCSI implementations.
> > >
> > > Steph
> > >
>
> --
> Mark Mokryn      SANgate Systems Inc.      mark@SANgate.com
> Phone: +972-9-8919821                Mobile: +972-54-270030
> Fax: +972-9-8919449                  http://www.SANgate.com
> P.O. Box 1486 41 Hameyasdim St., Even Yehuda 40500 Israel
>



From owner-ips@ece.cmu.edu  Tue Apr 24 05:16:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA05214
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 05:16:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3NGcqr05285
	for ips-outgoing; Mon, 23 Apr 2001 12:38:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3NGcDA05253
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 12:38:13 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3NHk1129234;
	Mon, 23 Apr 2001 10:46:01 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "John Hufferd" <hufferd@us.ibm.com>,
        "Dillard, David" <david_dillard@adaptec.com>
Cc: <ips@ece.cmu.edu>
Subject: RE: iSCSI Target Reset
Date: Mon, 23 Apr 2001 09:36:19 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMELACGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <OFCAEBAA8A.97EC3912-ON88256A36.0068A802@LocalDomain>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

John,

This problem remains an architectural dilemma.  We have yet to provide a
means of establishing authority within the iSCSI domain.  Should this become
possible, then an error reported "Not Authorized" becomes a standard way of
excluding commands.

Doug


> This is at least better.  But I do not have the issue of it being vendor
> unique.  This is a shut down and restart of the complete Target, and will
> probably be part of the vendors' operator console or their own remote
> support functions, it is not clear that it needs to be a general
> management
> function that works the same on all iSCSI Storage Controllers.
>
> Many of the major Storage Controller do not support this feature today.
>
> I do not believe that most SNMP implementations are very secure.  Most
> folks do not want to have a changeable MIB until they have secure
> SNMP, and
> even though there is a version of SNMP that has security
> features, this has
> not been well supported.
>
> I do NOT think that Target Reset should be in the base iSCSI protocol, but
> I think it is reasonable to hold this discussion apart from the base
> protocol document, and the question should be asked if this is a general
> management function or a vendor specific console or remote support
> function.
>
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> "Dillard, David" <david_dillard@adaptec.com>@ece.cmu.edu on 04/22/2001
> 08:13:49 AM
>
> Sent by:  owner-ips@ece.cmu.edu
>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  RE: Target Reset
>
>
>
> John,
>
> I understand the danger of issuing a target reset and I agree that it
> should
> not be a part of the an initiator's normal error recovery procedure.
> However, looking at this from a management perspective I'd like to see a
> standardized way of resetting a target.  I don't want to see a variety of
> vendor unique methods of resetting targets sprout up.
>
> If resetting a target using the protocol is not desirable from your
> perspective would incorporating this feature into the MIB be acceptable?
> (MIBs are for management after all)
>
> ----------------------------------------------------------------
> David Dillard                          david_dillard@adaptec.com
> Management Software Group
> Adaptec, Inc.                          www.adaptec.com
>
>
>
>
> -----Original Message-----
> From: John Hufferd [mailto:hufferd@us.ibm.com]
> Sent: Sunday, April 22, 2001 4:59 AM
> To: ips@ece.cmu.edu
> Subject: Target Reset
>
>
> I thought we had a number of discussion previously about Target
> Reset (Warm
> or Cold).  I thought there was general feeling that this command is so
> dangerous that it should not be supported by iSCSI.  The long distance
> capability of iSCSI makes the risks involved unmanageable.  There should
> only be an Admin way to do this.
>
> Some folks have said that we could permit it and have special
> authorization
> etc.  This would probably cause a separate section in the spec. to define
> the authorization approach,  and what ever other security is needed to
> prevent this from inappropriately being used.  All for what purpose?  This
> can not be part of error recovery from a normal initiator.  The
> wide spread
> effect is too great for that.
>
> I would like to hear from the list about their feeling on this item.
>
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
>
>



From owner-ips@ece.cmu.edu  Tue Apr 24 06:00:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA05522
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 06:00:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O3QCx16335
	for ips-outgoing; Mon, 23 Apr 2001 23:26:12 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O3PBA16261
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 23:25:11 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3O4Wq129707;
	Mon, 23 Apr 2001 21:32:52 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Charles Monia" <cmonia@NishanSystems.com>,
        "Ips \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: iSCSI Reqts: In-Order Delivery
Date: Mon, 23 Apr 2001 20:22:58 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMELPCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <B300BD9620BCD411A366009027C21D9B173455@ariel.nishansystems.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Charles,

To quote John,

"A target like an IBM Shark or EMC Symmetrix will have thousands of LUs and
10s to 100s of Hosts connected to it, and you want to reset the whole
Target?  I do not think that is a good idea.  Perhaps Task Reset or LU
reset etc. but not Target Reset."

There could be one pending Task Management request per Logical Unit not per
Target.  According to John, you would not have any hope of only sending one
such command.  With hundreds of hosts, and thousands of Logical Units, there
is more than a few potential commands to handle.  You MUST track each of
these potential commands via their Client Tag and LUN.  It is also clear you
will be expected to sort pending commands by their LUN value, look to see
which commands are potentially affected by the Task Management command,
bring the ExpCmdSN up to a place then enabled by this "cleanup", silently
discard related commands as if they had been delivered and assume such
operation of the Target.  These lost commands are provided placeholders so
they do not inject holes into your sequence not yet transversed although you
have not as yet acknowledged their delivery.  This implies there is a
significant amount of SCSI level processing to be done on behalf of the
Target and a bit of fudging as to what has and has not been delivered. (If I
understood the intent of the recovery comments.)

The technique that I presented does not need to track Client Tags, sorted by
LUN, nor examine the Command content, decided which command is related to
the Task Management, etc.  If there is a connection disqualification during
this process, which may have actually caused the need for these Task
Management commands, then the unacknowledged retries of these commands will
again need to be sorted based on their Client Tag and LUN.  You will need to
check for duplicates in this case as the Client will be unaware which
commands had arrived.  In effect, the present technique assumes the
transport is able to handle storing and sorting a great deal of Tags and
LUNS during normal processing.

Conversely, one could return responses serialized server wide and reject all
bypassed commands.  Now there is no ACA problem, no Client Tags to track, no
loss of Server state, and the flow control still works without the need to
set aside enough room for 100,000 commands on John's IBM Shark controller,
the extreme of one Task Management request per Client, per Logical Unit.
You would be hoping the client doesn't try to get clever and send a single
Abort Task per each command in mass.  If so, you may have 1,000,000 extra
commands to handle, sort, and remember upon a response by the Target.

Brutally clever or brutally simple, it is your choice.

(The bypass flag would be used to bypass all commands on all connections.)

Doug

> Hi:
>
> The point of my original posting was to suggest ways in which the
> semantics
> of all the task management functions could be preserved in
> multi-connection,
> command striping implementations without a lot of complicated bookeeping.
>
> In that regard, the proposed solution imposes no additional tracking
> equirements on initiators aside from those that would be needed anyhow to
> issue the ABORT TASK request. For the most part, that amounts to keeping
> track of each pending I/O request including a handle by which the task can
> be referenced and a pointer to the connection the SCSI command was issued
> on.
>
> I did neglect one restiction however: Specifically, that the
> initiator have
> no more than one task management request pending at a time to a specific
> target.
>
> In other respects, as long as ordered delivery to the SCSI layer is
> preserved for individual connections, I don't see a problem.
>
> > ....Those commands executed out of
> > sequence by means of a bypass flag, those commands that are Task
> Management .....
>
> I apparently don't understand how the bypass flag is supposed to
> work.  I'd
> assumed its function was to maximize the benefits of command striping by
> allowing commands on other connections in the session to be bypassed. I'd
> assumed that commands on the same connection are never bypassed
> (since there
> appears to be no benefit in doing so).
>
> Hence my statement:
>
> > > .....I've made the tacit assumption that commands
> > on a given
> > > connection are presented to the SCSI layer in order they were sent,
> > > regardless of whether or nor cmdSN was set to 0.  I assume
> > the framing
> > > mechanisms that have been discussed for buffer offloading do not
> > > affect this
> > > behavior.  I.e., a fully formed PDU slated for immediate
> > delivery won't be
> > > passed to the SCSI layer before a partially complete PDU
> > that was received
> > > earlier.
>
> Is this assumption incorrect?
>
> Charles
>
> > -----Original Message-----
> > From: Douglas Otis [mailto:dotis@sanlight.net]
> > Sent: Monday, April 23, 2001 12:10 PM
> > To: Charles Monia; Santosh Rao (E-mail)
> > Cc: Ips (E-mail)
> > Subject: RE: iSCSI Reqts: In-Order Delivery
> >
> >
> > Charles,
> >
> > Your solution requires a fair amount of tracking of commands
> > based solely on
> > their Client Tags.  These Tags are randomly generated but will need to
> > retain sequential order for your scheme.  The transport must
> > remember the
> > type of command sent together with their relative placement
> > based only on
> > the Client Tag.  In addition, these commands will need to be
> > placed into
> > different categories.  Those commands executed out of
> > sequence by means of a
> > bypass flag, those commands that are Task Management
> > commands, and commands
> > affected by these other types of commands.  It seems that in
> > large part,
> > these concerns can be met with proper handing of the
> > transport without such
> > laborious sorting of the Client Tags.  The out-of-sequence or
> > bypass flag
> > also depends on the transport sorting the Client Tag.  In addition to
> > disabling flow-control, this technique of not incrementing
> > the serialization
> > of these commands, requires all commands with the same
> > serialization value
> > to be sent on the same connection without acknowledgment, if
> > these commands
> > are also to be kept in sequence.  This connection requirement
> > is yet to be
> > specified.
> >
> > Ver 6, Pg 12:
> >    "iSCSI may avoid delivering some command to the
> >    SCSI layer if so required by some prior SCSI or iSCSI action (e.g.,
> >    clear task set Task Management request received before all the
> >    commands it was supposed to act on)."
> >
> > Here, there seems to be expectations of the iSCSI transport
> > interpreting the
> > content of the SCSI commands.  How this is done is not
> > obvious.  Is the
> > transport expected to generate SCSI responses?
> >
> > In addition, although iSCSI presently relies on ACA, there are few
> > applications that implement ACA.  It would appear for iSCSI
> > to work with the
> > present protocol, significant application changes are
> > required.  With the
> > proposal I am suggesting, this is not a problem as all
> > bypassed commands are
> > rejected back to the Initiator.  The drivers that implement
> > iSCSI will be
> > required to provide handling for these commands that bypass
> > other commands.
> > The amount of information contained in a rejected command
> > list should be
> > relatively small and these occasions for such Management
> > rare.  Without
> > proper handling of these events, there will be 2:00 AM alarm
> > pagers going
> > off.
> >
> > Here in the proposal, sorting CmdSN based on LUN values takes
> > place within a
> > "Barrier List."  I can not tell what is implied by these recovery
> > instructions.  What is meant by Remove, Release, Drop,
> > Cleanup, Placeholder,
> > and ALL.  What is the intended feedback to the initiator for
> > this Clean-up?
> > It would appear the transport works on behalf of the target.  In the
> > proposal that I am suggesting, there is no actions within the
> > transport on
> > behalf of the target.  All decisions are done either by the
> > Target or the
> > Initiator.  None by the transport.
> >
> > The concept is simple.  Keep the transport simple.  Do not expect the
> > transport to decipher SCSI commands.  Do not expect the
> > transport to respond
> > on behalf of the Target.  Do not expect the transport to sort pending
> > commands based on LUN value.  Do not expect the transport to
> > require SCSI
> > and iSCSI ACA.
> >
> > In the case of session wide serialization, what is good for
> > the goose is
> > also good for the gander.  It is important from the prospect
> > of quickly
> > detecting an error and knowing the server state to also use
> > session wide
> > serialization from the server.  The technique of replicating
> > Management
> > commands down each connection in addition to changing global
> > commands into
> > specific commands already over burdens the set-aside that
> > must be made to
> > handle these non-serialized management commands.  My proposal
> > eliminates the
> > problem of set-aside resources and loss of server state.  Rather than
> > silently rejecting commands out-of-sequence, these rejections
> > are reported.
> > Once done, this feature can be used to extract pending
> > commands in a simple
> > and direct manner without burdening the transport.
> >
> > As attempts are made to support the SCSI architecture, rather than
> > increasing the intelligence of the transport, efforts should
> > be made to
> > simplify the transport.  The number of fields that the transport must
> > manipulate will be met with complexity and non-uniform implementation.
> >
> > See:
> > http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt
> >
> > Ver 6, Pg 92:
> >      "N.B. As an alternative to Logout and reissue commands, the
> >       initiator MAY instead reset the target and terminate all
> >       outstanding commands with a service response indicating
> >       Delivery Subsystem Failure. The initiator MUST perform one of
> >       the two actions."
> >
> > ...
> >
>
> > Ver 6, Pg 93:
> >    "The following general mechanism can be used to achieve
> > the effect of
> >    ordered delivery for task management commands while enabling the
> >    "urgent" delivery that some of them imply and immediate
> > execution of
> >    the task management commands without:
> >
> >       At Initiator when a relevant task management command is issued:
> >
> >          a) if ExpCmdSN is equal to CmdSN skip to step c
> >          b) mark all pending commands with a CmdSN field between
> >          ExpCmdSN and the current CmdSN and a relevant LUN as
> >          candidates for cleanup and retain CmdSN in a "barrier list".
> >          c) send the task management command for immediate delivery
> >          to the target
> >
> >       At initiator when updating ExpCmdSN:
> >
> >          a) if the "barrier list" is empty or ExpCmdSN is less than
> >          the first entry in the barrier list then skip to step d
> >          b) remove the barrier list entry and remove and drop all
> >          entries marked for cleanup having a CmdSN field less than
> >          ExpCmdSN
> >          c) go to step a
> >          d) release all queued entries between the old and new
> >          ExpCmdSN from the queue
> >
> >       At target when receiving a relevant task management command for
> >       immediate delivery:
> >
> >          a) if ExpCmdSN is equal to CmdSN skip to step c
> >          b) mark all pending entries (commands received and
> >          placeholders) with a CmdSN field between ExpCmdSN and the
> >          current CmdSN as candidates for cleanup and retain CmdSN in
> >          a "barrier list" including the referenced LUN (or an ALL
> >          marker)
> >          c) send the task management command to SCSI for immediate
> >          execution
> >
> >       At target when updating ExpCmdSN (releasing ordered commands to
> >       SCSI):
> >
> >          a) if the "barrier list" is empty or ExpCmdSN is less than
> >          the first entry in the barrier list then skip to step d
> >          b) remove the barrier list entry and remove and drop all
> >          entries marked for cleanup and having the same LUN as the
> >          barrier entry (any if the barrier is marked ALL) and a CmdSN
> >          field less than ExpCmdSN
> >          c) go to step a
> >          d) release all queued entries between the old and new
> >          ExpCmdSN from the queue
> >
> >    Note that this scheme will withstand connection recovery."
> >
> > Doug
>
> < remainder deleleted>
>



From owner-ips@ece.cmu.edu  Tue Apr 24 06:43:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA05798
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 06:43:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O50I720616
	for ips-outgoing; Tue, 24 Apr 2001 01:00:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O4xrA20586
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 00:59:57 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3O67P129777;
	Mon, 23 Apr 2001 23:07:25 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <ips@ece.cmu.edu>, <marjorie_krueger@hp.com>, <Black_David@emc.com>
Cc: <sob@harvard.edu>, <egrodriguez@lucent.com>, <mankin@east.isi.edu>
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Mon, 23 Apr 2001 21:57:30 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOEMACGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A070801549E@corpmx9.isus.emc.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

Thanks for the url correction for the document.  Although I posted this
document 4 days ago, it still has not been published as an I-D.  Here is a
reference to it until then.
http://ips.pdl.cs.cmu.edu/mail/msg04238.html

I'll stand by the stated intent of implementing this protocol in hardware.
The same is also true for iFCP and FCIP.

Page 8 of the requirements document:
    "(2)  Development of Ethernet storage NICs and related driver and
         protocol software; [NOTE: high-speed applications of iSCSI are
         expected to require significant portions of the iSCSI/TCP/IP
         implementation in hardware to achieve the necessary
         throughput.]"

Page 11:
  "2.3. Framing

   Framing refers to the addition of information in a header, or the
   data stream to allow implementations to locate the boundaries of an
   iSCSI protocol data unit (PDU) within the TCP byte stream.  There
   are two technical requirements driving framing: interfacing needs,
   and accelerated processing needs.

   A framing solution that addresses the "interfacing needs" of the
   iSCSI protocol will facilitate the implementation of a message-based
   upper layer protocol (iSCSI) on top of an underlying byte streaming
   protocol (TCP).  Since TCP is a reliable transport, this can be
   accomplished by including a length field in the iSCSI header.
   Finding the protocol frame assumes that the receiver will parse from
   the beginning of the TCP data stream, and never make a mistake (lose
   alignment on packet headers).

   The other technical requirement for framing, "accelerated
   processing", stems from the need to handle increasingly higher data
   rates in the physical media interface.  Two needs arise from higher
   data rates:

   (1)  LAN environment - NIC vendors seek ways to provide "zero-copy"
        methods of moving data directly from the wire into application
        buffers.

   (2)  WAN environment- the emergence of high bandwidth, high latency,
        low bit error rate physical media places huge buffer
        requirements on the physical interface solutions.

   First, vendors are producing network processing hardware that
   offloads network protocols to hardware solutions to achieve higher
   data rates.  The concept of "zero-copy" seeks to store blocks of
   data in appropriate memory locations (aligned) directly off the
   wire, even in when data is reordered due to packet loss.  This is
   necessary to drive actual data rates of 10 Gigabits and beyond.

   Secondly, in order for iSCSI to be successful in the WAN arena it
   MUST be possible to operate efficiently in high bandwidth, high
   delay networks.  The emergence of multi-gigabit IP networks with
   latencies in the tens to hundreds of milliseconds presents a
   challenge. To fill such large pipes, tens of megabytes of
   outstanding requests from the application are needed. In addition,
   some protocols potentially require tens of megabytes at the
   transport layer to deal with buffering for reassembly of data when
   packets are received out-of-order.

   Consider that a network pipe at 10 Gbps x 200 msec holds 250 MB.
   [Assume land-based communication with a spot half way around the
   world at the equator.  Ignore additional distance due to cable
   routing.  Ignore repeater and switching delays; consider only a
   speed-of-light delay of 5 microsec/km.  The circumference of the
   globe at the equator is approx. 40000 km (round-trip delay must be
   considered to keep the pipe full).  10 Gb/sec x 40000 km x 5
   microsec/km x B / 8b = 250 MB].  In a conventional TCP
   implementation, loss of a TCP segment means that stream processing
   MUST stop until that segment is recovered, which takes at least a
   time of <network round trip> to accomplish.  Following the example
   above, an implementation would be obliged to catch 250 MB of data
   into an anonymous buffer before resuming stream processing; later,
   this data would need to be moved to its proper location.  Some
   proponents of iSCSI seek some means of putting data directly where
   it belongs, and avoiding extra data movement in the case of segment
   drop.  This is a key concept in understanding the debate behind
   framing methodologies.

   The framing of the iSCSI protocol impacts both the "interfacing
   needs" and the "accelerated processing needs", however, while
   including a length in a header may suffice for the "interfacing
   needs", it will not serve the "accelerated processing needs". The
   framing mechanism developed should allow resynchronization of packet
   boundaries even in the case where a packet is temporarily missing in
   the incoming data stream."


Here IPS is developing a framing protocol that increases the level of error
detection.  The IPS has made explicit reference to these intentions of
having this protocol supported directly in hardware.  I will be happy to
show how this protocol can be mapped into a common structure which would
avail more protocols to this hardware acceleration at the same time ease the
transition to SCTP.

Although I may upset some with this message, I feel this is a pivotal point
in time.  Yes, it is possible to support this protocol in software, but not
at competitive data rates.  Of this you should be clear.  At this point in
time, with the iSCSI protocol in flux, it is impossible to make hardware for
this protocol.  With a minor amount of effort, this protocol could be placed
into a common format.  This becomes an architectural decision.  This is a
profound decision as it will determine the number of adapters needed by
systems to support these various protocols.  A decision that will impact the
next decade and likely influence networking profoundly.  These are not just
protocols.  You are attempting to redefine the nature of the network
adapter.

Presently you wish to see this done in a haphazard manner without any
coordination by the IETF.  This attitude does not reflect the magnitude of
this endeavor.  Frankly, had the IPS followed the efforts of the sigtran
group, there could have been and will be considerable time saved.  As it is
now, we are readopting multi-connection protocols with unique SACK packet
schemes, error handling etc.  In other words, re-inventing SCTP.  It took
them more than 2 years to get SCTP where they are now and it is in suitable
form for generic use.  If the effort is to use TCP, I can show how that can
be done as well by adopting SCTP structures.  iSCSI, RDMA, iFCP, and FCIP
should not need special devices made as a result of wanting improved error
detection, framing and data vectoring.  This is not rocket science, but not
using a common format makes it an adapter zoo.

Doug


David Black wrote:
> > This requirements document makes it clear there is expectation of
> > modifying Ethernet adapters to support this protocol.  Should this
> > required hardware support be made in a general fashion to allow
> > common use among other protocols?
>
> There are at least two announced iSCSI products and
> an open source driver that do not require any
> modifications to existing Ethernet adapters,
> so such modifications are clearly not a requirement.
>
> > This hardware requirement is primarily based on two
> > requirements, to increase the level of error detection and to allow
> > framing.
>
> Error detection (i.e., CRC or checksum) can be done in
> software as Doug has frequently pointed out on this list.
> I don't think the statement about IPS and TSVWG pursuing
> a common error detection algorithm is correct -- while
> I'll defer to the ADs (who are the chairs of TSVWG), I
> believe TSVWG has significant interest in an improved
> checksum (e.g., Adler-32 based on adding 16-bit quantities
> instead of 8-bit), whereas IPS intends to use CRCs.
>
> Framing is optional and being pursued in a layered fashion
> as called for by the WG charter.  The instructions in
> the WG charter should be sufficient - adding text to the
> iSCSI requirements document that introduces otherwise
> unneeded dependencies on other protocol specification
> efforts (i.e., iFCP, FCIP) is a bad idea.  Heaven help
> us if we have to submit all the protocol drafts for the
> IPS WG to the IESG in one big bundle - at the very least
> the IESG will be annoyed, and annoying the IESG has all
> sorts of bad side effects ;-).
>
> I don't see that any change to the iSCSI requirements draft
> is needed in either of these areas based on this set of
> comments.
>
> Doug also posted a pointer to the wrong draft - I suspect
> he meant to point to draft-otis-tcp-framing-00.txt.
>
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>
> > -----Original Message-----
> > From:	Douglas Otis [SMTP:dotis@sanlight.net]
> > Sent:	Monday, April 23, 2001 4:15 PM
> > To:	KRUEGER,MARJORIE (HP-Roseville,ex1); Ips Reflector (E-mail)
> > Cc:	Allison Mankin; David Black; Elizabeth G Rodriguez (Elizabeth);
> > Scott Bradner
> > Subject:	RE: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
> >
> > Marjorie,
> >
> > This requirements document makes it clear there is expectation of
> > modifying Ethernet adapters to support this protocol.  Should this
> > required hardware support be made in a general fashion to allow
> > common use among other protocols?  This hardware requirement is
> > primarily based on two requirements, to increase the level of error
> > detection and to allow framing.  Presently, IETF supports a framing
> > protocol that also increases the level of error detection.
> >
> > Presently TSVWG and IPS are working on a common error detection
> > algorithm.  In addition, there are two other protocols expecting
> > hardware for framing and error detection.  This is iFCP and FCIP.
> >
> > See:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt
>
> > It is possible to have all these protocols use the same error detection
> > and framing.  If this MUST be done using TCP, as this requirement
> > document demands, then here is a possible general propose header that
> > would allow common use of hardware and a easy transition into SCTP.
>
> > I will be happy to define a mapping from the present protocols into this
> > generalized form.  The advantage should be obvious.  One Ethernet
adapter
> > can handle these various protocols without specialized hardware for
each.
>
> > For those wishing to update and route based on encapsulated headers, a
> > fix-up field at the end of these headers will allow use of a common
error
> > scheme using header fix-up.
>
> > Here is an example of how TCP can be made to look like SCTP.
http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt
> >
> > This header could become a TCP option field to allow for negotiation.
> >
> > P.S.
> > One additional question however.
> >
> > On page 18,
> >  "The iSCSI protocol document SHOULD NOT define the management
> >  architecture for iSCSI within the network infrastructure."
> >
> > What does this mean?
> >
> > Doug
> >
> >
> > > The IP Storage Working group is chartered with developing
> > > comprehensive technology to transport block storage data
> > > over IP protocols.  This effort includes a protocol to
> > > transport the Small Computer Systems Interface (SCSI)
> > > protocol over the internet (iSCSI).
> > >
> > > A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt
> > >



From owner-ips@ece.cmu.edu  Tue Apr 24 08:34:03 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA07189
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 08:34:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O9gRv11779
	for ips-outgoing; Tue, 24 Apr 2001 05:42:27 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O9fYA11755
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 05:41:34 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id FAA24018
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 05:34:09 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id DAA27624
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 03:41:33 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: RE: iSCSI Target Reset
To: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFB5AEA26F.FCA1233A-ON88256A38.0034E5F8@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Tue, 24 Apr 2001 02:41:13 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/24/2001 03:41:29 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I am posting Charles comments to me onto the reflector, you all might find
it interesting.  Thank you Charles.

Any other comments?

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com
---------------------- Forwarded by John Hufferd/San Jose/IBM on 04/24/2001
02:37 AM ---------------------------

Charles Monia <cmonia@NishanSystems.com> on 04/24/2001 12:02:36 AM

To:   John Hufferd/San Jose/IBM@IBMUS, Charles Monia
      <cmonia@NishanSystems.com>
cc:
Subject:  RE: iSCSI Target Reset



Hi John;

The following is my .02:

1.  Target reset must be supported (SAM says so at the moment).

2.  The interconnect's behavior is outside the scope of SAM.  i.e..  It is
up to the protocol spec.

3.  IMO: The only SAM requirement is the behavior at device (LUN) level as
seen by the  initiator issuing the request.  In that regard, for example,
it's sufficient to reset only the LUs that the initiator can see.  In a
virtual environment I assume that's a small subset of the LUs on a system.

Charles

> -----Original Message-----
> From: John Hufferd [mailto:hufferd@us.ibm.com]
> Sent: Monday, April 23, 2001 10:43 PM
> To: Charles Monia
> Subject: RE: iSCSI Target Reset
>
>
>
> Charles,
> You need to be more direct.  Is it a T10 requirement to support Target
> Reset?  How much Flexibility does T10 give the implementations.
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
>
>
> Charles Monia <cmonia@NishanSystems.com>@ece.cmu.edu on
> 04/23/2001 08:36:20
> PM
>
> Sent by:  owner-ips@ece.cmu.edu
>
>
> To:   ips@ece.cmu.edu
> cc:   Charles Monia <cmonia@NishanSystems.com>
> Subject:  RE: iSCSI Target Reset
>
>
>
> Hi:
>
> Sorry to reopen old issues, but unfortunately that seems
> necessary since
> something has been lost in translation.
>
> My concerns about TARGET RESET center on three areas:
>
> 1.  Adverse side effects at the transport layer that could
> affect other
> users.
> 2.  Similar adverse side effects on the affected devices,
> 3.  The impact on legacy software.
>
> The gist of my opinion on the first issue, (as expressed on the T10
> reflector) is as follows:
>
> "> > > > The ... issue is whether mechanisms, such as terget reset,
> > > > > are appropriate for a given transport.  In my view, the only
> > > > > immutable requirement is to preserve the transport-independant
> > > > > part of the semantics.  The definition of transport-specific
> > > > > side effects is best handled in the appropriate transport
> > > > > specification."
>
> The point of the above is that, in my view, a protocol
> specification has
> leeway to define the protocol-specific side effects in a rational and
> benign
> manner provided that the observable effects on the attached
> devices are
> preserved.
>
> Regarding adverse device-level side effects, I also stated that:
>
> "....restricting the operation [target reset] means providing
> hooks so that
> only a
> trusted class of initiators can perform the function.  It's a bit like
> controlling access to a file so that lots of users can read
> it but only a
> trusted few can perform a write or delete operation."
>
> Finally, with regard to legacy implementations, my main
> intent was to avoid
> the situation where an operation that was previously legal
> becomes illegal.
> In that regard, I also recall suggesting that the function be
> treated as a
> LUN RESET broadcast to all the logical units to which the initator had
> access privileges.  That would also support the notion of preventing
> adverse
> effects on other users as well.
>
> At any rate, since there is a proposal to make LUN RESET
> mandatory (see
> below), I believed this was a reasonable and consistent alternative. I
> therefore felt (and continue to believe) there is no justification for
> making the function optional.
>
> Anyhow, thess views did not prevail at the meeting in
> question. For that
> reason, the so-called NOP proposal was made as a last ditch effort to
> accomodate legacy implementations by preserving a measure of backwards
> compatibility (albeit token compatibility). In that regard,
> it seemed the
> lesser evil.
>
> > > As for leaving things out of iSCSI - the default modus
> > > operandi should be to put in everything that's described
> > > in SAM2 unless we can convince T10 to take the feature
> > > out of SAM2.  Let's not go deciding to cast things out
> > > of SCSI on T10's behalf.
>
> So, I guess that anyone wishing to support a change in the spec is, of
> course, free to pursue it in that forum.
>
> Charles
>
> PS: The proposal referenced above can be found at:
> ftp://ftp.t10.org/t10/document.01/01-015r2.pdf
>
> > -----Original Message-----
> > From: Elliott, Robert [mailto:Robert.Elliott@COMPAQ.com]
> > Sent: Monday, April 23, 2001 5:20 PM
> > To: ips@ece.cmu.edu; 'Black_David@emc.com';
> 'cmonia@nishansystems.com'
> > Subject: RE: iSCSI Target Reset
> >
> >
> > It's not a very good situation when each device chooses the
> > interpretation of TARGET RESET that it thinks is appropriate.
> > IBM's Shark might choose a different hack than Compaq's
> > RA8000.  How is software supposed to make sense of this?
> >
> > The rule in SAM-2 is that TARGET RESET resets every logical
> > unit (subject to access controls, if implemented).  The fact
> > is that in Fibre Channel, not many multi-LUN multi-port
> > targets followed that rule.  The result is that software
> > cannot tell what's going to happen and may have to handle
> > targets from each vendor differently.  This is not very
> > interoperable.
> >
> > Charles suggested in the T10 meeting that we allow it to
> > be implemented as a no-op rather than let protocols drop
> > support for it.  That doesn't help software like Windows that
> > does expect certain effects - a no-op implementation
> > would break clustering.  By removing it from the protocol,
> > software is forced to find a suitable replacement (e.g. use
> > LOGICAL UNIT RESET or switch to persistent reservations).
> > In Windows, this can be done at the port driver level
> > (STORPORT improving on SCSIPORT) or at the miniport level
> > (convert each target reset request into multiple LOGICAL UNIT
> > RESETs).
> >
> > Note that other protocols like NFS and HTTP over IP don't
> > seem to have "server resets."
> >
> > The recent SAM-2 change was expressly designed to encourage
> > iSCSI and SRP to drop support for TARGET RESET.  Please don't
> > keep it because you think T10 would be offended :-)
> >
> > ---
> > Rob Elliott, Compaq Server Storage
> > Robert.Elliott@compaq.com
> >
> >
> >
> >
> >
> >
> > > -----Original Message-----
> > > From: Black_David@emc.com [mailto:Black_David@emc.com]
> > > Sent: Monday, April 23, 2001 6:31 PM
> > > To: cmonia@NishanSystems.com; ips@ece.cmu.edu
> > > Subject: RE: iSCSI Target Reset
> > >
> > >
> > > I agree with Charles that this is an implementation
> > > issue.  If a Shark wants to reset all 32 adapters
> > > when it receives a Target Reset on one of them, that's
> > > a Shark implementation decision.  It's completely valid
> > > to reset only the adapter that the Target Reset is
> > > received on (common Fibre Channel behavior) or
> > > only the iSCSI target to which the Target Reset is
> > > addressed if there's more than one Target behind
> > > the adapter.
> > >
> > > As for leaving things out of iSCSI - the default modus
> > > operandi should be to put in everything that's described
> > > in SAM2 unless we can convince T10 to take the feature
> > > out of SAM2.  Let's not go deciding to cast things out
> > > of SCSI on T10's behalf.
> > >
> > > Thanks,
> > > --David
> > >
> > > > -----Original Message-----
> > > > From:    Charles Monia [SMTP:cmonia@nishansystems.com]
> > > > Sent:    Monday, April 23, 2001 7:12 PM
> > > > To: ips@ece.cmu.edu
> > > > Subject: RE: iSCSI Target Reset
> > > >
> > > > Hi:
> > > >
> > > > These seem to be implementation decisions. I don't see how
> > > that justifies
> > > > removing support from the protocol.
> > > >
> > > > Charles
> > > >
> > > > > -----Original Message-----
> > > > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > > > Sent: Monday, April 23, 2001 2:34 PM
> > > > > To: Santosh Rao
> > > > > Cc: ips@ece.cmu.edu
> > > > > Subject: Re: iSCSI Target Reset
> > > > >
> > > > >
> > > > >
> > > > > Absolutely not,  Why would we think that impacting 32
> > > different other
> > > > > initiators is an OK thing to do.  By the way there are lots
> > > > > more Initiators
> > > > > possible with FC on Shark, and would hope that there would be
> > > > > even more
> > > > > with iSCSI.
> > > > >
> > > > > I have been told that these large Storage Controllers do not
> > > > > support Target
> > > > > Reset today.  So I see no loss in not supporting such an
> > > item in iSCSI
> > > > > especially since many Initiators will be beyond even the
> > > distances and
> > > > > mischief that is possible with FC.
> > > > >
> > > > > .
> > > > > .
> > > > > .
> > > > > John L. Hufferd
> > > > > Senior Technical Staff Member (STSM)
> > > > > IBM/SSG San Jose Ca
> > > > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > > > Internet address: hufferd@us.ibm.com
> > > > >
> > > > >
> > > > > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > > > > 01:24:02 PM
> > > > >
> > > > > Sent by:  owner-ips@ece.cmu.edu
> > > > >
> > > > >
> > > > > To:   ips@ece.cmu.edu
> > > > > cc:
> > > > > Subject:  Re: iSCSI Target Reset
> > > > >
> > > > >
> > > > >
> > > > > "Dillard, David" wrote:
> > > > > >
> > > > > > When will STORPORT be generally available?  The latest
> > > > > STORPORT document
> > > > > > that I found on the MS web site is version 0.6a, dated
> > > > > March 18, 2001.
> > > > > > Given this it seems like STORPORT might not be available
> > > > > soon.  In that
> > > > > case
> > > > > > do you know what happens with the current drivers?  Are we
> > > > > going to be
> > > > > > telling customers that if they want to use iSCSI and NT
> > > > > clustering they
> > > > > have
> > > > > > to update to Whistler?
> > > > >
> > > > >
> > > > > [One would hope that this list does not turn into a Microsoft
> > > > > release/product discussion mailing list (?) ]
> > > > >
> > > > > Without going into specifics of A certain O.S., does it
> > suffice to
> > > > > require that iSCSI not break existing legacy SCSI
> applications ?
> > > > >
> > > > > If the above is a valid requirement, then, knowing that legacy
> > > > > applications continue to use SCSI-2 Reserve/Release and the
> > > > > target reset
> > > > > as a mechanism of breaking SCSI-2 reservations, should'nt
> > > > > iSCSI continue
> > > > > to support the target reset ?
> > > > >
> > > > > - Santosh
> > > > >
> > > > >
> > > > >
> > > > >
> > >
> >
>
>
>





From owner-ips@ece.cmu.edu  Tue Apr 24 08:34:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA07200
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 08:34:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O9eRL11697
	for ips-outgoing; Tue, 24 Apr 2001 05:40:27 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c017.sfo.cp.net (c017-h020.c017.sfo.cp.net [209.228.12.234])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3O9ddA11621
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 05:39:39 -0400 (EDT)
Received: (cpmta 351 invoked from network); 24 Apr 2001 02:39:31 -0700
Received: from sangate-GW.ser.netvision.net.il (HELO sangate.com) (212.143.114.146)
  by smtp.sangate.com (209.228.12.234) with SMTP; 24 Apr 2001 02:39:31 -0700
X-Sent: 24 Apr 2001 09:39:31 GMT
Message-ID: <3AE53488.F99D55FD@sangate.com>
Date: Tue, 24 Apr 2001 11:08:40 +0300
From: Mark Mokryn <mark@sangate.com>
Organization: SANgate Systems
X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16-22 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: John Hufferd <hufferd@us.ibm.com>
CC: Santosh Rao <santoshr@cup.hp.com>, ips@ece.cmu.edu
Subject: Re: iSCSI Target Reset
References: <OF1FDE19E2.A3D61BC6-ON88256A37.007604FE@LocalDomain>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Shark does indeed support Target Reset. According to the Shark reference
manual, a Target Reset has the effect of performing a LUN Reset on each
LUN surfaced on the port. Interestingly, for FC (and not for parallel
SCSI), Shark surfaces all LUNs on all ports (there are ACLs per LUN), so
I assume a Target Reset has the unfortunate effect of reseting the
entire Shark...

John Hufferd wrote:
> 
> Absolutely not,  Why would we think that impacting 32 different other
> initiators is an OK thing to do.  By the way there are lots more Initiators
> possible with FC on Shark, and would hope that there would be even more
> with iSCSI.
> 
> I have been told that these large Storage Controllers do not support Target
> Reset today.  So I see no loss in not supporting such an item in iSCSI
> especially since many Initiators will be beyond even the distances and
> mischief that is possible with FC.
> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> 
> Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001 01:24:02 PM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI Target Reset
> 
> "Dillard, David" wrote:
> >
> > When will STORPORT be generally available?  The latest STORPORT document
> > that I found on the MS web site is version 0.6a, dated March 18, 2001.
> > Given this it seems like STORPORT might not be available soon.  In that
> case
> > do you know what happens with the current drivers?  Are we going to be
> > telling customers that if they want to use iSCSI and NT clustering they
> have
> > to update to Whistler?
> 
> [One would hope that this list does not turn into a Microsoft
> release/product discussion mailing list (?) ]
> 
> Without going into specifics of A certain O.S., does it suffice to
> require that iSCSI not break existing legacy SCSI applications ?
> 
> If the above is a valid requirement, then, knowing that legacy
> applications continue to use SCSI-2 Reserve/Release and the target reset
> as a mechanism of breaking SCSI-2 reservations, should'nt iSCSI continue
> to support the target reset ?
> 
> - Santosh

-- 
Mark Mokryn      SANgate Systems Inc.      mark@SANgate.com
Phone: +972-9-8919821                Mobile: +972-54-270030
Fax: +972-9-8919449                  http://www.SANgate.com
P.O. Box 1486 41 Hameyasdim St., Even Yehuda 40500 Israel


From owner-ips@ece.cmu.edu  Tue Apr 24 08:40:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA07279
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 08:40:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3O9DQQ04929
	for ips-outgoing; Tue, 24 Apr 2001 05:13:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O9CcA02516
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 05:12:38 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id LAA145638
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 11:12:30 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id LAA83608;
	Tue, 24 Apr 2001 11:12:27 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A38.003291AF ; Tue, 24 Apr 2001 11:12:20 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: dfsmith@almaden.ibm.com
cc: ips@ece.cmu.edu
Message-ID: <C1256A38.00329036.00@d12mta02.de.ibm.com>
Date: Tue, 24 Apr 2001 12:17:46 +0300
Subject: Re: iSCSI-06 SCSI Cmd typo
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Thanks Daniel,

It's fixed. CmdSN is at 24 and  the following are shifted.

Julo

Daniel Smith <dfsmith@almaden.ibm.com> on 24/04/2001 03:16:31

Please respond to Daniel Smith <dfsmith@almaden.ibm.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  iSCSI-06 SCSI Cmd typo




In section 2.3 SCSI Command...

the table shows bytes 36--47 reserved for the CDB (12 bytes).
But
the description 2.3.6 says it's 16 bytes.

(I'd prefer 16 bytes---I'm not a big fan of Stat/DataSN.)

This document is getting big---but the latest version seems to be holding
up
well as I read through it.  Good work.

Daniel
--
IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120-6099, USA
K65B/C2 Phone: +1(408)927-2072 Fax: +1(408)927-3010 Home: +1(408)227-5786





From owner-ips@ece.cmu.edu  Tue Apr 24 14:21:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA11916
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 14:21:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OFBfe26929
	for ips-outgoing; Tue, 24 Apr 2001 11:11:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mcmail.mcdata.com (mcgate.mcdata.com [144.49.1.5])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OFAeA26860
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 11:10:40 -0400 (EDT)
Received: from exchange5.mcdata.com (gw2exch [144.49.33.200])
	by mcmail.mcdata.com (8.8.6 (PHNE_14041)/8.8.6) with ESMTP id TAA08412
	for <ips@ece.cmu.edu>; Mon, 23 Apr 2001 19:12:18 -0600 (MDT)
Received: by exchange5.mcdata.com with Internet Mail Service (5.5.2653.19)
	id <J2K63KBV>; Tue, 24 Apr 2001 09:10:34 -0600
Message-ID: <F23E86F16912534DA9FA8937CFD7CA740262F934@exchange5.mcdata.com>
From: Anil Rijhsinghani <anil.rijhsinghani@mcdata.com>
To: "'ips@ece.cmu.edu'" <ips@ece.cmu.edu>
Subject: Internet draft: FCIP MIB 
Date: Tue, 24 Apr 2001 09:10:31 -0600
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Until the IETF folks put it up on their web site, here is the FCIP MIB
internet draft:

Word format: http://www.mcdata.com/fcip/draft-anil-fcip-mib-v01.doc
Text format: http://www.mcdata.com/fcip/draft-anil-fcip-mib-v01.txt

	Title		: FCIP MIB
	Author(s)	: S. Akkala, R. Natarajan, A. Rijhsinghani
	Filename	: draft-anil-fcip-mib-v01.txt
	Pages		: 17
	Date		: 20-Apr-01

Regards,
Anil



From owner-ips@ece.cmu.edu  Tue Apr 24 14:21:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA11927
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 14:21:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OFTBC27884
	for ips-outgoing; Tue, 24 Apr 2001 11:29:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from crufty.research.bell-labs.com (crufty.research.bell-labs.com [204.178.16.49])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3OFSNA27849
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 11:28:23 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by crufty; Tue Apr 24 11:24:42 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Tue Apr 24 11:27:07 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id LAA01558;
	Tue, 24 Apr 2001 11:27:00 -0400 (EDT)
Message-ID: <3AE59B44.8F2AC897@research.bell-labs.com>
Date: Tue, 24 Apr 2001 11:27:00 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: John Hufferd <hufferd@us.ibm.com>
CC: ips@ece.cmu.edu
Subject: Re: iSCSI Target Reset
References: <OFB5AEA26F.FCA1233A-ON88256A38.0034E5F8@LocalDomain>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

John Hufferd wrote:
> 
> I am posting Charles comments to me onto the reflector, you all might find
> it interesting.  Thank you Charles.
> 
> Any other comments?

It would be interesting to hear from Veritas (anyone on
the list?).  I think the Veritas Cluster Server also uses 
SCSI reservations to avoid split-brain and can support 
32-node clusters in a SAN.

-Sandeep

> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> ---------------------- Forwarded by John Hufferd/San Jose/IBM on 04/24/2001
> 02:37 AM ---------------------------
> 
> Charles Monia <cmonia@NishanSystems.com> on 04/24/2001 12:02:36 AM
> 
> To:   John Hufferd/San Jose/IBM@IBMUS, Charles Monia
>       <cmonia@NishanSystems.com>
> cc:
> Subject:  RE: iSCSI Target Reset
> 
> Hi John;
> 
> The following is my .02:
> 
> 1.  Target reset must be supported (SAM says so at the moment).
> 
> 2.  The interconnect's behavior is outside the scope of SAM.  i.e..  It is
> up to the protocol spec.
> 
> 3.  IMO: The only SAM requirement is the behavior at device (LUN) level as
> seen by the  initiator issuing the request.  In that regard, for example,
> it's sufficient to reset only the LUs that the initiator can see.  In a
> virtual environment I assume that's a small subset of the LUs on a system.
> 
> Charles
> 
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Monday, April 23, 2001 10:43 PM
> > To: Charles Monia
> > Subject: RE: iSCSI Target Reset
> >
> >
> >
> > Charles,
> > You need to be more direct.  Is it a T10 requirement to support Target
> > Reset?  How much Flexibility does T10 give the implementations.
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > Charles Monia <cmonia@NishanSystems.com>@ece.cmu.edu on
> > 04/23/2001 08:36:20
> > PM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:   Charles Monia <cmonia@NishanSystems.com>
> > Subject:  RE: iSCSI Target Reset
> >
> >
> >
> > Hi:
> >
> > Sorry to reopen old issues, but unfortunately that seems
> > necessary since
> > something has been lost in translation.
> >
> > My concerns about TARGET RESET center on three areas:
> >
> > 1.  Adverse side effects at the transport layer that could
> > affect other
> > users.
> > 2.  Similar adverse side effects on the affected devices,
> > 3.  The impact on legacy software.
> >
> > The gist of my opinion on the first issue, (as expressed on the T10
> > reflector) is as follows:
> >
> > "> > > > The ... issue is whether mechanisms, such as terget reset,
> > > > > > are appropriate for a given transport.  In my view, the only
> > > > > > immutable requirement is to preserve the transport-independant
> > > > > > part of the semantics.  The definition of transport-specific
> > > > > > side effects is best handled in the appropriate transport
> > > > > > specification."
> >
> > The point of the above is that, in my view, a protocol
> > specification has
> > leeway to define the protocol-specific side effects in a rational and
> > benign
> > manner provided that the observable effects on the attached
> > devices are
> > preserved.
> >
> > Regarding adverse device-level side effects, I also stated that:
> >
> > "....restricting the operation [target reset] means providing
> > hooks so that
> > only a
> > trusted class of initiators can perform the function.  It's a bit like
> > controlling access to a file so that lots of users can read
> > it but only a
> > trusted few can perform a write or delete operation."
> >
> > Finally, with regard to legacy implementations, my main
> > intent was to avoid
> > the situation where an operation that was previously legal
> > becomes illegal.
> > In that regard, I also recall suggesting that the function be
> > treated as a
> > LUN RESET broadcast to all the logical units to which the initator had
> > access privileges.  That would also support the notion of preventing
> > adverse
> > effects on other users as well.
> >
> > At any rate, since there is a proposal to make LUN RESET
> > mandatory (see
> > below), I believed this was a reasonable and consistent alternative. I
> > therefore felt (and continue to believe) there is no justification for
> > making the function optional.
> >
> > Anyhow, thess views did not prevail at the meeting in
> > question. For that
> > reason, the so-called NOP proposal was made as a last ditch effort to
> > accomodate legacy implementations by preserving a measure of backwards
> > compatibility (albeit token compatibility). In that regard,
> > it seemed the
> > lesser evil.
> >
> > > > As for leaving things out of iSCSI - the default modus
> > > > operandi should be to put in everything that's described
> > > > in SAM2 unless we can convince T10 to take the feature
> > > > out of SAM2.  Let's not go deciding to cast things out
> > > > of SCSI on T10's behalf.
> >
> > So, I guess that anyone wishing to support a change in the spec is, of
> > course, free to pursue it in that forum.
> >
> > Charles
> >
> > PS: The proposal referenced above can be found at:
> > ftp://ftp.t10.org/t10/document.01/01-015r2.pdf
> >
> > > -----Original Message-----
> > > From: Elliott, Robert [mailto:Robert.Elliott@COMPAQ.com]
> > > Sent: Monday, April 23, 2001 5:20 PM
> > > To: ips@ece.cmu.edu; 'Black_David@emc.com';
> > 'cmonia@nishansystems.com'
> > > Subject: RE: iSCSI Target Reset
> > >
> > >
> > > It's not a very good situation when each device chooses the
> > > interpretation of TARGET RESET that it thinks is appropriate.
> > > IBM's Shark might choose a different hack than Compaq's
> > > RA8000.  How is software supposed to make sense of this?
> > >
> > > The rule in SAM-2 is that TARGET RESET resets every logical
> > > unit (subject to access controls, if implemented).  The fact
> > > is that in Fibre Channel, not many multi-LUN multi-port
> > > targets followed that rule.  The result is that software
> > > cannot tell what's going to happen and may have to handle
> > > targets from each vendor differently.  This is not very
> > > interoperable.
> > >
> > > Charles suggested in the T10 meeting that we allow it to
> > > be implemented as a no-op rather than let protocols drop
> > > support for it.  That doesn't help software like Windows that
> > > does expect certain effects - a no-op implementation
> > > would break clustering.  By removing it from the protocol,
> > > software is forced to find a suitable replacement (e.g. use
> > > LOGICAL UNIT RESET or switch to persistent reservations).
> > > In Windows, this can be done at the port driver level
> > > (STORPORT improving on SCSIPORT) or at the miniport level
> > > (convert each target reset request into multiple LOGICAL UNIT
> > > RESETs).
> > >
> > > Note that other protocols like NFS and HTTP over IP don't
> > > seem to have "server resets."
> > >
> > > The recent SAM-2 change was expressly designed to encourage
> > > iSCSI and SRP to drop support for TARGET RESET.  Please don't
> > > keep it because you think T10 would be offended :-)
> > >
> > > ---
> > > Rob Elliott, Compaq Server Storage
> > > Robert.Elliott@compaq.com
> > >
> > >
> > >
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Black_David@emc.com [mailto:Black_David@emc.com]
> > > > Sent: Monday, April 23, 2001 6:31 PM
> > > > To: cmonia@NishanSystems.com; ips@ece.cmu.edu
> > > > Subject: RE: iSCSI Target Reset
> > > >
> > > >
> > > > I agree with Charles that this is an implementation
> > > > issue.  If a Shark wants to reset all 32 adapters
> > > > when it receives a Target Reset on one of them, that's
> > > > a Shark implementation decision.  It's completely valid
> > > > to reset only the adapter that the Target Reset is
> > > > received on (common Fibre Channel behavior) or
> > > > only the iSCSI target to which the Target Reset is
> > > > addressed if there's more than one Target behind
> > > > the adapter.
> > > >
> > > > As for leaving things out of iSCSI - the default modus
> > > > operandi should be to put in everything that's described
> > > > in SAM2 unless we can convince T10 to take the feature
> > > > out of SAM2.  Let's not go deciding to cast things out
> > > > of SCSI on T10's behalf.
> > > >
> > > > Thanks,
> > > > --David
> > > >
> > > > > -----Original Message-----
> > > > > From:    Charles Monia [SMTP:cmonia@nishansystems.com]
> > > > > Sent:    Monday, April 23, 2001 7:12 PM
> > > > > To: ips@ece.cmu.edu
> > > > > Subject: RE: iSCSI Target Reset
> > > > >
> > > > > Hi:
> > > > >
> > > > > These seem to be implementation decisions. I don't see how
> > > > that justifies
> > > > > removing support from the protocol.
> > > > >
> > > > > Charles
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > > > > Sent: Monday, April 23, 2001 2:34 PM
> > > > > > To: Santosh Rao
> > > > > > Cc: ips@ece.cmu.edu
> > > > > > Subject: Re: iSCSI Target Reset
> > > > > >
> > > > > >
> > > > > >
> > > > > > Absolutely not,  Why would we think that impacting 32
> > > > different other
> > > > > > initiators is an OK thing to do.  By the way there are lots
> > > > > > more Initiators
> > > > > > possible with FC on Shark, and would hope that there would be
> > > > > > even more
> > > > > > with iSCSI.
> > > > > >
> > > > > > I have been told that these large Storage Controllers do not
> > > > > > support Target
> > > > > > Reset today.  So I see no loss in not supporting such an
> > > > item in iSCSI
> > > > > > especially since many Initiators will be beyond even the
> > > > distances and
> > > > > > mischief that is possible with FC.
> > > > > >
> > > > > > .
> > > > > > .
> > > > > > .
> > > > > > John L. Hufferd
> > > > > > Senior Technical Staff Member (STSM)
> > > > > > IBM/SSG San Jose Ca
> > > > > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > > > > Internet address: hufferd@us.ibm.com
> > > > > >
> > > > > >
> > > > > > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > > > > > 01:24:02 PM
> > > > > >
> > > > > > Sent by:  owner-ips@ece.cmu.edu
> > > > > >
> > > > > >
> > > > > > To:   ips@ece.cmu.edu
> > > > > > cc:
> > > > > > Subject:  Re: iSCSI Target Reset
> > > > > >
> > > > > >
> > > > > >
> > > > > > "Dillard, David" wrote:
> > > > > > >
> > > > > > > When will STORPORT be generally available?  The latest
> > > > > > STORPORT document
> > > > > > > that I found on the MS web site is version 0.6a, dated
> > > > > > March 18, 2001.
> > > > > > > Given this it seems like STORPORT might not be available
> > > > > > soon.  In that
> > > > > > case
> > > > > > > do you know what happens with the current drivers?  Are we
> > > > > > going to be
> > > > > > > telling customers that if they want to use iSCSI and NT
> > > > > > clustering they
> > > > > > have
> > > > > > > to update to Whistler?
> > > > > >
> > > > > >
> > > > > > [One would hope that this list does not turn into a Microsoft
> > > > > > release/product discussion mailing list (?) ]
> > > > > >
> > > > > > Without going into specifics of A certain O.S., does it
> > > suffice to
> > > > > > require that iSCSI not break existing legacy SCSI
> > applications ?
> > > > > >
> > > > > > If the above is a valid requirement, then, knowing that legacy
> > > > > > applications continue to use SCSI-2 Reserve/Release and the
> > > > > > target reset
> > > > > > as a mechanism of breaking SCSI-2 reservations, should'nt
> > > > > > iSCSI continue
> > > > > > to support the target reset ?
> > > > > >
> > > > > > - Santosh
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > >
> > >
> >
> >
> >


From owner-ips@ece.cmu.edu  Tue Apr 24 14:21:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA11938
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 14:21:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OEul726139
	for ips-outgoing; Tue, 24 Apr 2001 10:56:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OEu6A26067
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 10:56:06 -0400 (EDT)
Received: from southrelay02.raleigh.ibm.com (southrelay02.raleigh.ibm.com [9.37.3.209])
	by e21.nc.us.ibm.com (8.9.3/8.9.3) with ESMTP id KAA88862;
	Tue, 24 Apr 2001 10:44:55 -0500
Received: from d04nms25nms26.raleigh.ibm.com (d04nms25.raleigh.ibm.com [9.67.228.6])
	by southrelay02.raleigh.ibm.com (8.11.1/NCO v4.96) with ESMTP id f3OEoa570522;
	Tue, 24 Apr 2001 10:50:36 -0400
Importance: Normal
Subject: iSCSI: handling of persistent reserves during initiator reboot
To: ips@ece.cmu.edu
Cc: iscsi-mib@external.cisco.com
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFB34446FE.390D72C3-ON85256A38.004D52DB@raleigh.ibm.com>
From: "Thomas McSweeney" <rf42tpme@us.ibm.com>
Date: Tue, 24 Apr 2001 10:50:36 -0400
X-MIMETrack: Serialize by Router on D04NMS25/04/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/24/2001 10:50:36 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I'd like to move this discussion from the iSCSI MIB exploder to the IPS
exploder, since it has implications on the iSCSI spec.

John Hufferd appended a question asking whether the target would reset its
session counters under the following conditions:

>And if the Host was rebooted but comes back to the Target
>with the same iSCSI name and ISID, of a broken connection --
>which has outstanding persistent reserves -- then the
>Target must respond with the corresponding TSID of the
>interrupted Session.   This means that the Session has
>continued, and the connection has also been continued.

My response was:

I disagree.  Certainly, it is a new connection, so at the target the old
connection object would disappear and be replaced.  I think it is a new
session also - spec section 1.2.3 states that Login with a null TSID is for
a new session.  Whether or not the old session is cleaned up is still a
matter of debate...  Section 6.7.3 mentions "Login with an implied Logout",
which seems to apply to this case.  However, that depends on how the
"multiple sessions between one initiator and one target" issue is resolved.

If the initiator retained enough info about the I-T nexus despite the
reboot to send the correct non-null TSID on the (re)Login, then it should
have saved all its counters too.  The target does not reset its session
object counters due to within-session connection recovery.

John replied:

>I did not say that the initiators saved the TSID, just their own ISID. The
>point is that it is a SAM-2 requirement that if Persistent Reserves are
>outstanding, the Initiator should be able to clean them up or continue to
>work with them after a boot.  This will mean that the target will have to
>detect that  Persistent Reserves are outstanding, and keep approprate
>information around for the reconnection (with the InitiatorName=xxxxxxx,
>and ISID of zzzzzzz).  The Target will then need to respond --- when a new
>session is being created for InitiatorName=xxxxxxx, and ISID or zzzzzzzz
>--- with the TSID used with the broken connection.  In this way the
>initiator can clean up or continuation of the Persistent Reserves.

>Persistent Reserves are a bit different then any other State that is kept
>by the Target.

I believe the Login which the rebooted initiator sends has to be for a
"leading connection", so it can learn the value of negotiated iSCSI
parameters like MaxConnections and FMarker that are only exchanged during
initial session activation.  So to me it's a new session.  It shouldn't
matter to the underlying SCSI resources whether the TSID matches the
previous session or not.   I'm not sure how the target should handle the
outstanding persistent reserves - I assume they can be moved to the new
session.  Now it's time for the real experts to chime in...

I'd like to see a new section added to Chapter 6 of the spec to describe
the handling of outstanding persistent reserves, since it is a special
recovery case.

As for the SNMP objects, I think the counts will be reset (old session
instance disappears, new session instance initialized to zero).  This would
make the target's counts consistent with the initiator's.  We plan to
discuss the tracking of long-term (across multiple sessions) usage per
initiator in Nashua.

Tom McSweeney
iSCSI Development, Storage Systems Group, IBM
Email: rf42tpme@us.ibm.com
Phone: (USA) 919-254-5634  (tie line: 444-5634)
Fax:   (USA) 919-254-0391  (tie line: 444-0391)



From owner-ips@ece.cmu.edu  Tue Apr 24 14:24:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA11957
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 14:24:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OFEhc27041
	for ips-outgoing; Tue, 24 Apr 2001 11:14:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OFEZA27030
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 11:14:36 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <2G6LVTNB>; Tue, 24 Apr 2001 11:05:05 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154A8@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Cc: sob@harvard.edu, egrodriguez@lucent.com, mankin@east.isi.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Tue, 24 Apr 2001 11:14:28 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> I'll stand by the stated intent of implementing this protocol in hardware.
> The same is also true for iFCP and FCIP.

I never questioned the fact that there will be hardware implementations.
It still appears to me that no change to the iSCSI requirements document
is needed to deal with this set of issues.  I am strongly opposed to linking
iSCSI specification development to FCIP and iFCP in a fashion that would
require all three to be submitted to the IESG as a single set of documents.

> Here IPS is developing a framing protocol that increases the level of
error
> detection. 

I'm sorry, but that's incorrect, because the IPS WG is not developing
any framing protocol.  draft-williams-tcpulpframe-01.txt and
draft-otis-tcp-framing-00.txt are both TSVWG drafts, not IPS drafts.
TSVWG is the right place to work on these sorts of common framing
mechanisms, and how that work is pursued is at the discretion and
judgement of TSVWG and its chairs.

> The IPS has made explicit reference to these intentions of
> having this protocol supported directly in hardware.  I will be happy to
> show how this protocol can be mapped into a common structure which would
> avail more protocols to this hardware acceleration at the same time ease
the
> transition to SCTP.

The first step of submitting draft-otis-tcp-framing-00.txt has been taken
(thank you) and the IPS WG will observe and follow what is done in this
area by TSVWG.

Further discussion of framing belongs on the TSVWG list, not IPS.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Tue Apr 24 14:27:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA12002
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 14:27:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OFui129405
	for ips-outgoing; Tue, 24 Apr 2001 11:56:44 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OFuHA29348
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 11:56:17 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id LAA67982
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 11:48:52 -0400
Received: from f3n42e (d03nm042h.boulder.ibm.com [9.99.140.42])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id JAA170590
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 09:56:10 -0600
Importance: Normal
Subject: Re: iSCSI: handling of persistent reserves during initiator reboot
To: "Thomas McSweeney" <rf42tpme@us.ibm.com>
Cc: ips@ece.cmu.edu
From: "Jim Hafner" <hafner@almaden.ibm.com>
Date: Tue, 24 Apr 2001 08:56:09 -0700
Message-ID: <OF66344A2F.2D475BD4-ON88256A38.0054B39C@LocalDomain>
X-MIMETrack: Serialize by Router on D03NM042/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/24/2001 08:56:10 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Tom,

Here's my two cents. Apologies for being long-winded.  In short, I think I
agree with most of your sentiment.

1) In order to accommodate Persistent Reservations (and a few other SCSI
things), I think it is important to have some mechanism for an initiator to
reestablish some nexus state information through logout/login.  The current
thinking in N&DT is that the initiator should reuse it's old ISID and the
target should respond with the old TSID (thereby rebuilding the nexus).
Note, the initiator starts with TSID=0, as if it was a new session.  The
target that remembers the old ISID/TSID pair (can reuse the old TSID). Keep
in mind that that target is already remembering some state information (the
reservation) about the ISID/TSID (and Names) relationship in the nexus, so
this isn't a big deal.

2) One way to view this SCSI requirement is that the SCSI nexus gets
rebuilt.  But that doesn't imply that the iSCSI session itself recovers.
Consequently, I would think that the iSCSI is as "fresh" as it need be.
Note that the nexus itself need not go back to the complete previous state
it was in.   Certain information needs to be restored for the nexus, but
not everything (there will be lost tasks and that's OK).  I think this is
"restored" nexus, not a "continued" nexus, so the requirements are very
different.

3) There is a symmetrical problem here that was lost in the discussion
quoted.  Namely, persistent reserves are more precisely a requirement for
the target to maintain/restore through the target's reset events (like
power cycles).  It is less of a statement about initiator's logout/login.
FCP added something even more general by the requirement that persistent
reservation state for an initiator be restored after network address
reassignment.  (The initiator may have moved address but it still the same
initiator by name so keeps some state through these underlying transport
events.) That's the problem that needs to be addressed.  We can think of
session crashes (connect drops), and target recycles and host rebooting as
events that tear down the underlying transport; such events should have
minimal impact on SCSI stuff.

4) These "minimal impacts" may in fact be a lot (like aborting of all
current tasks and clearing session state).

5) I'm supportive of a section in the main document (perhaps an appendix or
annex, but I don't care) that does the following:
  (a) describes the mapping of SCSI terms to iSCSI terms, in particular,
the nexus, the initiator and target devices and ports and their names and
identifiers, et al.
  (b) defines the SCSI state of a nexus that must be restored when the
nexus gets rebuilt (this includes but is not exclusively persistent
reservations)
  (c) defines the procedures for how that nexus gets rebuilt and the side
effect that has on other state information at the SCSI and iSCSI levels

Well that was probably more than 2 cents (or maybe it was more like 4
lire).

Jim Hafner


Thomas McSweeney/Raleigh/IBM@IBMUS@ece.cmu.edu on 04-24-2001 07:50:36 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:   iscsi-mib@external.cisco.com
Subject:  iSCSI: handling of persistent reserves during initiator reboot



I'd like to move this discussion from the iSCSI MIB exploder to the IPS
exploder, since it has implications on the iSCSI spec.

John Hufferd appended a question asking whether the target would reset its
session counters under the following conditions:

>And if the Host was rebooted but comes back to the Target
>with the same iSCSI name and ISID, of a broken connection --
>which has outstanding persistent reserves -- then the
>Target must respond with the corresponding TSID of the
>interrupted Session.   This means that the Session has
>continued, and the connection has also been continued.

My response was:

I disagree.  Certainly, it is a new connection, so at the target the old
connection object would disappear and be replaced.  I think it is a new
session also - spec section 1.2.3 states that Login with a null TSID is for
a new session.  Whether or not the old session is cleaned up is still a
matter of debate...  Section 6.7.3 mentions "Login with an implied Logout",
which seems to apply to this case.  However, that depends on how the
"multiple sessions between one initiator and one target" issue is resolved.

If the initiator retained enough info about the I-T nexus despite the
reboot to send the correct non-null TSID on the (re)Login, then it should
have saved all its counters too.  The target does not reset its session
object counters due to within-session connection recovery.

John replied:

>I did not say that the initiators saved the TSID, just their own ISID. The
>point is that it is a SAM-2 requirement that if Persistent Reserves are
>outstanding, the Initiator should be able to clean them up or continue to
>work with them after a boot.  This will mean that the target will have to
>detect that  Persistent Reserves are outstanding, and keep approprate
>information around for the reconnection (with the InitiatorName=xxxxxxx,
>and ISID of zzzzzzz).  The Target will then need to respond --- when a new
>session is being created for InitiatorName=xxxxxxx, and ISID or zzzzzzzz
>--- with the TSID used with the broken connection.  In this way the
>initiator can clean up or continuation of the Persistent Reserves.

>Persistent Reserves are a bit different then any other State that is kept
>by the Target.

I believe the Login which the rebooted initiator sends has to be for a
"leading connection", so it can learn the value of negotiated iSCSI
parameters like MaxConnections and FMarker that are only exchanged during
initial session activation.  So to me it's a new session.  It shouldn't
matter to the underlying SCSI resources whether the TSID matches the
previous session or not.   I'm not sure how the target should handle the
outstanding persistent reserves - I assume they can be moved to the new
session.  Now it's time for the real experts to chime in...

I'd like to see a new section added to Chapter 6 of the spec to describe
the handling of outstanding persistent reserves, since it is a special
recovery case.

As for the SNMP objects, I think the counts will be reset (old session
instance disappears, new session instance initialized to zero).  This would
make the target's counts consistent with the initiator's.  We plan to
discuss the tracking of long-term (across multiple sessions) usage per
initiator in Nashua.

Tom McSweeney
iSCSI Development, Storage Systems Group, IBM
Email: rf42tpme@us.ibm.com
Phone: (USA) 919-254-5634  (tie line: 444-5634)
Fax:   (USA) 919-254-0391  (tie line: 444-0391)






From owner-ips@ece.cmu.edu  Tue Apr 24 15:07:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA12840
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 15:07:33 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OFiiM28698
	for ips-outgoing; Tue, 24 Apr 2001 11:44:44 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OFhfA28666
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 11:43:42 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NT3ZANQ>; Tue, 24 Apr 2001 11:45:12 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154A9@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: hufferd@us.ibm.com
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI Target Reset
Date: Tue, 24 Apr 2001 11:43:34 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Starting with John's comments:

> described the action in FC terms.  In iSCSI terms, I think the Target
> Device would be a complete Symmetrix.  Do you think that is correct?

If existing cluster software continues to issue Target Resets, then
no, because such Target Resets must not reset the complete
Symmetrix for correct operation - they don't currently.

> What does Symmetrix do today to support Target Reset?

Reset the port on which it arrives (SCSI or FC), and generally
configure things so that anything else that could be negatively
impacted by side effects of that reset is not connected to that port.
IIRC, AIX clusters use Target Resets.

> And how do you think it should act in the future?

Whatever it takes to keep existing cluster software happy.

> Surly you do not want the complete controller reset, do you?

If "controller" means the entire array, then of course not -
don't be ridiculous.

On to Rob's most important comment:

> The recent SAM-2 change was expressly designed to encourage 
> iSCSI and SRP to drop support for TARGET RESET.  Please don't 
> keep it because you think T10 would be offended :-)

If dropping support for it breaks existing clustering software, that doesn't
seem like a good idea.  Such breakage will be viewed as an iSCSI
problem (it works fine on Fibre Channel ...), and will not contribute
to the adoption or acceptance of iSCSI :-(.

I appreciate T10's intention to obsolete Target Reset and would offer
two possible courses of action:

- If someone writes up a convincing case (Internet-Draft) for being
	able to map things that would have caused Target Resets to
	LUN Resets in iSCSI device drivers and the like in a fashion that
	will keep existing cluster software working (and unaware that
	this change has happened beneath it), then dropping
	Target Reset may be feasible.  This is akin to the discussion
	we went through about mandating Autosense and dropping
	Contingent Allegiance (it's easy to make a CA device do
	Autosense transparently), except that it'll have to consider what
	the various clustering software implementations actually do,
	as opposed to all the possible ways in which Target Reset
	could be used/abused.  Anyone want to volunteer?

- Put in some SHOULDs and SHOULD NOTs encouraging the use
	of LU Reset in place of Target Reset, and making Target Reset
	either OPTIONAL or NOT RECOMMENDED to implement.

The second bullet seems to match what T10 has done in practice
- this feature is not a good idea, but can't be removed from the
specification, at least not yet.

One more question for T10 - do the tape folks still need Target
Reset to whack an uncooperative robot/drive into submission
or is LU Reset good enough?

Comments?

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Tue Apr 24 15:17:41 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA13198
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 15:17:40 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OH2JU03010
	for ips-outgoing; Tue, 24 Apr 2001 13:02:19 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from lmoxch11.nsmg.veritas.com (london-bridge.east.veritas.com [207.30.27.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OH1CA02911
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 13:01:13 -0400 (EDT)
Received: by lmoxch11.nsmg.veritas.com with Internet Mail Service (5.5.2653.19)
	id <25X076WD>; Tue, 24 Apr 2001 13:02:51 -0400
Message-ID: <7E7FAEE92BB8D411A91D0008C7B1DC3D2186D8@lmoxch11.nsmg.veritas.com>
From: Roger Cummings <roger.cummings@veritas.com>
To: "'Sandeep Joshi'" <sandeepj@research.bell-labs.com>
Cc: "'ips@ece.cmu.edu'" <ips@ece.cmu.edu>
Subject: RE: iSCSI Target Reset
Date: Tue, 24 Apr 2001 13:02:50 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="ISO-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Sandeep,

Your statements about VERITAS Cluster Server (VCS) are correct. Today VCS
supports 32 node clusters and uses SCSI reservations in its storage
allocation and split brain detection algorithms. Target resets are also part
of the split brain algorithm, though they are not used in any major way. For
future configurations we're looking at using Persistent Reservations, as
defined in SPC-2 and beyond, for many of the reasons already stated in this
thread i.e. scalability and data availability in large SANs with multiported
storage.

Resetting only the LUNs that a specific initiator can see, or providing a
LUN-specific reset, is an interesting idea, and I think there are a number
of storage controllers out there that provide such functionality in a vendor
unique way. But my personal feeling (not necessarily that of everyone @
VERITAS) is that using & extending the functionality provided by Persistent
Reservations is a better and cleaner solution going forward.

Regards,





Roger Cummings
Technology Group
VERITAS Software

roger.cummings@veritas.com

-----Original Message-----
From: Sandeep Joshi [mailto:sandeepj@research.bell-labs.com]
Sent: Tuesday, April 24, 2001 11:27 AM
To: John Hufferd
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI Target Reset


John Hufferd wrote:
> 
> I am posting Charles comments to me onto the reflector, you all might find
> it interesting.  Thank you Charles.
> 
> Any other comments?

It would be interesting to hear from Veritas (anyone on
the list?).  I think the Veritas Cluster Server also uses 
SCSI reservations to avoid split-brain and can support 
32-node clusters in a SAN.

-Sandeep

> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> ---------------------- Forwarded by John Hufferd/San Jose/IBM on
04/24/2001
> 02:37 AM ---------------------------
> 
> Charles Monia <cmonia@NishanSystems.com> on 04/24/2001 12:02:36 AM
> 
> To:   John Hufferd/San Jose/IBM@IBMUS, Charles Monia
>       <cmonia@NishanSystems.com>
> cc:
> Subject:  RE: iSCSI Target Reset
> 
> Hi John;
> 
> The following is my .02:
> 
> 1.  Target reset must be supported (SAM says so at the moment).
> 
> 2.  The interconnect's behavior is outside the scope of SAM.  i.e..  It is
> up to the protocol spec.
> 
> 3.  IMO: The only SAM requirement is the behavior at device (LUN) level as
> seen by the  initiator issuing the request.  In that regard, for example,
> it's sufficient to reset only the LUs that the initiator can see.  In a
> virtual environment I assume that's a small subset of the LUs on a system.
> 
> Charles
> 
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Monday, April 23, 2001 10:43 PM
> > To: Charles Monia
> > Subject: RE: iSCSI Target Reset
> >
> >
> >
> > Charles,
> > You need to be more direct.  Is it a T10 requirement to support Target
> > Reset?  How much Flexibility does T10 give the implementations.
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > Charles Monia <cmonia@NishanSystems.com>@ece.cmu.edu on
> > 04/23/2001 08:36:20
> > PM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:   Charles Monia <cmonia@NishanSystems.com>
> > Subject:  RE: iSCSI Target Reset
> >
> >
> >
> > Hi:
> >
> > Sorry to reopen old issues, but unfortunately that seems
> > necessary since
> > something has been lost in translation.
> >
> > My concerns about TARGET RESET center on three areas:
> >
> > 1.  Adverse side effects at the transport layer that could
> > affect other
> > users.
> > 2.  Similar adverse side effects on the affected devices,
> > 3.  The impact on legacy software.
> >
> > The gist of my opinion on the first issue, (as expressed on the T10
> > reflector) is as follows:
> >
> > "> > > > The ... issue is whether mechanisms, such as terget reset,
> > > > > > are appropriate for a given transport.  In my view, the only
> > > > > > immutable requirement is to preserve the transport-independant
> > > > > > part of the semantics.  The definition of transport-specific
> > > > > > side effects is best handled in the appropriate transport
> > > > > > specification."
> >
> > The point of the above is that, in my view, a protocol
> > specification has
> > leeway to define the protocol-specific side effects in a rational and
> > benign
> > manner provided that the observable effects on the attached
> > devices are
> > preserved.
> >
> > Regarding adverse device-level side effects, I also stated that:
> >
> > "....restricting the operation [target reset] means providing
> > hooks so that
> > only a
> > trusted class of initiators can perform the function.  It's a bit like
> > controlling access to a file so that lots of users can read
> > it but only a
> > trusted few can perform a write or delete operation."
> >
> > Finally, with regard to legacy implementations, my main
> > intent was to avoid
> > the situation where an operation that was previously legal
> > becomes illegal.
> > In that regard, I also recall suggesting that the function be
> > treated as a
> > LUN RESET broadcast to all the logical units to which the initator had
> > access privileges.  That would also support the notion of preventing
> > adverse
> > effects on other users as well.
> >
> > At any rate, since there is a proposal to make LUN RESET
> > mandatory (see
> > below), I believed this was a reasonable and consistent alternative. I
> > therefore felt (and continue to believe) there is no justification for
> > making the function optional.
> >
> > Anyhow, thess views did not prevail at the meeting in
> > question. For that
> > reason, the so-called NOP proposal was made as a last ditch effort to
> > accomodate legacy implementations by preserving a measure of backwards
> > compatibility (albeit token compatibility). In that regard,
> > it seemed the
> > lesser evil.
> >
> > > > As for leaving things out of iSCSI - the default modus
> > > > operandi should be to put in everything that's described
> > > > in SAM2 unless we can convince T10 to take the feature
> > > > out of SAM2.  Let's not go deciding to cast things out
> > > > of SCSI on T10's behalf.
> >
> > So, I guess that anyone wishing to support a change in the spec is, of
> > course, free to pursue it in that forum.
> >
> > Charles
> >
> > PS: The proposal referenced above can be found at:
> > ftp://ftp.t10.org/t10/document.01/01-015r2.pdf
> >
> > > -----Original Message-----
> > > From: Elliott, Robert [mailto:Robert.Elliott@COMPAQ.com]
> > > Sent: Monday, April 23, 2001 5:20 PM
> > > To: ips@ece.cmu.edu; 'Black_David@emc.com';
> > 'cmonia@nishansystems.com'
> > > Subject: RE: iSCSI Target Reset
> > >
> > >
> > > It's not a very good situation when each device chooses the
> > > interpretation of TARGET RESET that it thinks is appropriate.
> > > IBM's Shark might choose a different hack than Compaq's
> > > RA8000.  How is software supposed to make sense of this?
> > >
> > > The rule in SAM-2 is that TARGET RESET resets every logical
> > > unit (subject to access controls, if implemented).  The fact
> > > is that in Fibre Channel, not many multi-LUN multi-port
> > > targets followed that rule.  The result is that software
> > > cannot tell what's going to happen and may have to handle
> > > targets from each vendor differently.  This is not very
> > > interoperable.
> > >
> > > Charles suggested in the T10 meeting that we allow it to
> > > be implemented as a no-op rather than let protocols drop
> > > support for it.  That doesn't help software like Windows that
> > > does expect certain effects - a no-op implementation
> > > would break clustering.  By removing it from the protocol,
> > > software is forced to find a suitable replacement (e.g. use
> > > LOGICAL UNIT RESET or switch to persistent reservations).
> > > In Windows, this can be done at the port driver level
> > > (STORPORT improving on SCSIPORT) or at the miniport level
> > > (convert each target reset request into multiple LOGICAL UNIT
> > > RESETs).
> > >
> > > Note that other protocols like NFS and HTTP over IP don't
> > > seem to have "server resets."
> > >
> > > The recent SAM-2 change was expressly designed to encourage
> > > iSCSI and SRP to drop support for TARGET RESET.  Please don't
> > > keep it because you think T10 would be offended :-)
> > >
> > > ---
> > > Rob Elliott, Compaq Server Storage
> > > Robert.Elliott@compaq.com
> > >
> > >
> > >
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Black_David@emc.com [mailto:Black_David@emc.com]
> > > > Sent: Monday, April 23, 2001 6:31 PM
> > > > To: cmonia@NishanSystems.com; ips@ece.cmu.edu
> > > > Subject: RE: iSCSI Target Reset
> > > >
> > > >
> > > > I agree with Charles that this is an implementation
> > > > issue.  If a Shark wants to reset all 32 adapters
> > > > when it receives a Target Reset on one of them, that's
> > > > a Shark implementation decision.  It's completely valid
> > > > to reset only the adapter that the Target Reset is
> > > > received on (common Fibre Channel behavior) or
> > > > only the iSCSI target to which the Target Reset is
> > > > addressed if there's more than one Target behind
> > > > the adapter.
> > > >
> > > > As for leaving things out of iSCSI - the default modus
> > > > operandi should be to put in everything that's described
> > > > in SAM2 unless we can convince T10 to take the feature
> > > > out of SAM2.  Let's not go deciding to cast things out
> > > > of SCSI on T10's behalf.
> > > >
> > > > Thanks,
> > > > --David
> > > >
> > > > > -----Original Message-----
> > > > > From:    Charles Monia [SMTP:cmonia@nishansystems.com]
> > > > > Sent:    Monday, April 23, 2001 7:12 PM
> > > > > To: ips@ece.cmu.edu
> > > > > Subject: RE: iSCSI Target Reset
> > > > >
> > > > > Hi:
> > > > >
> > > > > These seem to be implementation decisions. I don't see how
> > > > that justifies
> > > > > removing support from the protocol.
> > > > >
> > > > > Charles
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > > > > Sent: Monday, April 23, 2001 2:34 PM
> > > > > > To: Santosh Rao
> > > > > > Cc: ips@ece.cmu.edu
> > > > > > Subject: Re: iSCSI Target Reset
> > > > > >
> > > > > >
> > > > > >
> > > > > > Absolutely not,  Why would we think that impacting 32
> > > > different other
> > > > > > initiators is an OK thing to do.  By the way there are lots
> > > > > > more Initiators
> > > > > > possible with FC on Shark, and would hope that there would be
> > > > > > even more
> > > > > > with iSCSI.
> > > > > >
> > > > > > I have been told that these large Storage Controllers do not
> > > > > > support Target
> > > > > > Reset today.  So I see no loss in not supporting such an
> > > > item in iSCSI
> > > > > > especially since many Initiators will be beyond even the
> > > > distances and
> > > > > > mischief that is possible with FC.
> > > > > >
> > > > > > .
> > > > > > .
> > > > > > .
> > > > > > John L. Hufferd
> > > > > > Senior Technical Staff Member (STSM)
> > > > > > IBM/SSG San Jose Ca
> > > > > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > > > > Internet address: hufferd@us.ibm.com
> > > > > >
> > > > > >
> > > > > > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > > > > > 01:24:02 PM
> > > > > >
> > > > > > Sent by:  owner-ips@ece.cmu.edu
> > > > > >
> > > > > >
> > > > > > To:   ips@ece.cmu.edu
> > > > > > cc:
> > > > > > Subject:  Re: iSCSI Target Reset
> > > > > >
> > > > > >
> > > > > >
> > > > > > "Dillard, David" wrote:
> > > > > > >
> > > > > > > When will STORPORT be generally available?  The latest
> > > > > > STORPORT document
> > > > > > > that I found on the MS web site is version 0.6a, dated
> > > > > > March 18, 2001.
> > > > > > > Given this it seems like STORPORT might not be available
> > > > > > soon.  In that
> > > > > > case
> > > > > > > do you know what happens with the current drivers?  Are we
> > > > > > going to be
> > > > > > > telling customers that if they want to use iSCSI and NT
> > > > > > clustering they
> > > > > > have
> > > > > > > to update to Whistler?
> > > > > >
> > > > > >
> > > > > > [One would hope that this list does not turn into a Microsoft
> > > > > > release/product discussion mailing list (?) ]
> > > > > >
> > > > > > Without going into specifics of A certain O.S., does it
> > > suffice to
> > > > > > require that iSCSI not break existing legacy SCSI
> > applications ?
> > > > > >
> > > > > > If the above is a valid requirement, then, knowing that legacy
> > > > > > applications continue to use SCSI-2 Reserve/Release and the
> > > > > > target reset
> > > > > > as a mechanism of breaking SCSI-2 reservations, should'nt
> > > > > > iSCSI continue
> > > > > > to support the target reset ?
> > > > > >
> > > > > > - Santosh
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > >
> > >
> >
> >
> >


From owner-ips@ece.cmu.edu  Tue Apr 24 16:03:55 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA14153
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 16:03:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OGqjb02500
	for ips-outgoing; Tue, 24 Apr 2001 12:52:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3OGpwA02468
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 12:51:58 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by dirty; Tue Apr 24 12:51:20 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Tue Apr 24 12:51:19 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id MAA05410;
	Tue, 24 Apr 2001 12:51:15 -0400 (EDT)
Message-ID: <3AE5AF03.D9CE93AA@research.bell-labs.com>
Date: Tue, 24 Apr 2001 12:51:15 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Jim Hafner <hafner@almaden.ibm.com>
CC: ips@ece.cmu.edu
Subject: Re: iSCSI: handling of persistent reserves during initiator reboot
References: <OF66344A2F.2D475BD4-ON88256A38.0054B39C@LocalDomain>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Jim Hafner wrote:
> 
> Tom,
> 
> Here's my two cents. Apologies for being long-winded.  In short, I think I
> agree with most of your sentiment.
> 
> 1) In order to accommodate Persistent Reservations (and a few other SCSI
> things), I think it is important to have some mechanism for an initiator to
> reestablish some nexus state information through logout/login.  The current
> thinking in N&DT is that the initiator should reuse it's old ISID and the
> target should respond with the old TSID (thereby rebuilding the nexus).
> Note, the initiator starts with TSID=0, as if it was a new session.  The
> target that remembers the old ISID/TSID pair (can reuse the old TSID). Keep
> in mind that that target is already remembering some state information (the
> reservation) about the ISID/TSID (and Names) relationship in the nexus, so
> this isn't a big deal.

If I am not mistaken, isnt this also adding to initiator state ?
Initiators now have to remember the ISIDs used with every target 
(or have only one!)

And doesnt this also add state to iSCSI-FC bridges...?

Regards,
-Sandeep


From owner-ips@ece.cmu.edu  Tue Apr 24 16:09:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA14234
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 16:09:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OHGnE03805
	for ips-outgoing; Tue, 24 Apr 2001 13:16:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OHFiA03732
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 13:15:44 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id 93B4468E
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 10:15:43 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA01889;
	Tue, 24 Apr 2001 10:15:37 -0700 (PDT)
Message-ID: <3AE5B486.DE3080D9@cup.hp.com>
Date: Tue, 24 Apr 2001 10:14:46 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI : Clearing effects of Initiator Actions.
Content-Type: multipart/mixed;
 boundary="------------0C5CF7040FF0ECA4EC75524C"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------0C5CF7040FF0ECA4EC75524C
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


The iSCSI spec needs to explicitly document the clearing effects of
initiator actions on target objects, along the lines of Table-4 of
FCP-2.

A matrix of the following initiator actions :
1) session login (leading)
2) connection login (non-leading)
3) connection login with the restart bit
4) logout - close the session
5) logout close the connection
6) logout - connection recovery
7) logout - remove the connection at tgt's request
8) power cycle
9) TCP connection reset
10) TCP connection close
11) Abort Task
12) Abort Task Set
13) Clear Task Set
14) LUN reset
15) Warm Target Reset
16) Cold Target Reset


against their corresponding effects on the following target objects :
1) session login parameters (leading)
2) connection login parameters (non-leading)
3) Mode Page parameters
4) (ExpCmdSN, MaxCmdSN) window size.
5) Active SCSI tasks 
6) Active non-SCSI tasks
7) ACA, UA and deferred errors.
8) SCSI-2 Reservations.
9) Persistent Reservations
10) CRN
11) CmdSN
12) StatSN
13) MIB counters

would be useful.

All of the clearing effects on target objects need to be described for :
a) - for the initiator originating the action
b) - for all initiators.

- Santosh
--------------0C5CF7040FF0ECA4EC75524C
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------0C5CF7040FF0ECA4EC75524C--



From owner-ips@ece.cmu.edu  Tue Apr 24 17:00:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA15240
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 17:00:43 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OHTVC04420
	for ips-outgoing; Tue, 24 Apr 2001 13:29:31 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OHSFA04380
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 13:28:15 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP id 0D7BC1410
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 10:28:05 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA03100;
	Tue, 24 Apr 2001 10:27:57 -0700 (PDT)
Message-ID: <3AE5B76A.45DA34C0@cup.hp.com>
Date: Tue, 24 Apr 2001 10:27:06 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI : Aborting non-SCSI tasks.
Content-Type: multipart/mixed;
 boundary="------------7A06EC3A18E168B8B6AB4B88"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------7A06EC3A18E168B8B6AB4B88
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

All,

The iSCSI spec is missing description on how non-SCSI tasks should be
aborted in order to flush stale PDUs of that task. Initiators will
typically time non-SCSI [& SCSI] tasks and will need to resort to some
form of abort and cleanup action on a timeout of the non-scsi task.

This is required in order to safely re-use the task tag resources
without the danger of stale PDUs arriving from a previous incarnation of
that task tag.

The spec should provide some description on how this is to be done.
Perhaps, the semantics of Abort Task can be extended to non-SCSI tasks
as well, to avoid defining a second abort mechanism for non-SCSI tasks.

- Santosh
--------------7A06EC3A18E168B8B6AB4B88
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------7A06EC3A18E168B8B6AB4B88--



From owner-ips@ece.cmu.edu  Tue Apr 24 17:28:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA15740
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 17:28:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OJBuY10225
	for ips-outgoing; Tue, 24 Apr 2001 15:11:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com (mxbh4.isus.emc.com [128.221.10.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OJBUA10203
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 15:11:31 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <JG35AMZ9>; Tue, 24 Apr 2001 15:11:25 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154AE@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: dotis@sanlight.net, ips@ece.cmu.edu
Cc: mankin@east.isi.edu, egrodriguez@lucent.com, sob@harvard.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Tue, 24 Apr 2001 15:11:23 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> The only aspect that I was requesting was the iSCSI requirements
> document adds a requirement to seek this common structure for hardware
> support.  You say it is not practical.  I differ with that opinion.

A sentence equivalent in strength to the following sentence
from the IPS WG charter is acceptable:

  The WG will consider whether a layered architecture providing
  common transport, security, and/or other functionality for its
  encapsulations is the best technical approach.

Anything stronger risks linking the protocol encapsulations
in a way that has the potential to cause enormous procedural
headaches.  Common framing work will take place in TSVWG and
can be incorporated by reference to a TSVWG document if
appropriate. 

As for iSCSI on SCTP -- "Send Draft".

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Tue Apr 24 17:44:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA15953
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 17:44:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OJ2q509677
	for ips-outgoing; Tue, 24 Apr 2001 15:02:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe65.law11.hotmail.com [64.4.16.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OJ27A09647
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 15:02:08 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Tue, 24 Apr 2001 12:01:53 -0700
X-Originating-IP: [66.31.72.237]
From: "Eddy Quicksall" <ESQuicksall@hotmail.com>
To: "Santosh Rao" <santoshr@cup.hp.com>, "IPS Reflector" <ips@ece.cmu.edu>
References: <3AE5B76A.45DA34C0@cup.hp.com>
Subject: Re: iSCSI : Aborting non-SCSI tasks.
Date: Tue, 24 Apr 2001 15:01:53 -0400
MIME-Version: 1.0
Content-Type: text/plain;	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.3018.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Message-ID: <OE657RAluo3npvqnJNe00002eeb@hotmail.com>
X-OriginalArrivalTime: 24 Apr 2001 19:01:53.0333 (UTC) FILETIME=[066C7650:01C0CCF1]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

I started a thread on this once called "aborting an out of sequence cmdSN".
What you mention here is what I was talking about. There may be some
relevant answer in that thread.

My problem was that an ABORT TASK would have to pass to the TARGET layer in
order to abort the task (assuming one thinks the TARGET layer should handle
tasks ... I happen to think that way because you could have different
transport layers all feeding the TARGET layer).

I would also like some clean solution.

Eddy

----- Original Message -----
From: "Santosh Rao" <santoshr@cup.hp.com>
To: "IPS Reflector" <ips@ece.cmu.edu>
Sent: Tuesday, April 24, 2001 1:27 PM
Subject: iSCSI : Aborting non-SCSI tasks.


> All,
>
> The iSCSI spec is missing description on how non-SCSI tasks should be
> aborted in order to flush stale PDUs of that task. Initiators will
> typically time non-SCSI [& SCSI] tasks and will need to resort to some
> form of abort and cleanup action on a timeout of the non-scsi task.
>
> This is required in order to safely re-use the task tag resources
> without the danger of stale PDUs arriving from a previous incarnation of
> that task tag.
>
> The spec should provide some description on how this is to be done.
> Perhaps, the semantics of Abort Task can be extended to non-SCSI tasks
> as well, to avoid defining a second abort mechanism for non-SCSI tasks.
>
> - Santosh


From owner-ips@ece.cmu.edu  Tue Apr 24 17:47:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA15997
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 17:47:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OJGpi10498
	for ips-outgoing; Tue, 24 Apr 2001 15:16:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OJGYA10481
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 15:16:34 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3OKOH130840;
	Tue, 24 Apr 2001 13:24:22 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>,
        "Ips Reflector \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
Date: Tue, 24 Apr 2001 12:14:25 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJEEMHCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <6BD67FFB937FD411A04F00D0B74FE87802A08FF0@xrose06.rose.hp.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie,

Thanks for the change.  One question however.

On page 18,
 "The iSCSI protocol document SHOULD NOT define the management
  architecture for iSCSI within the network infrastructure."

 What does this mean?

I think it means that the iSCSI protocol document SHOULD NOT define the
management architecture.  The management architecture may be defined within
in a separate document.

I do not understand the phrase 'within network infrastructure.'

Doug



From owner-ips@ece.cmu.edu  Tue Apr 24 17:47:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA16021
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 17:47:58 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OIW0707947
	for ips-outgoing; Tue, 24 Apr 2001 14:32:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OIV9A07866
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 14:31:09 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id A056213F0; Tue, 24 Apr 2001 11:31:08 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA08552;
	Tue, 24 Apr 2001 11:31:02 -0700 (PDT)
Message-ID: <3AE5C631.EC5015FB@cup.hp.com>
Date: Tue, 24 Apr 2001 11:30:09 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Cc: David Black <Black_David@emc.com>
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
References: <FFD40DB4943CD411876500508BAD0279026B20F0@sj5-ex2.brocade.com>
Content-Type: multipart/mixed;
 boundary="------------90C909B9CA95992727122E82"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------90C909B9CA95992727122E82
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


What is the final resolution on this EMDP issue. IMHO, iSCSI must retain
the EMDP semantics as defined in FCP, SRP. i.e. It controls the order of
the data across the entire SCSI command. (which includes sending R2T
requests in order, if EMDP was set to 1).

Some additional thoughts on this topic are :

1) Is it worth a finer granularity of control wherein the initiator be
allowed to negotiate with the target that R2T requests be sent in-order
, while not imposing any constraints on the Read Data PDU order.

2) Should control be provided over a "Random Relative Offset" feature,
as Bob describes it below, or is it to be assumed that iSCSI Data PDUs
will always be in-order within a sequence ?

3) Speaking of sequence, this terminology has been often used in this
thread. Where is the notion of a sequence defined in iSCSI ? What is the
definition of an iSCSI sequence.

- Santosh

Robert Snively wrote:
> 
> Seems to me that there are some unclarities in this area as well.
> 
> There are really two pieces being discussed as one:
> 
>         EMDP (a SCSI functionality)
> 
>         Random relative offset (a transport functionality)
> 
> EMDP is used to allow a target to request or deliver its data
> out of order.  This is used for things like passing a stripe
> segment from a RAID data extent as soon as it has been accumulated,
> rather than waiting until all previous parts of the RAID data
> extent have also been accumulated and delivered.  It is also used
> for things like "start anywhere" reading of a disk track.
> 
> It says nothing about the ordering of data within a PDU or sequence
> which must be ordered according to the rules of the protocol.  Fibre
> Channel allows the data within a sequence to be transmitted in order
> or out of order by using the login parameter "random relative offset".
> Almost all devices choose to login and require "continuously increasing
> relative offset".
--------------90C909B9CA95992727122E82
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------90C909B9CA95992727122E82--



From owner-ips@ece.cmu.edu  Tue Apr 24 17:51:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA16116
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 17:51:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OIgng08568
	for ips-outgoing; Tue, 24 Apr 2001 14:42:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe49.law11.hotmail.com [64.4.16.21])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OIfjA08534
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 14:41:45 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Tue, 24 Apr 2001 11:41:39 -0700
X-Originating-IP: [66.31.72.237]
From: "Eddy Quicksall" <ESQuicksall@hotmail.com>
To: "Roger Cummings" <roger.cummings@veritas.com>,
        "'Sandeep Joshi'" <sandeepj@research.bell-labs.com>
Cc: <ips@ece.cmu.edu>
References: <7E7FAEE92BB8D411A91D0008C7B1DC3D2186D8@lmoxch11.nsmg.veritas.com>
Subject: Re: iSCSI Target Reset
Date: Tue, 24 Apr 2001 14:41:38 -0400
MIME-Version: 1.0
Content-Type: text/plain;	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.3018.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Message-ID: <OE49yI8ftNH3NnRFrN900002a08@hotmail.com>
X-OriginalArrivalTime: 24 Apr 2001 18:41:39.0161 (UTC) FILETIME=[32B88490:01C0CCEE]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

I think I may have opened a can of worms here by accident ... in another
thread, I justified a TARGET RESET by saying that an NT driver may be doing
a TARGET RESET to each target it is handling to emulate the SCSI BUS RESET.
That is because I was mentally mapping a parallel SCSI-2 thing called a BUS
DEVICE RESET to a TARGET RESET and did not think about the LOGICAL UNIT
RESET.

The NT driver for iSCSI will be exposing LU's to the SCSI Port Driver (or
another method of supporting SCSI). So, the driver may not expose all
possible targets/luns. NT will issue a blind reset to the emulated BUS and
that may itself not even represent all LUs.

If that driver were to issue a LOGICAL UNIT RESET to the LUs it is exposing,
that should take care of it. Remember also, that for the case I mentioned
(NT), the number of LUs and Hosts is extremely limited compared to what you
guys are talking about.

Since an iSCSI driver has not even been fully written yet (since the spec is
not ready), there is no legacy issue ... just an implementation issue.
Perhaps a note in the spec would help.


Eddy

----- Original Message -----
From: "Roger Cummings" <roger.cummings@veritas.com>
To: "'Sandeep Joshi'" <sandeepj@research.bell-labs.com>
Cc: <ips@ece.cmu.edu>
Sent: Tuesday, April 24, 2001 1:02 PM
Subject: RE: iSCSI Target Reset


> Sandeep,
>
> Your statements about VERITAS Cluster Server (VCS) are correct. Today VCS
> supports 32 node clusters and uses SCSI reservations in its storage
> allocation and split brain detection algorithms. Target resets are also
part
> of the split brain algorithm, though they are not used in any major way.
For
> future configurations we're looking at using Persistent Reservations, as
> defined in SPC-2 and beyond, for many of the reasons already stated in
this
> thread i.e. scalability and data availability in large SANs with
multiported
> storage.
>
> Resetting only the LUNs that a specific initiator can see, or providing a
> LUN-specific reset, is an interesting idea, and I think there are a number
> of storage controllers out there that provide such functionality in a
vendor
> unique way. But my personal feeling (not necessarily that of everyone @
> VERITAS) is that using & extending the functionality provided by
Persistent
> Reservations is a better and cleaner solution going forward.
>
> Regards,
>
>
>
>
>
> Roger Cummings
> Technology Group
> VERITAS Software
>
> roger.cummings@veritas.com
>
> -----Original Message-----
> From: Sandeep Joshi [mailto:sandeepj@research.bell-labs.com]
> Sent: Tuesday, April 24, 2001 11:27 AM
> To: John Hufferd
> Cc: ips@ece.cmu.edu
> Subject: Re: iSCSI Target Reset
>
>
> John Hufferd wrote:
> >
> > I am posting Charles comments to me onto the reflector, you all might
find
> > it interesting.  Thank you Charles.
> >
> > Any other comments?
>
> It would be interesting to hear from Veritas (anyone on
> the list?).  I think the Veritas Cluster Server also uses
> SCSI reservations to avoid split-brain and can support
> 32-node clusters in a SAN.
>
> -Sandeep
>
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> > ---------------------- Forwarded by John Hufferd/San Jose/IBM on
> 04/24/2001
> > 02:37 AM ---------------------------
> >
> > Charles Monia <cmonia@NishanSystems.com> on 04/24/2001 12:02:36 AM
> >
> > To:   John Hufferd/San Jose/IBM@IBMUS, Charles Monia
> >       <cmonia@NishanSystems.com>
> > cc:
> > Subject:  RE: iSCSI Target Reset
> >
> > Hi John;
> >
> > The following is my .02:
> >
> > 1.  Target reset must be supported (SAM says so at the moment).
> >
> > 2.  The interconnect's behavior is outside the scope of SAM.  i.e..  It
is
> > up to the protocol spec.
> >
> > 3.  IMO: The only SAM requirement is the behavior at device (LUN) level
as
> > seen by the  initiator issuing the request.  In that regard, for
example,
> > it's sufficient to reset only the LUs that the initiator can see.  In a
> > virtual environment I assume that's a small subset of the LUs on a
system.
> >
> > Charles
> >
> > > -----Original Message-----
> > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > Sent: Monday, April 23, 2001 10:43 PM
> > > To: Charles Monia
> > > Subject: RE: iSCSI Target Reset
> > >
> > >
> > >
> > > Charles,
> > > You need to be more direct.  Is it a T10 requirement to support Target
> > > Reset?  How much Flexibility does T10 give the implementations.
> > >
> > > .
> > > .
> > > .
> > > John L. Hufferd
> > > Senior Technical Staff Member (STSM)
> > > IBM/SSG San Jose Ca
> > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > Internet address: hufferd@us.ibm.com
> > >
> > >
> > > Charles Monia <cmonia@NishanSystems.com>@ece.cmu.edu on
> > > 04/23/2001 08:36:20
> > > PM
> > >
> > > Sent by:  owner-ips@ece.cmu.edu
> > >
> > >
> > > To:   ips@ece.cmu.edu
> > > cc:   Charles Monia <cmonia@NishanSystems.com>
> > > Subject:  RE: iSCSI Target Reset
> > >
> > >
> > >
> > > Hi:
> > >
> > > Sorry to reopen old issues, but unfortunately that seems
> > > necessary since
> > > something has been lost in translation.
> > >
> > > My concerns about TARGET RESET center on three areas:
> > >
> > > 1.  Adverse side effects at the transport layer that could
> > > affect other
> > > users.
> > > 2.  Similar adverse side effects on the affected devices,
> > > 3.  The impact on legacy software.
> > >
> > > The gist of my opinion on the first issue, (as expressed on the T10
> > > reflector) is as follows:
> > >
> > > "> > > > The ... issue is whether mechanisms, such as terget reset,
> > > > > > > are appropriate for a given transport.  In my view, the only
> > > > > > > immutable requirement is to preserve the transport-independant
> > > > > > > part of the semantics.  The definition of transport-specific
> > > > > > > side effects is best handled in the appropriate transport
> > > > > > > specification."
> > >
> > > The point of the above is that, in my view, a protocol
> > > specification has
> > > leeway to define the protocol-specific side effects in a rational and
> > > benign
> > > manner provided that the observable effects on the attached
> > > devices are
> > > preserved.
> > >
> > > Regarding adverse device-level side effects, I also stated that:
> > >
> > > "....restricting the operation [target reset] means providing
> > > hooks so that
> > > only a
> > > trusted class of initiators can perform the function.  It's a bit like
> > > controlling access to a file so that lots of users can read
> > > it but only a
> > > trusted few can perform a write or delete operation."
> > >
> > > Finally, with regard to legacy implementations, my main
> > > intent was to avoid
> > > the situation where an operation that was previously legal
> > > becomes illegal.
> > > In that regard, I also recall suggesting that the function be
> > > treated as a
> > > LUN RESET broadcast to all the logical units to which the initator had
> > > access privileges.  That would also support the notion of preventing
> > > adverse
> > > effects on other users as well.
> > >
> > > At any rate, since there is a proposal to make LUN RESET
> > > mandatory (see
> > > below), I believed this was a reasonable and consistent alternative. I
> > > therefore felt (and continue to believe) there is no justification for
> > > making the function optional.
> > >
> > > Anyhow, thess views did not prevail at the meeting in
> > > question. For that
> > > reason, the so-called NOP proposal was made as a last ditch effort to
> > > accomodate legacy implementations by preserving a measure of backwards
> > > compatibility (albeit token compatibility). In that regard,
> > > it seemed the
> > > lesser evil.
> > >
> > > > > As for leaving things out of iSCSI - the default modus
> > > > > operandi should be to put in everything that's described
> > > > > in SAM2 unless we can convince T10 to take the feature
> > > > > out of SAM2.  Let's not go deciding to cast things out
> > > > > of SCSI on T10's behalf.
> > >
> > > So, I guess that anyone wishing to support a change in the spec is, of
> > > course, free to pursue it in that forum.
> > >
> > > Charles
> > >
> > > PS: The proposal referenced above can be found at:
> > > ftp://ftp.t10.org/t10/document.01/01-015r2.pdf
> > >
> > > > -----Original Message-----
> > > > From: Elliott, Robert [mailto:Robert.Elliott@COMPAQ.com]
> > > > Sent: Monday, April 23, 2001 5:20 PM
> > > > To: ips@ece.cmu.edu; 'Black_David@emc.com';
> > > 'cmonia@nishansystems.com'
> > > > Subject: RE: iSCSI Target Reset
> > > >
> > > >
> > > > It's not a very good situation when each device chooses the
> > > > interpretation of TARGET RESET that it thinks is appropriate.
> > > > IBM's Shark might choose a different hack than Compaq's
> > > > RA8000.  How is software supposed to make sense of this?
> > > >
> > > > The rule in SAM-2 is that TARGET RESET resets every logical
> > > > unit (subject to access controls, if implemented).  The fact
> > > > is that in Fibre Channel, not many multi-LUN multi-port
> > > > targets followed that rule.  The result is that software
> > > > cannot tell what's going to happen and may have to handle
> > > > targets from each vendor differently.  This is not very
> > > > interoperable.
> > > >
> > > > Charles suggested in the T10 meeting that we allow it to
> > > > be implemented as a no-op rather than let protocols drop
> > > > support for it.  That doesn't help software like Windows that
> > > > does expect certain effects - a no-op implementation
> > > > would break clustering.  By removing it from the protocol,
> > > > software is forced to find a suitable replacement (e.g. use
> > > > LOGICAL UNIT RESET or switch to persistent reservations).
> > > > In Windows, this can be done at the port driver level
> > > > (STORPORT improving on SCSIPORT) or at the miniport level
> > > > (convert each target reset request into multiple LOGICAL UNIT
> > > > RESETs).
> > > >
> > > > Note that other protocols like NFS and HTTP over IP don't
> > > > seem to have "server resets."
> > > >
> > > > The recent SAM-2 change was expressly designed to encourage
> > > > iSCSI and SRP to drop support for TARGET RESET.  Please don't
> > > > keep it because you think T10 would be offended :-)
> > > >
> > > > ---
> > > > Rob Elliott, Compaq Server Storage
> > > > Robert.Elliott@compaq.com
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Black_David@emc.com [mailto:Black_David@emc.com]
> > > > > Sent: Monday, April 23, 2001 6:31 PM
> > > > > To: cmonia@NishanSystems.com; ips@ece.cmu.edu
> > > > > Subject: RE: iSCSI Target Reset
> > > > >
> > > > >
> > > > > I agree with Charles that this is an implementation
> > > > > issue.  If a Shark wants to reset all 32 adapters
> > > > > when it receives a Target Reset on one of them, that's
> > > > > a Shark implementation decision.  It's completely valid
> > > > > to reset only the adapter that the Target Reset is
> > > > > received on (common Fibre Channel behavior) or
> > > > > only the iSCSI target to which the Target Reset is
> > > > > addressed if there's more than one Target behind
> > > > > the adapter.
> > > > >
> > > > > As for leaving things out of iSCSI - the default modus
> > > > > operandi should be to put in everything that's described
> > > > > in SAM2 unless we can convince T10 to take the feature
> > > > > out of SAM2.  Let's not go deciding to cast things out
> > > > > of SCSI on T10's behalf.
> > > > >
> > > > > Thanks,
> > > > > --David
> > > > >
> > > > > > -----Original Message-----
> > > > > > From:    Charles Monia [SMTP:cmonia@nishansystems.com]
> > > > > > Sent:    Monday, April 23, 2001 7:12 PM
> > > > > > To: ips@ece.cmu.edu
> > > > > > Subject: RE: iSCSI Target Reset
> > > > > >
> > > > > > Hi:
> > > > > >
> > > > > > These seem to be implementation decisions. I don't see how
> > > > > that justifies
> > > > > > removing support from the protocol.
> > > > > >
> > > > > > Charles
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > > > > > Sent: Monday, April 23, 2001 2:34 PM
> > > > > > > To: Santosh Rao
> > > > > > > Cc: ips@ece.cmu.edu
> > > > > > > Subject: Re: iSCSI Target Reset
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Absolutely not,  Why would we think that impacting 32
> > > > > different other
> > > > > > > initiators is an OK thing to do.  By the way there are lots
> > > > > > > more Initiators
> > > > > > > possible with FC on Shark, and would hope that there would be
> > > > > > > even more
> > > > > > > with iSCSI.
> > > > > > >
> > > > > > > I have been told that these large Storage Controllers do not
> > > > > > > support Target
> > > > > > > Reset today.  So I see no loss in not supporting such an
> > > > > item in iSCSI
> > > > > > > especially since many Initiators will be beyond even the
> > > > > distances and
> > > > > > > mischief that is possible with FC.
> > > > > > >
> > > > > > > .
> > > > > > > .
> > > > > > > .
> > > > > > > John L. Hufferd
> > > > > > > Senior Technical Staff Member (STSM)
> > > > > > > IBM/SSG San Jose Ca
> > > > > > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > > > > > Internet address: hufferd@us.ibm.com
> > > > > > >
> > > > > > >
> > > > > > > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > > > > > > 01:24:02 PM
> > > > > > >
> > > > > > > Sent by:  owner-ips@ece.cmu.edu
> > > > > > >
> > > > > > >
> > > > > > > To:   ips@ece.cmu.edu
> > > > > > > cc:
> > > > > > > Subject:  Re: iSCSI Target Reset
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > "Dillard, David" wrote:
> > > > > > > >
> > > > > > > > When will STORPORT be generally available?  The latest
> > > > > > > STORPORT document
> > > > > > > > that I found on the MS web site is version 0.6a, dated
> > > > > > > March 18, 2001.
> > > > > > > > Given this it seems like STORPORT might not be available
> > > > > > > soon.  In that
> > > > > > > case
> > > > > > > > do you know what happens with the current drivers?  Are we
> > > > > > > going to be
> > > > > > > > telling customers that if they want to use iSCSI and NT
> > > > > > > clustering they
> > > > > > > have
> > > > > > > > to update to Whistler?
> > > > > > >
> > > > > > >
> > > > > > > [One would hope that this list does not turn into a Microsoft
> > > > > > > release/product discussion mailing list (?) ]
> > > > > > >
> > > > > > > Without going into specifics of A certain O.S., does it
> > > > suffice to
> > > > > > > require that iSCSI not break existing legacy SCSI
> > > applications ?
> > > > > > >
> > > > > > > If the above is a valid requirement, then, knowing that legacy
> > > > > > > applications continue to use SCSI-2 Reserve/Release and the
> > > > > > > target reset
> > > > > > > as a mechanism of breaking SCSI-2 reservations, should'nt
> > > > > > > iSCSI continue
> > > > > > > to support the target reset ?
> > > > > > >
> > > > > > > - Santosh
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> > >
> > >
>


From owner-ips@ece.cmu.edu  Tue Apr 24 17:52:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA16150
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 17:52:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OIw8H09391
	for ips-outgoing; Tue, 24 Apr 2001 14:58:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OIvCA09356
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 14:57:12 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3OK50130823;
	Tue, 24 Apr 2001 13:05:00 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: <ips@ece.cmu.edu>, <Black_David@emc.com>
Cc: <mankin@east.isi.edu>, <egrodriguez@lucent.com>, <sob@harvard.edu>
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Tue, 24 Apr 2001 11:55:07 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEMFCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A07080154A8@corpmx9.isus.emc.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

If you look at the "desired" and required features of iSCSI as described in
the requirements document, there is a general structure that could provide
these features.  iSCSI still defines the various commands and the means of
communicating with very little changed from the current proposal.  This
change in structure however will lend itself to a generalized method of
providing hardware features to many other protocols however.

Let me give you an example from the perspective of the SCTP structures as
related to iSCSI:
 - Map the SCSI Tag into the SCTP Data Chunk Stream.
 - Establish hardware vectors associated with Data Chunk Payload Protocol
Identifiers indexed by Stream.
 - Define categories of Payload Protocol IDs to isolate Command, Status,
Data, and iSCSI PDUs.
 - These categories then provide direct placement of stream content
sensitive to iSCSI.

This generalized form of data placement maps nicely into iSCSI, iFCP, FCIP,
RDMA, RPC, NFS, ...

Advantages comes from the removal of error detection, error recovery, flow
control, multi-homing, session recovery, anti-spoofing, anti-DoS, and
independent delivery become common among all of these protocols as well as
many more to come.  The movement to SCTP then is painless.  As it is now,
iSCSI can not meet the SAM-2 requirement and needs renovation due to this
short-coming.  SCTP structures again provides a solution here as well.

iSCSI, iFCP, FCIP, and RDMA could stop looking for a record offload scheme
for TCP.  This record based offload becomes SCTP structure based to create a
common structure.  The SCTP common header could become a TCP option to allow
for negotiation.

I do not see why this subject should not be discussed on IPS as it relates
directly to those protocols that are presently advocating modified network
adapters.  The only aspect that I was requesting was the iSCSI requirements
document adds a requirement to seek this common structure for hardware
support.  You say it is not practical.  I differ with that opinion.

Doug

> > I'll stand by the stated intent of implementing this protocol
> in hardware.
> > The same is also true for iFCP and FCIP.
>
> I never questioned the fact that there will be hardware implementations.
> It still appears to me that no change to the iSCSI requirements document
> is needed to deal with this set of issues.  I am strongly opposed
> to linking
> iSCSI specification development to FCIP and iFCP in a fashion that would
> require all three to be submitted to the IESG as a single set of
> documents.
>
> > Here IPS is developing a framing protocol that increases the level of
> error
> > detection.
>
> I'm sorry, but that's incorrect, because the IPS WG is not developing
> any framing protocol.  draft-williams-tcpulpframe-01.txt and
> draft-otis-tcp-framing-00.txt are both TSVWG drafts, not IPS drafts.
> TSVWG is the right place to work on these sorts of common framing
> mechanisms, and how that work is pursued is at the discretion and
> judgement of TSVWG and its chairs.
>
> > The IPS has made explicit reference to these intentions of
> > having this protocol supported directly in hardware.  I will be happy to
> > show how this protocol can be mapped into a common structure which would
> > avail more protocols to this hardware acceleration at the same time ease
> the
> > transition to SCTP.
>
> The first step of submitting draft-otis-tcp-framing-00.txt has been taken
> (thank you) and the IPS WG will observe and follow what is done in this
> area by TSVWG.
>
> Further discussion of framing belongs on the TSVWG list, not IPS.
>
> Thanks,
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>



From owner-ips@ece.cmu.edu  Tue Apr 24 18:24:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA16759
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 18:24:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OKWsm14797
	for ips-outgoing; Tue, 24 Apr 2001 16:32:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OKVwA14751
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 16:31:59 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id 2C6DC1359; Tue, 24 Apr 2001 13:31:58 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id NAA20991;
	Tue, 24 Apr 2001 13:31:51 -0700 (PDT)
Message-ID: <3AE5E284.86632AD@cup.hp.com>
Date: Tue, 24 Apr 2001 13:31:00 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Charles Monia <cmonia@NishanSystems.com>
Cc: IPS Reflector <ips@ece.cmu.edu>
Subject: Re: iSCSI : Aborting non-SCSI tasks.
References: <B300BD9620BCD411A366009027C21D9B173460@ariel.nishansystems.com>
Content-Type: multipart/mixed;
 boundary="------------E99B7CA2CE2CA126F39CEFB9"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------E99B7CA2CE2CA126F39CEFB9
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi Charles,

non-scsi tasks refers to Login, Logout, Text & NOP-OUT PDUs.

Initiators will need to time these tasks [exchanges ?] and how do they
deal with a timeout of such non-scsi tasks [exchanges/PDUs/commands] ?


Regards,
Santosh


Charles Monia wrote:
> 
> Hi Santosh:
> 
> > The iSCSI spec is missing description on how non-SCSI tasks should be
> > aborted
> 
> What is a non-SCSI task?
> 
> Charles
> > -----Original Message-----
> > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > Sent: Tuesday, April 24, 2001 10:27 AM
> > To: IPS Reflector
> > Subject: iSCSI : Aborting non-SCSI tasks.
> >
> >
> > All,
> >
> > The iSCSI spec is missing description on how non-SCSI tasks should be
> > aborted in order to flush stale PDUs of that task. Initiators will
> > typically time non-SCSI [& SCSI] tasks and will need to resort to some
> > form of abort and cleanup action on a timeout of the non-scsi task.
> >
> > This is required in order to safely re-use the task tag resources
> > without the danger of stale PDUs arriving from a previous
> > incarnation of
> > that task tag.
> >
> > The spec should provide some description on how this is to be done.
> > Perhaps, the semantics of Abort Task can be extended to non-SCSI tasks
> > as well, to avoid defining a second abort mechanism for
> > non-SCSI tasks.
> >
> > - Santosh
> >
--------------E99B7CA2CE2CA126F39CEFB9
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------E99B7CA2CE2CA126F39CEFB9--



From owner-ips@ece.cmu.edu  Tue Apr 24 18:26:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA16831
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 18:26:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OKJpZ14037
	for ips-outgoing; Tue, 24 Apr 2001 16:19:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OKJJA14018
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 16:19:20 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRRTD>; Tue, 24 Apr 2001 13:19:12 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173460@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "'Santosh Rao'" <santoshr@cup.hp.com>
Cc: IPS Reflector <ips@ece.cmu.edu>
Subject: RE: iSCSI : Aborting non-SCSI tasks.
Date: Tue, 24 Apr 2001 13:19:11 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi Santosh:

> The iSCSI spec is missing description on how non-SCSI tasks should be
> aborted 

What is a non-SCSI task?

Charles
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Tuesday, April 24, 2001 10:27 AM
> To: IPS Reflector
> Subject: iSCSI : Aborting non-SCSI tasks.
> 
> 
> All,
> 
> The iSCSI spec is missing description on how non-SCSI tasks should be
> aborted in order to flush stale PDUs of that task. Initiators will
> typically time non-SCSI [& SCSI] tasks and will need to resort to some
> form of abort and cleanup action on a timeout of the non-scsi task.
> 
> This is required in order to safely re-use the task tag resources
> without the danger of stale PDUs arriving from a previous 
> incarnation of
> that task tag.
> 
> The spec should provide some description on how this is to be done.
> Perhaps, the semantics of Abort Task can be extended to non-SCSI tasks
> as well, to avoid defining a second abort mechanism for 
> non-SCSI tasks.
> 
> - Santosh
> 


From owner-ips@ece.cmu.edu  Tue Apr 24 21:06:51 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA19200
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 21:06:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OLXt618038
	for ips-outgoing; Tue, 24 Apr 2001 17:33:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OLXQA18017
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 17:33:26 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRRX3>; Tue, 24 Apr 2001 14:33:10 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173462@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "'Santosh Rao'" <santoshr@cup.hp.com>
Cc: IPS Reflector <ips@ece.cmu.edu>
Subject: RE: iSCSI : Aborting non-SCSI tasks.
Date: Tue, 24 Apr 2001 14:33:09 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi Santosh:

My .02 (again):

Assuming these transactions are handled in the iSCSI layer:

a) No abort mechanism is required.  Like a 'ping' operation, these functions
are expected to complete in some fixed time or not at all.

b) A timeout indicates the iSCSI layer is broken -- i.e., timeouts are
fatal.

Charles

> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Tuesday, April 24, 2001 1:31 PM
> To: Charles Monia
> Cc: IPS Reflector
> Subject: Re: iSCSI : Aborting non-SCSI tasks.
> 
> 
> Hi Charles,
> 
> non-scsi tasks refers to Login, Logout, Text & NOP-OUT PDUs.
> 
> Initiators will need to time these tasks [exchanges ?] and how do they
> deal with a timeout of such non-scsi tasks [exchanges/PDUs/commands] ?
> 
> 
> Regards,
> Santosh
> 
> 
> Charles Monia wrote:
> > 
> > Hi Santosh:
> > 
> > > The iSCSI spec is missing description on how non-SCSI 
> tasks should be
> > > aborted
> > 
> > What is a non-SCSI task?
> > 
> > Charles
> > > -----Original Message-----
> > > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > > Sent: Tuesday, April 24, 2001 10:27 AM
> > > To: IPS Reflector
> > > Subject: iSCSI : Aborting non-SCSI tasks.
> > >
> > >
> > > All,
> > >
> > > The iSCSI spec is missing description on how non-SCSI 
> tasks should be
> > > aborted in order to flush stale PDUs of that task. Initiators will
> > > typically time non-SCSI [& SCSI] tasks and will need to 
> resort to some
> > > form of abort and cleanup action on a timeout of the 
> non-scsi task.
> > >
> > > This is required in order to safely re-use the task tag resources
> > > without the danger of stale PDUs arriving from a previous
> > > incarnation of
> > > that task tag.
> > >
> > > The spec should provide some description on how this is 
> to be done.
> > > Perhaps, the semantics of Abort Task can be extended to 
> non-SCSI tasks
> > > as well, to avoid defining a second abort mechanism for
> > > non-SCSI tasks.
> > >
> > > - Santosh
> > >
> 


From owner-ips@ece.cmu.edu  Tue Apr 24 21:07:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA19221
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 21:07:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OM2xl19498
	for ips-outgoing; Tue, 24 Apr 2001 18:02:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OM2XA19485
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 18:02:33 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3ON7l130974;
	Tue, 24 Apr 2001 16:07:51 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Santosh Rao" <santoshr@cup.hp.com>,
        "Charles Monia" <cmonia@NishanSystems.com>
Cc: "IPS Reflector" <ips@ece.cmu.edu>
Subject: RE: iSCSI : Aborting non-SCSI tasks.
Date: Tue, 24 Apr 2001 14:57:54 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMEMLCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <3AE5E284.86632AD@cup.hp.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Santosh,

Should the Service Delivery Subsystem (iSCSI) be responding to Task
Management requests?  Unless you wish to burden the transport with the
prospect of sorting and tracking LUN:TAG values, the transport should not
attempt to interpret SCSI commands.  There should be a simple scheme for
handling these situations without having the transport take on the
characteristics of a SCSI target.  Philosophically, this does not isolate
the functions in a layered manner.  It gets complex very quickly otherwise.

Doug



> Hi Charles,
>
> non-scsi tasks refers to Login, Logout, Text & NOP-OUT PDUs.
>
> Initiators will need to time these tasks [exchanges ?] and how do they
> deal with a timeout of such non-scsi tasks [exchanges/PDUs/commands] ?
>
>
> Regards,
> Santosh
>
>
> Charles Monia wrote:
> >
> > Hi Santosh:
> >
> > > The iSCSI spec is missing description on how non-SCSI tasks should be
> > > aborted
> >
> > What is a non-SCSI task?
> >
> > Charles
> > > -----Original Message-----
> > > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > > Sent: Tuesday, April 24, 2001 10:27 AM
> > > To: IPS Reflector
> > > Subject: iSCSI : Aborting non-SCSI tasks.
> > >
> > >
> > > All,
> > >
> > > The iSCSI spec is missing description on how non-SCSI tasks should be
> > > aborted in order to flush stale PDUs of that task. Initiators will
> > > typically time non-SCSI [& SCSI] tasks and will need to resort to some
> > > form of abort and cleanup action on a timeout of the non-scsi task.
> > >
> > > This is required in order to safely re-use the task tag resources
> > > without the danger of stale PDUs arriving from a previous
> > > incarnation of
> > > that task tag.
> > >
> > > The spec should provide some description on how this is to be done.
> > > Perhaps, the semantics of Abort Task can be extended to non-SCSI tasks
> > > as well, to avoid defining a second abort mechanism for
> > > non-SCSI tasks.
> > >
> > > - Santosh
> > >



From owner-ips@ece.cmu.edu  Tue Apr 24 21:07:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA19232
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 21:07:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OMjxh21605
	for ips-outgoing; Tue, 24 Apr 2001 18:45:59 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OMj3A21526
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 18:45:03 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id SAA79268;
	Tue, 24 Apr 2001 18:37:32 -0400
Received: from f3n42e (d03nm042h.boulder.ibm.com [9.99.140.42])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96) with ESMTP id QAA39274;
	Tue, 24 Apr 2001 16:44:56 -0600
Importance: Normal
Subject: Re: iSCSI: handling of persistent reserves during initiator reboot
To: Sandeep Joshi <sandeepj@research.bell-labs.com>
Cc: ips@ece.cmu.edu
From: "Jim Hafner" <hafner@almaden.ibm.com>
Date: Tue, 24 Apr 2001 15:44:54 -0700
Message-ID: <OF2A159F66.1E593B6A-ON88256A38.007A485A@LocalDomain>
X-MIMETrack: Serialize by Router on D03NM042/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/24/2001 03:44:55 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Sandeep,

Suppose an initiator has a persistent reservation for a logical unit on
some target device.  This initiator has two choices:
a) a requirement, as you suggest, to remember some initiator state (the
ISID it used for the particular target device and the reservation key that
it was using, the latter in any case) OR
b) it doesn't recover the reservation "for free" but must use a heavier
hammer (like prempt and abort) to recover the reservation

As in another part of my note, a host rebooting is only one scenario where
this reservation auto-recovery should occur (as seen from the SCSI layers).
For all the other cases, the host doesn't loose any state (it never went
down) so things are OK.

For a host rebooting, a lot has to do with why it shutdown in the first
place and whether it expects to get back its reservations with minimal
effort.   One can argue that in the case of reboot, the host will start
from scratch for everything and use the bigger hammer.  Or one can argue
that if it really cared, it can preserve this state information.  I don't
think this is a major requirement either way.

As for iSCSI-FC bridges, I'm not thought about the requirements.  A very
large lot depends on the model that that bridge implements:
a) how much does the bridge attempt to be a "front" for an FC device and
tunnel initiator state through it
b) how much does the bridge act as an independent iSCSI target and just
aglomerate all (some of) the logical units on the FC network behind it.

For case (b), the bridge is an iSCSI target device and therefore needs to
support all target requirements.
When I look at case (a), I seem to always come to the conclusion that such
bridges cannot be very stateless and that this extra burden is relatively
minimal.  For example, the bridge will need to something non-trivial to
restore initiator state through itself when (a) changes occur on the FC
side, (b) the target resets completely, (c) the IP side resets (sessions
crash and burn for IP or link layer errors).  Adding an ISID to the "saved"
state along with the iSCSI Initiators Name doesn't sound like a whole lot
of extra stuff.

Jim Hafner


Sandeep Joshi <sandeepj@research.bell-labs.com>@research.bell-labs.com on
04-24-2001 09:51:15 AM

Sent by:  sandeepj@research.bell-labs.com


To:   Jim Hafner/Almaden/IBM@IBMUS
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI: handling of persistent reserves during initiator
      reboot



Jim Hafner wrote:
>
> Tom,
>
> Here's my two cents. Apologies for being long-winded.  In short, I think
I
> agree with most of your sentiment.
>
> 1) In order to accommodate Persistent Reservations (and a few other SCSI
> things), I think it is important to have some mechanism for an initiator
to
> reestablish some nexus state information through logout/login.  The
current
> thinking in N&DT is that the initiator should reuse it's old ISID and the
> target should respond with the old TSID (thereby rebuilding the nexus).
> Note, the initiator starts with TSID=0, as if it was a new session.  The
> target that remembers the old ISID/TSID pair (can reuse the old TSID).
Keep
> in mind that that target is already remembering some state information
(the
> reservation) about the ISID/TSID (and Names) relationship in the nexus,
so
> this isn't a big deal.

If I am not mistaken, isnt this also adding to initiator state ?
Initiators now have to remember the ISIDs used with every target
(or have only one!)

And doesnt this also add state to iSCSI-FC bridges...?

Regards,
-Sandeep





From owner-ips@ece.cmu.edu  Tue Apr 24 21:09:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA19252
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 21:09:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3OMNuu20616
	for ips-outgoing; Tue, 24 Apr 2001 18:23:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3OMNNA20599
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 18:23:24 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRR52>; Tue, 24 Apr 2001 15:23:17 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173463@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Ips (E-mail)" <ips@ece.cmu.edu>
Cc: Charles Monia <cmonia@NishanSystems.com>
Subject: iSCSI: Immediate Delivery Behavior
Date: Tue, 24 Apr 2001 15:23:17 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

The behavior for immediate commands seems ambiguous and possibly needlessly
complex.

Rev 06 says the following regarding ordered delivery to the SCSI layer:

   "Except for the commands marked for immediate delivery the iSCSI 
   target layer MUST deliver the commands to the SCSI target layer in 
   the order specified by CmdSN. Commands marked for immediate delivery 
   may be handed over by the iSCSI target layer to the SCSI target layer 
   as soon as detected. iSCSI may avoid delivering some command to the 
   SCSI layer if so required by some prior SCSI or iSCSI action (e.g., 
   clear task set Task Management request received before all the 
   commands it was supposed to act on)."

In a non-striped session consisting of one TCP/IP connection, the above
could be interpreted to allow the delivery of an immediate command before
other partly received commands that were previously issued. As a result, an
operation, such as an abort task, might bypass the command to be aborted --
even if both were sent on the same connection.

Assuming that's true, I believe a useful simplification is to require that
all traffic flowing over a given TCP/IP connection be delivered to the SCSI
layer in the order received over that connection.  In a striped session, an
immediate command might therefore leapfrog commands on other connections but
would never bypass commands on the same connection.  In my opinion, that
simplifies the problem of properly purging commands and stale PDUs in the
wake of a task management operation. 

Charles
Charles Monia
Senior Technology Consultant
Nishan Systems
email: cmonia@nishansystems.com
voice: (408) 519-3986
fax:   (408) 435-8385


From owner-ips@ece.cmu.edu  Tue Apr 24 22:45:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA22270
	for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 22:45:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3ONM6123254
	for ips-outgoing; Tue, 24 Apr 2001 19:22:06 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3ONKwA23194
	for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 19:20:58 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id B818215A0; Tue, 24 Apr 2001 16:20:56 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id QAA15316;
	Tue, 24 Apr 2001 16:20:51 -0700 (PDT)
Message-ID: <3AE60A21.5E0F5872@cup.hp.com>
Date: Tue, 24 Apr 2001 16:20:01 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Charles Monia <cmonia@NishanSystems.com>
Cc: IPS Reflector <ips@ece.cmu.edu>
Subject: Re: iSCSI : Aborting non-SCSI tasks.
References: <B300BD9620BCD411A366009027C21D9B173462@ariel.nishansystems.com>
Content-Type: multipart/mixed;
 boundary="------------1C9F982636E39BE1347B6C84"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------1C9F982636E39BE1347B6C84
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi Charles,

Yes, this would be sufficient and simple enough for such an event. The
point here being that the iSCSI spec needs to explicitly address this
issue of deterministic termination of non-SCSI command PDUs like text,
nop-out, login, logout, etc.

The solution could be based on a timer with gross recovery on its
expiry. However, such a timer needs to be specified by the protocol and
tunable through an iSCSI login key (timers equivalent to RR_TOV, RA_TOV
in FC).

Regards,
Santosh


Charles Monia wrote:
> 
> Hi Santosh:
> 
> My .02 (again):
> 
> Assuming these transactions are handled in the iSCSI layer:
> 
> a) No abort mechanism is required.  Like a 'ping' operation, these functions
> are expected to complete in some fixed time or not at all.
> 
> b) A timeout indicates the iSCSI layer is broken -- i.e., timeouts are
> fatal.
--------------1C9F982636E39BE1347B6C84
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------1C9F982636E39BE1347B6C84--



From owner-ips@ece.cmu.edu  Wed Apr 25 13:37:09 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA23814
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 13:37:08 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PEXdc17967
	for ips-outgoing; Wed, 25 Apr 2001 10:33:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic2.us.dg.com (mxic2.us.dg.com [128.221.31.40])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PEWxA17917
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 10:33:00 -0400 (EDT)
Received: by mxic2.us.dg.com with Internet Mail Service (5.5.2650.21)
	id <J4194JMC>; Wed, 25 Apr 2001 10:17:21 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154AD@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: santoshr@cup.hp.com, ips@ece.cmu.edu
Subject: RE: iSCSI : digest error handling violates EMDP/InDataOrder
Date: Tue, 24 Apr 2001 15:00:45 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Santosh's original issue was that R2Ts to request
retransmission of data (e.g., due to data CRC failure)
result in an initiator seeing what appear to be out
of order R2Ts due to the need to go back and get
the failed data retransmitted.

Santosh's original email said (in part):

> Section 6.2 (pg 80). Digest Errors
> -----------------------------------
> "If the error is a Data-Digest-Error in a Data-PDU, the target MUST
> either request retransmission with a R2T or answer with a Reject iSCSI
> PDU and abort the task."

> Problem :
> ---------
> On a Data digest error detected by a target, it MUST NOT request
> re-transmission of the data PDU thru an R2T if the session login key
> InDataOrder is set to yes. 

This key has been renamed to DataOrder in -06, and if
it's set to "yes" as currently defined, then (IMHO)
Santosh appears to be correct.  The Initiator is not
going to be expecting the R2T offset to step back to
pick up the missing data, and hence the Target MUST
Reject and Abort.

Beyond this, I take Bob Snively's mail as a suggestion
that we ought to split iSCSI DataOrder from SCSI's
EMDP, as FCP considers those to be separate concepts.
That seems like a reasonable approach.

Comments?

--David

> -----Original Message-----
> From:	Santosh Rao [SMTP:santoshr@cup.hp.com]
> Sent:	Tuesday, April 24, 2001 2:30 PM
> To:	ips@ece.cmu.edu
> Cc:	David Black
> Subject:	Re: iSCSI : digest error handling violates EMDP/InDataOrder
> 
> 
> What is the final resolution on this EMDP issue. IMHO, iSCSI must retain
> the EMDP semantics as defined in FCP, SRP. i.e. It controls the order of
> the data across the entire SCSI command. (which includes sending R2T
> requests in order, if EMDP was set to 1).
> 
> Some additional thoughts on this topic are :
> 
> 1) Is it worth a finer granularity of control wherein the initiator be
> allowed to negotiate with the target that R2T requests be sent in-order
> , while not imposing any constraints on the Read Data PDU order.
> 
> 2) Should control be provided over a "Random Relative Offset" feature,
> as Bob describes it below, or is it to be assumed that iSCSI Data PDUs
> will always be in-order within a sequence ?
> 
> 3) Speaking of sequence, this terminology has been often used in this
> thread. Where is the notion of a sequence defined in iSCSI ? What is the
> definition of an iSCSI sequence.
> 
> - Santosh
> 
> Robert Snively wrote:
> > 
> > Seems to me that there are some unclarities in this area as well.
> > 
> > There are really two pieces being discussed as one:
> > 
> >         EMDP (a SCSI functionality)
> > 
> >         Random relative offset (a transport functionality)
> > 
> > EMDP is used to allow a target to request or deliver its data
> > out of order.  This is used for things like passing a stripe
> > segment from a RAID data extent as soon as it has been accumulated,
> > rather than waiting until all previous parts of the RAID data
> > extent have also been accumulated and delivered.  It is also used
> > for things like "start anywhere" reading of a disk track.
> > 
> > It says nothing about the ordering of data within a PDU or sequence
> > which must be ordered according to the rules of the protocol.  Fibre
> > Channel allows the data within a sequence to be transmitted in order
> > or out of order by using the login parameter "random relative offset".
> > Almost all devices choose to login and require "continuously increasing
> > relative offset". << File: Card for Santosh Rao >> 


From owner-ips@ece.cmu.edu  Wed Apr 25 18:22:53 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA02822
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 18:21:50 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PKQnD10055
	for ips-outgoing; Wed, 25 Apr 2001 16:26:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PKPtA09978
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 16:25:55 -0400 (EDT)
Received: from amrelay1.boi.hp.com (amrelay1.boi.hp.com [15.56.8.24])
	by palrel1.hp.com (Postfix) with ESMTP
	id 135C51DB0; Wed, 25 Apr 2001 13:25:54 -0700 (PDT)
Received: from xatlbh1.atl.hp.com (xatlbh1.atl.hp.com [15.45.89.186])
	by amrelay1.boi.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA23382;
	Wed, 25 Apr 2001 14:25:51 -0600 (MDT)
Received: by xatlbh1.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <JQX9ZJGK>; Wed, 25 Apr 2001 16:25:50 -0400
Message-ID: <499DC368E25AD411B3F100902740AD65BC5AC0@xrose03.rose.hp.com>
From: "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
To: "'julian_satran@il.ibm.com'" <julian_satran@il.ibm.com>,
        Chip Sharp <chsharp@cisco.com>
Cc: vince_cavanna@agilent.com, steph@cs.uchicago.edu, ips@ece.cmu.edu,
        tsvwg@ietf.org, craig@aland.bbn.com, Jonathan.Wood@sun.com,
        xieqb@cig.mot.com, jonathan@dsg.stanford.edu, rrs@cisco.com
Subject: RE: [Tsvwg] [SCTP checksum problems]
Date: Wed, 25 Apr 2001 14:16:44 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I think this "SCTP checksum" thread spanning IPS and TSVWG was for
discussion around whether or not iSCSI (running over SCTP) could forgo data
integrity checking and transport-like functionality (retransmission, ack,
etc) should SCTP provide a sufficiently strong check-code.
If iSCSI were willing to completely trust SCTP end-to-end across a network
fabric (including "middleboxes"), then that provides one reason for SCTP to
adopt a stronger checksum or CRC.
If iSCSI will still implement its own data integrity check-code above SCTP,
then SCTP needs to make an independent decision on whether its current
check-code is sufficiently strong for its target uses.
Currently, iSCSI contains a data integrity check "digest" that can be
negotiated end-to-end to be disabled on a per-connection basis.

This discussion begs a few questions:
- Are there clearly different classes of applications (in regards to their
end-to-end data integrity strength needs)?
- How are these application classes' end-to-end data integrity needs meet in
the future?  Is it SCTP, IPSec, application-specific protocol, a new
protocol?
- Is there a general need for strong end-to-end data integrity that could be
provided for in a recommended generic manner?
- Is iSCSI unique in being an "ultra-low error rate application" and should
iSCSI then handle its own data integrity?
- Should SCTP strengthen its checksum to meet the needs of a general class
of data-criticial applications, and/or provide a means for negotiating an
optional stronger checksum?
- What is the role of network infrastructure (router/middlebox hardware and
software) in strengthening end-to-end data integrity?

Data integrity for iSCSI over TCP is a separate issue. It is unlikely that
we will be able to evolve TCP in a timely manner to utilize a stronger
check-code given TCP's current wide scale deployment (although adding a
stronger checksum/CRC to TCP would seem to be the best solution). So,
something else has to be done either above or below TCP to provide the
required level of iSCSI data integrity. Of course, if TCP's data integrity
deficiency is impacting other data-critical applications, then it seems
prudent to at least consider solving the problem generically.

Jim


> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: Friday, April 20, 2001 1:02 AM
> To: Chip Sharp
> Cc: vince_cavanna@agilent.com; steph@cs.uchicago.edu; WENDT,JIM
> (HP-Roseville,ex1); ips@ece.cmu.edu; tsvwg@ietf.org;
> craig@aland.bbn.com; Jonathan.Wood@sun.com; xieqb@cig.mot.com;
> jonathan@dsg.stanford.edu; rrs@cisco.com
> Subject: RE: [Tsvwg] [SCTP checksum problems]
> 
> 
> 
> 
> Chip,
> 
> CRC s are not meant to protect against malicious middle boxes 
> - rather on
> boxes that strip the strong link CRCs and
> let the end-system rely on the weak TCP checksum.
> 
> NAT boxes have good reason to recompute TCP checksums, but 
> unless they are
> malicious no reason to recompute iSCSI CRCs.
> 
> And against malicious boxes iSCSI has cryptographic digests 
> as options.
> 
> And I was not aware that we are discussing - in this forum - 
> iSCSI data
> integrity options.
> 
> Julo
> 
> Chip Sharp <chsharp@cisco.com> on 19/04/2001 18:53:53
> 
> Please respond to Chip Sharp <chsharp@cisco.com>
> 
> To:   vince_cavanna@agilent.com
> cc:   steph@cs.uchicago.edu, vince_cavanna@agilent.com, 
> jim_wendt@hp.com,
>       Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org,
>       craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
>       jonathan@dsg.stanford.edu, rrs@cisco.com
> Subject:  RE: [Tsvwg] [SCTP checksum problems]
> 
> 
> 
> 
> As was pointed out previously, middle box operations (such as 
> NATs) tend to
> creep up the protocol stack and into applications.
> 
> Take SIP for example.  It includes IP addresses in its 
> INVITE.  In order to
> work across a NAT, the IP addresses it exchanges have to be 
> replaced with
> the NATed address.  One way is for the NAT to reach up into 
> the SIP INVITE
> and change the address.  This modifies the TCP or UDP 
> checksum.  Now SIP
> could have included its own integrity check to protect 
> against corrupted or
> modified TCP checksums, but all that would have happened is 
> that NATs would
> have changed the SIP checksum in addition to the TCP/UDP checksum.
> 
> Therefore, even if iSCSI included its own integrity check, if 
> a middle box
> is going to futz with iSCSI packets it will just strip the check, do
> whatever it does and then recalculate the check.
> 
> If this is what you want to protect against you will have to 
> go to some
> type of digital signature.
> 
> At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
> >Stephen,
> >
> >I have to admit that I do not have much direct experience with middle
> boxes,
> >BUT I did have fairly direct and recent experience with a popular NAT
> router
> >from a popular vendor that was corrupting data in a network of
> Macintoshes.
> >
> >Apple's TCP was unaware of any problem as was Apple's Filing 
> Protocol and
> >most applications. The only applications that detected the 
> corruption were
> >those that performed an integrity check of their own. Those 
> applications
> >that assumed a reliable transport (and file system) were doomed to
> >experiencing the indirect effects of the corruption at some 
> later time.
> The
> >corruption only happened when large amounts of data were transferred
> >quickly.  The router vendor fixed the problem once; then 
> fixed it again;
> >then fixed it one last time before the data corruption finally
> >"disappeared". After several weeks of continuous operation the router
> >appeared to get into a mode where it was once again 
> corrupting data. Power
> >cycling the router "fixed it". The story apparently has not 
> yet ended.
> >
> >I admit I may have given too much significance to this 
> single incident
> that
> >I have personally experienced but on the other hand I don't see the
> >mechanisms in place to prevent this type of problem in the 
> future other
> than
> >the end to end integrity checks.
> >
> >Incidentally this incident change my behavior when 
> transferring data over
> a
> >network. I will always use a compression utility; not only 
> for reducing
> the
> >data to be transmitted but to ensure the integrity of my 
> data is protected
> >end to end by the utility's CRC mechanism.
> >
> >I believe quite firmly that we DO need a mechanism to allow 
> us to tolerate
> >poor implementations of middle boxes and cannot simply hope that
> eventually
> >such poor implementations will vanish, nor that we will have 
> the luxury of
> >being able to select only good implementations for every 
> component of our
> >storage network.
> >
> >Vince
> >
> >|-----Original Message-----
> >|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> >|Sent: Wednesday, April 18, 2001 3:09 PM
> >|To: CAVANNA,VICENTE V (A-Roseville,ex1)
> >|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
> >|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
> >|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
> >|Subject: Re: [Tsvwg] [SCTP checksum problems]
> >|
> >|
> >|Vince,
> >|
> >|> I don't think iSCSI can be completely relieved of performing
> >|some data
> >|> integrity checking as long as there exists the possibility
> >|of "middle boxes"
> >|> opening up the transport protocol's packet and thus
> >|potentially invalidating
> >|> any reliability guarantees the transport protocol makes.
> >|
> >|Any protection provided against this failure mode will only be
> >|transient, so we must temper the desire to introduce such a
> >|requirement with reality.
> >|
> >|Middleboxes can just as easily open up to the iSCSI layer and tinker
> >|with the payload, as they do with other ULPs running on TCP 
> (e.g HTTP)
> >|today.  Short of securing the connection, there is ALWAYS a
> >|possibility of a middlebox terminating and reoriginating an 
> integrity
> >|check.  In case you think this is a farfetched scenario, I 
> do get the
> >|impression that there is a high level of interest in `actively
> >|middling' iSCSI once the specs crystalize.  Who shaves the barber?
> >|
> >|An integrity check is not necessary as long as some lower layer
> >|provides adequate integrity guarantees.
> >|
> >|Adding an integrity check above the transport layer is based upon
> >|documentation of the presence of a lot of crappy network 
> hardware and
> >|software and analyses of the transport integrity check (TCP 
> checksum)
> >|which suggests it might not be adequately strong against some such
> >|observed errors.
> >|
> >|I claim that the high incidence of `broken' (corruption introducing)
> >|components is a result of a variety of factors which have shaped the
> >|development of network components thus far.  The fact that integrity
> >|checks are assumed to be performed in a network context 
> substantially
> >|lowers the bar for implementation correctness.
> >|
> >|In a storage (or CPU) context, these types of implementation errors
> >|are a) more easily detectable (more fatal) b) more carefully avoided
> >|during implementation (because of the cost of a potential fatal
> >|error).  If network components magically reached the same `quality
> >|level' as storage and CPU components, there might be no 
> justification
> >|for additional integrity checks above the transport.  
> Similarly if the
> >|transport (or whatever lower layer) integrity checks are very strong
> >|(e.g. IPSec), there is, again, no need for a higher level integrity
> >|check.
> >|
> >|I am not disagreeing that we need an additional integrity check over
> >|TCP in the present target environment, but I do disagree that iSCSI
> >|will always need such a check, independently of what is running
> >|beneath it.
> >|
> >|Steph
> >|
> 
> 
> -------------------------------------------------------------------
> Chip Sharp                       Consulting Engineering
> Cisco Systems
> -------------------------------------------------------------------
> 
> 
> 
> 


From owner-ips@ece.cmu.edu  Wed Apr 25 18:25:35 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA02871
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 18:25:34 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PHvgd29462
	for ips-outgoing; Wed, 25 Apr 2001 13:57:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PHvMA29449
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 13:57:22 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id TAA187364
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 19:57:15 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id TAA131860
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 19:57:14 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A39.00629E90 ; Wed, 25 Apr 2001 19:57:11 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Black_David@emc.com
cc: hufferd@us.ibm.com, ips@ece.cmu.edu
Message-ID: <C1256A39.00629C67.00@d12mta05.de.ibm.com>
Date: Wed, 25 Apr 2001 21:02:32 +0300
Subject: RE: iSCSI:Target Reset
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



It looks close to a decent solution. As we are still trying to minimize
options we may want to specify a response code telling that the result was
"good" but no reset was really done (except for authorized users)
This will satisfy legacy software and will avoid the harm John is so
concerned about and avoid adding an option.

I said already on this list that removing it is really a bad idea as any
management program will need probably full access to the SCSI command set.

Julo

Black_David@emc.com on 23/04/2001 18:19:29

Please respond to Black_David@emc.com

To:   John Hufferd/San Jose/IBM@IBMUS, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI:Target Reset




With my co-chair hat off, my inclination would be to
specify it (since it's in SAM) but include words about
the risks, make support for it OPTIONAL, and point out
that implementations may want to have access controls
on which initiators are permitted to do this.  I'm
assuming that at least two implementations will do
this so that we don't get into that issues of potentially
needing to remove this in order to go from Proposed
Standard to Draft Standard in the future.

--David

> -----Original Message-----
> From:   John Hufferd [SMTP:hufferd@us.ibm.com]
> Sent:   Sunday, April 22, 2001 5:52 PM
> To:     ips@ece.cmu.edu
> Subject:     iSCSI:Target Reset
>
> (resend of message with iSCSI in Subject)
> I thought we had a number of discussion previously about Target Reset
> (Warm
> or Cold).  I thought there was general feeling that this command is so
> dangerous that it should not be supported by iSCSI.  The long distance
> capability of iSCSI makes the risks involved unmanageable.  There should
> only be an Admin way to do this.
>
> Some folks have said that we could permit it and have special
> authorization
> etc.  This would probably cause a separate section in the spec. to define
> the authorization approach,  and what ever other security is needed to
> prevent this from inappropriately being used.  All for what purpose?
This
> can not be part of error recovery from a normal initiator.  The wide
> spread
> effect is too great for that.
>
> I would like to hear from the list about their feeling on this item.
>
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com





From owner-ips@ece.cmu.edu  Wed Apr 25 18:26:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA02885
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 18:26:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PK2lG08691
	for ips-outgoing; Wed, 25 Apr 2001 16:02:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from iol.unh.edu (mars.iol.unh.edu [132.177.121.222])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PK2ZA08674
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 16:02:35 -0400 (EDT)
Received: from localhost (rdr@localhost)
	by iol.unh.edu (8.9.3/8.9.3) with ESMTP id QAA27960;
	Wed, 25 Apr 2001 16:02:19 -0400 (EDT)
Date: Wed, 25 Apr 2001 16:02:19 -0400
From: "Robert D. Russell" <rdr@mars.iol.unh.edu>
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: iSCSI Parameter Negotiation
In-Reply-To: <C1256A39.00629C67.00@d12mta05.de.ibm.com>
Message-ID: <Pine.SGI.4.20.0104251600080.27817-100000@mars.iol.unh.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julo:

A simple question:
  During the Login Phase, or during an exchange of text and text-response
  messages during the Full Feature Phase, can the target introduce a
  key=value or key=list pair that has not previously been offered by the
  initiator?

Example:
  Suppose the initiator is happy with the default value of 8 for
  MaxConnections, and therefore does not offer this key in any of the
  login or text messages it sends to the target during the leading
  connection for the session.  Further suppose that the target cannot
  support 8 connections but only 1 connection and wants to inform the
  initiator of this fact.  The only way it can do this is to send a
  login-response or text-response that includes "MaxConnections=1", even
  though this is not a "response" to anything offered by the initiator.

Reference:
  It seems possible to use parts of the current standard (v6) to justify
  either of 2 answers to this question: YES, or NO.  (And therefore, the
  real answer is currently a non-interoperable "maybe"!)

YES, the target can introduce a key=value or key=list pair in a
  login-response or a text-response even though the initiator did not
  previously offer this key in a login or text message.
  
  Justification:
  On page 16, the last paragraph of Section 1.2.4 on Text Mode Negotiation
  talks in terms of "the offering party" and "the responding party",
  presumably implying that either the initiator or the target can be the
  "offering party".

  On page 51, the last paragraph of section 2.9.1 about the F (Final) Bit
  says "if (the Final Bit is) set to 0 in a response to a text command with
  the Final Bit set to 1 it indicates that the target has more work to do
  (invites a follow-on text command)".  It is unclear what this "more work"
  might be, and it is also unclear whether that "follow-on text command"
  from the initiator could include the initiator's "response" to a key=value
  or key=list pair introduced by the target in this response.

NO, the target cannot introduce a key-value pair in a login-response or a
  text-response -- it can ONLY respond to keys explicitly offered by the
  initiator in the login or text message being responded to.

  Justification:
  On page 84, the second paragraph of Section 4.3 on Operational Parameter
  Negotiation During the Login Phase says "Operational parameter negotiation
  MAY involve several request-response exchanges (login and/or text) always
  driven by the initiator."  Further, on page 85, the second paragraph of
  Section 5 on Operational Parameter Negotiation Outside the Login Phase
  again reiterates "Operational parameter negotiation MAY involve several
  text request-response exchanges always driven by the initiator."
  Depending on what you understand by "driven", this could mean that only the
  initiator can offer keys, and the target can only respond to offered keys.
  (It could be that "driven" on both these pages refers only to the setting
  of the F bit, in which case it implies nothing about key=value pairs).

  Furthermore, on page 51, the first paragraph of Section 2.9.3 on Text
  Response Data says "The Text Response Data Segment contains responses in
  the same key=value format as the Text Command and with the same length and
  coding constraints.  Appendix C lists some basic Text Commands and their
  Responses."  This clearly says that a Text Response from the target
  can only contain responses.  The rest of section 2.9.3 also gives the
  impression that the target cannot introduce any new key=value pairs
  in a response, because it says what to do "if the Text Response does not
  contain a key that was requested", and that "Text response key=value
  pairs MUST be delivered in the same order as the command key=value pairs
  whenever applicable".  This section gives no indication that new key=value
  pairs are allowed in the response, and if they are, where they could be
  inserted in the ordering of key=value pairs in the response.

In general, except for the use of the terms "the offering party" and "the
  responding party" in Section 1.2.4, the whole tone of the standard reads
  as if only the initiator can offer key=value or key=list pairs, and the
  target can only respond with values for offered keys.  I say this because
  when you read section 2.8 on the Text Command, nowhere does it say anything
  about what to do if the target sends a response that offers a key that
  had not previously been requested by the initiator.  Likewise, section
  2.9 on the Text Response includes nothing to indicate that the keys in
  this response can be different from those offered in the previous Text
  Command.  Likewise, section 4.2 starting on page 82 describes primarily
  the Security and Integrity Negotiation, but frequently mentions iSCSI
  non-security parameters.  The last paragraph on page 82 explicitly
  says "The initiator sends a text command with an ordered list of the
  options it supports for each subject (authentication algorithm,
  iSCSI parameters and so on)", which implies that operational parameters
  can be offered in this way by the initiator.  The description of the
  target response on page 83 implies that the target can only select
  the appropriate choice to keys offered by the initiator.  There is no
  hint of the target offering the initiator any new keys.

  Even section 1.2.4 on page 15, which generally talks about "offering party"
  and "responding party", does not do so in paragraph 5:
  "If a target is not supporting, or not allowed to use with a specific
  initiator, any of the offered options, it may use the value "reject".
  This clearly says that only targets can do this, not initiators, and
  would therefore seem to imply that targets cannot offer options.

Resolution:
  The standard should unambiguously state the answer to this question
  someplace.  I would suggest in section 1.2.4, but it would not hurt
  to reiterate it in other places as well, such as in section 4 on the
  Login Phase.  In addition:

  if the answer is "YES", then add some statements in sections 2.8 and 2.9
  to describe how to handle these offers from the target at the same level
  of detail as is now done in those sections for handling offers from the
  initiator.

  if the answer is "NO", then get rid of the terms "offering party" and
  "responding party" in section 1.2.4 (this is the only place in the
  standard where those terms are used), and add statements in sections 2.8
  and 2.9 to explicitly state that targets cannot offer new keys to an
  initiator.

  Furthermore, if the answer is "NO", then what should the target do in
  the example I gave at the start of this e-mail?


Bob Russell
InterOperability Lab
University of New Hampshire
rdr@iol.unh.edu
603-862-3774



From owner-ips@ece.cmu.edu  Wed Apr 25 18:27:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA02943
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 18:27:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PHJe727317
	for ips-outgoing; Wed, 25 Apr 2001 13:19:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PHJDA27303
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 13:19:13 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 09AAA94009
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 13:18:58 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder 
In-Reply-To: Message from Black_David@emc.com 
   of "Tue, 24 Apr 2001 15:00:45 EDT." <0F31E5C394DAD311B60C00E029101A07080154AD@corpmx9.isus.emc.com> 
References: <0F31E5C394DAD311B60C00E029101A07080154AD@corpmx9.isus.emc.com> 
Date: Wed, 25 Apr 2001 13:17:18 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010425171858.09AAA94009@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

David,

> Beyond this, I take Bob Snively's mail as a suggestion
> that we ought to split iSCSI DataOrder from SCSI's
> EMDP, as FCP considers those to be separate concepts.
> That seems like a reasonable approach.
> 
> Comments?

Assuming DataOrder controls the equivalent of random relative offset
(data within an R2T block, or an entire read return), you took the
words out of my mouth.

Steph



From owner-ips@ece.cmu.edu  Wed Apr 25 20:29:13 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA04710
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 20:29:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PM9rR16430
	for ips-outgoing; Wed, 25 Apr 2001 18:09:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PM9lA16421
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 18:09:47 -0400 (EDT)
Received: from xatlrelay1.atl.hp.com (xatlrelay1.atl.hp.com [15.45.89.190])
	by atlrel1.hp.com (Postfix) with ESMTP
	id B20D5A43; Wed, 25 Apr 2001 18:09:46 -0400 (EDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1])
	by xatlrelay1.atl.hp.com (Postfix) with ESMTP
	id 709411F50E; Wed, 25 Apr 2001 18:07:51 -0400 (EDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <JK4C95R0>; Wed, 25 Apr 2001 15:09:44 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A09006@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'Douglas Otis'" <dotis@sanlight.net>, ips@ece.cmu.edu,
        Black_David@emc.com
Cc: sob@harvard.edu, egrodriguez@lucent.com, mankin@east.isi.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Wed, 25 Apr 2001 15:09:42 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

iSCSI will not compete with other SCSI transport speeds unless

- significant portions are implemented in hardware

AND
-there is some method of regaining framing in the TCP byte stream to enable
direct iSCSI data placement.

I believe this is sufficiently stated in the iSCSI requirements doc.

Although it is admittedly hard to extract from Doug's long emails, he does
bring up one painful point - even though there are several factual errors
sprinkled throughout that last email.  

The point I chose to extract is that given the first two facts, it is
extremely important to have coordination and attention between the two
efforts (iSCSI definition and framing solution definition).  

Doug states:
> Presently you wish to see this done in a haphazard manner without any
> coordination by the IETF.  This attitude does not reflect the 
> magnitude of this endeavor.

I sympathize with Doug's frustration.  It's a chicken and egg problem that
may cripple iSCSI implementations.  On the one hand, iSCSI definition must
proceed today in order to deliver the specification in a reasonable
timeframe, and a framing solution must proceed in parallel.  On the other
hand, without proper architecture of current iSCSI for a framing solution,
iSCSI may have to be reinvented to perform at 10Gig speeds.  This keeps me
up at night :-)

Another valid point Doug made is that iSCSI, FCIP, and iFCP all have the
same framing needs and should all use the framing solution.  That
recommendation certainly seems sane and within the scope of this WG to
oversee.  Last time I paid attention, FCIP and iFCP were trying to reinvent
this wheel.

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 



From owner-ips@ece.cmu.edu  Wed Apr 25 21:17:23 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05348
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 21:17:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PNvs523332
	for ips-outgoing; Wed, 25 Apr 2001 19:57:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from magic.adaptec.com (magic.adaptec.com [208.236.45.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PNvSA23314
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 19:57:28 -0400 (EDT)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id QAA18525;
	Wed, 25 Apr 2001 16:57:22 -0700 (PDT)
Received: from aimexc03.corp.adaptec.com (aimexc03.corp.adaptec.com [162.62.62.43])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id QAA29854;
	Wed, 25 Apr 2001 16:48:06 -0700 (PDT)
Received: by aimexc03.corp.adaptec.com with Internet Mail Service (5.5.2650.21)
	id <JGP3VC1L>; Wed, 25 Apr 2001 16:57:21 -0700
Message-ID: <268DBFF7D2A3D411A37400D0B72E345FE71B3B@aimexc03.corp.adaptec.com>
From: "Mudaliar, Hari" <Hari_Mudaliar@adaptec.com>
To: "'Santosh Rao'" <santoshr@cup.hp.com>,
        "Mudaliar, Hari"
	 <Hari_Mudaliar@adaptec.com>
Cc: IPS Reflector <ips@ece.cmu.edu>
Subject: RE: iSCSI : target session login behaviour
Date: Wed, 25 Apr 2001 16:57:17 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Santosh,
	I get your point. But what if there is more than one iSCSI Host bus
adapter in a system? The Initiator Name will be the same and ISID MAY turn
out to be the same (unless the ISIDs are apportioned between the initiators
through some configuration method). This assumes that multiple sessions can
exist between one initiator system (containing multiple iSCSI off-load
engines/HBAs) and a target.

- Hari

-----Original Message-----
From: Santosh Rao [mailto:santoshr@cup.hp.com]
Sent: Wednesday, April 25, 2001 4:18 PM
To: Mudaliar, Hari
Cc: IPS Reflector
Subject: Re: iSCSI : target session login behaviour


"Mudaliar, Hari" wrote:

>         I am assuming that you are referring to the creation of a new
> session with TSID=0 in your example below. Take the case of an initiator
I1
> who has established a session with a target with an ISID=ISID1. What if a
> second initator I2 tries to login to the same target with ISID1? The
target
> cannot decide to logout the first initiator (who already has a session
> established with ISID1) as suggested by you. 

Hari,

You may want to take a second look at my mail. It specifically refers to
the problem in the context of a given (Initiator Name, ISID). Your
example above does not fall under that category. A 2nd initiator using
the same ISID would have a different Initiator Name. (a.k.a initiator
WWUI).

The problem raised is in the context of an existing session for a given
(Initiator Name, ISID). How does a target deal with a second session
login received for the same (Initiator Name, ISID) with a NULL TSID ?

>         Also, depending on implementation, the target may realize that the
> TCP connections for a session were lost (using Keep-Alives or iSCSI NOPs
> etc.) when the initiator rebooted thus terminating the session. By the
time
> a new login from the same initiator is received, the old session info may
> have been cleared.

Then again, it may not. There's 2 aspects to this issue :
1) Successful session re-logins from the rebooted host.
2) Garbage collection and cleanup of the old session resources.

(1) is a more serious issue, since the target MUST NOT reject the login
based on a pre-existing active session for a given (Initiator Name,
ISID).

(2) is handled through garbage collection algorithms, but implementation
of the proposal would help accelerate the release of stale session
resources.

- Santosh


> 
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Wednesday, April 25, 2001 11:19 AM
> To: IPS Reflector
> Subject: iSCSI : target session login behaviour
> 
> All,
> 
> How should a target respond when it receives a session login  [on a new
> TCP connection] with the same (ISID, Initiator Name) as a session
> already active at the target.
> 
> Does such a login request imply :
> 
> 1) the target should perform implicit logout and re-login of the session
> identified by (ISID, initiator name) ?
> 
> 2) Or does this result in the target responding to the session login
> with :
> a login response with status class of non-zero indicating target is
> rejecting the login ?
> 
> [The draft does not describe target behaviour for this scenario.]
> 
> iSCSI session login semantics should explicitly state that the above
> scenario will result in case (1) above. i.e. when a target sees a
> session login for a given (ISID, initiator name), it MUST treat this as
> an implicit logout of any previous session active at the target for that
> (ISID, initiator name) and then, establish a new session.
> 
> This is required because the above scenario can typically occur when an
> initiator reboots without having performed a session logout on all
> active sessions.(system did not perform an orderly shutdown).
> 
> As a side note, the iSCSI draft Status Class/Codes could do with a misc
> error category along the lines of the FC "No additional Explantion"
> reason explantion. This would help deal with error conditions that don't
> come under the listed category.
> 
> - Santosh


From owner-ips@ece.cmu.edu  Wed Apr 25 21:17:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05359
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 21:17:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PNqss23012
	for ips-outgoing; Wed, 25 Apr 2001 19:52:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PNpvA22959
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 19:51:57 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRT5X>; Wed, 25 Apr 2001 16:51:49 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B17346E@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "'KRUEGER,MARJORIE (HP-Roseville,ex1)'" <marjorie_krueger@hp.com>,
        "'Douglas Otis'" <dotis@sanlight.net>, ips@ece.cmu.edu,
        Black_David@emc.com
Cc: sob@harvard.edu, egrodriguez@lucent.com, mankin@east.isi.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Wed, 25 Apr 2001 16:51:40 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

> Another valid point Doug made is that iSCSI, FCIP, and iFCP 
> all have the
> same framing needs and should all use the framing solution.  That
> recommendation certainly seems sane and within the scope of this WG to
> oversee.  Last time I paid attention, FCIP and iFCP were 
> trying to reinvent
> this wheel.

To bring you up to date, there is no "framing solution" in the common
encapsulation proposal.

See
http://search.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt


Charles



From owner-ips@ece.cmu.edu  Wed Apr 25 21:17:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05370
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 21:17:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PNZtR22009
	for ips-outgoing; Wed, 25 Apr 2001 19:35:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PNZWA21991
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 19:35:32 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 554CA75E; Wed, 25 Apr 2001 16:35:31 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id QAA28188;
	Wed, 25 Apr 2001 16:35:26 -0700 (PDT)
Message-ID: <3AE75F0F.3FFB34BB@cup.hp.com>
Date: Wed, 25 Apr 2001 16:34:39 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Julian Satran <julian_satran@il.ibm.com>, IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI: multiple sessions b/n a pair of WWUIs.
Content-Type: multipart/mixed;
 boundary="------------A1DA71583AFAA91849CB06EA"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------A1DA71583AFAA91849CB06EA
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

> To: ips@ece.cmu.edu 
> Subject: Re: iSCSI: session login and ISID 
> From: julian_satran@il.ibm.com 
> Date: Tue, 10 Apr 2001 14:21:49 +0300 
          

> WWUI can be presented during login phase (2.10.9 is correct and in-line with 1.2.7) Two > sesions can have the same ISID but will have different TSID. The question of whether 
> more than one session should be allowed between a pair of wuis is under debate.

> Julo


Julian,

There seems to be some disconnect between your comments above and the
name-disc draft. As per the name-disc draft Section 2(d) :

"There can be only one iSCSI  session with a given ISID between an iSCSI
Intiator Node and an iSCSI Target Node."

The iSCSI [&name-disc] drafts should explicitly state that ISID is
uniquely assigned for a given initiator. Similarly, the TSID is uniquely
assigned for a given target. 

On the subject of multiple sessions for a given pair of WWUIs, this MUST
be a requirement. iSCSI must allow multiple sessions for a given pair of
WWUIs.

This is required because single-connection session models would like to
setup multiple sessions b/n initiator hosts and multi-ported targets and
export the multiple paths to LUs to upper layer wedge drivers like EMC
Powerpath, Veritas VxVm, etc.

Inability to establish multiple sessions b/n a pair of WWUIs implies
iSCSI layer will only export one path to the upper layer wedge drivers,
thereby, breaking such applications. 

This also implies iSCSI would then take on all the responsibilities of
providing load balancing and fail-over capabilities and would require
the use of multi-connection sessions for that purpose.

By allowing multiple sessions for a given WWUI pair, iSCSI layer could
achieve equivalent functionality using single connection sessions and
would also not break existing wedge drivers.

Regards,
Santosh
--------------A1DA71583AFAA91849CB06EA
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------A1DA71583AFAA91849CB06EA--



From owner-ips@ece.cmu.edu  Wed Apr 25 21:17:56 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05397
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 21:17:55 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q02rJ23596
	for ips-outgoing; Wed, 25 Apr 2001 20:02:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q02bA23589
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 20:02:38 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id 30EFB1949; Wed, 25 Apr 2001 17:02:37 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id RAA00729;
	Wed, 25 Apr 2001 17:02:32 -0700 (PDT)
Message-ID: <3AE76569.B0C73EB8@cup.hp.com>
Date: Wed, 25 Apr 2001 17:01:45 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: hafner@almaden.ibm.com, IPS Reflector <ips@ece.cmu.edu>
Subject: RE: iSCSI: session login and ISID
Content-Type: multipart/mixed;
 boundary="------------61415D5DC9632F9DD6665951"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------61415D5DC9632F9DD6665951
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

> To: ips@ece.cmu.edu, <someshg@yahoo.com> 
> Subject: RE: iSCSI: session login and ISID 
> From: "Jim Hafner" <hafner@almaden.ibm.com> 
> Date: Wed, 11 Apr 2001 13:27:28 -0700 
                            
> Now we come to the crux of the issue. In my opinion, there
> is a fundamental assumption (not explicitly stated) in the
> SCSI architecture that there never exist more than one  
> nexus between the same two (named or addressed) SCSI Ports. 

Jim,

Is it not the case that the Service Delivery Port (a.k.a. SDP) comprises
of a (WWUI + session id). If this is correct, then, multiple sessions
for a given pair of WWUIs is not an attempt to establish 2 I-T nexus b/n
the same 2 names or addressed SCSI ports.

It would be 2 I-T nexus b/n 2 initiator SDPs and 2 target SDPs.

As indicated in my earlier mail on this subject, multiple sessions b/n a
given pair of WWUIs is required to deal with the following type of
configurations :

- host with 2 iSCSI initiator SDPs, init1 & init2.
- target with 2 iSCSI SDPs, tgt1 & tgt2.
- single connection sessions established as follows :
 init1 - tgt1
 init2 - tgt2

All the LUs of a target are seen through 2 different paths and these are
exported to upper layer wedge drivers that handle path failover and load
balancing. 

Failure to allow the above configuration would imply that iSCSI is
breaking wedge driver functionality as also requiring the deployment of
multi-connection session in order to achieve path failover and load
balancing features.

Regards,
Santosh


> Parallel and FCP get this for free because of protocol layer
> constructs/limitations. In my opinion, iSCSI needs to make a 
> similar restriction to meet this requirement.  (Why SAM-x needs 
> it at all is rooted in the nexus state; I could go into that, 
> but I won't unless pressed, preferably offline.)
--------------61415D5DC9632F9DD6665951
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------61415D5DC9632F9DD6665951--



From owner-ips@ece.cmu.edu  Wed Apr 25 21:18:34 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA05408
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 21:18:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PNIsd20913
	for ips-outgoing; Wed, 25 Apr 2001 19:18:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PNIaA20901
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 19:18:36 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 046791CF5; Wed, 25 Apr 2001 16:18:36 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id QAA25978;
	Wed, 25 Apr 2001 16:18:31 -0700 (PDT)
Message-ID: <3AE75B18.721597C6@cup.hp.com>
Date: Wed, 25 Apr 2001 16:17:44 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: "Mudaliar, Hari" <Hari_Mudaliar@adaptec.com>
Cc: IPS Reflector <ips@ece.cmu.edu>
Subject: Re: iSCSI : target session login behaviour
References: <268DBFF7D2A3D411A37400D0B72E345FE71B39@aimexc03.corp.adaptec.com>
Content-Type: multipart/mixed;
 boundary="------------5F855F46B1753AAB3F02F165"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------5F855F46B1753AAB3F02F165
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

"Mudaliar, Hari" wrote:

>         I am assuming that you are referring to the creation of a new
> session with TSID=0 in your example below. Take the case of an initiator I1
> who has established a session with a target with an ISID=ISID1. What if a
> second initator I2 tries to login to the same target with ISID1? The target
> cannot decide to logout the first initiator (who already has a session
> established with ISID1) as suggested by you. 

Hari,

You may want to take a second look at my mail. It specifically refers to
the problem in the context of a given (Initiator Name, ISID). Your
example above does not fall under that category. A 2nd initiator using
the same ISID would have a different Initiator Name. (a.k.a initiator
WWUI).

The problem raised is in the context of an existing session for a given
(Initiator Name, ISID). How does a target deal with a second session
login received for the same (Initiator Name, ISID) with a NULL TSID ?

>         Also, depending on implementation, the target may realize that the
> TCP connections for a session were lost (using Keep-Alives or iSCSI NOPs
> etc.) when the initiator rebooted thus terminating the session. By the time
> a new login from the same initiator is received, the old session info may
> have been cleared.

Then again, it may not. There's 2 aspects to this issue :
1) Successful session re-logins from the rebooted host.
2) Garbage collection and cleanup of the old session resources.

(1) is a more serious issue, since the target MUST NOT reject the login
based on a pre-existing active session for a given (Initiator Name,
ISID).

(2) is handled through garbage collection algorithms, but implementation
of the proposal would help accelerate the release of stale session
resources.

- Santosh


> 
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Wednesday, April 25, 2001 11:19 AM
> To: IPS Reflector
> Subject: iSCSI : target session login behaviour
> 
> All,
> 
> How should a target respond when it receives a session login  [on a new
> TCP connection] with the same (ISID, Initiator Name) as a session
> already active at the target.
> 
> Does such a login request imply :
> 
> 1) the target should perform implicit logout and re-login of the session
> identified by (ISID, initiator name) ?
> 
> 2) Or does this result in the target responding to the session login
> with :
> a login response with status class of non-zero indicating target is
> rejecting the login ?
> 
> [The draft does not describe target behaviour for this scenario.]
> 
> iSCSI session login semantics should explicitly state that the above
> scenario will result in case (1) above. i.e. when a target sees a
> session login for a given (ISID, initiator name), it MUST treat this as
> an implicit logout of any previous session active at the target for that
> (ISID, initiator name) and then, establish a new session.
> 
> This is required because the above scenario can typically occur when an
> initiator reboots without having performed a session logout on all
> active sessions.(system did not perform an orderly shutdown).
> 
> As a side note, the iSCSI draft Status Class/Codes could do with a misc
> error category along the lines of the FC "No additional Explantion"
> reason explantion. This would help deal with error conditions that don't
> come under the listed category.
> 
> - Santosh
--------------5F855F46B1753AAB3F02F165
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------5F855F46B1753AAB3F02F165--



From owner-ips@ece.cmu.edu  Wed Apr 25 21:47:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA06619
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 21:47:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PIKnR00906
	for ips-outgoing; Wed, 25 Apr 2001 14:20:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PIJtA00800
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 14:19:59 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP id 127721A1D
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 11:19:54 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA22420;
	Wed, 25 Apr 2001 11:19:49 -0700 (PDT)
Message-ID: <3AE71515.D8D13B88@cup.hp.com>
Date: Wed, 25 Apr 2001 11:19:01 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iSCSI : target session login behaviour
Content-Type: multipart/mixed;
 boundary="------------180AA6B17B9024D95DF2BD12"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------180AA6B17B9024D95DF2BD12
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

All,

How should a target respond when it receives a session login  [on a new
TCP connection] with the same (ISID, Initiator Name) as a session
already active at the target.

Does such a login request imply :

1) the target should perform implicit logout and re-login of the session
identified by (ISID, initiator name) ? 

2) Or does this result in the target responding to the session login
with :
a login response with status class of non-zero indicating target is
rejecting the login ?

[The draft does not describe target behaviour for this scenario.]

iSCSI session login semantics should explicitly state that the above
scenario will result in case (1) above. i.e. when a target sees a
session login for a given (ISID, initiator name), it MUST treat this as
an implicit logout of any previous session active at the target for that
(ISID, initiator name) and then, establish a new session.

This is required because the above scenario can typically occur when an
initiator reboots without having performed a session logout on all
active sessions.(system did not perform an orderly shutdown).
 
As a side note, the iSCSI draft Status Class/Codes could do with a misc
error category along the lines of the FC "No additional Explantion"
reason explantion. This would help deal with error conditions that don't
come under the listed category.

- Santosh
--------------180AA6B17B9024D95DF2BD12
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------180AA6B17B9024D95DF2BD12--



From owner-ips@ece.cmu.edu  Wed Apr 25 22:04:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA06835
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 22:04:08 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PLWo313899
	for ips-outgoing; Wed, 25 Apr 2001 17:32:50 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from magic.adaptec.com (magic.adaptec.com [208.236.45.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PLVqA13830
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 17:31:53 -0400 (EDT)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id OAA02465;
	Wed, 25 Apr 2001 14:31:42 -0700 (PDT)
Received: from aimexc03.corp.adaptec.com (aimexc03.corp.adaptec.com [162.62.62.43])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id OAA14157;
	Wed, 25 Apr 2001 14:22:25 -0700 (PDT)
Received: by aimexc03.corp.adaptec.com with Internet Mail Service (5.5.2650.21)
	id <JGP3VBZS>; Wed, 25 Apr 2001 14:31:41 -0700
Message-ID: <268DBFF7D2A3D411A37400D0B72E345FE71B39@aimexc03.corp.adaptec.com>
From: "Mudaliar, Hari" <Hari_Mudaliar@adaptec.com>
To: "'Santosh Rao'" <santoshr@cup.hp.com>, IPS Reflector <ips@ece.cmu.edu>
Subject: RE: iSCSI : target session login behaviour
Date: Wed, 25 Apr 2001 14:31:38 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Santosh,
	I am assuming that you are referring to the creation of a new
session with TSID=0 in your example below. Take the case of an initiator I1
who has established a session with a target with an ISID=ISID1. What if a
second initator I2 tries to login to the same target with ISID1? The target
cannot decide to logout the first initiator (who already has a session
established with ISID1) as suggested by you. It must reject the login.
Initiator I2 perhaps needs to retry with a different ISID.
	Also, depending on implementation, the target may realize that the
TCP connections for a session were lost (using Keep-Alives or iSCSI NOPs
etc.) when the initiator rebooted thus terminating the session. By the time
a new login from the same initiator is received, the old session info may
have been cleared.

- Hari
 

-----Original Message-----
From: Santosh Rao [mailto:santoshr@cup.hp.com]
Sent: Wednesday, April 25, 2001 11:19 AM
To: IPS Reflector
Subject: iSCSI : target session login behaviour


All,

How should a target respond when it receives a session login  [on a new
TCP connection] with the same (ISID, Initiator Name) as a session
already active at the target.

Does such a login request imply :

1) the target should perform implicit logout and re-login of the session
identified by (ISID, initiator name) ? 

2) Or does this result in the target responding to the session login
with :
a login response with status class of non-zero indicating target is
rejecting the login ?

[The draft does not describe target behaviour for this scenario.]

iSCSI session login semantics should explicitly state that the above
scenario will result in case (1) above. i.e. when a target sees a
session login for a given (ISID, initiator name), it MUST treat this as
an implicit logout of any previous session active at the target for that
(ISID, initiator name) and then, establish a new session.

This is required because the above scenario can typically occur when an
initiator reboots without having performed a session logout on all
active sessions.(system did not perform an orderly shutdown).
 
As a side note, the iSCSI draft Status Class/Codes could do with a misc
error category along the lines of the FC "No additional Explantion"
reason explantion. This would help deal with error conditions that don't
come under the listed category.

- Santosh


From owner-ips@ece.cmu.edu  Wed Apr 25 22:17:50 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA07021
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 22:17:45 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q0HtT24474
	for ips-outgoing; Wed, 25 Apr 2001 20:17:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q0H1A24437
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 20:17:01 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id 707631913; Wed, 25 Apr 2001 17:17:00 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id RAA01686;
	Wed, 25 Apr 2001 17:16:55 -0700 (PDT)
Message-ID: <3AE768C8.3B925B7@cup.hp.com>
Date: Wed, 25 Apr 2001 17:16:08 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: "Mudaliar, Hari" <Hari_Mudaliar@adaptec.com>
Cc: IPS Reflector <ips@ece.cmu.edu>
Subject: Re: iSCSI : target session login behaviour
References: <268DBFF7D2A3D411A37400D0B72E345FE71B3B@aimexc03.corp.adaptec.com>
Content-Type: multipart/mixed;
 boundary="------------0461B497A83331BA60B16103"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------0461B497A83331BA60B16103
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

"Mudaliar, Hari" wrote:
> 
> Santosh,
>         I get your point. But what if there is more than one iSCSI Host bus
> adapter in a system? The Initiator Name will be the same and ISID MAY turn
> out to be the same (unless the ISIDs are apportioned between the initiators
> through some configuration method). 

HBAs that offload session mgmt are required to accept the ISID from the
node (in this context, the host O.S. driver). This allows the node to
maintain uniqueness of the ISID within its space.

The name-disc draft explicitly states in Section    :

"The SCSI Initiator Port Name and SCSI Initiator Port  Identifier are
the iSCSI Initiator Node name together with the ISID  of the session
identifier."

<snip snip>

"There can be only one iSCSI session with a given ISID between an iSCSI
Intiator Node and an iSCSI Target Node."

<snip snip>

"There can be multiple SCSI Port objects present in an  iSCSI Storage
Node object (one for each session created). In software iSCSI
representations, the iSCSI Storage Node object creates a session process
which, in turn, represents the SCSI Port. By making the SCSI Port be a
separate object from the iSCSI Node object, it allows one to use
different combinations of software and hardware iSCSI implementations
within the same iSCSI node umbrella. Moreover, this also allows the
iSCSI Node name at the initiators to be associated with the operating
system footprint and not with any network card hardware (such as the
iSCSI offload network card). In hardware iSCSI offload card
implementations, the session process is present in the iSCSI network
card. The iSCSI Node object passes the unique iSCSI Node name and the
ISID or the TSID to the session process."

- Santosh




> This assumes that multiple sessions can
> exist between one initiator system (containing multiple iSCSI off-load
> engines/HBAs) and a target.
> 
> - Hari
> 
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Wednesday, April 25, 2001 4:18 PM
> To: Mudaliar, Hari
> Cc: IPS Reflector
> Subject: Re: iSCSI : target session login behaviour
> 
> "Mudaliar, Hari" wrote:
> 
> >         I am assuming that you are referring to the creation of a new
> > session with TSID=0 in your example below. Take the case of an initiator
> I1
> > who has established a session with a target with an ISID=ISID1. What if a
> > second initator I2 tries to login to the same target with ISID1? The
> target
> > cannot decide to logout the first initiator (who already has a session
> > established with ISID1) as suggested by you.
> 
> Hari,
> 
> You may want to take a second look at my mail. It specifically refers to
> the problem in the context of a given (Initiator Name, ISID). Your
> example above does not fall under that category. A 2nd initiator using
> the same ISID would have a different Initiator Name. (a.k.a initiator
> WWUI).
> 
> The problem raised is in the context of an existing session for a given
> (Initiator Name, ISID). How does a target deal with a second session
> login received for the same (Initiator Name, ISID) with a NULL TSID ?
> 
> >         Also, depending on implementation, the target may realize that the
> > TCP connections for a session were lost (using Keep-Alives or iSCSI NOPs
> > etc.) when the initiator rebooted thus terminating the session. By the
> time
> > a new login from the same initiator is received, the old session info may
> > have been cleared.
> 
> Then again, it may not. There's 2 aspects to this issue :
> 1) Successful session re-logins from the rebooted host.
> 2) Garbage collection and cleanup of the old session resources.
> 
> (1) is a more serious issue, since the target MUST NOT reject the login
> based on a pre-existing active session for a given (Initiator Name,
> ISID).
> 
> (2) is handled through garbage collection algorithms, but implementation
> of the proposal would help accelerate the release of stale session
> resources.
> 
> - Santosh
> 
> >
> > -----Original Message-----
> > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > Sent: Wednesday, April 25, 2001 11:19 AM
> > To: IPS Reflector
> > Subject: iSCSI : target session login behaviour
> >
> > All,
> >
> > How should a target respond when it receives a session login  [on a new
> > TCP connection] with the same (ISID, Initiator Name) as a session
> > already active at the target.
> >
> > Does such a login request imply :
> >
> > 1) the target should perform implicit logout and re-login of the session
> > identified by (ISID, initiator name) ?
> >
> > 2) Or does this result in the target responding to the session login
> > with :
> > a login response with status class of non-zero indicating target is
> > rejecting the login ?
> >
> > [The draft does not describe target behaviour for this scenario.]
> >
> > iSCSI session login semantics should explicitly state that the above
> > scenario will result in case (1) above. i.e. when a target sees a
> > session login for a given (ISID, initiator name), it MUST treat this as
> > an implicit logout of any previous session active at the target for that
> > (ISID, initiator name) and then, establish a new session.
> >
> > This is required because the above scenario can typically occur when an
> > initiator reboots without having performed a session logout on all
> > active sessions.(system did not perform an orderly shutdown).
> >
> > As a side note, the iSCSI draft Status Class/Codes could do with a misc
> > error category along the lines of the FC "No additional Explantion"
> > reason explantion. This would help deal with error conditions that don't
> > come under the listed category.
> >
> > - Santosh
--------------0461B497A83331BA60B16103
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------0461B497A83331BA60B16103--



From owner-ips@ece.cmu.edu  Wed Apr 25 22:25:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA07089
	for <ips-archive@odin.ietf.org>; Wed, 25 Apr 2001 22:25:39 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PLVnc13828
	for ips-outgoing; Wed, 25 Apr 2001 17:31:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel2.hp.com (atlrel2.hp.com [156.153.255.202])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PLVKA13761
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 17:31:20 -0400 (EDT)
Received: from xatlrelay1.atl.hp.com (xatlrelay1.atl.hp.com [15.45.89.190])
	by atlrel2.hp.com (Postfix) with ESMTP
	id E54DC33F; Wed, 25 Apr 2001 17:31:19 -0400 (EDT)
Received: from xpabh3.corp.hp.com (xpabh3.corp.hp.com [15.58.136.223])
	by xatlrelay1.atl.hp.com (Postfix) with ESMTP
	id 821761F50E; Wed, 25 Apr 2001 17:29:19 -0400 (EDT)
Received: by xpabh3.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <JKQY9V40>; Wed, 25 Apr 2001 14:31:08 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A09005@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'Douglas Otis'" <dotis@sanlight.net>,
        "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: RE: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
Date: Wed, 25 Apr 2001 14:31:04 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> On page 18,
>  "The iSCSI protocol document SHOULD NOT define the management
>   architecture for iSCSI within the network infrastructure."
> 
>  What does this mean?
> 
> I think it means that the iSCSI protocol document SHOULD NOT 
> define the
> management architecture.  The management architecture may be 
> defined within
> in a separate document.
> 
> I do not understand the phrase 'within network infrastructure.'

My tendency to want to distill a concept into the most concise, least wordy
expression has perhaps left *this* concept too meagerly defined. :-)

"management architecture within a network infrastructure" refers to the
concept that iSCSI will be yet another resource service within a complex
environment of network resources (printers, file servers, NAS, application
servers, etc).  There will certainly be efforts to design how the "block
storage service" that iSCSI devices provide will be integrated into a
comprehensive, shared model, network management environment.  A "network
administrator" (or "storage administrator") will desire to have integrated
applications for assigning user names, resource names, etc. and indicating
access rights.  iSCSI devices presumably will want to interact with these
integrated network management applications.  The iSCSI protocol document
will not attempt to solve that set of problems, or specify means for devices
to provide management agents.  In fact, there should be no mention of MIBs
or any other means of managing iSCSI devices as explicit references in the
iSCSI protocol document, because MIBs and management descriptions and
protocols change with the needs of the environment and the business models
of the management applications.

Marj 



From owner-ips@ece.cmu.edu  Thu Apr 26 00:59:41 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA10157
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 00:59:40 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3PLXp813943
	for ips-outgoing; Wed, 25 Apr 2001 17:33:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.brocade.com (asbestos.brocade.com [63.121.140.244])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3PL9wA12466
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 17:09:58 -0400 (EDT)
Received: from thor.brocade.com (thor [192.168.126.45])
	by mail.brocade.com (8.8.8+Sun/8.8.8) with ESMTP id OAA15368
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 14:09:49 -0700 (PDT)
Received: by thor.brocade.com with Internet Mail Service (5.5.2653.19)
	id <FXLBA0H5>; Wed, 25 Apr 2001 14:09:49 -0700
Message-ID: <FFD40DB4943CD411876500508BAD02797D46D1@sj5-ex2.brocade.com>
From: Robert Snively <rsnively@brocade.com>
To: ips@ece.cmu.edu
Subject: Re:  iSCSI Requirements Draft - Informal WG Last Call  -  A few c
	oncerns about the document.
Date: Wed, 25 Apr 2001 14:09:47 -0700
X-Mailer: Internet Mail Service (5.5.2653.19)
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Having reviewed the iSCSI requirements draft, I have identified the
following issues.  Where possible, I have proposed corrections.

1)  MISSING REQUIREMENT FOR AVOIDING LUN BLOCKING

	At present, there does not appear to be any text
	referencing the problems of hanging up all logical units
	on a target because one command to one logical unit
	was stalled.  This should be addressed explicitly, 
	probably in section 4.2.

2)  ORDERING OF COMMANDS

	In section 2.4, the following text is given:
    
   MUST provide a FIFO transport of SCSI commands, even when commands 
   are sent along different paths. This command ordering mechanism 
   SHOULD seek to minimize the amount of communication necessary across 
   multiple adapters doing transport off-load. 
    
	This was heavily discussed already.  Related to point 1, the
	issue should actually be the FIFO transport of SCSI commands
	for a particular I_T_L nexus.  If the commands are going to
	different logical units or crossing different I_T nexi, the
	requirement should not be present.  I would propose changing
	the MUST to apply only to an I_T_L nexus, allowing other
	relationships to have an un-ordered relationship.  This will
	be especially useful for recovery on systems that install
	a lower level framing capability.

3)  CLARIFY AN UNCLEAR SECURITY REQUIREMENT  

Section 6.2 provides a requirement that uses the oxymoron 
"passive attack".  If it is an attack, there is an intent and
it is active.  I would propose deleting the word "passive" in
the following requirement:
    
   iSCSI authenticated login MUST be resilient against >passive< attacks. 
  
4)  REMOVE SALES/MARKETING FLUFF

	Certain sections are filled with opinions that have no
	relevance to the standards activities upon which these
	requirements are being placed.  The wording may or may
	not be correct and is often improvable.  Such wording
	should be deleted from this document.  The more
	egregious examples, but not the only examples, of text
	that should be deleted include:

	In section 2.1: 

   "The IP infrastructure offers compelling advantages for volume/block-
   oriented storage attachment compared to current approaches.  It 
   offers the opportunity to take advantage of the performance/cost 
   benefits provided by competition in the internet marketplace. This 
   reduces the cost of storage infrastructure by: 
    
    -- Increasing performance (market driven by networking demand) 
    -- Offers richer array of management, security and QoS solutions 
    -- Economies arising from the need to install and operate only 
       single type of network 
    
   In addition, mapping SCSI over IP provides: 
    
    -- Extended distance ranges 
    -- Connectivity to "carrier class" services that support IP"

	While an intriguing marketing statement, there is absolutely
	no objective information that indicates what is the standard of 	comparison and whether or not any of this is true.  This text
	should be deleted.

	On page 7:

   "Products could initially be offered for Gigabit Ethernet attachment, 
   with rapid migration to 10 GbE.  For performance competitive with 
   alternative SCSI transports, it will be necessary to implement the 
   performance path of the full protocol stack in hardware.  These new 
   storage NICs might perform full-stack processing of a complete SCSI 
   task, analogous to today's SCSI and Fibre Channel HBAs, and might 
   also support all host protocols that use TCP (NFS, CIFS, HTTP, etc)."

	All these statements depend on what market is being approached
	by the implementation.  For any particular market, any or all
	of this information may be incorrect.  This text should be deleted.

	Other similar text is sprinkled throughout the document and 
	should be deleted.

5)  CORRECT TEXT ASSOCIATED WITH DIRECT DATA PLACEMENT

	The text associated with direct data placement in section 2.2
	is largely associated with routing buffering and framing, not
	the requirements for zero-copy.  The text at present is:

   Direct data placement (zero-copy iSCSI): 
    
   This is an important implementation goal.  In an iSCSI system, each 
   of the end nodes (for example host computer and storage controller) 
   has ample memory; but the intervening nodes (NIC, switches) do not.  
   Assume a WAN-scale retransmission requirement of 25 MB (1 Gbps) or 
   250 MB (10 Gbps, see Framing discussion).  Therefore, intervening 
   nodes MUST NOT be required to buffer data. 

	It should be rewritten to say:

   Direct data placement (zero-copy iSCSI): 
    
   Direct data placement allows iSCSI data to be moved directly
   to the required memory locations in memory with no requirement
   to recopy the incoming information.  Direct data placement 
   significantly reduces the memory bus and I/O bus loading in
   the end-point systems, allowing improved performance.
   
6)  ALTERNATE CONNECTION BINDING

	The section in 2.4 discussing an alternate mechanism for
	connection binding merely serves to weaken the stand
	in favor of the selected binding relationship.  The following
	text should be deleted:

   "An alternate approach that was extensively discussed involved 
   sending all commands on a single connection and the associated data 
   and status on a different connection (asymetric approach). In this 
   scheme, the transport ensures the commands arrive in order. The 
   protocol on the data and status connections is simpler, perhaps 
   lending itself to a simpler realization in hardware.  One 
   disadvantage of this approach is that the recovery procedure is 
   different if a command connection fails vs. a data connection. Some 
   argued that this approach would require greater inter-processor 
   communication when connections are spread across processors.   
   The reader may reference the mail archives of the IPS mailing list 
   between June and September of 2000 for extensive discussions on 
   symmetric vs asymmetric connection models." 

7)  OPTIONAL BEHAVIOR

	In clause 2.4, wording about the desirability of minimizing
	optional features is discussed.  However, it reaches the 
	mistaken conclusion that there is only one time at which
	options may be negotiated and that rejection is required if
	the options are not supported.  The following text should 
	be changed from:

   "In the interest of simplicity, iSCSI SHOULD minimize optional 
   features.  When features are deemed necessary, the protocol SHOULD 
   allow for feature negotiation at session establishment (login) and 
   provide for rejection when an implementation does not support a 
   requested feature."

	to:

   "iSCSI SHOULD minimize optional features.  When features are
   deemed necessary, the protocol SHALL provide for negotiation of
   the use of those features.  iSCSI SHALL operate correctly whether
   an optional feature is negotiated to be used or is 
   negotiated not to be used."

8)  REMOVE OPTIONAL EXTENSIONS

	In section 4.1, the text suggests that various digest
	implementations may be used.  This is an option that has
	no reason to be allowed, since we will choose the proper
	digest calculation method after due study and no other
	calculation method should be allowed.  The following
	text should be deleted.
 
   "The iSCSI header format SHOULD be extensible to include other digest 
   calculation methods."

9)  SOFTEN REQUIREMENT TO IMPLEMENT STRANGE SAM-2 FUNCTIONS

	In section 5.2, the following text suggests that any
	feature in SAM-2 requires a valid transport mapping.  However,
	it further suggests making such functions recommended or
	required to implement, even if they are rarely used or 
	used only in contexts different from iSCSI.  The following
	text:    

   "In order to be considered a SCSI transport, the iSCSI standard must 
   comply with the requirements of the SCSI Architecture Model [SAM2] 
   for a SCSI transport.  Any feature SAM2 requires in a valid 
   transport mapping MUST be specified by iSCSI and the specification 
   SHOULD make such a feature either RECOMMENDED or REQUIRED in 
   implementations."

	should be changed to read:

  "In order to be considered a SCSI transport, the iSCSI standard SHALL 
   comply with the requirements of the SCSI Architecture Model [SAM2] 
   for a SCSI transport.  Any feature SAM2 requires in a valid 
   transport mapping SHALL be specified by iSCSI.  The iSCSI document 
   SHALL specify for each feature whether it is OPTIONAL, RECOMMENDED,
   or REQUIRED to implement and/or use."

10)  INCORRECT REQUIREMENT FOR BRIDGES/ROUTERS  

	In section 5.2, there is a paragraph treating gateways.
	I contend that all present SCSI transports are easily bridged
	BECAUSE they have chosen a very similar encapsulation format.
	The similar encapsulation format is that used by FCP, FCP-2,
	SBP-2, Packetized Parallel, and SSA.  The structure of 
	iSCSI packets and the protocol for transmitting them should
	be similar to the encapsulation formats used by those
	protocols.  Using this as a guideline, the following
	requirement is incorrect:

   "The iSCSI protocol MUST allow for the construction of gateways to 
   other SCSI transports, including parallel SCSI [SPI-X] and to SCSI-
   FCP[FCP, FCP-2].  It MUST be possible to construct "translating" 
   gateways so that iSCSI hosts can interoperate with SCSI-X devices; 
   so that SCSI-X devices can communicate over an iSCSI network; and so 
   that SCSI-X hosts can use iSCSI targets (where SCSI-X refers to 
   parallel SCSI, SCSI-FCP, or SCSI over any other transport).  This 
   requirement is implied by support for SAM-2, but is worthy of 
   emphasis. These are true application protocol gateways, and not just 
   bridge/routers.  The different standards have only the SCSI-3 
   command set layer in common.  These gateways are not mere packet 
   forwarders."

	That paragraph should be reworded as follows:

   "The iSCSI protocol MUST allow for the construction of simple
   gateways to other SCSI transports, including parallel SCSI and
   packetized parallel SCSI as specified by SPI-4, and Fibre
   Channel Protocol for SCSI as specified by FCP and FCP-2.
   It MUST be possible to construct  
   gateways so that iSCSI devices can use SCSI commands to communicate with
   devices using other protocols.  This 
   requirement is implied by support for SAM-2, but is worthy of 
   emphasis. iSCSI SHALL use packet formats similar to the common
   packet formats used by other packetized SCSI protocols where
   possible to allow both simple bridging gateways and more
   sophisticated translating gateways."

11) CLARIFY CONGESTION QUESTION

	Section 8.3 considers congestion in a rather strange way.
	It was my impression that a well-behaved TCP/IP connection
	provided appropriate congestion management, regardless of
	the information passed across it.  As a result, the following
	text in 8.3 should be removed:

   "The iSCSI protocol MUST be a good network citizen with proven 
   congestion control (as defined in RFC 2309). In addition, iSCSI 
   implementations MUST NOT use multiple connections as a means to 
   avoid transport-layer congestion control."

	and replaced with:

   "iSCSI implementations MUST NOT use multiple connections as a means to 
   avoid transport-layer congestion control.  Standard TCP/IP 
   congestion management mechanisms operate normally while transporting
   iSCSI information."


      


From owner-ips@ece.cmu.edu  Thu Apr 26 01:48:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA15112
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 01:48:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q3d1E06004
	for ips-outgoing; Wed, 25 Apr 2001 23:39:01 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e22.nc.us.ibm.com (e22.nc.us.ibm.com [32.97.136.228])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q3ckA05994
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 23:38:46 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e22.nc.us.ibm.com (8.9.3/8.9.3) with ESMTP id XAA37038;
	Wed, 25 Apr 2001 23:30:35 -0500
Received: from d03nmx42.almaden.ibm.com (d03nmx42.almaden.ibm.com [9.1.24.146])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96.1.0) with ESMTP id VAA55114;
	Wed, 25 Apr 2001 21:33:36 -0600
Importance: Normal
Subject: Re: iSCSI: multiple sessions b/n a pair of WWUIs.
To: Santosh Rao <santoshr@cup.hp.com>
Cc: ips@ece.cmu.edu
From: "Jim Hafner" <hafner@almaden.ibm.com>
Date: Wed, 25 Apr 2001 20:33:34 -0700
Message-ID: <OF7A4FB579.2EDCC70B-ON88256A3A.00111074@almaden.ibm.com>
X-MIMETrack: Serialize by Router on D03NMX42/03/M/IBM(Beta 1 M8_03292001|March 29, 2001) at
 04/25/2001 08:33:35 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Santosh,

There may be lack of synchronization between the two drafts (not unexpected
since they are being worked on in parallel).

The requirement in name+disc that a given initiator name cannot reuse an
ISID for two different sessions comes as a consequence a number of things
(which are described in the draft).  The gist is that this is needed to
provide the correct context for restoration of reservations state (and
other nexus state) to a particular nexus after logout/login. In other
words, if the session goes down for some reason, the target needs clear
context to restore nexus state to a rebuilt session. The only tool it has
is uniqueness of Name+ISID combination within its name space.

There was a similar requirement in draft...05.txt in 2.11.5 (though that
was ambiguous about whether ISIDs are unique within a target across all
initiator names or just with respect to a given initiator name).
Apparently that requirement is now gone from draft...06.txt!

The requirement forces carving up ISID namespaces between iSCSI adapters
(session managers) in a given node.  They each get their name and a piece
of the ISID space from configuration information.  You mentioned this
yourself in a related reply in this thread to Hari.

So, we get multiple sessions between nodes (named things), single sessions
between ports (one per ISID+TSID pair) and a framework for restoring
necessary nexus state (uniqueness of Name+ISID at the target -- no reuse of
ISID).   (I think Julo's comment is that he has had time to keep up with
the formalization done in N&DT within the last week.)

I think/hope this has everything needed.

[But I have to admit, this is sort of a hack to get the iSCSI constructs to
shoe-horn into the SAM constructs.  All of this would have been a lot
easier if we hadn't gone to multiple connections per session. iSCSI is the
first protocol to allow for such constructs.]

Jim Hafner


Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/25/2001 04:34:39 PM

Sent by:  owner-ips@ece.cmu.edu


To:   Julian Satran/Haifa/IBM@IBMIL, IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI: multiple sessions b/n a pair of WWUIs.



> To: ips@ece.cmu.edu
> Subject: Re: iSCSI: session login and ISID
> From: julian_satran@il.ibm.com
> Date: Tue, 10 Apr 2001 14:21:49 +0300


> WWUI can be presented during login phase (2.10.9 is correct and in-line
with 1.2.7) Two > sesions can have the same ISID but will have different
TSID. The question of whether
> more than one session should be allowed between a pair of wuis is under
debate.

> Julo


Julian,

There seems to be some disconnect between your comments above and the
name-disc draft. As per the name-disc draft Section 2(d) :

"There can be only one iSCSI  session with a given ISID between an iSCSI
Intiator Node and an iSCSI Target Node."

The iSCSI [&name-disc] drafts should explicitly state that ISID is
uniquely assigned for a given initiator. Similarly, the TSID is uniquely
assigned for a given target.

On the subject of multiple sessions for a given pair of WWUIs, this MUST
be a requirement. iSCSI must allow multiple sessions for a given pair of
WWUIs.

This is required because single-connection session models would like to
setup multiple sessions b/n initiator hosts and multi-ported targets and
export the multiple paths to LUs to upper layer wedge drivers like EMC
Powerpath, Veritas VxVm, etc.

Inability to establish multiple sessions b/n a pair of WWUIs implies
iSCSI layer will only export one path to the upper layer wedge drivers,
thereby, breaking such applications.

This also implies iSCSI would then take on all the responsibilities of
providing load balancing and fail-over capabilities and would require
the use of multi-connection sessions for that purpose.

By allowing multiple sessions for a given WWUI pair, iSCSI layer could
achieve equivalent functionality using single connection sessions and
would also not break existing wedge drivers.

Regards,
Santosh






From owner-ips@ece.cmu.edu  Thu Apr 26 01:49:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id BAA15198
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 01:49:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q0fsG25800
	for ips-outgoing; Wed, 25 Apr 2001 20:41:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q0fBA25745
	for <ips@ece.cmu.edu>; Wed, 25 Apr 2001 20:41:12 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3Q1lo132551;
	Wed, 25 Apr 2001 18:47:51 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Charles Monia" <cmonia@NishanSystems.com>,
        "'KRUEGER,MARJORIE \(HP-Roseville,ex1\)'" <marjorie_krueger@hp.com>,
        <ips@ece.cmu.edu>
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Wed, 25 Apr 2001 17:37:55 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMENICGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <B300BD9620BCD411A366009027C21D9B17346E@ariel.nishansystems.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Charles,

The encapsulation proposal is devoid of techniques for framing.  How do you
expect to see this lack of framing resolved?  Do you expect to use this
adapter and not have a means for framing?

Doug

> Hi:
>
> > Another valid point Doug made is that iSCSI, FCIP, and iFCP
> > all have the
> > same framing needs and should all use the framing solution.  That
> > recommendation certainly seems sane and within the scope of this WG to
> > oversee.  Last time I paid attention, FCIP and iFCP were
> > trying to reinvent
> > this wheel.
>
> To bring you up to date, there is no "framing solution" in the common
> encapsulation proposal.
>
> See
http://search.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt
>
>
> Charles
>
>



From owner-ips@ece.cmu.edu  Thu Apr 26 04:18:23 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA26020
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 04:18:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q6m8415905
	for ips-outgoing; Thu, 26 Apr 2001 02:48:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q6l6A15868
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 02:47:06 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id IAA138370
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 08:46:58 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA104976
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 08:46:58 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3A.00253FDA ; Thu, 26 Apr 2001 08:46:51 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3A.00253F82.00@d12mta02.de.ibm.com>
Date: Thu, 26 Apr 2001 09:52:18 +0300
Subject: Re: iSCSI : LUN field in NOP-OUT & NOP-IN PDUs.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

In NOPs the LUNs are used only if the exchange is originated by the target
and then the LUN is
has to be valid.

Otherwise the LUN is 0 (when the target task tag is ffffffff).

I've added to the NOP-out figure the "or Reserved (0)".

In rest the text explaines exactly what I said here.

Julo



Santosh Rao <santoshr@cup.hp.com> on 23/04/2001 22:14:54

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : LUN field in NOP-OUT & NOP-IN PDUs.





At the risk of raising a subject that has been beaten to death [and
still not resolved ?], could someone please clarify on what was the
outcome of the prior thread discussing the usage of the LUN field in
NOP-OUT & NOP-IN PDUs ?

As has been discussed in the past, the NOP-OUT and NOP-IN PDUs are
transport specific and are used without any LUN context. Hence, it is
not clear why a LUN field is required in either the NOP-OUT or NOP-IN
PDUs.

Some side notes on this subject :
a) the LUN field description is missing in the NOP-OUT description and
it is not clear from either the figure or the text whether this field
can be reserved. If so, what would be the reserved value for LUN in
NOP-OUT ?

b) The NOP-IN PDU description shows a value of LUN 0 as reserved in the
NOP-IN PDU diagram. Is not LUN 0 a valid LU number for a LU (?)

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr 26 04:21:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA26052
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 04:21:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q6I8e14470
	for ips-outgoing; Thu, 26 Apr 2001 02:18:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q6HHA14425
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 02:17:17 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id IAA10876
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 08:17:09 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA119688
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 08:17:09 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3A.00228600 ; Thu, 26 Apr 2001 08:17:05 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3A.00228489.00@d12mta02.de.ibm.com>
Date: Thu, 26 Apr 2001 09:22:29 +0300
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

How about a third confusing response?

We have introduced the EnableACA (per LU) to enable an initiator unwilling
to use it to disable it at the target for this specific Initiator.

NormACA - the enquiry bit - indicates support for ACA by the target device
server but is a "read-only" bit.

What you are suggesting is that in all cases in which iSCSI would require
the target to enter ACA (after a LU reset or a target reset) look forward
in the queue and enter ACA only if it finds a CDB marked NACA? (and that
includes commands in flight).

Or to enter ACA only when it finds such a command (sort of "soft ACA")?

Both of them sound wrong and complex.

Julo

Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 22:11:12

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI : EnableACA




Julian,

Would the following not satisfy the requirements for dealing with this
ACA issue :

1) Initiators determine the target support for ACA through the NACA bit
in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
good faith, this would be supported.)

2) Initiators set the NACA bit in the CDBs of commands that need strong
ordering. (This could be a small subset of the I/O traffic to one or
more LUNs within the session and not required for all the I/Os in that
session.)

3) Any exception condition on a SCSI I/O, for which the NACA bit was set
results in ACA being established.
Thus, ACA would only be applied if some I/O traffic that required strong
ordering was affected by the exception condition.

4) Since the initiator is ACA capable based on its usage of the NACA
bit, it should also be capable of performing the desired Clear ACA to
recover from this condition.

Such an approach would only apply ACA and its corresponding recovery
when some strongly ordered I/O encountered an exception condition,
rather than applying ACA on a session granularity.

To summarize, the above approach allows :
- ACA to be turned on/off for a subset of I/Os headed to a LUN
- ACA based recovery only used where needed.
- Keeps iSCSI ACA un-aware and rightly so, since this is a property of
the SCSI ULP.
- Avoids applying ACA recovery on a session granularity.

What am I missing here (?). Why is an EnableACA needed ?

- Santosh


julian_satran@il.ibm.com wrote:

> All references to
> EnableACA are redundant and should be removed for the following reasons
> :
>
> a) An initiator knows whether a target supports ACA from the NACA bit in
> the INQUIRY response. When a target indicates support for ACA, the
> initiator can use it by setting the NACA bit in the CDBs it sends. There
> is NO need for any sort of negotiation of this behaviour above and
> beyond what is already provided thru SCSI mechanisms.
>
> b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
> use or lack thereof. This is done thru the NACA bit in CDBs.
>
> c) (As a side note, the description of EnableACA on pg 127 refers to its
> presence in the lun control mode page, but it is actually present in the
> protocol specific port page.)
>
> d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
> negotiated on a per-session basis. SCSI allows initiators to request ACA
> behaviour on a per I/O basis through the use of NACA bit in the CDBs.
>

 +++ We have required ACA to be supported by all new iSCSI targets and
 several
 actions require the target to enter ACA state.
 It was brought to our attention that many initiators will not react
 properly to a
 target entering ACA state (not do the reset).
 The EnableACA bit and key are meant to enable an initiator to control
this
 iSCSI specific ACA behaviour.  This behaviour is related to
asynchronous
 events and is not controlled by the NACA CDB bit.

 ++++
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Thu Apr 26 05:39:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA26861
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 05:39:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q7aBg18229
	for ips-outgoing; Thu, 26 Apr 2001 03:36:11 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q7ZuA18211
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 03:35:56 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id JAA345186
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 09:35:49 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id JAA44160
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 09:35:48 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3A.0029B6A4 ; Thu, 26 Apr 2001 09:35:37 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3A.0029B65F.00@d12mta02.de.ibm.com>
Date: Thu, 26 Apr 2001 10:41:04 +0300
Subject: RE: iSCSI Target Reset
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



I agree that this is an implementation decision.  However I wonder it it
won't be fair to tell the guy trying to do reset that although everything
is fine the reset was not performed - and do this in a way that does not
harm legacy initiators.

Julo

Black_David@emc.com on 24/04/2001 02:31:24

Please respond to Black_David@emc.com

To:    ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI Target Reset




I agree with Charles that this is an implementation
issue.  If a Shark wants to reset all 32 adapters
when it receives a Target Reset on one of them, that's
a Shark implementation decision.  It's completely valid
to reset only the adapter that the Target Reset is
received on (common Fibre Channel behavior) or
only the iSCSI target to which the Target Reset is
addressed if there's more than one Target behind
the adapter.

As for leaving things out of iSCSI - the default modus
operandi should be to put in everything that's described
in SAM2 unless we can convince T10 to take the feature
out of SAM2.  Let's not go deciding to cast things out
of SCSI on T10's behalf.

Thanks,
--David

> -----Original Message-----
> From:   Charles Monia [SMTP:cmonia@nishansystems.com]
> Sent:   Monday, April 23, 2001 7:12 PM
> To:     ips@ece.cmu.edu
> Subject:     RE: iSCSI Target Reset
>
> Hi:
>
> These seem to be implementation decisions. I don't see how that justifies
> removing support from the protocol.
>
> Charles
>
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Monday, April 23, 2001 2:34 PM
> > To: Santosh Rao
> > Cc: ips@ece.cmu.edu
> > Subject: Re: iSCSI Target Reset
> >
> >
> >
> > Absolutely not,  Why would we think that impacting 32 different other
> > initiators is an OK thing to do.  By the way there are lots
> > more Initiators
> > possible with FC on Shark, and would hope that there would be
> > even more
> > with iSCSI.
> >
> > I have been told that these large Storage Controllers do not
> > support Target
> > Reset today.  So I see no loss in not supporting such an item in iSCSI
> > especially since many Initiators will be beyond even the distances and
> > mischief that is possible with FC.
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > 01:24:02 PM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI Target Reset
> >
> >
> >
> > "Dillard, David" wrote:
> > >
> > > When will STORPORT be generally available?  The latest
> > STORPORT document
> > > that I found on the MS web site is version 0.6a, dated
> > March 18, 2001.
> > > Given this it seems like STORPORT might not be available
> > soon.  In that
> > case
> > > do you know what happens with the current drivers?  Are we
> > going to be
> > > telling customers that if they want to use iSCSI and NT
> > clustering they
> > have
> > > to update to Whistler?
> >
> >
> > [One would hope that this list does not turn into a Microsoft
> > release/product discussion mailing list (?) ]
> >
> > Without going into specifics of A certain O.S., does it suffice to
> > require that iSCSI not break existing legacy SCSI applications ?
> >
> > If the above is a valid requirement, then, knowing that legacy
> > applications continue to use SCSI-2 Reserve/Release and the
> > target reset
> > as a mechanism of breaking SCSI-2 reservations, should'nt
> > iSCSI continue
> > to support the target reset ?
> >
> > - Santosh
> >
> >
> >
> >





From owner-ips@ece.cmu.edu  Thu Apr 26 05:39:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA26873
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 05:39:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q7VA017981
	for ips-outgoing; Thu, 26 Apr 2001 03:31:10 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [128.221.11.32])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q7UlA17962
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 03:30:47 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <28S7ZPVQ>; Thu, 26 Apr 2001 03:30:41 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154BD@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Subject: RE: iSCSI : EnableACA
Date: Thu, 26 Apr 2001 03:30:40 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> What you are suggesting is that in all cases in which iSCSI would require
> the target to enter ACA (after a LU reset or a target reset) look forward
> in the queue and enter ACA only if it finds a CDB marked NACA? (and that
> includes commands in flight).

Ok, so why is iSCSI requiring ACA in addition to Unit
Attention for Clear Task Set, LU Reset and Target Reset?
In all cases, we're dealing with Initiators whose tasks
are cleared as a consequence of another Initiator
issuing the appropriate task management command.
SAM2 only requires Unit Attention.

--David

> -----Original Message-----
> From:	julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> Sent:	Thursday, April 26, 2001 2:22 AM
> To:	ips@ece.cmu.edu
> Subject:	Re: iSCSI : EnableACA
> 
> 
> 
> Santosh,
> 
> How about a third confusing response?
> 
> We have introduced the EnableACA (per LU) to enable an initiator unwilling
> to use it to disable it at the target for this specific Initiator.
> 
> NormACA - the enquiry bit - indicates support for ACA by the target device
> server but is a "read-only" bit.
> 
> What you are suggesting is that in all cases in which iSCSI would require
> the target to enter ACA (after a LU reset or a target reset) look forward
> in the queue and enter ACA only if it finds a CDB marked NACA? (and that
> includes commands in flight).
> 
> Or to enter ACA only when it finds such a command (sort of "soft ACA")?
> 
> Both of them sound wrong and complex.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 22:11:12
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  iSCSI : EnableACA
> 
> 
> 
> 
> Julian,
> 
> Would the following not satisfy the requirements for dealing with this
> ACA issue :
> 
> 1) Initiators determine the target support for ACA through the NACA bit
> in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
> good faith, this would be supported.)
> 
> 2) Initiators set the NACA bit in the CDBs of commands that need strong
> ordering. (This could be a small subset of the I/O traffic to one or
> more LUNs within the session and not required for all the I/Os in that
> session.)
> 
> 3) Any exception condition on a SCSI I/O, for which the NACA bit was set
> results in ACA being established.
> Thus, ACA would only be applied if some I/O traffic that required strong
> ordering was affected by the exception condition.
> 
> 4) Since the initiator is ACA capable based on its usage of the NACA
> bit, it should also be capable of performing the desired Clear ACA to
> recover from this condition.
> 
> Such an approach would only apply ACA and its corresponding recovery
> when some strongly ordered I/O encountered an exception condition,
> rather than applying ACA on a session granularity.
> 
> To summarize, the above approach allows :
> - ACA to be turned on/off for a subset of I/Os headed to a LUN
> - ACA based recovery only used where needed.
> - Keeps iSCSI ACA un-aware and rightly so, since this is a property of
> the SCSI ULP.
> - Avoids applying ACA recovery on a session granularity.
> 
> What am I missing here (?). Why is an EnableACA needed ?
> 
> - Santosh
> 
> 
> julian_satran@il.ibm.com wrote:
> 
> > All references to
> > EnableACA are redundant and should be removed for the following reasons
> > :
> >
> > a) An initiator knows whether a target supports ACA from the NACA bit in
> > the INQUIRY response. When a target indicates support for ACA, the
> > initiator can use it by setting the NACA bit in the CDBs it sends. There
> > is NO need for any sort of negotiation of this behaviour above and
> > beyond what is already provided thru SCSI mechanisms.
> >
> > b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
> > use or lack thereof. This is done thru the NACA bit in CDBs.
> >
> > c) (As a side note, the description of EnableACA on pg 127 refers to its
> > presence in the lun control mode page, but it is actually present in the
> > protocol specific port page.)
> >
> > d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
> > negotiated on a per-session basis. SCSI allows initiators to request ACA
> > behaviour on a per I/O basis through the use of NACA bit in the CDBs.
> >
> 
>  +++ We have required ACA to be supported by all new iSCSI targets and
>  several
>  actions require the target to enter ACA state.
>  It was brought to our attention that many initiators will not react
>  properly to a
>  target entering ACA state (not do the reset).
>  The EnableACA bit and key are meant to enable an initiator to control
> this
>  iSCSI specific ACA behaviour.  This behaviour is related to
> asynchronous
>  events and is not controlled by the NACA CDB bit.
> 
>  ++++
>  - santoshr.vcf
> 
> 


From owner-ips@ece.cmu.edu  Thu Apr 26 05:41:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA26899
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 05:41:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q8CCC19981
	for ips-outgoing; Thu, 26 Apr 2001 04:12:12 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q8ACA19834
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 04:10:12 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id KAA31440
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:10:07 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id KAA81428
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:10:07 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3A.002CDDF6 ; Thu, 26 Apr 2001 10:10:04 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3A.002CDD25.00@d12mta02.de.ibm.com>
Date: Thu, 26 Apr 2001 11:15:29 +0300
Subject: Re: iSCSI : Aborting non-SCSI tasks.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



OK I'll mention this explicitile although it is implied. Julo

Santosh Rao <santoshr@cup.hp.com> on 24/04/2001 20:27:06

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI : Aborting non-SCSI tasks.




All,

The iSCSI spec is missing description on how non-SCSI tasks should be
aborted in order to flush stale PDUs of that task. Initiators will
typically time non-SCSI [& SCSI] tasks and will need to resort to some
form of abort and cleanup action on a timeout of the non-scsi task.

This is required in order to safely re-use the task tag resources
without the danger of stale PDUs arriving from a previous incarnation of
that task tag.

The spec should provide some description on how this is to be done.
Perhaps, the semantics of Abort Task can be extended to non-SCSI tasks
as well, to avoid defining a second abort mechanism for non-SCSI tasks.

- Santosh




From owner-ips@ece.cmu.edu  Thu Apr 26 05:42:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA26947
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 05:42:14 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q8VDn20863
	for ips-outgoing; Thu, 26 Apr 2001 04:31:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e2.ny.us.ibm.com ([32.97.182.102])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q8TwA20768
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 04:29:58 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e2.ny.us.ibm.com (8.9.3/8.9.3) with ESMTP id EAA185924
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 04:28:13 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96.1.0) with ESMTP id CAA78270
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 02:24:36 -0600
Importance: Normal
Subject: RE: iSCSI Target Reset
To: "Julian Satran" <Julian_Satran@il.ibm.com>
Cc: ips@ece.cmu.edu
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF6C1541B2.F0D079D0-ON88256A3A.002CDC1F@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Thu, 26 Apr 2001 01:24:17 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/26/2001 02:24:28 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


I received the following statement from the IBM Shark development team.

" For Fibre Channel, Shark supports Target Reset and it is used by almost
all the hosts as well.

Shark supports Target Reset by resetting all the LUNS that are configured
to the host that issues
the Target Reset.   The Target Reset, therefore, will not affect the LUNs
seen by other initiators,
unless they are in sharing the same LUNs. "

Therefore, it probably make some since to state, in the draft, that this
kind of approach should be considered by iSCSI implementers.

I do not think this is a large controller only problem, since with iSCSI,
lots of different desktop and Laptops will be getting at RAID arrays, which
might have previously been attached to only 2 Host via SCSI BUSes or a few
more hosts with Fibre Channel.

IOW, Target Reset has a larger impact then before, even on Storage
controllers that are in the "Mid Range" (and lower), which previously may
not have even worried about the issue.

So the mention of an approach such as the above might be a useful note in
the Spec.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


Julian Satran/Haifa/IBM@IBMIL@ece.cmu.edu on 04/26/2001 12:41:04 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI Target Reset





I agree that this is an implementation decision.  However I wonder it it
won't be fair to tell the guy trying to do reset that although everything
is fine the reset was not performed - and do this in a way that does not
harm legacy initiators.

Julo

Black_David@emc.com on 24/04/2001 02:31:24

Please respond to Black_David@emc.com

To:    ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI Target Reset




I agree with Charles that this is an implementation
issue.  If a Shark wants to reset all 32 adapters
when it receives a Target Reset on one of them, that's
a Shark implementation decision.  It's completely valid
to reset only the adapter that the Target Reset is
received on (common Fibre Channel behavior) or
only the iSCSI target to which the Target Reset is
addressed if there's more than one Target behind
the adapter.

As for leaving things out of iSCSI - the default modus
operandi should be to put in everything that's described
in SAM2 unless we can convince T10 to take the feature
out of SAM2.  Let's not go deciding to cast things out
of SCSI on T10's behalf.

Thanks,
--David

> -----Original Message-----
> From:   Charles Monia [SMTP:cmonia@nishansystems.com]
> Sent:   Monday, April 23, 2001 7:12 PM
> To:     ips@ece.cmu.edu
> Subject:     RE: iSCSI Target Reset
>
> Hi:
>
> These seem to be implementation decisions. I don't see how that justifies
> removing support from the protocol.
>
> Charles
>
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Monday, April 23, 2001 2:34 PM
> > To: Santosh Rao
> > Cc: ips@ece.cmu.edu
> > Subject: Re: iSCSI Target Reset
> >
> >
> >
> > Absolutely not,  Why would we think that impacting 32 different other
> > initiators is an OK thing to do.  By the way there are lots
> > more Initiators
> > possible with FC on Shark, and would hope that there would be
> > even more
> > with iSCSI.
> >
> > I have been told that these large Storage Controllers do not
> > support Target
> > Reset today.  So I see no loss in not supporting such an item in iSCSI
> > especially since many Initiators will be beyond even the distances and
> > mischief that is possible with FC.
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > 01:24:02 PM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI Target Reset
> >
> >
> >
> > "Dillard, David" wrote:
> > >
> > > When will STORPORT be generally available?  The latest
> > STORPORT document
> > > that I found on the MS web site is version 0.6a, dated
> > March 18, 2001.
> > > Given this it seems like STORPORT might not be available
> > soon.  In that
> > case
> > > do you know what happens with the current drivers?  Are we
> > going to be
> > > telling customers that if they want to use iSCSI and NT
> > clustering they
> > have
> > > to update to Whistler?
> >
> >
> > [One would hope that this list does not turn into a Microsoft
> > release/product discussion mailing list (?) ]
> >
> > Without going into specifics of A certain O.S., does it suffice to
> > require that iSCSI not break existing legacy SCSI applications ?
> >
> > If the above is a valid requirement, then, knowing that legacy
> > applications continue to use SCSI-2 Reserve/Release and the
> > target reset
> > as a mechanism of breaking SCSI-2 reservations, should'nt
> > iSCSI continue
> > to support the target reset ?
> >
> > - Santosh
> >
> >
> >
> >








From owner-ips@ece.cmu.edu  Thu Apr 26 05:42:39 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id FAA26961
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 05:42:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q8WDL20934
	for ips-outgoing; Thu, 26 Apr 2001 04:32:13 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q8W5A20925
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 04:32:06 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id KAA233424
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:31:59 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id KAA74656
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:31:58 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3A.002EDE61 ; Thu, 26 Apr 2001 10:31:55 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3A.002EDD27.00@d12mta02.de.ibm.com>
Date: Thu, 26 Apr 2001 11:37:20 +0300
Subject: Re: iSCSI: Immediate Delivery Behavior
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Charles,

There is an explicit statement that iSCSI uses TCP and this implies that on
any given connection nothing can be delivered out of order.

However if there is a hole in the iSCSI queue (e.g., due to a digest error)
immediate commands can still be delivered out of order.

In 7.3 there is description of how to handle task management to cover for
those cases.

Regards,
Julo

Charles Monia <cmonia@NishanSystems.com> on 25/04/2001 01:23:17

Please respond to Charles Monia <cmonia@NishanSystems.com>

To:   "Ips (E-mail)" <ips@ece.cmu.edu>
cc:   Charles Monia <cmonia@NishanSystems.com>
Subject:  iSCSI: Immediate Delivery Behavior




Hi:

The behavior for immediate commands seems ambiguous and possibly needlessly
complex.

Rev 06 says the following regarding ordered delivery to the SCSI layer:

   "Except for the commands marked for immediate delivery the iSCSI
   target layer MUST deliver the commands to the SCSI target layer in
   the order specified by CmdSN. Commands marked for immediate delivery
   may be handed over by the iSCSI target layer to the SCSI target layer
   as soon as detected. iSCSI may avoid delivering some command to the
   SCSI layer if so required by some prior SCSI or iSCSI action (e.g.,
   clear task set Task Management request received before all the
   commands it was supposed to act on)."

In a non-striped session consisting of one TCP/IP connection, the above
could be interpreted to allow the delivery of an immediate command before
other partly received commands that were previously issued. As a result, an
operation, such as an abort task, might bypass the command to be aborted --
even if both were sent on the same connection.

Assuming that's true, I believe a useful simplification is to require that
all traffic flowing over a given TCP/IP connection be delivered to the SCSI
layer in the order received over that connection.  In a striped session, an
immediate command might therefore leapfrog commands on other connections
but
would never bypass commands on the same connection.  In my opinion, that
simplifies the problem of properly purging commands and stale PDUs in the
wake of a task management operation.

Charles
Charles Monia
Senior Technology Consultant
Nishan Systems
email: cmonia@nishansystems.com
voice: (408) 519-3986
fax:   (408) 435-8385





From owner-ips@ece.cmu.edu  Thu Apr 26 08:16:28 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA00283
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 08:16:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3Q9mIw05908
	for ips-outgoing; Thu, 26 Apr 2001 05:48:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from bramg1.net.external.hp.com (bramg1.net.external.hp.com [192.6.126.73])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q9lLA05873
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 05:47:21 -0400 (EDT)
Received: from quasit.br.itc.hp.com (quasit.br.itc.hp.com [15.145.8.135])
	by bramg1.net.external.hp.com (Postfix) with ESMTP id 7E8BB10E
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 11:47:18 +0200 (METDST)
Received: from loddon.br.itc.hp.com (loddon.br.itc.hp.com [15.145.8.166])
	by quasit.br.itc.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit6.0.6 OpenMail) with SMTP id KAA09274
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:47:18 +0100 (BST)
Received: from 15.145.8.166 by loddon.br.itc.hp.com (InterScan E-Mail VirusWall NT); Thu, 26 Apr 2001 10:47:18 +0100 (GMT Daylight Time)
Received: by loddon.br.itc.hp.com with Internet Mail Service (5.5.2653.19)
	id <JCJ27D4T>; Thu, 26 Apr 2001 10:47:18 +0100
Message-ID: <0B9A57FF1D57D411B47500D0B73E5CC101E7A699@dickens.bri.hp.com>
From: "BURBRIDGE,MATTHEW (HP-UnitedKingdom,ex2)" <matthew_burbridge@hp.com>
To: "'ips@ece.cmu.edu'" <ips@ece.cmu.edu>
Subject: RE: iSCSI : target session login behaviour
Date: Thu, 26 Apr 2001 10:47:17 +0100
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="ISO-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hari,

In response to your email

Either:

The two HBAs are operating independantly (i.e. sessions do not span HBAs)
and therefore should have some form of differentiation: e.g. a different
iSCSI Initiator name, or they do need to have some form of co-operatation at
the iSCSI layer(s)/configuration to ensure uniqueness of the ISID as you
have suggested.

Or:

The two HBAs are operating together (sessions can span HBAs) in which case
there is effectively only one iSCSI Layer and so the ISID will be different
when the initiator creates a new independant session, or the ISID and TSID
is the same and the initiator is creating a new connection within the same
session albeit on a different HBA.

Matthew Burbridge
NIS-Bristol
Hewlett Packard
Telnet: 312 7010
E-mail: matthewb@bri.hp.com


> -----Original Message-----
> From: Mudaliar, Hari [mailto:Hari_Mudaliar@adaptec.com]
> Sent: 26 April 2001 00:57
> To: 'Santosh Rao'; Mudaliar, Hari
> Cc: IPS Reflector
> Subject: RE: iSCSI : target session login behaviour
> 
> 
> Santosh,
> 	I get your point. But what if there is more than one 
> iSCSI Host bus
> adapter in a system? The Initiator Name will be the same and 
> ISID MAY turn
> out to be the same (unless the ISIDs are apportioned between 
> the initiators
> through some configuration method). This assumes that 
> multiple sessions can
> exist between one initiator system (containing multiple iSCSI off-load
> engines/HBAs) and a target.
> 
> - Hari
> 
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Wednesday, April 25, 2001 4:18 PM
> To: Mudaliar, Hari
> Cc: IPS Reflector
> Subject: Re: iSCSI : target session login behaviour
> 
> 
> "Mudaliar, Hari" wrote:
> 
> >         I am assuming that you are referring to the 
> creation of a new
> > session with TSID=0 in your example below. Take the case of 
> an initiator
> I1
> > who has established a session with a target with an 
> ISID=ISID1. What if a
> > second initator I2 tries to login to the same target with ISID1? The
> target
> > cannot decide to logout the first initiator (who already 
> has a session
> > established with ISID1) as suggested by you. 
> 
> Hari,
> 
> You may want to take a second look at my mail. It 
> specifically refers to
> the problem in the context of a given (Initiator Name, ISID). Your
> example above does not fall under that category. A 2nd initiator using
> the same ISID would have a different Initiator Name. (a.k.a initiator
> WWUI).
> 
> The problem raised is in the context of an existing session 
> for a given
> (Initiator Name, ISID). How does a target deal with a second session
> login received for the same (Initiator Name, ISID) with a NULL TSID ?
> 
> >         Also, depending on implementation, the target may 
> realize that the
> > TCP connections for a session were lost (using Keep-Alives 
> or iSCSI NOPs
> > etc.) when the initiator rebooted thus terminating the 
> session. By the
> time
> > a new login from the same initiator is received, the old 
> session info may
> > have been cleared.
> 
> Then again, it may not. There's 2 aspects to this issue :
> 1) Successful session re-logins from the rebooted host.
> 2) Garbage collection and cleanup of the old session resources.
> 
> (1) is a more serious issue, since the target MUST NOT reject 
> the login
> based on a pre-existing active session for a given (Initiator Name,
> ISID).
> 
> (2) is handled through garbage collection algorithms, but 
> implementation
> of the proposal would help accelerate the release of stale session
> resources.
> 
> - Santosh
> 
> 
> > 
> > -----Original Message-----
> > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > Sent: Wednesday, April 25, 2001 11:19 AM
> > To: IPS Reflector
> > Subject: iSCSI : target session login behaviour
> > 
> > All,
> > 
> > How should a target respond when it receives a session 
> login  [on a new
> > TCP connection] with the same (ISID, Initiator Name) as a session
> > already active at the target.
> > 
> > Does such a login request imply :
> > 
> > 1) the target should perform implicit logout and re-login 
> of the session
> > identified by (ISID, initiator name) ?
> > 
> > 2) Or does this result in the target responding to the session login
> > with :
> > a login response with status class of non-zero indicating target is
> > rejecting the login ?
> > 
> > [The draft does not describe target behaviour for this scenario.]
> > 
> > iSCSI session login semantics should explicitly state that the above
> > scenario will result in case (1) above. i.e. when a target sees a
> > session login for a given (ISID, initiator name), it MUST 
> treat this as
> > an implicit logout of any previous session active at the 
> target for that
> > (ISID, initiator name) and then, establish a new session.
> > 
> > This is required because the above scenario can typically 
> occur when an
> > initiator reboots without having performed a session logout on all
> > active sessions.(system did not perform an orderly shutdown).
> > 
> > As a side note, the iSCSI draft Status Class/Codes could do 
> with a misc
> > error category along the lines of the FC "No additional Explantion"
> > reason explantion. This would help deal with error 
> conditions that don't
> > come under the listed category.
> > 
> > - Santosh
> 


From owner-ips@ece.cmu.edu  Thu Apr 26 11:13:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA07498
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 11:13:07 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QDKPk16486
	for ips-outgoing; Thu, 26 Apr 2001 09:20:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QDJOA16416
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 09:19:24 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 3DDCE94006
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 09:19:24 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI : target session login behaviour 
In-Reply-To: Message from Santosh Rao <santoshr@cup.hp.com> 
   of "Wed, 25 Apr 2001 11:19:01 PDT." <3AE71515.D8D13B88@cup.hp.com> 
References: <3AE71515.D8D13B88@cup.hp.com> 
Date: Thu, 26 Apr 2001 09:17:44 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010426131924.3DDCE94006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Santosh,

> How should a target respond when it receives a session login  [on a new
> TCP connection] with the same (ISID, Initiator Name) as a session
> already active at the target.

Originally, I thought rejecting the login was the correct behavior,
and that's what I specified in the error handling pseudocode.  I
construed this as a consistency (logic) error in the target.  However,
on further thought, I've changed my mind.  I believe the correct
behavior is to perform an implicit logout (of all outstanding
connections in the session).  The target and initiator may have
different ideas about whether the connections are still live, and like
in FCP, performing an implicit logout solves this problem.  It also
provides a mechanisms for rapid, proactive recovery of session
resources when possible, which is the right thing.

> iSCSI session login semantics should explicitly state that the above
> scenario will result in case (1) above.

I agree.  This and all other possible cases.  

> As a side note, the iSCSI draft Status Class/Codes could do with a misc
> error category along the lines of the FC "No additional Explantion"
> reason explantion. This would help deal with error conditions that don't
> come under the listed category.

Personally, I think we should add categories for reasons we obviously
see now, AND have a no additional reason.

One peculiarity with what you're talking about above is that it should
be a login response status code which expresses this rejection.  The
login response set does not seem to have an `invalid parameter'
response for cases when the request is somehow inconsistent.

Steph



From owner-ips@ece.cmu.edu  Thu Apr 26 11:13:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA07520
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 11:13:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QCROP13600
	for ips-outgoing; Thu, 26 Apr 2001 08:27:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QCR0A13564
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 08:27:00 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id IAA14449 for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 08:26:53 -0400 (EDT)
Message-ID: <3AE813A2.DE59B6E7@cisco.com>
Date: Thu, 26 Apr 2001 07:25:06 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: IPS <ips@ece.cmu.edu>
Subject: iSCSI MIB drawings
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian has placed the iSCSI MIB object model and table structure
drawings on his web site.  They are at:

http://www.haifa.il.ibm.com/satran/ips/Visio-ietf-iscsi-mib-structure-00.pdf

http://www.haifa.il.ibm.com/satran/ips/Visio-ietf-iscsi-uml-model-00.pdf

I just noticed that my pdf writer had messed up the fonts on the
second page of the UML model.  Please ignore this page if it does
not show up correctly; the first page is the actual model.  I will
fix it in the mean time.

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Thu Apr 26 13:32:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA13493
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 13:32:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QElXo22120
	for ips-outgoing; Thu, 26 Apr 2001 10:47:33 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QEkkA22058
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:46:46 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id KAA53570
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:38:58 -0400
Received: from f3n42e (d03nm042h.boulder.ibm.com [9.99.140.42])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96.1.0) with ESMTP id IAA166416
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 08:46:13 -0600
Importance: Normal
Subject: RE: iSCSI Target Reset
To: "John Hufferd" <hufferd@us.ibm.com>
Cc: ips@ece.cmu.edu
From: "Jim Hafner" <hafner@almaden.ibm.com>
Date: Thu, 26 Apr 2001 07:46:12 -0700
Message-ID: <OF94ABC8C5.7C78BE1E-ON88256A3A.004F3248@LocalDomain>
X-MIMETrack: Serialize by Router on D03NM042/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/26/2001 07:46:12 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


John,

I'm really getting lost in this thread, but let me throw in my 2 lire.

Has anybody read the latest draft of SAM-2 rev 16? It says:
(a)Target Reset is not more than a set of Logical Unit Resets (see 6.6)
[Earlier drafts use the term "hard reset"-- this is no longer there.]
(b) Target Reset shall only affect the logical units to which the initiator
has access controls rights (see 6.0) [this is exactly what Shark does
though their "access controls" are pre-standard]

In short, I think we've spent a lot of time debating an issue that is no
longer a problem: the undesirable side effects of Target Reset just aren't
in the spec any more!

There *is* language that Logical Unit Reset "shall perform any additional
functions required by the applicable standards" (namely, the protocol
standards).

So, we
(a) can allow Target Reset in iSCSI without major concerns (but it would be
nice to "discourage it's use" as mentioned in SAM-2, rev 16, 6.6)
(b) should state explicitly that Logical Unit Reset has no additional
effects in iSCSI beyond those specified in SAM-2, rev 16, 5.7.7.


Jim Hafner


John Hufferd/San Jose/IBM@IBMUS@ece.cmu.edu on 04-26-2001 01:24:17 AM

Sent by:  owner-ips@ece.cmu.edu


To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  RE: iSCSI Target Reset




I received the following statement from the IBM Shark development team.

" For Fibre Channel, Shark supports Target Reset and it is used by almost
all the hosts as well.

Shark supports Target Reset by resetting all the LUNS that are configured
to the host that issues
the Target Reset.   The Target Reset, therefore, will not affect the LUNs
seen by other initiators,
unless they are in sharing the same LUNs. "

Therefore, it probably make some since to state, in the draft, that this
kind of approach should be considered by iSCSI implementers.

I do not think this is a large controller only problem, since with iSCSI,
lots of different desktop and Laptops will be getting at RAID arrays, which
might have previously been attached to only 2 Host via SCSI BUSes or a few
more hosts with Fibre Channel.

IOW, Target Reset has a larger impact then before, even on Storage
controllers that are in the "Mid Range" (and lower), which previously may
not have even worried about the issue.

So the mention of an approach such as the above might be a useful note in
the Spec.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


Julian Satran/Haifa/IBM@IBMIL@ece.cmu.edu on 04/26/2001 12:41:04 AM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI Target Reset





I agree that this is an implementation decision.  However I wonder it it
won't be fair to tell the guy trying to do reset that although everything
is fine the reset was not performed - and do this in a way that does not
harm legacy initiators.

Julo

Black_David@emc.com on 24/04/2001 02:31:24

Please respond to Black_David@emc.com

To:    ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI Target Reset




I agree with Charles that this is an implementation
issue.  If a Shark wants to reset all 32 adapters
when it receives a Target Reset on one of them, that's
a Shark implementation decision.  It's completely valid
to reset only the adapter that the Target Reset is
received on (common Fibre Channel behavior) or
only the iSCSI target to which the Target Reset is
addressed if there's more than one Target behind
the adapter.

As for leaving things out of iSCSI - the default modus
operandi should be to put in everything that's described
in SAM2 unless we can convince T10 to take the feature
out of SAM2.  Let's not go deciding to cast things out
of SCSI on T10's behalf.

Thanks,
--David

> -----Original Message-----
> From:   Charles Monia [SMTP:cmonia@nishansystems.com]
> Sent:   Monday, April 23, 2001 7:12 PM
> To:     ips@ece.cmu.edu
> Subject:     RE: iSCSI Target Reset
>
> Hi:
>
> These seem to be implementation decisions. I don't see how that justifies
> removing support from the protocol.
>
> Charles
>
> > -----Original Message-----
> > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > Sent: Monday, April 23, 2001 2:34 PM
> > To: Santosh Rao
> > Cc: ips@ece.cmu.edu
> > Subject: Re: iSCSI Target Reset
> >
> >
> >
> > Absolutely not,  Why would we think that impacting 32 different other
> > initiators is an OK thing to do.  By the way there are lots
> > more Initiators
> > possible with FC on Shark, and would hope that there would be
> > even more
> > with iSCSI.
> >
> > I have been told that these large Storage Controllers do not
> > support Target
> > Reset today.  So I see no loss in not supporting such an item in iSCSI
> > especially since many Initiators will be beyond even the distances and
> > mischief that is possible with FC.
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Internet address: hufferd@us.ibm.com
> >
> >
> > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > 01:24:02 PM
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI Target Reset
> >
> >
> >
> > "Dillard, David" wrote:
> > >
> > > When will STORPORT be generally available?  The latest
> > STORPORT document
> > > that I found on the MS web site is version 0.6a, dated
> > March 18, 2001.
> > > Given this it seems like STORPORT might not be available
> > soon.  In that
> > case
> > > do you know what happens with the current drivers?  Are we
> > going to be
> > > telling customers that if they want to use iSCSI and NT
> > clustering they
> > have
> > > to update to Whistler?
> >
> >
> > [One would hope that this list does not turn into a Microsoft
> > release/product discussion mailing list (?) ]
> >
> > Without going into specifics of A certain O.S., does it suffice to
> > require that iSCSI not break existing legacy SCSI applications ?
> >
> > If the above is a valid requirement, then, knowing that legacy
> > applications continue to use SCSI-2 Reserve/Release and the
> > target reset
> > as a mechanism of breaking SCSI-2 reservations, should'nt
> > iSCSI continue
> > to support the target reset ?
> >
> > - Santosh
> >
> >
> >
> >











From owner-ips@ece.cmu.edu  Thu Apr 26 14:13:31 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA14952
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 14:13:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QGHsY28120
	for ips-outgoing; Thu, 26 Apr 2001 12:17:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c017.sfo.cp.net (c017-h015.c017.sfo.cp.net [209.228.12.229])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3QGGnA28063
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 12:16:49 -0400 (EDT)
Received: (cpmta 1716 invoked from network); 26 Apr 2001 09:16:34 -0700
Received: from ras4-p29.rvt.netvision.net.il (HELO sangate.com) (62.0.182.158)
  by smtp.sangate.com (209.228.12.229) with SMTP; 26 Apr 2001 09:16:34 -0700
X-Sent: 26 Apr 2001 16:16:34 GMT
Message-ID: <3AE84A69.5E14BB85@sangate.com>
Date: Thu, 26 Apr 2001 18:18:49 +0200
From: Mark Mokryn <mark@sangate.com>
X-Mailer: Mozilla 4.75 [en] (Win95; U)
X-Accept-Language: en
MIME-Version: 1.0
To: John Hufferd <hufferd@us.ibm.com>
CC: ips@ece.cmu.edu
Subject: Re: iSCSI Target Reset
References: <OF6C1541B2.F0D079D0-ON88256A3A.002CDC1F@LocalDomain>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

The Shark approach cannot be easily adopted by many arrays, since only
some manufacturers implement ACLs at the array level. Others rely on
host software, intelligent switches, etc. for LUN masking and remapping.
For these types of arrays, a Target Reset is a headache.

John Hufferd wrote:
> 
> I received the following statement from the IBM Shark development team.
> 
> " For Fibre Channel, Shark supports Target Reset and it is used by almost
> all the hosts as well.
> 
> Shark supports Target Reset by resetting all the LUNS that are configured
> to the host that issues
> the Target Reset.   The Target Reset, therefore, will not affect the LUNs
> seen by other initiators,
> unless they are in sharing the same LUNs. "
> 
> Therefore, it probably make some since to state, in the draft, that this
> kind of approach should be considered by iSCSI implementers.
> 
> I do not think this is a large controller only problem, since with iSCSI,
> lots of different desktop and Laptops will be getting at RAID arrays, which
> might have previously been attached to only 2 Host via SCSI BUSes or a few
> more hosts with Fibre Channel.
> 
> IOW, Target Reset has a larger impact then before, even on Storage
> controllers that are in the "Mid Range" (and lower), which previously may
> not have even worried about the issue.
> 
> So the mention of an approach such as the above might be a useful note in
> the Spec.
> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> 
> Julian Satran/Haifa/IBM@IBMIL@ece.cmu.edu on 04/26/2001 12:41:04 AM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> To:   ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI Target Reset
> 
> I agree that this is an implementation decision.  However I wonder it it
> won't be fair to tell the guy trying to do reset that although everything
> is fine the reset was not performed - and do this in a way that does not
> harm legacy initiators.
> 
> Julo
> 
> Black_David@emc.com on 24/04/2001 02:31:24
> 
> Please respond to Black_David@emc.com
> 
> To:    ips@ece.cmu.edu
> cc:
> Subject:  RE: iSCSI Target Reset
> 
> I agree with Charles that this is an implementation
> issue.  If a Shark wants to reset all 32 adapters
> when it receives a Target Reset on one of them, that's
> a Shark implementation decision.  It's completely valid
> to reset only the adapter that the Target Reset is
> received on (common Fibre Channel behavior) or
> only the iSCSI target to which the Target Reset is
> addressed if there's more than one Target behind
> the adapter.
> 
> As for leaving things out of iSCSI - the default modus
> operandi should be to put in everything that's described
> in SAM2 unless we can convince T10 to take the feature
> out of SAM2.  Let's not go deciding to cast things out
> of SCSI on T10's behalf.
> 
> Thanks,
> --David
> 
> > -----Original Message-----
> > From:   Charles Monia [SMTP:cmonia@nishansystems.com]
> > Sent:   Monday, April 23, 2001 7:12 PM
> > To:     ips@ece.cmu.edu
> > Subject:     RE: iSCSI Target Reset
> >
> > Hi:
> >
> > These seem to be implementation decisions. I don't see how that justifies
> > removing support from the protocol.
> >
> > Charles
> >
> > > -----Original Message-----
> > > From: John Hufferd [mailto:hufferd@us.ibm.com]
> > > Sent: Monday, April 23, 2001 2:34 PM
> > > To: Santosh Rao
> > > Cc: ips@ece.cmu.edu
> > > Subject: Re: iSCSI Target Reset
> > >
> > >
> > >
> > > Absolutely not,  Why would we think that impacting 32 different other
> > > initiators is an OK thing to do.  By the way there are lots
> > > more Initiators
> > > possible with FC on Shark, and would hope that there would be
> > > even more
> > > with iSCSI.
> > >
> > > I have been told that these large Storage Controllers do not
> > > support Target
> > > Reset today.  So I see no loss in not supporting such an item in iSCSI
> > > especially since many Initiators will be beyond even the distances and
> > > mischief that is possible with FC.
> > >
> > > .
> > > .
> > > .
> > > John L. Hufferd
> > > Senior Technical Staff Member (STSM)
> > > IBM/SSG San Jose Ca
> > > (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > > Internet address: hufferd@us.ibm.com
> > >
> > >
> > > Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/23/2001
> > > 01:24:02 PM
> > >
> > > Sent by:  owner-ips@ece.cmu.edu
> > >
> > >
> > > To:   ips@ece.cmu.edu
> > > cc:
> > > Subject:  Re: iSCSI Target Reset
> > >
> > >
> > >
> > > "Dillard, David" wrote:
> > > >
> > > > When will STORPORT be generally available?  The latest
> > > STORPORT document
> > > > that I found on the MS web site is version 0.6a, dated
> > > March 18, 2001.
> > > > Given this it seems like STORPORT might not be available
> > > soon.  In that
> > > case
> > > > do you know what happens with the current drivers?  Are we
> > > going to be
> > > > telling customers that if they want to use iSCSI and NT
> > > clustering they
> > > have
> > > > to update to Whistler?
> > >
> > >
> > > [One would hope that this list does not turn into a Microsoft
> > > release/product discussion mailing list (?) ]
> > >
> > > Without going into specifics of A certain O.S., does it suffice to
> > > require that iSCSI not break existing legacy SCSI applications ?
> > >
> > > If the above is a valid requirement, then, knowing that legacy
> > > applications continue to use SCSI-2 Reserve/Release and the
> > > target reset
> > > as a mechanism of breaking SCSI-2 reservations, should'nt
> > > iSCSI continue
> > > to support the target reset ?
> > >
> > > - Santosh
> > >
> > >
> > >
> > >


From owner-ips@ece.cmu.edu  Thu Apr 26 15:01:42 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16168
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 15:01:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QHuel04760
	for ips-outgoing; Thu, 26 Apr 2001 13:56:40 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QHttA04688
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 13:55:56 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 2A8F92035; Thu, 26 Apr 2001 10:55:55 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA00549;
	Thu, 26 Apr 2001 10:55:49 -0700 (PDT)
Message-ID: <3AE860F7.8763EE49@cup.hp.com>
Date: Thu, 26 Apr 2001 10:55:03 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: "'ips@ece.cmu.edu'" <ips@ece.cmu.edu>
Cc: Jim Hafner <hafner@almaden.ibm.com>
Subject: Re: iSCSI : target session login behaviour
References: <0B9A57FF1D57D411B47500D0B73E5CC101E7A699@dickens.bri.hp.com>
Content-Type: multipart/mixed;
 boundary="------------6D17BE3DE9531ACF775553BC"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------6D17BE3DE9531ACF775553BC
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

"BURBRIDGE,MATTHEW (HP-UnitedKingdom,ex2)" wrote:
> 
> Hari,
> 
> In response to your email
> 
> Either:
> 
> The two HBAs are operating independantly (i.e. sessions do not span HBAs)
> and therefore should have some form of differentiation: e.g. a different
> iSCSI Initiator name, 


One would hope that such mis-abuse of the Initiator Name (WWUI) is not
repeated as was done with the FC Node WWN. The Initiator & Target Names
(WWUI) are explicitly defined in the name-disc draft as :

"The terms "initiator name" and "target name", when used in this
document, refer to iSCSI Node Names."

<snip snip>

"The Initiator Name corresponds to the logical operating system on which
the initiator is running, and the Target Name corresponds to the target
Storage Node entity."

<snip snip>

"An iSCSI Name really names a logical software entity, and is not tied
to a port or other hardware that can be changed.  For instance, an
Initiator Name should name the iSCSI initiator driver, and not a
particular NIC or HBA card.  When multiple NICs are used, they should
generally all present the same iSCSI Initiator Name to the targets,
since they are really to the same entity.  In most operating systems,
the named entity is the operating system image.  Most hosts will have a
single OS running; some of the really big ones could have multiples. 

A target name should similarly not be tied to hardware interfaces that
can be changed.  A Target Name should identify the logical  target,and
must be the same for the target regardless of the physical porton which
it is addressed.  This gives iSCSI initiators an easy way to determine
that two targets it has discovered are really two paths to the same
target." 

The wording above should be strengthened from "should" to "MUST".
Failure to do so could result in the same mis-abuse as has occured with
FC Node WWNs.

Ex : 
"For instance, an Initiator Name MUST name the iSCSI initiator driver,
and not a particular NIC or HBA card."

"When multiple NICs are used, they MUST present the same iSCSI Initiator
Name to the targets, since they are really to the same entity."

- Santosh

> > -----Original Message-----
> > From: Mudaliar, Hari [mailto:Hari_Mudaliar@adaptec.com]
> > Sent: 26 April 2001 00:57
> > To: 'Santosh Rao'; Mudaliar, Hari
> > Cc: IPS Reflector
> > Subject: RE: iSCSI : target session login behaviour
> >
> >
> > Santosh,
> >       I get your point. But what if there is more than one
> > iSCSI Host bus
> > adapter in a system? The Initiator Name will be the same and
> > ISID MAY turn
> > out to be the same (unless the ISIDs are apportioned between
> > the initiators
> > through some configuration method). This assumes that
> > multiple sessions can
> > exist between one initiator system (containing multiple iSCSI off-load
> > engines/HBAs) and a target.
> >
> > - Hari
--------------6D17BE3DE9531ACF775553BC
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------6D17BE3DE9531ACF775553BC--



From owner-ips@ece.cmu.edu  Thu Apr 26 15:02:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16197
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 15:02:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QHU8C02971
	for ips-outgoing; Thu, 26 Apr 2001 13:30:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QHSZA02906
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 13:28:35 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id 4A86F1AB8; Thu, 26 Apr 2001 10:28:34 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA26695;
	Thu, 26 Apr 2001 10:28:19 -0700 (PDT)
Message-ID: <3AE85A86.95BA2615@cup.hp.com>
Date: Thu, 26 Apr 2001 10:27:34 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: Black_David@emc.com, ips@ece.cmu.edu
Subject: Re: iSCSI : EnableACA
References: <0F31E5C394DAD311B60C00E029101A07080154BD@corpmx9.isus.emc.com>
Content-Type: multipart/mixed;
 boundary="------------095B46DC4D8BA5CC52FF61BC"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------095B46DC4D8BA5CC52FF61BC
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Black_David@emc.com wrote:
> 
> > What you are suggesting is that in all cases in which iSCSI would require
> > the target to enter ACA (after a LU reset or a target reset) look forward
> > in the queue and enter ACA only if it finds a CDB marked NACA? (and that
> > includes commands in flight).

Julian,

I am not suggesting any such thing. ACA is only established when the
logical unit completes a command [that had the NACA bit set in its
control byte in the CDB] with a CHECK CONDITION status.

ACA is not established after a LU Reset or Target Reset. On the
contrary, a target reset or LU Reset would clear any existing CAC or
ACA.

- Santosh


> 
> Ok, so why is iSCSI requiring ACA in addition to Unit
> Attention for Clear Task Set, LU Reset and Target Reset?
> In all cases, we're dealing with Initiators whose tasks
> are cleared as a consequence of another Initiator
> issuing the appropriate task management command.
> SAM2 only requires Unit Attention.
> 
> --David
> 
> > -----Original Message-----
> > From: julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > Sent: Thursday, April 26, 2001 2:22 AM
> > To:   ips@ece.cmu.edu
> > Subject:      Re: iSCSI : EnableACA
> >
> >
> >
> > Santosh,
> >
> > How about a third confusing response?
> >
> > We have introduced the EnableACA (per LU) to enable an initiator unwilling
> > to use it to disable it at the target for this specific Initiator.
> >
> > NormACA - the enquiry bit - indicates support for ACA by the target device
> > server but is a "read-only" bit.
> >
> > What you are suggesting is that in all cases in which iSCSI would require
> > the target to enter ACA (after a LU reset or a target reset) look forward
> > in the queue and enter ACA only if it finds a CDB marked NACA? (and that
> > includes commands in flight).
> >
> > Or to enter ACA only when it finds such a command (sort of "soft ACA")?
> >
> > Both of them sound wrong and complex.
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 22:11:12
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  iSCSI : EnableACA
> >
> >
> >
> >
> > Julian,
> >
> > Would the following not satisfy the requirements for dealing with this
> > ACA issue :
> >
> > 1) Initiators determine the target support for ACA through the NACA bit
> > in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
> > good faith, this would be supported.)
> >
> > 2) Initiators set the NACA bit in the CDBs of commands that need strong
> > ordering. (This could be a small subset of the I/O traffic to one or
> > more LUNs within the session and not required for all the I/Os in that
> > session.)
> >
> > 3) Any exception condition on a SCSI I/O, for which the NACA bit was set
> > results in ACA being established.
> > Thus, ACA would only be applied if some I/O traffic that required strong
> > ordering was affected by the exception condition.
> >
> > 4) Since the initiator is ACA capable based on its usage of the NACA
> > bit, it should also be capable of performing the desired Clear ACA to
> > recover from this condition.
> >
> > Such an approach would only apply ACA and its corresponding recovery
> > when some strongly ordered I/O encountered an exception condition,
> > rather than applying ACA on a session granularity.
> >
> > To summarize, the above approach allows :
> > - ACA to be turned on/off for a subset of I/Os headed to a LUN
> > - ACA based recovery only used where needed.
> > - Keeps iSCSI ACA un-aware and rightly so, since this is a property of
> > the SCSI ULP.
> > - Avoids applying ACA recovery on a session granularity.
> >
> > What am I missing here (?). Why is an EnableACA needed ?
> >
> > - Santosh
> >
> >
> > julian_satran@il.ibm.com wrote:
> >
> > > All references to
> > > EnableACA are redundant and should be removed for the following reasons
> > > :
> > >
> > > a) An initiator knows whether a target supports ACA from the NACA bit in
> > > the INQUIRY response. When a target indicates support for ACA, the
> > > initiator can use it by setting the NACA bit in the CDBs it sends. There
> > > is NO need for any sort of negotiation of this behaviour above and
> > > beyond what is already provided thru SCSI mechanisms.
> > >
> > > b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating its
> > > use or lack thereof. This is done thru the NACA bit in CDBs.
> > >
> > > c) (As a side note, the description of EnableACA on pg 127 refers to its
> > > presence in the lun control mode page, but it is actually present in the
> > > protocol specific port page.)
> > >
> > > d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT be
> > > negotiated on a per-session basis. SCSI allows initiators to request ACA
> > > behaviour on a per I/O basis through the use of NACA bit in the CDBs.
> > >
> >
> >  +++ We have required ACA to be supported by all new iSCSI targets and
> >  several
> >  actions require the target to enter ACA state.
> >  It was brought to our attention that many initiators will not react
> >  properly to a
> >  target entering ACA state (not do the reset).
> >  The EnableACA bit and key are meant to enable an initiator to control
> > this
> >  iSCSI specific ACA behaviour.  This behaviour is related to
> > asynchronous
> >  events and is not controlled by the NACA CDB bit.
> >
> >  ++++
> >  - santoshr.vcf
> >
> >
--------------095B46DC4D8BA5CC52FF61BC
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------095B46DC4D8BA5CC52FF61BC--



From owner-ips@ece.cmu.edu  Thu Apr 26 15:02:24 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16216
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 15:02:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QHjcN04030
	for ips-outgoing; Thu, 26 Apr 2001 13:45:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QHigA03926
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 13:44:42 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP id 9E28B1CA3
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:44:41 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA28495
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 10:44:37 -0700 (PDT)
Message-ID: <3AE85E57.6AECB1DA@cup.hp.com>
Date: Thu, 26 Apr 2001 10:43:51 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI: Re: iSCSI & Linked Commands
References: <NEBBJGDMMLHHCIKHGBEJAEOCCGAA.dotis@sanlight.net>
Content-Type: multipart/mixed;
 boundary="------------2F3016215788DE2316D5D6BC"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------2F3016215788DE2316D5D6BC
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


This thread on linked commands has no relevance to iSCSI any longer, as
has been mentioned several times by now.  

- Santosh


Douglas Otis wrote:
> 
> Robert,
> 
> Relative addressing is not defined because that is the only means of
> addressing.  Relative to the last block.
> 
> Doug
> 
> > Doug,
> >
> > Relative addressing is not defined in the SSC command set nor
> > in the SPC command set for tapes.
> >
> > Bob
> >
> > >  -----Original Message-----
> > >  From: Douglas Otis [mailto:dotis@sanlight.net]
> > >  Sent: Monday, April 23, 2001 9:36 AM
> > >  To: Stephen Bailey; ips@ece.cmu.edu
> > >  Subject: RE: iSCSI: Re: iSCSI & Linked Commands
> > >
> > >
> > >  Stephen,
> > >
> > >  Unlike random access devices, sequential access devices operate with
> > >  relative addressing.  For random access devices, this is a
> > >  seldom used
> > >  option.  There is a requirement to bind commands together to
> > >  ensure order of
> > >  execution on these devices.  By popular, you mean not sequential?
> > >
> > >  Doug
> > >
> > >
> > >  > Julian,
> > >  >
> > >  > > According to your logic no FCP implementation can use
> > >  linked commands?
> > >  > > Is this true for all OS's?  Is it a verified fact or foloklor?
> > >  >
> > >  > In my experience it's fact.  I have never used a SCSI
> > >  stack which both
> > >  > supported AND used linked commands.  Like some others
> > >  here, I always
> > >  > assumed AIX might :^) Ralph has pointed out that T10 is well aware
> > >  > that the feature is not popular.  There are other ways of
> > >  > accomplishing the same thing that are less likely to blow
> > >  up in your
> > >  > face.
> > >  >
> > >  > > Is it so also for the new MS StorPort driver?
> > >  >
> > >  > I don't know, but I'd be really surprised if they did use linked
> > >  > commands.  You have to be pretty nuts to rely on a feature
> > >  that's not
> > >  > even exercised by most SCSI implementations.
> > >  >
> > >  > Steph
> > >  >
> > >
> > >
> >
--------------2F3016215788DE2316D5D6BC
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------2F3016215788DE2316D5D6BC--



From owner-ips@ece.cmu.edu  Thu Apr 26 15:06:07 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id PAA16286
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 15:06:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QHgcG03830
	for ips-outgoing; Thu, 26 Apr 2001 13:42:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QHflA03784
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 13:41:47 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 63874207A; Thu, 26 Apr 2001 10:41:46 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA28302;
	Thu, 26 Apr 2001 10:41:37 -0700 (PDT)
Message-ID: <3AE85DA4.570F800D@cup.hp.com>
Date: Thu, 26 Apr 2001 10:40:52 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Jim Hafner <hafner@almaden.ibm.com>
Cc: ips@ece.cmu.edu, Julian Satran <julian_satran@il.ibm.com>
Subject: Re: iSCSI: multiple sessions b/n a pair of WWUIs.
References: <OF7A4FB579.2EDCC70B-ON88256A3A.00111074@almaden.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------1B2E82145CD8AAE33C8D7022"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------1B2E82145CD8AAE33C8D7022
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Jim,

Thanks for the clarification. Both the iSCSI and the name-disc drafts
need to explicitly state that ISID is uniquely assigned for all sessions
within a given initiator. Similarly, TSID is uniquely assigned for all
sessions within a given target.

Regards,
Santosh

Jim Hafner wrote:
> 
> Santosh,
> 
> There may be lack of synchronization between the two drafts (not unexpected
> since they are being worked on in parallel).
> 
> The requirement in name+disc that a given initiator name cannot reuse an
> ISID for two different sessions comes as a consequence a number of things
> (which are described in the draft).  The gist is that this is needed to
> provide the correct context for restoration of reservations state (and
> other nexus state) to a particular nexus after logout/login. In other
> words, if the session goes down for some reason, the target needs clear
> context to restore nexus state to a rebuilt session. The only tool it has
> is uniqueness of Name+ISID combination within its name space.
> 
> There was a similar requirement in draft...05.txt in 2.11.5 (though that
> was ambiguous about whether ISIDs are unique within a target across all
> initiator names or just with respect to a given initiator name).
> Apparently that requirement is now gone from draft...06.txt!
> 
> The requirement forces carving up ISID namespaces between iSCSI adapters
> (session managers) in a given node.  They each get their name and a piece
> of the ISID space from configuration information.  You mentioned this
> yourself in a related reply in this thread to Hari.
> 
> So, we get multiple sessions between nodes (named things), single sessions
> between ports (one per ISID+TSID pair) and a framework for restoring
> necessary nexus state (uniqueness of Name+ISID at the target -- no reuse of
> ISID).   (I think Julo's comment is that he has had time to keep up with
> the formalization done in N&DT within the last week.)
> 
> I think/hope this has everything needed.
> 
> [But I have to admit, this is sort of a hack to get the iSCSI constructs to
> shoe-horn into the SAM constructs.  All of this would have been a lot
> easier if we hadn't gone to multiple connections per session. iSCSI is the
> first protocol to allow for such constructs.]
> 
> Jim Hafner
> 
> Santosh Rao <santoshr@cup.hp.com>@ece.cmu.edu on 04/25/2001 04:34:39 PM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> To:   Julian Satran/Haifa/IBM@IBMIL, IPS Reflector <ips@ece.cmu.edu>
> cc:
> Subject:  iSCSI: multiple sessions b/n a pair of WWUIs.
> 
> > To: ips@ece.cmu.edu
> > Subject: Re: iSCSI: session login and ISID
> > From: julian_satran@il.ibm.com
> > Date: Tue, 10 Apr 2001 14:21:49 +0300
> 
> > WWUI can be presented during login phase (2.10.9 is correct and in-line
> with 1.2.7) Two > sesions can have the same ISID but will have different
> TSID. The question of whether
> > more than one session should be allowed between a pair of wuis is under
> debate.
> 
> > Julo
> 
> Julian,
> 
> There seems to be some disconnect between your comments above and the
> name-disc draft. As per the name-disc draft Section 2(d) :
> 
> "There can be only one iSCSI  session with a given ISID between an iSCSI
> Intiator Node and an iSCSI Target Node."
> 
> The iSCSI [&name-disc] drafts should explicitly state that ISID is
> uniquely assigned for a given initiator. Similarly, the TSID is uniquely
> assigned for a given target.
> 
> On the subject of multiple sessions for a given pair of WWUIs, this MUST
> be a requirement. iSCSI must allow multiple sessions for a given pair of
> WWUIs.
> 
> This is required because single-connection session models would like to
> setup multiple sessions b/n initiator hosts and multi-ported targets and
> export the multiple paths to LUs to upper layer wedge drivers like EMC
> Powerpath, Veritas VxVm, etc.
> 
> Inability to establish multiple sessions b/n a pair of WWUIs implies
> iSCSI layer will only export one path to the upper layer wedge drivers,
> thereby, breaking such applications.
> 
> This also implies iSCSI would then take on all the responsibilities of
> providing load balancing and fail-over capabilities and would require
> the use of multi-connection sessions for that purpose.
> 
> By allowing multiple sessions for a given WWUI pair, iSCSI layer could
> achieve equivalent functionality using single connection sessions and
> would also not break existing wedge drivers.
> 
> Regards,
> Santosh
--------------1B2E82145CD8AAE33C8D7022
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------1B2E82145CD8AAE33C8D7022--



From owner-ips@ece.cmu.edu  Thu Apr 26 16:08:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA18021
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 16:08:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QHxbR04895
	for ips-outgoing; Thu, 26 Apr 2001 13:59:37 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from ntmail.qlc.com (216-231-2-8.qlc.com [216.231.2.8])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QHwaA04834
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 13:58:44 -0400 (EDT)
Received: by ntmail.qlc.com with Internet Mail Service (5.5.2653.19)
	id <27568LKS>; Thu, 26 Apr 2001 10:57:55 -0700
Message-ID: <408738A76826D5119E7D0003470D72E437E9A9@ntmail.qlc.com>
From: Chuck Micalizzi <chuck.micalizzi@qlogic.com>
To: "'julian_satran@il.ibm.com'" <julian_satran@il.ibm.com>
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI-06 SCSI Cmd typo
Date: Thu, 26 Apr 2001 10:57:55 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julian,

	I'm confused as to what field was removed in order to make
	the CDB field 16 bytes and locate the CmdSN at offset 24.
	Can you send the correct layout of the SCSI Command PDU
	when you get time?

Thank You

chuck micalizzi

-----Original Message-----
From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
Sent: Tuesday, April 24, 2001 2:18 AM
To: dfsmith@almaden.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI-06 SCSI Cmd typo




Thanks Daniel,

It's fixed. CmdSN is at 24 and  the following are shifted.

Julo

Daniel Smith <dfsmith@almaden.ibm.com> on 24/04/2001 03:16:31

Please respond to Daniel Smith <dfsmith@almaden.ibm.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  iSCSI-06 SCSI Cmd typo




In section 2.3 SCSI Command...

the table shows bytes 36--47 reserved for the CDB (12 bytes).
But
the description 2.3.6 says it's 16 bytes.

(I'd prefer 16 bytes---I'm not a big fan of Stat/DataSN.)

This document is getting big---but the latest version seems to be holding
up
well as I read through it.  Good work.

Daniel
--
IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120-6099, USA
K65B/C2 Phone: +1(408)927-2072 Fax: +1(408)927-3010 Home: +1(408)227-5786




From owner-ips@ece.cmu.edu  Thu Apr 26 16:15:43 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA18395
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 16:15:43 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QJ2d908913
	for ips-outgoing; Thu, 26 Apr 2001 15:02:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3QJ1uA08892
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 15:01:56 -0400 (EDT)
Received: from scummy.research.bell-labs.com ([135.104.2.10]) by dirty; Thu Apr 26 15:01:14 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by scummy; Thu Apr 26 15:01:15 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id PAA03256
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 15:01:07 -0400 (EDT)
Message-ID: <3AE87073.9D5F91A2@research.bell-labs.com>
Date: Thu, 26 Apr 2001 15:01:07 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: RE: iSCSI-06 SCSI Cmd typo
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


I dont believe any field was removed.  The format is fine and
can accomodate all fields.  The byte position for Initiator task 
tag was mentioned as 24 instead of 20 in 05 & 06.  That caused 
the initial comment.

Read as =>
"initiator task tag" at 16.
"expected data length" at 20.
"cmdSN" at 24.
"expStatSN" at 28.
"CDB" at 32 and upto 48.

-Sandeep

> Julian,
>
>   I'm confused as to what field was removed in order to make
>   the CDB field 16 bytes and locate the CmdSN at offset 24.
>   Can you send the correct layout of the SCSI Command PDU
>   when you get time?
>
> Thank You
>
> chuck micalizzi
>
> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: Tuesday, April 24, 2001 2:18 AM
> To: dfsmith@almaden.ibm.com
> Cc: ips@ece.cmu.edu
> Subject: Re: iSCSI-06 SCSI Cmd typo
>
> Thanks Daniel,
>
> It's fixed. CmdSN is at 24 and  the following are shifted.
>
> Daniel Smith <dfsmith@almaden.ibm.com> on 24/04/2001 03:16:31
>
> Please respond to Daniel Smith <dfsmith@almaden.ibm.com>
>
> To:	Julian Satran/Haifa/IBM@IBMIL
> cc:
> Subject:  iSCSI-06 SCSI Cmd typo
>
>
> In section 2.3 SCSI Command...
>
> the table shows bytes 36--47 reserved for the CDB (12 bytes).
> But
> the description 2.3.6 says it's 16 bytes.
>
> (I'd prefer 16 bytes---I'm not a big fan of Stat/DataSN.)
>
> This document is getting big---but the latest version seems to be holding
> up
> well as I read through it.  Good work.
>
> Daniel
> --
> IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120-6099, USA
> K65B/C2 Phone: +1(408)927-2072 Fax: +1(408)927-3010 Home: +1(408)227-5786


From owner-ips@ece.cmu.edu  Thu Apr 26 16:51:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA21189
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 16:51:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QJHll09869
	for ips-outgoing; Thu, 26 Apr 2001 15:17:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-1.cisco.com (sj-msg-core-1.cisco.com [171.71.163.11])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3Q8SoA20734
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 04:28:50 -0400 (EDT)
Received: from mira-sjc5-2.cisco.com (mira-sjc5-2.cisco.com [171.71.163.16])
	by sj-msg-core-1.cisco.com (8.11.3/8.9.1) with ESMTP id f3Q8OFL29483;
	Thu, 26 Apr 2001 01:24:15 -0700 (PDT)
Received: from cisco.com (ssh-sj1.cisco.com [171.68.225.134])
	by mira-sjc5-2.cisco.com (Mirapoint)
	with ESMTP id AED16976 (AUTH rrs);
	Thu, 26 Apr 2001 01:24:06 -0700 (PDT)
Message-ID: <3AE7DB24.7AECE222@cisco.com>
Date: Thu, 26 Apr 2001 03:24:04 -0500
From: Randall Stewart <rrs@cisco.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.12 i386)
X-Accept-Language: en
MIME-Version: 1.0
To: tsvwg@ietf.org
CC: Chip Sharp <chsharp@cisco.com>, ips@ece.cmu.edu,
        Craig Partridge <craig@aland.bbn.com>,
        Jonathan Stone <jonathan@dsg.stanford.edu>
Subject: Re: [Tsvwg] [SCTP checksum problems]
References: <499DC368E25AD411B3F100902740AD65BC5AC0@xrose03.rose.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Jim:

I am glad you copied me on this.. since being at the bakeoff and with
the
recent email/site problems I have.. I do not get IETF mail until next
week :0

Now, some comments...


My major concern is that SCTP's checksum is weaker than TCP.What the
upper layers do to defend against middle boxes etc they will need to
do anyway. I have just ran a few numbers using the sctp_test_app in
the reference implementation.

I did the following:

Compiled the normal ref-imp with -O -pg

started two endpoints on the same machine (freeBSD intel/sony vio
PCG-Z505JS), this
of course has my SCTP kernel patches applied ...

Now I setup an association between two endpoints and did

bulk:100:0:1000000

This transfers 1 Million 100 byte packets holding ascii data from one
endpoint to the other. 

I then captured the gprof information for this run.

I did the same exact test after changing the checksum to crc32 and the
modified
adler32 (16 bit sums).

My results (not meant to show strength in catching errors but instead
performance
of a software version of the sum) are as follows:

Adler 32... 

Sender side and Receiver side Avereage time spent in the checksum
routine per
call was 121 nano seconds.

Adler 32 Modified

Sender side and Receiver side Average time spent in the checksum routine
per
call was 90 nano seconds.

CRC32

Sender side average time spent in the checksum calculation per packet
was
5.3 micro seconds

Recevier side average time spent in the checksum calculation per packet
was
5.9 micro seconds.

I believe the differences seen in the CRC32 can be attributed to how
lucky the
repsective application is in finding the index table (the ssh_crc32
found in
FreeBSD) in the processor cache. If I run the CRC comparision with just
random
data in a stand alone program (very unrealistic) crc32 outperforms all
others but
this is because the table get completely pre-fetched into cache and is
never 
pulled from main memory... (I started here and then realized the only
way to
get good information is to do it in a real implementation where a lot of
other
code would run between crc calls).

So on that note... I vote STRONGLY for Modified Adler32... i.e. same as
regular
adler but make the quantities added be 16 bit sums...

I think this will take care of the critical problem i.e. the weakness in
the
SCTP checksum for small packets... Jonathan/Craig any comments or
questions...

And yess... I am working on getting things ran on a sparc box but the
only on
we have here is a sparc20... so the numbers definetly can NOT be
compared to anything
else...

R




"WENDT,JIM (HP-Roseville,ex1)" wrote:
> 
> I think this "SCTP checksum" thread spanning IPS and TSVWG was for
> discussion around whether or not iSCSI (running over SCTP) could forgo data
> integrity checking and transport-like functionality (retransmission, ack,
> etc) should SCTP provide a sufficiently strong check-code.
> If iSCSI were willing to completely trust SCTP end-to-end across a network
> fabric (including "middleboxes"), then that provides one reason for SCTP to
> adopt a stronger checksum or CRC.
> If iSCSI will still implement its own data integrity check-code above SCTP,
> then SCTP needs to make an independent decision on whether its current
> check-code is sufficiently strong for its target uses.
> Currently, iSCSI contains a data integrity check "digest" that can be
> negotiated end-to-end to be disabled on a per-connection basis.
> 
> This discussion begs a few questions:
> - Are there clearly different classes of applications (in regards to their
> end-to-end data integrity strength needs)?
> - How are these application classes' end-to-end data integrity needs meet in
> the future?  Is it SCTP, IPSec, application-specific protocol, a new
> protocol?
> - Is there a general need for strong end-to-end data integrity that could be
> provided for in a recommended generic manner?
> - Is iSCSI unique in being an "ultra-low error rate application" and should
> iSCSI then handle its own data integrity?
> - Should SCTP strengthen its checksum to meet the needs of a general class
> of data-criticial applications, and/or provide a means for negotiating an
> optional stronger checksum?
> - What is the role of network infrastructure (router/middlebox hardware and
> software) in strengthening end-to-end data integrity?
> 
> Data integrity for iSCSI over TCP is a separate issue. It is unlikely that
> we will be able to evolve TCP in a timely manner to utilize a stronger
> check-code given TCP's current wide scale deployment (although adding a
> stronger checksum/CRC to TCP would seem to be the best solution). So,
> something else has to be done either above or below TCP to provide the
> required level of iSCSI data integrity. Of course, if TCP's data integrity
> deficiency is impacting other data-critical applications, then it seems
> prudent to at least consider solving the problem generically.
> 
> Jim
> 
> > -----Original Message-----
> > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > Sent: Friday, April 20, 2001 1:02 AM
> > To: Chip Sharp
> > Cc: vince_cavanna@agilent.com; steph@cs.uchicago.edu; WENDT,JIM
> > (HP-Roseville,ex1); ips@ece.cmu.edu; tsvwg@ietf.org;
> > craig@aland.bbn.com; Jonathan.Wood@sun.com; xieqb@cig.mot.com;
> > jonathan@dsg.stanford.edu; rrs@cisco.com
> > Subject: RE: [Tsvwg] [SCTP checksum problems]
> >
> >
> >
> >
> > Chip,
> >
> > CRC s are not meant to protect against malicious middle boxes
> > - rather on
> > boxes that strip the strong link CRCs and
> > let the end-system rely on the weak TCP checksum.
> >
> > NAT boxes have good reason to recompute TCP checksums, but
> > unless they are
> > malicious no reason to recompute iSCSI CRCs.
> >
> > And against malicious boxes iSCSI has cryptographic digests
> > as options.
> >
> > And I was not aware that we are discussing - in this forum -
> > iSCSI data
> > integrity options.
> >
> > Julo
> >
> > Chip Sharp <chsharp@cisco.com> on 19/04/2001 18:53:53
> >
> > Please respond to Chip Sharp <chsharp@cisco.com>
> >
> > To:   vince_cavanna@agilent.com
> > cc:   steph@cs.uchicago.edu, vince_cavanna@agilent.com,
> > jim_wendt@hp.com,
> >       Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org,
> >       craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
> >       jonathan@dsg.stanford.edu, rrs@cisco.com
> > Subject:  RE: [Tsvwg] [SCTP checksum problems]
> >
> >
> >
> >
> > As was pointed out previously, middle box operations (such as
> > NATs) tend to
> > creep up the protocol stack and into applications.
> >
> > Take SIP for example.  It includes IP addresses in its
> > INVITE.  In order to
> > work across a NAT, the IP addresses it exchanges have to be
> > replaced with
> > the NATed address.  One way is for the NAT to reach up into
> > the SIP INVITE
> > and change the address.  This modifies the TCP or UDP
> > checksum.  Now SIP
> > could have included its own integrity check to protect
> > against corrupted or
> > modified TCP checksums, but all that would have happened is
> > that NATs would
> > have changed the SIP checksum in addition to the TCP/UDP checksum.
> >
> > Therefore, even if iSCSI included its own integrity check, if
> > a middle box
> > is going to futz with iSCSI packets it will just strip the check, do
> > whatever it does and then recalculate the check.
> >
> > If this is what you want to protect against you will have to
> > go to some
> > type of digital signature.
> >
> > At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
> > >Stephen,
> > >
> > >I have to admit that I do not have much direct experience with middle
> > boxes,
> > >BUT I did have fairly direct and recent experience with a popular NAT
> > router
> > >from a popular vendor that was corrupting data in a network of
> > Macintoshes.
> > >
> > >Apple's TCP was unaware of any problem as was Apple's Filing
> > Protocol and
> > >most applications. The only applications that detected the
> > corruption were
> > >those that performed an integrity check of their own. Those
> > applications
> > >that assumed a reliable transport (and file system) were doomed to
> > >experiencing the indirect effects of the corruption at some
> > later time.
> > The
> > >corruption only happened when large amounts of data were transferred
> > >quickly.  The router vendor fixed the problem once; then
> > fixed it again;
> > >then fixed it one last time before the data corruption finally
> > >"disappeared". After several weeks of continuous operation the router
> > >appeared to get into a mode where it was once again
> > corrupting data. Power
> > >cycling the router "fixed it". The story apparently has not
> > yet ended.
> > >
> > >I admit I may have given too much significance to this
> > single incident
> > that
> > >I have personally experienced but on the other hand I don't see the
> > >mechanisms in place to prevent this type of problem in the
> > future other
> > than
> > >the end to end integrity checks.
> > >
> > >Incidentally this incident change my behavior when
> > transferring data over
> > a
> > >network. I will always use a compression utility; not only
> > for reducing
> > the
> > >data to be transmitted but to ensure the integrity of my
> > data is protected
> > >end to end by the utility's CRC mechanism.
> > >
> > >I believe quite firmly that we DO need a mechanism to allow
> > us to tolerate
> > >poor implementations of middle boxes and cannot simply hope that
> > eventually
> > >such poor implementations will vanish, nor that we will have
> > the luxury of
> > >being able to select only good implementations for every
> > component of our
> > >storage network.
> > >
> > >Vince
> > >
> > >|-----Original Message-----
> > >|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> > >|Sent: Wednesday, April 18, 2001 3:09 PM
> > >|To: CAVANNA,VICENTE V (A-Roseville,ex1)
> > >|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
> > >|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
> > >|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
> > >|Subject: Re: [Tsvwg] [SCTP checksum problems]
> > >|
> > >|
> > >|Vince,
> > >|
> > >|> I don't think iSCSI can be completely relieved of performing
> > >|some data
> > >|> integrity checking as long as there exists the possibility
> > >|of "middle boxes"
> > >|> opening up the transport protocol's packet and thus
> > >|potentially invalidating
> > >|> any reliability guarantees the transport protocol makes.
> > >|
> > >|Any protection provided against this failure mode will only be
> > >|transient, so we must temper the desire to introduce such a
> > >|requirement with reality.
> > >|
> > >|Middleboxes can just as easily open up to the iSCSI layer and tinker
> > >|with the payload, as they do with other ULPs running on TCP
> > (e.g HTTP)
> > >|today.  Short of securing the connection, there is ALWAYS a
> > >|possibility of a middlebox terminating and reoriginating an
> > integrity
> > >|check.  In case you think this is a farfetched scenario, I
> > do get the
> > >|impression that there is a high level of interest in `actively
> > >|middling' iSCSI once the specs crystalize.  Who shaves the barber?
> > >|
> > >|An integrity check is not necessary as long as some lower layer
> > >|provides adequate integrity guarantees.
> > >|
> > >|Adding an integrity check above the transport layer is based upon
> > >|documentation of the presence of a lot of crappy network
> > hardware and
> > >|software and analyses of the transport integrity check (TCP
> > checksum)
> > >|which suggests it might not be adequately strong against some such
> > >|observed errors.
> > >|
> > >|I claim that the high incidence of `broken' (corruption introducing)
> > >|components is a result of a variety of factors which have shaped the
> > >|development of network components thus far.  The fact that integrity
> > >|checks are assumed to be performed in a network context
> > substantially
> > >|lowers the bar for implementation correctness.
> > >|
> > >|In a storage (or CPU) context, these types of implementation errors
> > >|are a) more easily detectable (more fatal) b) more carefully avoided
> > >|during implementation (because of the cost of a potential fatal
> > >|error).  If network components magically reached the same `quality
> > >|level' as storage and CPU components, there might be no
> > justification
> > >|for additional integrity checks above the transport.
> > Similarly if the
> > >|transport (or whatever lower layer) integrity checks are very strong
> > >|(e.g. IPSec), there is, again, no need for a higher level integrity
> > >|check.
> > >|
> > >|I am not disagreeing that we need an additional integrity check over
> > >|TCP in the present target environment, but I do disagree that iSCSI
> > >|will always need such a check, independently of what is running
> > >|beneath it.
> > >|
> > >|Steph
> > >|
> >
> >
> > -------------------------------------------------------------------
> > Chip Sharp                       Consulting Engineering
> > Cisco Systems
> > -------------------------------------------------------------------
> >
> >
> >
> >

-- 
Randall R. Stewart
Systems & Solutions Engineering
Cisco Systems Inc.
rrs@cisco.com 815-342-5222 or 815-477-2127


From owner-ips@ece.cmu.edu  Thu Apr 26 16:51:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA21230
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 16:51:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QIggL07794
	for ips-outgoing; Thu, 26 Apr 2001 14:42:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QIgFA07767
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 14:42:15 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 9D65F1F26; Thu, 26 Apr 2001 11:42:10 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA05950;
	Thu, 26 Apr 2001 11:42:05 -0700 (PDT)
Message-ID: <3AE86BD0.C04EE2D1@cup.hp.com>
Date: Thu, 26 Apr 2001 11:41:20 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: Stephen Bailey <steph@cs.uchicago.edu>, ips@ece.cmu.edu
Cc: David Black <Black_David@emc.com>
Subject: Re: iSCSI : target session login behaviour
References: <3AE71515.D8D13B88@cup.hp.com> <20010426131924.3DDCE94006@sandmail.sandburst.com>
Content-Type: multipart/mixed;
 boundary="------------210273BB1FCB8EB182E38CCD"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------210273BB1FCB8EB182E38CCD
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Stephen Bailey wrote:

> > As a side note, the iSCSI draft Status Class/Codes could do with a misc
> > error category along the lines of the FC "No additional Explantion"
> > reason explantion. This would help deal with error conditions that don't
> > come under the listed category.
> 
> Personally, I think we should add categories for reasons we obviously
> see now, AND have a no additional reason.
> 
> One peculiarity with what you're talking about above is that it should
> be a login response status code which expresses this rejection.  The
> login response set does not seem to have an `invalid parameter'
> response for cases when the request is somehow inconsistent.

Steph,

The iSCSI draft is unclear today about the exact mechanism through which
the target indicates "invalid parameters" in response to a received
command.

1) Should it use a REJECT PDU or respond with the appropriate response
for that PDU indicating a response code of "Invalid Parameters" and a
"first bad byte" offset that indicates which parameter the target
disliked.

IMO, an "Invalid Parameters" response in the response codes is
appropriate for SCSI Command and SCSI Task Mgmt commands. [coupled with
a "first bad byte" offset.]

This is missing today. 

2) Also, as discussed above, a general "No Addional Explanation" type of
status code in the login response would cover the "misc" category.

3) There are cases of ambiguity in the usage of REJECT or SCSI Response.
Take the case of a "SNACK Reject". It is present in both the SCSI
Response (SNACK Rejected) and REJECT PDU reason code (Data SNACK
Reject). Which mechanism is to be used in this case ?

4) There is no "Status SNACK Rejected" in the REJECT PDU. 

Regards,
Santosh
--------------210273BB1FCB8EB182E38CCD
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------210273BB1FCB8EB182E38CCD--



From owner-ips@ece.cmu.edu  Thu Apr 26 16:54:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA21302
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 16:54:29 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QJJft09954
	for ips-outgoing; Thu, 26 Apr 2001 15:19:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mail.brocade.com (asbestos.brocade.com [63.121.140.244])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QF4vA23240
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 11:04:57 -0400 (EDT)
Received: from thor.brocade.com (thor [192.168.126.45])
	by mail.brocade.com (8.8.8+Sun/8.8.8) with ESMTP id IAA18499;
	Thu, 26 Apr 2001 08:04:39 -0700 (PDT)
Received: by thor.brocade.com with Internet Mail Service (5.5.2653.19)
	id <JVG2NL7A>; Thu, 26 Apr 2001 08:04:39 -0700
Message-ID: <FFD40DB4943CD411876500508BAD02797D46DA@sj5-ex2.brocade.com>
From: Robert Snively <rsnively@Brocade.COM>
To: "'Douglas Otis'" <dotis@sanlight.net>,
        Stephen Bailey
	 <steph@cs.uchicago.edu>, ips@ece.cmu.edu
Subject: RE: iSCSI: Re: iSCSI & Linked Commands
Date: Thu, 26 Apr 2001 08:04:38 -0700
X-Mailer: Internet Mail Service (5.5.2653.19)
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Doug,

Relative addressing is not defined in the SSC command set nor
in the SPC command set for tapes.

Bob

>  -----Original Message-----
>  From: Douglas Otis [mailto:dotis@sanlight.net]
>  Sent: Monday, April 23, 2001 9:36 AM
>  To: Stephen Bailey; ips@ece.cmu.edu
>  Subject: RE: iSCSI: Re: iSCSI & Linked Commands
>  
>  
>  Stephen,
>  
>  Unlike random access devices, sequential access devices operate with
>  relative addressing.  For random access devices, this is a 
>  seldom used
>  option.  There is a requirement to bind commands together to 
>  ensure order of
>  execution on these devices.  By popular, you mean not sequential?
>  
>  Doug
>  
>  
>  > Julian,
>  >
>  > > According to your logic no FCP implementation can use 
>  linked commands?
>  > > Is this true for all OS's?  Is it a verified fact or foloklor?
>  >
>  > In my experience it's fact.  I have never used a SCSI 
>  stack which both
>  > supported AND used linked commands.  Like some others 
>  here, I always
>  > assumed AIX might :^) Ralph has pointed out that T10 is well aware
>  > that the feature is not popular.  There are other ways of
>  > accomplishing the same thing that are less likely to blow 
>  up in your
>  > face.
>  >
>  > > Is it so also for the new MS StorPort driver?
>  >
>  > I don't know, but I'd be really surprised if they did use linked
>  > commands.  You have to be pretty nuts to rely on a feature 
>  that's not
>  > even exercised by most SCSI implementations.
>  >
>  > Steph
>  >
>  
>  


From owner-ips@ece.cmu.edu  Thu Apr 26 16:55:18 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA21315
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 16:55:18 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QImcY08121
	for ips-outgoing; Thu, 26 Apr 2001 14:48:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QIm4A08102
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 14:48:04 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP id D08F11E30
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 11:48:03 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA06372
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 11:47:56 -0700 (PDT)
Message-ID: <3AE86D2F.6C007849@cup.hp.com>
Date: Thu, 26 Apr 2001 11:47:11 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI : LUN field in NOP-OUT & NOP-IN PDUs.
References: <C1256A3A.00253F82.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------AA16F940FE271088931FCCDE"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------AA16F940FE271088931FCCDE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> In NOPs the LUNs are used only if the exchange is originated by the target
> and then the LUN is
> has to be valid.
> 
> Otherwise the LUN is 0 (when the target task tag is ffffffff).

Julian,

If the target originates a NOP-IN, are you saying that the LUN field
must be valid and contain a non-zero LUN value ? FYI, LUN 0 is a valid
LUN. Why cannot the target use LUN 0 in this case ?

Also, what if the target is assigning target transfer tags on a per
session basis. What should such a target initialize the LUN field to ? 

IMO, the NOP-OUT and NOP-IN PDUs should allow a LUN field and let it be
set to LUN 0 if the targets are assigning tags on a per session basis.
(LUN 0 implying target SDP).

- Santosh




> As has been discussed in the past, the NOP-OUT and NOP-IN PDUs are
> transport specific and are used without any LUN context. Hence, it is
> not clear why a LUN field is required in either the NOP-OUT or NOP-IN
> PDUs.
> 
> Some side notes on this subject :
> 
> b) The NOP-IN PDU description shows a value of LUN 0 as reserved in the
> NOP-IN PDU diagram. Is not LUN 0 a valid LU number for a LU (?)
--------------AA16F940FE271088931FCCDE
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------AA16F940FE271088931FCCDE--



From owner-ips@ece.cmu.edu  Thu Apr 26 17:42:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA23300
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 17:42:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QHIcs02214
	for ips-outgoing; Thu, 26 Apr 2001 13:18:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QHIMA02197
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 13:18:22 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3QIPJ133694;
	Thu, 26 Apr 2001 11:25:20 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Robert Snively" <rsnively@brocade.com>,
        "Stephen Bailey" <steph@cs.uchicago.edu>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Re: iSCSI & Linked Commands
Date: Thu, 26 Apr 2001 10:15:24 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJAEOCCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <FFD40DB4943CD411876500508BAD02797D46DA@sj5-ex2.brocade.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Robert,

Relative addressing is not defined because that is the only means of
addressing.  Relative to the last block.

Doug


> Doug,
>
> Relative addressing is not defined in the SSC command set nor
> in the SPC command set for tapes.
>
> Bob
>
> >  -----Original Message-----
> >  From: Douglas Otis [mailto:dotis@sanlight.net]
> >  Sent: Monday, April 23, 2001 9:36 AM
> >  To: Stephen Bailey; ips@ece.cmu.edu
> >  Subject: RE: iSCSI: Re: iSCSI & Linked Commands
> >
> >
> >  Stephen,
> >
> >  Unlike random access devices, sequential access devices operate with
> >  relative addressing.  For random access devices, this is a
> >  seldom used
> >  option.  There is a requirement to bind commands together to
> >  ensure order of
> >  execution on these devices.  By popular, you mean not sequential?
> >
> >  Doug
> >
> >
> >  > Julian,
> >  >
> >  > > According to your logic no FCP implementation can use
> >  linked commands?
> >  > > Is this true for all OS's?  Is it a verified fact or foloklor?
> >  >
> >  > In my experience it's fact.  I have never used a SCSI
> >  stack which both
> >  > supported AND used linked commands.  Like some others
> >  here, I always
> >  > assumed AIX might :^) Ralph has pointed out that T10 is well aware
> >  > that the feature is not popular.  There are other ways of
> >  > accomplishing the same thing that are less likely to blow
> >  up in your
> >  > face.
> >  >
> >  > > Is it so also for the new MS StorPort driver?
> >  >
> >  > I don't know, but I'd be really surprised if they did use linked
> >  > commands.  You have to be pretty nuts to rely on a feature
> >  that's not
> >  > even exercised by most SCSI implementations.
> >  >
> >  > Steph
> >  >
> >
> >
>



From owner-ips@ece.cmu.edu  Thu Apr 26 18:04:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA23622
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 18:04:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QJbg411108
	for ips-outgoing; Thu, 26 Apr 2001 15:37:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QJbMA11091
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 15:37:22 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 2732794009
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 15:37:21 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI Target Reset 
In-Reply-To: Message from julian_satran@il.ibm.com 
   of "Thu, 26 Apr 2001 10:41:04 +0300." <C1256A3A.0029B65F.00@d12mta02.de.ibm.com> 
References: <C1256A3A.0029B65F.00@d12mta02.de.ibm.com> 
Date: Thu, 26 Apr 2001 15:35:40 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010426193721.2732794009@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> However I wonder it it won't be fair to tell the guy trying to do
> reset that although everything is fine the reset was not performed -
> and do this in a way that does not harm legacy initiators.

Legacy initiators already have to deal with this in FC and ||SCSI.
Target reset has an implementation specific behavior which varies in
strength from target to target.  Doing the same thing with iSCSI
(target reset behavior is silently implementation dependent), seems
the right approach to me.

No matter how vigorously we wheeze about standards compliance, target
vendors are going to lay an iSCSI front end on their existing products
(just like they did with FCP), and (initially) do the minimum amount
of additional work.  Changing reset semantics is a MASSIVE
undertaking, and for no incremental gain.  There's a really small set
of required (by applications, I dunno about the standards) semantics
for reset, like breaking non-persistent reservations (which never
really worked well on ANY transport), and beyond that, you're just
offering a tool to cover your target's ass (it's wedged?  How 'bout
after you hit it with a reset?).

Steph


From owner-ips@ece.cmu.edu  Thu Apr 26 18:11:16 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA23705
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 18:11:16 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QK6fE12898
	for ips-outgoing; Thu, 26 Apr 2001 16:06:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QK5gA12824
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 16:05:42 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 09BF494009
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 16:05:42 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI Parameter Negotiation 
In-Reply-To: Message from "Robert D. Russell" <rdr@mars.iol.unh.edu> 
   of "Wed, 25 Apr 2001 16:02:19 EDT." <Pine.SGI.4.20.0104251600080.27817-100000@mars.iol.unh.edu> 
References: <Pine.SGI.4.20.0104251600080.27817-100000@mars.iol.unh.edu> 
Date: Thu, 26 Apr 2001 16:04:01 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010426200542.09BF494009@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Bob,

Excellent question.  I agree that it's unclear too, and I'm not even
sure what's `right' independently of what the spec says.  I think the
problem goes deeper than a simple yes/no.

My reading of the spec is that it (the spec) is 90% convinced that the
target can not `talk out of turn' (once a channel guy....).  However,
you have pointed out one (MaxConnections) of several examples where
this assumption doesn't work right.  Another example is marker
parameters---the target might want to ask for them even if the
initiator's not into it.  Furthermore, even if the initiator does
enable markers with an FMarker parameter, the targer may respond with
FMarker AND MarkInts, even though the initiator is satisfied with the
default values, and so, did not send the parameters.

I have proposed that we precisely specify the exchange characteristics
for each parameter in the operational set (security parameters already
seem to be well specified by merit of the fact that security exchanges
are themselves precisely defined).

Beyond the exchange characteristics of the parameters themselves comes
how they should appear in PDUs.  If the parameter exchanges are very
free-formed, the implementation complexity increases massively for no
corresponding increase in capability.  I.e. who really cares if
parameters must be sent in a specification-defined order versus any
order you feel like?  Who cares if you the target can return responses
in a single PDU or multiple?

I have a well formed opinion about how to specify to cut the
implementation complexity, but I want to get the requirements on the
table first.

Steph


From owner-ips@ece.cmu.edu  Thu Apr 26 19:28:21 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA24708
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 19:28:21 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QJ9md09343
	for ips-outgoing; Thu, 26 Apr 2001 15:09:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QJ9FA09319
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 15:09:20 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3QKFp133786;
	Thu, 26 Apr 2001 13:15:51 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Randall Stewart" <rrs@cisco.com>, <tsvwg@ietf.org>
Cc: "Chip Sharp" <chsharp@cisco.com>, <ips@ece.cmu.edu>,
        "Craig Partridge" <craig@aland.bbn.com>,
        "Jonathan Stone" <jonathan@dsg.stanford.edu>
Subject: RE: [Tsvwg] [SCTP checksum problems]
Date: Thu, 26 Apr 2001 12:05:54 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJMEODCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <3AE7DB24.7AECE222@cisco.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 8bit

Randall,

I suspected the dismal results of your CRC algorithm was due to the use of a
16 bit lookup rather than an 8 bit lookup but upon checking the code, I
found they had used an 8 bit lookup.  Based on estimates that I made,
conversion to an 8 bit with an immediate table should give results in the
neighborhood of 400 nanoseconds versus the 90 nanoseconds for the modified
Adler.  Your 5900 nanoseconds result being 15 times greater is likely due to
cache misses as you suspected.  Obviously there is more to consider with
respect to table driven algorithms as it seems as if you are running 120
instructions per byte.

Doug

> Jim:
>
> I am glad you copied me on this.. since being at the bakeoff and with
> the
> recent email/site problems I have.. I do not get IETF mail until next
> week :0
>
> Now, some comments...
>
>
> My major concern is that SCTP's checksum is weaker than TCP.What the
> upper layers do to defend against middle boxes etc they will need to
> do anyway. I have just ran a few numbers using the sctp_test_app in
> the reference implementation.
>
> I did the following:
>
> Compiled the normal ref-imp with -O -pg
>
> started two endpoints on the same machine (freeBSD intel/sony vio
> PCG-Z505JS), this
> of course has my SCTP kernel patches applied ...
>
> Now I setup an association between two endpoints and did
>
> bulk:100:0:1000000
>
> This transfers 1 Million 100 byte packets holding ascii data from one
> endpoint to the other.
>
> I then captured the gprof information for this run.
>
> I did the same exact test after changing the checksum to crc32 and the
> modified
> adler32 (16 bit sums).
>
> My results (not meant to show strength in catching errors but instead
> performance
> of a software version of the sum) are as follows:
>
> Adler 32...
>
> Sender side and Receiver side Avereage time spent in the checksum
> routine per
> call was 121 nano seconds.
>
> Adler 32 Modified
>
> Sender side and Receiver side Average time spent in the checksum routine
> per
> call was 90 nano seconds.
>
> CRC32
>
> Sender side average time spent in the checksum calculation per packet
> was
> 5.3 micro seconds
>
> Recevier side average time spent in the checksum calculation per packet
> was
> 5.9 micro seconds.
>
> I believe the differences seen in the CRC32 can be attributed to how
> lucky the
> repsective application is in finding the index table (the ssh_crc32
> found in
> FreeBSD) in the processor cache. If I run the CRC comparision with just
> random
> data in a stand alone program (very unrealistic) crc32 outperforms all
> others but
> this is because the table get completely pre-fetched into cache and is
> never
> pulled from main memory... (I started here and then realized the only
> way to
> get good information is to do it in a real implementation where a lot of
> other
> code would run between crc calls).
>
> So on that note... I vote STRONGLY for Modified Adler32... i.e. same as
> regular
> adler but make the quantities added be 16 bit sums...
>
> I think this will take care of the critical problem i.e. the weakness in
> the
> SCTP checksum for small packets... Jonathan/Craig any comments or
> questions...
>
> And yess... I am working on getting things ran on a sparc box but the
> only on
> we have here is a sparc20... so the numbers definetly can NOT be
> compared to anything
> else...
>
> R
>
>
>
>
> "WENDT,JIM (HP-Roseville,ex1)" wrote:
> >
> > I think this "SCTP checksum" thread spanning IPS and TSVWG was for
> > discussion around whether or not iSCSI (running over SCTP)
> could forgo data
> > integrity checking and transport-like functionality
> (retransmission, ack,
> > etc) should SCTP provide a sufficiently strong check-code.
> > If iSCSI were willing to completely trust SCTP end-to-end
> across a network
> > fabric (including "middleboxes"), then that provides one reason
> for SCTP to
> > adopt a stronger checksum or CRC.
> > If iSCSI will still implement its own data integrity check-code
> above SCTP,
> > then SCTP needs to make an independent decision on whether its current
> > check-code is sufficiently strong for its target uses.
> > Currently, iSCSI contains a data integrity check "digest" that can be
> > negotiated end-to-end to be disabled on a per-connection basis.
> >
> > This discussion begs a few questions:
> > - Are there clearly different classes of applications (in
> regards to their
> > end-to-end data integrity strength needs)?
> > - How are these application classes' end-to-end data integrity
> needs meet in
> > the future?  Is it SCTP, IPSec, application-specific protocol, a new
> > protocol?
> > - Is there a general need for strong end-to-end data integrity
> that could be
> > provided for in a recommended generic manner?
> > - Is iSCSI unique in being an "ultra-low error rate
> application" and should
> > iSCSI then handle its own data integrity?
> > - Should SCTP strengthen its checksum to meet the needs of a
> general class
> > of data-criticial applications, and/or provide a means for
> negotiating an
> > optional stronger checksum?
> > - What is the role of network infrastructure (router/middlebox
> hardware and
> > software) in strengthening end-to-end data integrity?
> >
> > Data integrity for iSCSI over TCP is a separate issue. It is
> unlikely that
> > we will be able to evolve TCP in a timely manner to utilize a stronger
> > check-code given TCP's current wide scale deployment (although adding a
> > stronger checksum/CRC to TCP would seem to be the best solution). So,
> > something else has to be done either above or below TCP to provide the
> > required level of iSCSI data integrity. Of course, if TCP's
> data integrity
> > deficiency is impacting other data-critical applications, then it seems
> > prudent to at least consider solving the problem generically.
> >
> > Jim
> >
> > > -----Original Message-----
> > > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > > Sent: Friday, April 20, 2001 1:02 AM
> > > To: Chip Sharp
> > > Cc: vince_cavanna@agilent.com; steph@cs.uchicago.edu; WENDT,JIM
> > > (HP-Roseville,ex1); ips@ece.cmu.edu; tsvwg@ietf.org;
> > > craig@aland.bbn.com; Jonathan.Wood@sun.com; xieqb@cig.mot.com;
> > > jonathan@dsg.stanford.edu; rrs@cisco.com
> > > Subject: RE: [Tsvwg] [SCTP checksum problems]
> > >
> > >
> > >
> > >
> > > Chip,
> > >
> > > CRC s are not meant to protect against malicious middle boxes
> > > - rather on
> > > boxes that strip the strong link CRCs and
> > > let the end-system rely on the weak TCP checksum.
> > >
> > > NAT boxes have good reason to recompute TCP checksums, but
> > > unless they are
> > > malicious no reason to recompute iSCSI CRCs.
> > >
> > > And against malicious boxes iSCSI has cryptographic digests
> > > as options.
> > >
> > > And I was not aware that we are discussing - in this forum -
> > > iSCSI data
> > > integrity options.
> > >
> > > Julo
> > >
> > > Chip Sharp <chsharp@cisco.com> on 19/04/2001 18:53:53
> > >
> > > Please respond to Chip Sharp <chsharp@cisco.com>
> > >
> > > To:   vince_cavanna@agilent.com
> > > cc:   steph@cs.uchicago.edu, vince_cavanna@agilent.com,
> > > jim_wendt@hp.com,
> > >       Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org,
> > >       craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
> > >       jonathan@dsg.stanford.edu, rrs@cisco.com
> > > Subject:  RE: [Tsvwg] [SCTP checksum problems]
> > >
> > >
> > >
> > >
> > > As was pointed out previously, middle box operations (such as
> > > NATs) tend to
> > > creep up the protocol stack and into applications.
> > >
> > > Take SIP for example.  It includes IP addresses in its
> > > INVITE.  In order to
> > > work across a NAT, the IP addresses it exchanges have to be
> > > replaced with
> > > the NATed address.  One way is for the NAT to reach up into
> > > the SIP INVITE
> > > and change the address.  This modifies the TCP or UDP
> > > checksum.  Now SIP
> > > could have included its own integrity check to protect
> > > against corrupted or
> > > modified TCP checksums, but all that would have happened is
> > > that NATs would
> > > have changed the SIP checksum in addition to the TCP/UDP checksum.
> > >
> > > Therefore, even if iSCSI included its own integrity check, if
> > > a middle box
> > > is going to futz with iSCSI packets it will just strip the check, do
> > > whatever it does and then recalculate the check.
> > >
> > > If this is what you want to protect against you will have to
> > > go to some
> > > type of digital signature.
> > >
> > > At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
> > > >Stephen,
> > > >
> > > >I have to admit that I do not have much direct experience with middle
> > > boxes,
> > > >BUT I did have fairly direct and recent experience with a popular NAT
> > > router
> > > >from a popular vendor that was corrupting data in a network of
> > > Macintoshes.
> > > >
> > > >Apple's TCP was unaware of any problem as was Apple's Filing
> > > Protocol and
> > > >most applications. The only applications that detected the
> > > corruption were
> > > >those that performed an integrity check of their own. Those
> > > applications
> > > >that assumed a reliable transport (and file system) were doomed to
> > > >experiencing the indirect effects of the corruption at some
> > > later time.
> > > The
> > > >corruption only happened when large amounts of data were transferred
> > > >quickly.  The router vendor fixed the problem once; then
> > > fixed it again;
> > > >then fixed it one last time before the data corruption finally
> > > >"disappeared". After several weeks of continuous operation the router
> > > >appeared to get into a mode where it was once again
> > > corrupting data. Power
> > > >cycling the router "fixed it". The story apparently has not
> > > yet ended.
> > > >
> > > >I admit I may have given too much significance to this
> > > single incident
> > > that
> > > >I have personally experienced but on the other hand I don't see the
> > > >mechanisms in place to prevent this type of problem in the
> > > future other
> > > than
> > > >the end to end integrity checks.
> > > >
> > > >Incidentally this incident change my behavior when
> > > transferring data over
> > > a
> > > >network. I will always use a compression utility; not only
> > > for reducing
> > > the
> > > >data to be transmitted but to ensure the integrity of my
> > > data is protected
> > > >end to end by the utility's CRC mechanism.
> > > >
> > > >I believe quite firmly that we DO need a mechanism to allow
> > > us to tolerate
> > > >poor implementations of middle boxes and cannot simply hope that
> > > eventually
> > > >such poor implementations will vanish, nor that we will have
> > > the luxury of
> > > >being able to select only good implementations for every
> > > component of our
> > > >storage network.
> > > >
> > > >Vince
> > > >
> > > >|-----Original Message-----
> > > >|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> > > >|Sent: Wednesday, April 18, 2001 3:09 PM
> > > >|To: CAVANNA,VICENTE V (A-Roseville,ex1)
> > > >|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
> > > >|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
> > > >|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
> > > >|Subject: Re: [Tsvwg] [SCTP checksum problems]
> > > >|
> > > >|
> > > >|Vince,
> > > >|
> > > >|> I don't think iSCSI can be completely relieved of performing
> > > >|some data
> > > >|> integrity checking as long as there exists the possibility
> > > >|of "middle boxes"
> > > >|> opening up the transport protocol's packet and thus
> > > >|potentially invalidating
> > > >|> any reliability guarantees the transport protocol makes.
> > > >|
> > > >|Any protection provided against this failure mode will only be
> > > >|transient, so we must temper the desire to introduce such a
> > > >|requirement with reality.
> > > >|
> > > >|Middleboxes can just as easily open up to the iSCSI layer and tinker
> > > >|with the payload, as they do with other ULPs running on TCP
> > > (e.g HTTP)
> > > >|today.  Short of securing the connection, there is ALWAYS a
> > > >|possibility of a middlebox terminating and reoriginating an
> > > integrity
> > > >|check.  In case you think this is a farfetched scenario, I
> > > do get the
> > > >|impression that there is a high level of interest in `actively
> > > >|middling' iSCSI once the specs crystalize.  Who shaves the barber?
> > > >|
> > > >|An integrity check is not necessary as long as some lower layer
> > > >|provides adequate integrity guarantees.
> > > >|
> > > >|Adding an integrity check above the transport layer is based upon
> > > >|documentation of the presence of a lot of crappy network
> > > hardware and
> > > >|software and analyses of the transport integrity check (TCP
> > > checksum)
> > > >|which suggests it might not be adequately strong against some such
> > > >|observed errors.
> > > >|
> > > >|I claim that the high incidence of `broken' (corruption introducing)
> > > >|components is a result of a variety of factors which have shaped the
> > > >|development of network components thus far.  The fact that integrity
> > > >|checks are assumed to be performed in a network context
> > > substantially
> > > >|lowers the bar for implementation correctness.
> > > >|
> > > >|In a storage (or CPU) context, these types of implementation errors
> > > >|are a) more easily detectable (more fatal) b) more carefully avoided
> > > >|during implementation (because of the cost of a potential fatal
> > > >|error).  If network components magically reached the same `quality
> > > >|level' as storage and CPU components, there might be no
> > > justification
> > > >|for additional integrity checks above the transport.
> > > Similarly if the
> > > >|transport (or whatever lower layer) integrity checks are very strong
> > > >|(e.g. IPSec), there is, again, no need for a higher level integrity
> > > >|check.
> > > >|
> > > >|I am not disagreeing that we need an additional integrity check over
> > > >|TCP in the present target environment, but I do disagree that iSCSI
> > > >|will always need such a check, independently of what is running
> > > >|beneath it.
> > > >|
> > > >|Steph
> > > >|
> > >
> > >
> > > -------------------------------------------------------------------
> > > Chip Sharp                       Consulting Engineering
> > > Cisco Systems
> > > -------------------------------------------------------------------
> > >
> > >
> > >
> > >
>
> --
> Randall R. Stewart
> Systems & Solutions Engineering
> Cisco Systems Inc.
> rrs@cisco.com 815-342-5222 or 815-477-2127
>
> _______________________________________________
> tsvwg mailing list
> tsvwg@ietf.org
> http://www1.ietf.org/mailman/listinfo/tsvwg
>



From owner-ips@ece.cmu.edu  Thu Apr 26 21:29:24 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA26382
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 21:29:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QNJl624670
	for ips-outgoing; Thu, 26 Apr 2001 19:19:47 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3QNJFA24653
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 19:19:15 -0400 (EDT)
Received: from hpbs5002.boi.hp.com (hpbs5002.boi.hp.com [15.2.216.28])
	by palrel1.hp.com (Postfix) with ESMTP id 951001E43
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 16:19:14 -0700 (PDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1]) by hpbs5002.boi.hp.com with ESMTP (8.7.1/8.7.3 SMKit7.02) id RAA20083 for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 17:19:14 -0600 (MDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <JK4DB9JS>; Thu, 26 Apr 2001 16:18:24 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A09014@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: iSCSI: Require iSCSI to use packet formats similar to FC, etc??
Date: Thu, 26 Apr 2001 16:18:21 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Robert Snively proposes that the iSCSI Requirements document include the
following requirement WRT gateway devices:

   iSCSI MUST use packet formats similar to the common
   packet formats used by other packetized SCSI protocols where
   possible to allow both simple bridging gateways and more
   sophisticated translating gateways.

Comments?

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com 


From owner-ips@ece.cmu.edu  Thu Apr 26 21:33:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA26506
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 21:33:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3QLgmo18878
	for ips-outgoing; Thu, 26 Apr 2001 17:42:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dirty.research.bell-labs.com (dirty.research.bell-labs.com [204.178.16.6])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3QLfuA18827
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 17:41:56 -0400 (EDT)
Received: from grubby.research.bell-labs.com ([135.104.2.9]) by dirty; Thu Apr 26 17:40:33 EDT 2001
Received: from aura.research.bell-labs.com ([135.104.46.10]) by grubby; Thu Apr 26 17:40:32 EDT 2001
Received: from research.bell-labs.com (IDENT:sandeepj@sandeepj-pcmh.research.bell-labs.com [135.104.47.90])
	by aura.research.bell-labs.com (8.9.1/8.9.1) with ESMTP id RAA26249;
	Thu, 26 Apr 2001 17:40:26 -0400 (EDT)
Message-ID: <3AE895CA.4C2F391@research.bell-labs.com>
Date: Thu, 26 Apr 2001 17:40:26 -0400
From: Sandeep Joshi <sandeepj@research.bell-labs.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.16-3 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Stephen Bailey <steph@cs.uchicago.edu>
CC: ips@ece.cmu.edu
Subject: Re: iSCSI Parameter Negotiation
References: <Pine.SGI.4.20.0104251600080.27817-100000@mars.iol.unh.edu> <20010426200542.09BF494009@sandmail.sandburst.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit


Steph,

It seems that the target is allowed to propose parameter values.
See http://ips.pdl.cs.cmu.edu/mail/msg03273.html

It seems that the statement which was added to resolve this 
is in Sec 2.8.3
 "All these keys except the X- extension formatted MUST be 
  supported by iSCSI initiators and targets"

I agree that the legal permutations need to be more clearly stated.
  
And your efforts to cut any implementation complexity will undoubtedly 
earn much goodwill :-)

regards,
-Sandeep 

Stephen Bailey wrote:
> 
> Bob,
> 
> Excellent question.  I agree that it's unclear too, and I'm not even
> sure what's `right' independently of what the spec says.  I think the
> problem goes deeper than a simple yes/no.
> 
> My reading of the spec is that it (the spec) is 90% convinced that the
> target can not `talk out of turn' (once a channel guy....).  However,
> you have pointed out one (MaxConnections) of several examples where
> this assumption doesn't work right.  Another example is marker
> parameters---the target might want to ask for them even if the
> initiator's not into it.  Furthermore, even if the initiator does
> enable markers with an FMarker parameter, the targer may respond with
> FMarker AND MarkInts, even though the initiator is satisfied with the
> default values, and so, did not send the parameters.
> 
> I have proposed that we precisely specify the exchange characteristics
> for each parameter in the operational set (security parameters already
> seem to be well specified by merit of the fact that security exchanges
> are themselves precisely defined).
> 
> Beyond the exchange characteristics of the parameters themselves comes
> how they should appear in PDUs.  If the parameter exchanges are very
> free-formed, the implementation complexity increases massively for no
> corresponding increase in capability.  I.e. who really cares if
> parameters must be sent in a specification-defined order versus any
> order you feel like?  Who cares if you the target can return responses
> in a single PDU or multiple?
> 
> I have a well formed opinion about how to specify to cut the
> implementation complexity, but I want to get the requirements on the
> table first.
> 
> Steph


From owner-ips@ece.cmu.edu  Thu Apr 26 22:23:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA28134
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 22:23:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R03n727223
	for ips-outgoing; Thu, 26 Apr 2001 20:03:49 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R034A27198
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 20:03:04 -0400 (EDT)
Received: from amrelay2.boi.hp.com (amrelay2.boi.hp.com [15.56.8.41])
	by palrel2.hp.com (Postfix) with ESMTP
	id DC3931E5C; Thu, 26 Apr 2001 17:03:03 -0700 (PDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1])
	by amrelay2.boi.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id SAA02323;
	Thu, 26 Apr 2001 18:03:02 -0600 (MDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <JK4DCBW6>; Thu, 26 Apr 2001 17:03:02 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A09015@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'Robert Snively'" <rsnively@Brocade.COM>, ips@ece.cmu.edu
Subject: RE: iSCSI Requirements Draft - Informal WG Last Call  -  A few c 
	oncerns about the document.
Date: Thu, 26 Apr 2001 17:03:01 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> 1)  MISSING REQUIREMENT FOR AVOIDING LUN BLOCKING
> 
> 	At present, there does not appear to be any text
> 	referencing the problems of hanging up all logical units
> 	on a target because one command to one logical unit
> 	was stalled.  This should be addressed explicitly, 
> 	probably in section 4.2.

Seems reasonable, but I'm having a hard time seeing where this fits and what
to say, and I have this nagging notion that this is already obliquely
mentioned in one of the sections.  I'll work on this.

> 
> 2)  ORDERING OF COMMANDS
> 
> 	In section 2.4, the following text is given:
>     
>    MUST provide a FIFO transport of SCSI commands, even when commands 
>    are sent along different paths. This command ordering mechanism 
>    SHOULD seek to minimize the amount of communication 
> necessary across 
>    multiple adapters doing transport off-load. 
>     
> 	This was heavily discussed already.  Related to point 1, the
> 	issue should actually be the FIFO transport of SCSI commands
> 	for a particular I_T_L nexus.  If the commands are going to
> 	different logical units or crossing different I_T nexi, the
> 	requirement should not be present.  I would propose changing
> 	the MUST to apply only to an I_T_L nexus, allowing other
> 	relationships to have an un-ordered relationship.  This will
> 	be especially useful for recovery on systems that install
> 	a lower level framing capability.

I don't see how these comments require any change to the text already
included.  The text doesn't mention sessions, but when it was written, the
details of session and connection relationships weren't fully explored.  The
details of the relationship between session, connection, command allegience
and the mapping of an I_T_L nexus are covered in the other docs.

> 
> 3)  CLARIFY AN UNCLEAR SECURITY REQUIREMENT  
> 
> Section 6.2 provides a requirement that uses the oxymoron 
> "passive attack".  If it is an attack, there is an intent and
> it is active.  I would propose deleting the word "passive" in
> the following requirement:
>     
>    iSCSI authenticated login MUST be resilient against 
> >passive< attacks. 
>  

The term "active" and "passive" attacks are commonly used in discussions of
Denial of Service attacks.  I didn't feel it was appropriate to provide
background information since there are plenty of other sources of
information about DOS available.  The requirement specifically mentions
passive attacks since it is nearly impossible to imagine/protect against
active attacks.
 
> 4)  REMOVE SALES/MARKETING FLUFF
> 
..snip..
> 	Other similar text is sprinkled throughout the document and 
> 	should be deleted.

I disagree.  This is an informational document - part of the intent of this
document is to discuss the implications and possibilities that iSCSI
technology development provides.  There are countless examples of this type
of discussion in other informational RFCs.

> 
> 5)  CORRECT TEXT ASSOCIATED WITH DIRECT DATA PLACEMENT
> 
> 	The text associated with direct data placement in section 2.2
> 	is largely associated with routing buffering and framing, not
> 	the requirements for zero-copy.  The text at present is:
> 
>    Direct data placement (zero-copy iSCSI): 
>     
>    This is an important implementation goal.  In an iSCSI 
> system, each 
>    of the end nodes (for example host computer and storage 
> controller) 
>    has ample memory; but the intervening nodes (NIC, 
> switches) do not.  
>    Assume a WAN-scale retransmission requirement of 25 MB (1 Gbps) or 
>    250 MB (10 Gbps, see Framing discussion).  Therefore, intervening 
>    nodes MUST NOT be required to buffer data. 
> 
> 	It should be rewritten to say:
> 
>    Direct data placement (zero-copy iSCSI): 
>     
>    Direct data placement allows iSCSI data to be moved directly
>    to the required memory locations in memory with no requirement
>    to recopy the incoming information.  Direct data placement 
>    significantly reduces the memory bus and I/O bus loading in
>    the end-point systems, allowing improved performance.

I see your point, I'll work on that wording.

>    
> 6)  ALTERNATE CONNECTION BINDING
> 
> 	The section in 2.4 discussing an alternate mechanism for
> 	connection binding merely serves to weaken the stand
> 	in favor of the selected binding relationship.  The following
> 	text should be deleted:
> 
>    "An alternate approach that was extensively discussed involved 
>    sending all commands on a single connection and the 
> associated data 
>    and status on a different connection (asymetric approach). In this 
>    scheme, the transport ensures the commands arrive in order. The 
>    protocol on the data and status connections is simpler, perhaps 
>    lending itself to a simpler realization in hardware.  One 
>    disadvantage of this approach is that the recovery procedure is 
>    different if a command connection fails vs. a data 
> connection. Some 
>    argued that this approach would require greater inter-processor 
>    communication when connections are spread across processors.   
>    The reader may reference the mail archives of the IPS mailing list 
>    between June and September of 2000 for extensive discussions on 
>    symmetric vs asymmetric connection models." 
> 

This text was added to satisfy a specific request.  The intent was to
summarize an approach that was already considered and discarded.  The hope
is that newcomers to the iSCSI discussion could review this document and we
might avoid revisiting this idea.

> 7)  OPTIONAL BEHAVIOR
> 
> 	In clause 2.4, wording about the desirability of minimizing
> 	optional features is discussed.  However, it reaches the 
> 	mistaken conclusion that there is only one time at which
> 	options may be negotiated and that rejection is required if
> 	the options are not supported.  The following text should 
> 	be changed from:
> 
>    "In the interest of simplicity, iSCSI SHOULD minimize optional 
>    features.  When features are deemed necessary, the protocol SHOULD 
>    allow for feature negotiation at session establishment (login) and 
>    provide for rejection when an implementation does not support a 
>    requested feature."
> 
> 	to:
> 
>    "iSCSI SHOULD minimize optional features.  When features are
>    deemed necessary, the protocol SHALL provide for negotiation of
>    the use of those features.  iSCSI SHALL operate correctly whether
>    an optional feature is negotiated to be used or is 
>    negotiated not to be used."

How about  
"In the interest of simplicity, iSCSI SHOULD minimize optional features.
When optional features are deemed necessary, the protocol MUST allow for
feature negotiation at session establishment (login).  This also implies
iSCSI MUST operate correctly when no optional features are negotiated, and
when any individual option negotiation is unsuccessful."

> 
> 8)  REMOVE OPTIONAL EXTENSIONS
> 
> 	In section 4.1, the text suggests that various digest
> 	implementations may be used.  This is an option that has
> 	no reason to be allowed, since we will choose the proper
> 	digest calculation method after due study and no other
> 	calculation method should be allowed.  The following
> 	text should be deleted.
>  
>    "The iSCSI header format SHOULD be extensible to include 
> other digest 
>    calculation methods."

You must have missed an early discussion that future technology developments
may make it necessary to re-evaluate our digest calculation (witness the TCP
checksum debate).  The intent isn't to allow negotiation of digest formats,
but to recognize that in the future it may be necessary to change the
protocol to update the digest algorithm.

> 
> 9)  SOFTEN REQUIREMENT TO IMPLEMENT STRANGE SAM-2 FUNCTIONS
> 
> 	In section 5.2, the following text suggests that any
> 	feature in SAM-2 requires a valid transport mapping.  However,
> 	it further suggests making such functions recommended or
> 	required to implement, even if they are rarely used or 
> 	used only in contexts different from iSCSI.  The following
> 	text:    
> 
>    "In order to be considered a SCSI transport, the iSCSI 
> standard must 
>    comply with the requirements of the SCSI Architecture Model [SAM2] 
>    for a SCSI transport.  Any feature SAM2 requires in a valid 
>    transport mapping MUST be specified by iSCSI and the specification 
>    SHOULD make such a feature either RECOMMENDED or REQUIRED in 
>    implementations."
> 
> 	should be changed to read:
> 
>   "In order to be considered a SCSI transport, the iSCSI 
> standard SHALL 
>    comply with the requirements of the SCSI Architecture Model [SAM2] 
>    for a SCSI transport.  Any feature SAM2 requires in a valid 
>    transport mapping SHALL be specified by iSCSI.  The iSCSI document 
>    SHALL specify for each feature whether it is OPTIONAL, RECOMMENDED,
>    or REQUIRED to implement and/or use."

I like it.  Done.

> 
> 10)  INCORRECT REQUIREMENT FOR BRIDGES/ROUTERS  
> 
> 	In section 5.2, there is a paragraph treating gateways.
> 	I contend that all present SCSI transports are easily bridged
> 	BECAUSE they have chosen a very similar encapsulation format.
> 	The similar encapsulation format is that used by FCP, FCP-2,
> 	SBP-2, Packetized Parallel, and SSA.  The structure of 
> 	iSCSI packets and the protocol for transmitting them should
> 	be similar to the encapsulation formats used by those
> 	protocols.  Using this as a guideline, the following
> 	requirement is incorrect:
> 
>    "The iSCSI protocol MUST allow for the construction of gateways to 
>    other SCSI transports, including parallel SCSI [SPI-X] and to SCSI-
>    FCP[FCP, FCP-2].  It MUST be possible to construct "translating" 
>    gateways so that iSCSI hosts can interoperate with SCSI-X devices; 
>    so that SCSI-X devices can communicate over an iSCSI 
> network; and so 
>    that SCSI-X hosts can use iSCSI targets (where SCSI-X refers to 
>    parallel SCSI, SCSI-FCP, or SCSI over any other transport).  This 
>    requirement is implied by support for SAM-2, but is worthy of 
>    emphasis. These are true application protocol gateways, 
> and not just 
>    bridge/routers.  The different standards have only the SCSI-3 
>    command set layer in common.  These gateways are not mere packet 
>    forwarders."
> 
> 	That paragraph should be reworded as follows:
> 
>    "The iSCSI protocol MUST allow for the construction of simple
>    gateways to other SCSI transports, including parallel SCSI and
>    packetized parallel SCSI as specified by SPI-4, and Fibre
>    Channel Protocol for SCSI as specified by FCP and FCP-2.
>    It MUST be possible to construct  
>    gateways so that iSCSI devices can use SCSI commands to 
> communicate with
>    devices using other protocols.  This 
>    requirement is implied by support for SAM-2, but is worthy of 
>    emphasis. iSCSI SHALL use packet formats similar to the common
>    packet formats used by other packetized SCSI protocols where
>    possible to allow both simple bridging gateways and more
>    sophisticated translating gateways."

See separate email to group.

> 
> 11) CLARIFY CONGESTION QUESTION
> 
> 	Section 8.3 considers congestion in a rather strange way.
> 	It was my impression that a well-behaved TCP/IP connection
> 	provided appropriate congestion management, regardless of
> 	the information passed across it.  As a result, the following
> 	text in 8.3 should be removed:
> 
>    "The iSCSI protocol MUST be a good network citizen with proven 
>    congestion control (as defined in RFC 2309). In addition, iSCSI 
>    implementations MUST NOT use multiple connections as a means to 
>    avoid transport-layer congestion control."
> 
> 	and replaced with:
> 
>    "iSCSI implementations MUST NOT use multiple connections 
> as a means to 
>    avoid transport-layer congestion control.  Standard TCP/IP 
>    congestion management mechanisms operate normally while 
> transporting
>    iSCSI information."

The first requirement is mandated by the IETF for new IP protocols and is
satisfied by our selection of TCP.  I think that's said in several
paragraphs in the document.

Thanks for your review!

Marj


From owner-ips@ece.cmu.edu  Thu Apr 26 23:16:59 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA00135
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 23:16:58 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R1Oqe01726
	for ips-outgoing; Thu, 26 Apr 2001 21:24:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R1NtA01694
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 21:23:56 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3R2Uw134062;
	Thu, 26 Apr 2001 19:30:58 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>,
        "Ips Reflector \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc??
Date: Thu, 26 Apr 2001 18:21:02 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJKEOICGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
In-Reply-To: <6BD67FFB937FD411A04F00D0B74FE87802A09014@xrose06.rose.hp.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie,

As the rules change from technology to technology, there are issues involved
in this endeavor that will place into focus some potential problems.  I tend
to think that an independent delivery protocol could be developed.  So in
general, in my long winded way, I agree with Robert.

Doug


> Robert Snively proposes that the iSCSI Requirements document include the
> following requirement WRT gateway devices:
>
>    iSCSI MUST use packet formats similar to the common
>    packet formats used by other packetized SCSI protocols where
>    possible to allow both simple bridging gateways and more
>    sophisticated translating gateways.
>
> Comments?
>
> Marjorie Krueger
> Networked Storage Architecture
> Networked Storage Solutions Org.
> Hewlett-Packard
> tel: +1 916 785 2656
> fax: +1 916 785 0391
> email: marjorie_krueger@hp.com
>



From owner-ips@ece.cmu.edu  Thu Apr 26 23:17:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA00148
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 23:17:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R1vvU03430
	for ips-outgoing; Thu, 26 Apr 2001 21:57:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R1vGA03404
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 21:57:16 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id 77251EC4; Thu, 26 Apr 2001 18:57:15 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id SAA21175;
	Thu, 26 Apr 2001 18:57:11 -0700 (PDT)
Message-ID: <3AE8D1CA.AE93CF4D@cup.hp.com>
Date: Thu, 26 Apr 2001 18:56:26 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
References: <C12569E0.00584FF0.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------B62388DE3A383F93F9A4A1E7"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------B62388DE3A383F93F9A4A1E7
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,

Could you please clarify if the below issue is going to be addressed in
the iSCSI draft, as was discussed earlier.
(http://ips.pdl.cs.cmu.edu/mail/msg03155.html).

Specifically, is the spec going to address the issue of how initiators
can plug a hole in CmdSN sequence when they detect a ULP timeout and/or
choose not to use "command retry".

Regards,
Santosh

julian_satran@il.ibm.com wrote:
> 
> Santosh,
> 
> You had a possible answer from Matt.  However I agree that we might want to
> address this in text although
> a solution similar to that suggested by Matt should be by now obvious to
> every implementer - the target should leave a placeholder in the input
> queue until the command after gets delivered.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 25/01/2001 21:38:04
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   IPS Reflector <ips@ece.cmu.edu>
> cc:
> Subject:  iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
> 
> Julian & All,
> 
> The draft is currently lacking a section that addresses abort I/O error
> recovery. Specifically, how is CmdSN bridging issues to be handled in
> the case where an initiator chooses not to retry an I/O [that failed on
> a connection failure that affects the delivery of the command to the
> target or a digest error at the target] because its ULP timer may have
> expired.
> 
> In such cases, the initiator can send an Abort Task to inform the target
> that the I.T.T is being aborted and its corresponding CmdSN can be
> bridged, instead of having the target stall infinitely in its attempt to
> enforce ordering and await the missing CmdSN [which is'nt going to
> arrive, because the initiator did not retry the command].
> 
> Regards,
> Santosh
> 
>  - santoshr.vcf
--------------B62388DE3A383F93F9A4A1E7
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------B62388DE3A383F93F9A4A1E7--



From owner-ips@ece.cmu.edu  Thu Apr 26 23:23:44 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA00259
	for <ips-archive@odin.ietf.org>; Thu, 26 Apr 2001 23:23:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R1Mqm01648
	for ips-outgoing; Thu, 26 Apr 2001 21:22:52 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R1MGA01619
	for <ips@ece.cmu.edu>; Thu, 26 Apr 2001 21:22:17 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRVZN>; Thu, 26 Apr 2001 18:22:10 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B17347B@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc
	??
Date: Thu, 26 Apr 2001 18:22:10 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> -----Original Message-----
> From: KRUEGER,MARJORIE (HP-Roseville,ex1)
> [mailto:marjorie_krueger@hp.com]
> Sent: Thursday, April 26, 2001 4:18 PM
> To: Ips Reflector (E-mail)
> Subject: iSCSI: Require iSCSI to use packet formats similar 
> to FC, etc??
> 
> 
> Robert Snively proposes that the iSCSI Requirements document 
> include the
> following requirement WRT gateway devices:
> 
>    iSCSI MUST use packet formats similar to the common
>    packet formats used by other packetized SCSI protocols where
>    possible to allow both simple bridging gateways and more
>    sophisticated translating gateways.
> 
> Comments?
> 

What is meant by 'similar'? Assuming the requested wording was added to the
requirements document, how would that impact the existing iSCSI spec?

Charles


From owner-ips@ece.cmu.edu  Fri Apr 27 02:28:05 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA17078
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 02:28:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R4X8u11699
	for ips-outgoing; Fri, 27 Apr 2001 00:33:08 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R4WSA11673
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 00:32:29 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id GAA207826
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:32:20 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id GAA194076
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:32:20 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.0018ECCA ; Fri, 27 Apr 2001 06:32:14 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.0018EA60.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 06:55:47 +0300
Subject: Re: iSCSI Parameter Negotiation
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Robert,

Yes - the target can "initiate" a negotiation (be a offering party).
Driven was meant to underline that a sequence should end with an initiator
setting the F bit and then the target
doing the same (except for errors).  I will try to fix the text to make it
clear.

Julo

"Robert D. Russell" <rdr@mars.iol.unh.edu> on 25/04/2001 23:02:19

Please respond to "Robert D. Russell" <rdr@mars.iol.unh.edu>

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  iSCSI Parameter Negotiation




Julo:

A simple question:
  During the Login Phase, or during an exchange of text and text-response
  messages during the Full Feature Phase, can the target introduce a
  key=value or key=list pair that has not previously been offered by the
  initiator?

Example:
  Suppose the initiator is happy with the default value of 8 for
  MaxConnections, and therefore does not offer this key in any of the
  login or text messages it sends to the target during the leading
  connection for the session.  Further suppose that the target cannot
  support 8 connections but only 1 connection and wants to inform the
  initiator of this fact.  The only way it can do this is to send a
  login-response or text-response that includes "MaxConnections=1", even
  though this is not a "response" to anything offered by the initiator.

Reference:
  It seems possible to use parts of the current standard (v6) to justify
  either of 2 answers to this question: YES, or NO.  (And therefore, the
  real answer is currently a non-interoperable "maybe"!)

YES, the target can introduce a key=value or key=list pair in a
  login-response or a text-response even though the initiator did not
  previously offer this key in a login or text message.

  Justification:
  On page 16, the last paragraph of Section 1.2.4 on Text Mode Negotiation
  talks in terms of "the offering party" and "the responding party",
  presumably implying that either the initiator or the target can be the
  "offering party".

  On page 51, the last paragraph of section 2.9.1 about the F (Final) Bit
  says "if (the Final Bit is) set to 0 in a response to a text command with
  the Final Bit set to 1 it indicates that the target has more work to do
  (invites a follow-on text command)".  It is unclear what this "more work"
  might be, and it is also unclear whether that "follow-on text command"
  from the initiator could include the initiator's "response" to a
key=value
  or key=list pair introduced by the target in this response.

NO, the target cannot introduce a key-value pair in a login-response or a
  text-response -- it can ONLY respond to keys explicitly offered by the
  initiator in the login or text message being responded to.

  Justification:
  On page 84, the second paragraph of Section 4.3 on Operational Parameter
  Negotiation During the Login Phase says "Operational parameter
negotiation
  MAY involve several request-response exchanges (login and/or text) always
  driven by the initiator."  Further, on page 85, the second paragraph of
  Section 5 on Operational Parameter Negotiation Outside the Login Phase
  again reiterates "Operational parameter negotiation MAY involve several
  text request-response exchanges always driven by the initiator."
  Depending on what you understand by "driven", this could mean that only
the
  initiator can offer keys, and the target can only respond to offered
keys.
  (It could be that "driven" on both these pages refers only to the setting
  of the F bit, in which case it implies nothing about key=value pairs).

  Furthermore, on page 51, the first paragraph of Section 2.9.3 on Text
  Response Data says "The Text Response Data Segment contains responses in
  the same key=value format as the Text Command and with the same length
and
  coding constraints.  Appendix C lists some basic Text Commands and their
  Responses."  This clearly says that a Text Response from the target
  can only contain responses.  The rest of section 2.9.3 also gives the
  impression that the target cannot introduce any new key=value pairs
  in a response, because it says what to do "if the Text Response does not
  contain a key that was requested", and that "Text response key=value
  pairs MUST be delivered in the same order as the command key=value pairs
  whenever applicable".  This section gives no indication that new
key=value
  pairs are allowed in the response, and if they are, where they could be
  inserted in the ordering of key=value pairs in the response.

In general, except for the use of the terms "the offering party" and "the
  responding party" in Section 1.2.4, the whole tone of the standard reads
  as if only the initiator can offer key=value or key=list pairs, and the
  target can only respond with values for offered keys.  I say this because
  when you read section 2.8 on the Text Command, nowhere does it say
anything
  about what to do if the target sends a response that offers a key that
  had not previously been requested by the initiator.  Likewise, section
  2.9 on the Text Response includes nothing to indicate that the keys in
  this response can be different from those offered in the previous Text
  Command.  Likewise, section 4.2 starting on page 82 describes primarily
  the Security and Integrity Negotiation, but frequently mentions iSCSI
  non-security parameters.  The last paragraph on page 82 explicitly
  says "The initiator sends a text command with an ordered list of the
  options it supports for each subject (authentication algorithm,
  iSCSI parameters and so on)", which implies that operational parameters
  can be offered in this way by the initiator.  The description of the
  target response on page 83 implies that the target can only select
  the appropriate choice to keys offered by the initiator.  There is no
  hint of the target offering the initiator any new keys.

  Even section 1.2.4 on page 15, which generally talks about "offering
party"
  and "responding party", does not do so in paragraph 5:
  "If a target is not supporting, or not allowed to use with a specific
  initiator, any of the offered options, it may use the value "reject".
  This clearly says that only targets can do this, not initiators, and
  would therefore seem to imply that targets cannot offer options.

Resolution:
  The standard should unambiguously state the answer to this question
  someplace.  I would suggest in section 1.2.4, but it would not hurt
  to reiterate it in other places as well, such as in section 4 on the
  Login Phase.  In addition:

  if the answer is "YES", then add some statements in sections 2.8 and 2.9
  to describe how to handle these offers from the target at the same level
  of detail as is now done in those sections for handling offers from the
  initiator.

  if the answer is "NO", then get rid of the terms "offering party" and
  "responding party" in section 1.2.4 (this is the only place in the
  standard where those terms are used), and add statements in sections 2.8
  and 2.9 to explicitly state that targets cannot offer new keys to an
  initiator.

  Furthermore, if the answer is "NO", then what should the target do in
  the example I gave at the start of this e-mail?


Bob Russell
InterOperability Lab
University of New Hampshire
rdr@iol.unh.edu
603-862-3774






From owner-ips@ece.cmu.edu  Fri Apr 27 02:31:33 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA17145
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 02:31:33 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R4X5w11694
	for ips-outgoing; Fri, 27 Apr 2001 00:33:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R4WJA11666
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 00:32:19 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id GAA263540
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:32:12 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id GAA113282
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:32:12 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.0018E8AB ; Fri, 27 Apr 2001 06:32:04 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Black_David@emc.com
cc: ips@ece.cmu.edu
Message-ID: <C1256A3B.0018E62A.00@d12mta05.de.ibm.com>
Date: Thu, 26 Apr 2001 17:06:00 +0300
Subject: RE: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



The reason to ask for ACA is to reestablish the order intended by the
issuing initiator including commands in flight at the time of the reset.
As the issuing initiators are not aware of the reset. The order in which
things will be executed for them may be important and they may want to
reissue some commands ahead of the commands in flight.

Julo

Black_David@emc.com on 26/04/2001 10:30:40

Please respond to Black_David@emc.com

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI : EnableACA




> What you are suggesting is that in all cases in which iSCSI would require
> the target to enter ACA (after a LU reset or a target reset) look forward
> in the queue and enter ACA only if it finds a CDB marked NACA? (and that
> includes commands in flight).

Ok, so why is iSCSI requiring ACA in addition to Unit
Attention for Clear Task Set, LU Reset and Target Reset?
In all cases, we're dealing with Initiators whose tasks
are cleared as a consequence of another Initiator
issuing the appropriate task management command.
SAM2 only requires Unit Attention.

--David

> -----Original Message-----
> From:   julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> Sent:   Thursday, April 26, 2001 2:22 AM
> To:     ips@ece.cmu.edu
> Subject:     Re: iSCSI : EnableACA
>
>
>
> Santosh,
>
> How about a third confusing response?
>
> We have introduced the EnableACA (per LU) to enable an initiator
unwilling
> to use it to disable it at the target for this specific Initiator.
>
> NormACA - the enquiry bit - indicates support for ACA by the target
device
> server but is a "read-only" bit.
>
> What you are suggesting is that in all cases in which iSCSI would require
> the target to enter ACA (after a LU reset or a target reset) look forward
> in the queue and enter ACA only if it finds a CDB marked NACA? (and that
> includes commands in flight).
>
> Or to enter ACA only when it finds such a command (sort of "soft ACA")?
>
> Both of them sound wrong and complex.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 22:11:12
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  iSCSI : EnableACA
>
>
>
>
> Julian,
>
> Would the following not satisfy the requirements for dealing with this
> ACA issue :
>
> 1) Initiators determine the target support for ACA through the NACA bit
> in the INQUIRY response. (Assuming iSCSI targets have implemented ACA in
> good faith, this would be supported.)
>
> 2) Initiators set the NACA bit in the CDBs of commands that need strong
> ordering. (This could be a small subset of the I/O traffic to one or
> more LUNs within the session and not required for all the I/Os in that
> session.)
>
> 3) Any exception condition on a SCSI I/O, for which the NACA bit was set
> results in ACA being established.
> Thus, ACA would only be applied if some I/O traffic that required strong
> ordering was affected by the exception condition.
>
> 4) Since the initiator is ACA capable based on its usage of the NACA
> bit, it should also be capable of performing the desired Clear ACA to
> recover from this condition.
>
> Such an approach would only apply ACA and its corresponding recovery
> when some strongly ordered I/O encountered an exception condition,
> rather than applying ACA on a session granularity.
>
> To summarize, the above approach allows :
> - ACA to be turned on/off for a subset of I/Os headed to a LUN
> - ACA based recovery only used where needed.
> - Keeps iSCSI ACA un-aware and rightly so, since this is a property of
> the SCSI ULP.
> - Avoids applying ACA recovery on a session granularity.
>
> What am I missing here (?). Why is an EnableACA needed ?
>
> - Santosh
>
>
> julian_satran@il.ibm.com wrote:
>
> > All references to
> > EnableACA are redundant and should be removed for the following reasons
> > :
> >
> > a) An initiator knows whether a target supports ACA from the NACA bit
in
> > the INQUIRY response. When a target indicates support for ACA, the
> > initiator can use it by setting the NACA bit in the CDBs it sends.
There
> > is NO need for any sort of negotiation of this behaviour above and
> > beyond what is already provided thru SCSI mechanisms.
> >
> > b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating
its
> > use or lack thereof. This is done thru the NACA bit in CDBs.
> >
> > c) (As a side note, the description of EnableACA on pg 127 refers to
its
> > presence in the lun control mode page, but it is actually present in
the
> > protocol specific port page.)
> >
> > d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT
be
> > negotiated on a per-session basis. SCSI allows initiators to request
ACA
> > behaviour on a per I/O basis through the use of NACA bit in the CDBs.
> >
>
>  +++ We have required ACA to be supported by all new iSCSI targets and
>  several
>  actions require the target to enter ACA state.
>  It was brought to our attention that many initiators will not react
>  properly to a
>  target entering ACA state (not do the reset).
>  The EnableACA bit and key are meant to enable an initiator to control
> this
>  iSCSI specific ACA behaviour.  This behaviour is related to
> asynchronous
>  events and is not controlled by the NACA CDB bit.
>
>  ++++
>  - santoshr.vcf
>
>





From owner-ips@ece.cmu.edu  Fri Apr 27 02:31:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA17156
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 02:31:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R4X2s11690
	for ips-outgoing; Fri, 27 Apr 2001 00:33:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R4WRA11672
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 00:32:27 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id GAA132426;
	Fri, 27 Apr 2001 06:32:12 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id GAA113280;
	Fri, 27 Apr 2001 06:32:11 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.0018E8F5 ; Fri, 27 Apr 2001 06:32:05 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Black_David@emc.com
cc: santoshr@cup.hp.com, ips@ece.cmu.edu
Message-ID: <C1256A3B.0018E62A.00@d12mta05.de.ibm.com>
Date: Thu, 26 Apr 2001 18:00:27 +0300
Subject: RE: iSCSI : digest error handling violates EMDP/InDataOrder
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



David,

I read Bob's mail and my interpretation is similar to his. However I think
that SPC explicitly states that different transports are free to interpret
and make use of this page as they find appropriate.

I have a hard time understanding Santosh's objection as it does not refer
to the reason the EMDP is there but to the way it is written in FCP (not
iSCSI).

I have trouble understanding why would a target that is reissuing an R2T
violate any rule. The ordering rules where meant to assure that targets can
do their work in a fashion that is optimal for them and they where not
meant to affect initiator operation - except perhaps to enable exception
checking.

The whole issue looks to me as overblown.

Regards,
Julo

Black_David@emc.com on 24/04/2001 22:00:45

Please respond to Black_David@emc.com

To:   santoshr@cup.hp.com, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI : digest error handling violates EMDP/InDataOrder




Santosh's original issue was that R2Ts to request
retransmission of data (e.g., due to data CRC failure)
result in an initiator seeing what appear to be out
of order R2Ts due to the need to go back and get
the failed data retransmitted.

Santosh's original email said (in part):

> Section 6.2 (pg 80). Digest Errors
> -----------------------------------
> "If the error is a Data-Digest-Error in a Data-PDU, the target MUST
> either request retransmission with a R2T or answer with a Reject iSCSI
> PDU and abort the task."

> Problem :
> ---------
> On a Data digest error detected by a target, it MUST NOT request
> re-transmission of the data PDU thru an R2T if the session login key
> InDataOrder is set to yes.

This key has been renamed to DataOrder in -06, and if
it's set to "yes" as currently defined, then (IMHO)
Santosh appears to be correct.  The Initiator is not
going to be expecting the R2T offset to step back to
pick up the missing data, and hence the Target MUST
Reject and Abort.

Beyond this, I take Bob Snively's mail as a suggestion
that we ought to split iSCSI DataOrder from SCSI's
EMDP, as FCP considers those to be separate concepts.
That seems like a reasonable approach.

Comments?

--David

> -----Original Message-----
> From:   Santosh Rao [SMTP:santoshr@cup.hp.com]
> Sent:   Tuesday, April 24, 2001 2:30 PM
> To:     ips@ece.cmu.edu
> Cc:     David Black
> Subject:     Re: iSCSI : digest error handling violates EMDP/InDataOrder
>
>
> What is the final resolution on this EMDP issue. IMHO, iSCSI must retain
> the EMDP semantics as defined in FCP, SRP. i.e. It controls the order of
> the data across the entire SCSI command. (which includes sending R2T
> requests in order, if EMDP was set to 1).
>
> Some additional thoughts on this topic are :
>
> 1) Is it worth a finer granularity of control wherein the initiator be
> allowed to negotiate with the target that R2T requests be sent in-order
> , while not imposing any constraints on the Read Data PDU order.
>
> 2) Should control be provided over a "Random Relative Offset" feature,
> as Bob describes it below, or is it to be assumed that iSCSI Data PDUs
> will always be in-order within a sequence ?
>
> 3) Speaking of sequence, this terminology has been often used in this
> thread. Where is the notion of a sequence defined in iSCSI ? What is the
> definition of an iSCSI sequence.
>
> - Santosh
>
> Robert Snively wrote:
> >
> > Seems to me that there are some unclarities in this area as well.
> >
> > There are really two pieces being discussed as one:
> >
> >         EMDP (a SCSI functionality)
> >
> >         Random relative offset (a transport functionality)
> >
> > EMDP is used to allow a target to request or deliver its data
> > out of order.  This is used for things like passing a stripe
> > segment from a RAID data extent as soon as it has been accumulated,
> > rather than waiting until all previous parts of the RAID data
> > extent have also been accumulated and delivered.  It is also used
> > for things like "start anywhere" reading of a disk track.
> >
> > It says nothing about the ordering of data within a PDU or sequence
> > which must be ordered according to the rules of the protocol.  Fibre
> > Channel allows the data within a sequence to be transmitted in order
> > or out of order by using the login parameter "random relative offset".
> > Almost all devices choose to login and require "continuously increasing
> > relative offset". << File: Card for Santosh Rao >>





From owner-ips@ece.cmu.edu  Fri Apr 27 02:32:08 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id CAA17171
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 02:32:08 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R4X0S11686
	for ips-outgoing; Fri, 27 Apr 2001 00:33:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R4WSA11674
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 00:32:29 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id GAA87690
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:32:21 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id GAA194078
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:32:20 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.0018EC8F ; Fri, 27 Apr 2001 06:32:14 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.0018EA60.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 07:28:35 +0300
Subject: Re: iSCSI: multiple sessions b/n a pair of WWUIs.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



N&D is work in progress. We will get in sync. soon.  Julo

Santosh Rao <santoshr@cup.hp.com> on 26/04/2001 02:34:39

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL, IPS Reflector <ips@ece.cmu.edu>
cc:
Subject:  iSCSI: multiple sessions b/n a pair of WWUIs.




> To: ips@ece.cmu.edu
> Subject: Re: iSCSI: session login and ISID
> From: julian_satran@il.ibm.com
> Date: Tue, 10 Apr 2001 14:21:49 +0300


> WWUI can be presented during login phase (2.10.9 is correct and in-line
with 1.2.7) Two > sesions can have the same ISID but will have different
TSID. The question of whether
> more than one session should be allowed between a pair of wuis is under
debate.

> Julo


Julian,

There seems to be some disconnect between your comments above and the
name-disc draft. As per the name-disc draft Section 2(d) :

"There can be only one iSCSI  session with a given ISID between an iSCSI
Intiator Node and an iSCSI Target Node."

The iSCSI [&name-disc] drafts should explicitly state that ISID is
uniquely assigned for a given initiator. Similarly, the TSID is uniquely
assigned for a given target.

On the subject of multiple sessions for a given pair of WWUIs, this MUST
be a requirement. iSCSI must allow multiple sessions for a given pair of
WWUIs.

This is required because single-connection session models would like to
setup multiple sessions b/n initiator hosts and multi-ported targets and
export the multiple paths to LUs to upper layer wedge drivers like EMC
Powerpath, Veritas VxVm, etc.

Inability to establish multiple sessions b/n a pair of WWUIs implies
iSCSI layer will only export one path to the upper layer wedge drivers,
thereby, breaking such applications.

This also implies iSCSI would then take on all the responsibilities of
providing load balancing and fail-over capabilities and would require
the use of multi-connection sessions for that purpose.

By allowing multiple sessions for a given WWUI pair, iSCSI layer could
achieve equivalent functionality using single connection sessions and
would also not break existing wedge drivers.

Regards,
Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Fri Apr 27 03:13:46 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA17729
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 03:13:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R5X3a14323
	for ips-outgoing; Fri, 27 Apr 2001 01:33:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R5WOA14292
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 01:32:24 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id HAA22316;
	Fri, 27 Apr 2001 07:32:15 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id HAA236590;
	Fri, 27 Apr 2001 07:32:15 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.001E699A ; Fri, 27 Apr 2001 07:32:11 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
cc: matt_wakeley@agilent.com
Message-ID: <C1256A3B.001E693A.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 08:37:36 +0300
Subject: iSCSI - opcodes
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Fellow iSCSI fans,

Some were incesed by the lak of a "direction bit" in the opcodes in draft
06.
Here is an attempt to a new list (having a bit for direction back - as the
LSB ).
To gain some reserved space I've curtailed the vendo-specific codes to 4 in
each direction.


Please comment,
Julo

1.1.1.1   Opcode

   The Opcode indicates what type of iSCSI PDU the header encapsulates.

   The Opcodes are divided into two categories: initiator opcodes and
   target opcodes. Initiator opcodes are in PDUs sent by the initiators
   (request PDUs), and target opcodes are in PDUs sent by the target
   (response PDUs).

   Initiators MUST NOT use target opcodes and targets MUST NOT use
   initiator opcodes.

   Valid initiator opcodes defined in this specification are:


      0x00 NOP-Out (from initiator to target)
      0x02 SCSI Command (encapsulates a SCSI Command Descriptor Block)
      0x04 SCSI Task Management Command
      0x06 Login Command
      0x08 Text Command
      0x0a SCSI Data-out (for WRITE operations)
      0x0c Logout Command
      0x10 SNACK Request

   Valid target opcodes are:


      0x01 NOP-In (from target to initiator)
      0x03 SCSI Response (contains SCSI status and possibly sense
      information or other response information)
      0x05 SCSI Task Management Response
      0x07 Login Response
      0x09 Text Response
      0x0b SCSI Data-in (for READ operations)
      0x0d Logout Response
      0x11 Ready To Transfer (R2T - sent by target to initiator when it is
      ready to receive data from initiator)
      0x13 Asynchronous Message (sent by target to initiator to indicate
      certain special conditions)
      0x2f Reject

   Initiator opcodes 0x38, 0x3a, 0x3c and 0x3e and target opcodes 0x39,
   0x3b, 0x3d and 0x3f are vendor specific codes.




From owner-ips@ece.cmu.edu  Fri Apr 27 03:17:32 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA17787
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 03:17:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3R5J2M13617
	for ips-outgoing; Fri, 27 Apr 2001 01:19:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R5IpA13607
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 01:18:52 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id HAA332632
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 07:18:45 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id HAA201802
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 07:18:45 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.001D2BE1 ; Fri, 27 Apr 2001 07:18:37 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.001D2BBE.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 08:24:03 +0300
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

The task management functions for iSCSI contain a description the behaviour
I am talking about.
This was also subject to a long discussion on the list that included the
need for ACA behaviour for things like
task-set full, busy etc. - not considered for ACA. For the later T10 is
handling making ACA behaviour
available.

Julo

Santosh Rao <santoshr@cup.hp.com> on 26/04/2001 20:27:34

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   Black_David@emc.com, ips@ece.cmu.edu
Subject:  Re: iSCSI : EnableACA




Black_David@emc.com wrote:
>
> > What you are suggesting is that in all cases in which iSCSI would
require
> > the target to enter ACA (after a LU reset or a target reset) look
forward
> > in the queue and enter ACA only if it finds a CDB marked NACA? (and
that
> > includes commands in flight).

Julian,

I am not suggesting any such thing. ACA is only established when the
logical unit completes a command [that had the NACA bit set in its
control byte in the CDB] with a CHECK CONDITION status.

ACA is not established after a LU Reset or Target Reset. On the
contrary, a target reset or LU Reset would clear any existing CAC or
ACA.

- Santosh


>
> Ok, so why is iSCSI requiring ACA in addition to Unit
> Attention for Clear Task Set, LU Reset and Target Reset?
> In all cases, we're dealing with Initiators whose tasks
> are cleared as a consequence of another Initiator
> issuing the appropriate task management command.
> SAM2 only requires Unit Attention.
>
> --David
>
> > -----Original Message-----
> > From: julian_satran@il.ibm.com [SMTP:julian_satran@il.ibm.com]
> > Sent: Thursday, April 26, 2001 2:22 AM
> > To:   ips@ece.cmu.edu
> > Subject:      Re: iSCSI : EnableACA
> >
> >
> >
> > Santosh,
> >
> > How about a third confusing response?
> >
> > We have introduced the EnableACA (per LU) to enable an initiator
unwilling
> > to use it to disable it at the target for this specific Initiator.
> >
> > NormACA - the enquiry bit - indicates support for ACA by the target
device
> > server but is a "read-only" bit.
> >
> > What you are suggesting is that in all cases in which iSCSI would
require
> > the target to enter ACA (after a LU reset or a target reset) look
forward
> > in the queue and enter ACA only if it finds a CDB marked NACA? (and
that
> > includes commands in flight).
> >
> > Or to enter ACA only when it finds such a command (sort of "soft ACA")?
> >
> > Both of them sound wrong and complex.
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 20/04/2001 22:11:12
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  iSCSI : EnableACA
> >
> >
> >
> >
> > Julian,
> >
> > Would the following not satisfy the requirements for dealing with this
> > ACA issue :
> >
> > 1) Initiators determine the target support for ACA through the NACA bit
> > in the INQUIRY response. (Assuming iSCSI targets have implemented ACA
in
> > good faith, this would be supported.)
> >
> > 2) Initiators set the NACA bit in the CDBs of commands that need strong
> > ordering. (This could be a small subset of the I/O traffic to one or
> > more LUNs within the session and not required for all the I/Os in that
> > session.)
> >
> > 3) Any exception condition on a SCSI I/O, for which the NACA bit was
set
> > results in ACA being established.
> > Thus, ACA would only be applied if some I/O traffic that required
strong
> > ordering was affected by the exception condition.
> >
> > 4) Since the initiator is ACA capable based on its usage of the NACA
> > bit, it should also be capable of performing the desired Clear ACA to
> > recover from this condition.
> >
> > Such an approach would only apply ACA and its corresponding recovery
> > when some strongly ordered I/O encountered an exception condition,
> > rather than applying ACA on a session granularity.
> >
> > To summarize, the above approach allows :
> > - ACA to be turned on/off for a subset of I/Os headed to a LUN
> > - ACA based recovery only used where needed.
> > - Keeps iSCSI ACA un-aware and rightly so, since this is a property of
> > the SCSI ULP.
> > - Avoids applying ACA recovery on a session granularity.
> >
> > What am I missing here (?). Why is an EnableACA needed ?
> >
> > - Santosh
> >
> >
> > julian_satran@il.ibm.com wrote:
> >
> > > All references to
> > > EnableACA are redundant and should be removed for the following
reasons
> > > :
> > >
> > > a) An initiator knows whether a target supports ACA from the NACA bit
in
> > > the INQUIRY response. When a target indicates support for ACA, the
> > > initiator can use it by setting the NACA bit in the CDBs it sends.
There
> > > is NO need for any sort of negotiation of this behaviour above and
> > > beyond what is already provided thru SCSI mechanisms.
> > >
> > > b) The ACA is a SCSI ULP concept and iSCSI should not be negotiating
its
> > > use or lack thereof. This is done thru the NACA bit in CDBs.
> > >
> > > c) (As a side note, the description of EnableACA on pg 127 refers to
its
> > > presence in the lun control mode page, but it is actually present in
the
> > > protocol specific port page.)
> > >
> > > d) ACA is a LUN-level (more an I/O level) control option. It MUST NOT
be
> > > negotiated on a per-session basis. SCSI allows initiators to request
ACA
> > > behaviour on a per I/O basis through the use of NACA bit in the CDBs.
> > >
> >
> >  +++ We have required ACA to be supported by all new iSCSI targets and
> >  several
> >  actions require the target to enter ACA state.
> >  It was brought to our attention that many initiators will not react
> >  properly to a
> >  target entering ACA state (not do the reset).
> >  The EnableACA bit and key are meant to enable an initiator to control
> > this
> >  iSCSI specific ACA behaviour.  This behaviour is related to
> > asynchronous
> >  events and is not controlled by the NACA CDB bit.
> >
> >  ++++
> >  - santoshr.vcf
> >
> >
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Fri Apr 27 08:17:09 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA25457
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 08:17:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RB0Ih07028
	for ips-outgoing; Fri, 27 Apr 2001 07:00:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RAxoA06990
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:59:50 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id MAA70188;
	Fri, 27 Apr 2001 12:57:38 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id MAA171370;
	Fri, 27 Apr 2001 12:57:39 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.003C338A ; Fri, 27 Apr 2001 12:57:33 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Randall Stewart <rrs@cisco.com>
cc: Chip Sharp <chsharp@cisco.com>, ips@ece.cmu.edu,
        Craig Partridge <craig@aland.bbn.com>,
        Jonathan Stone <jonathan@dsg.stanford.edu>
Message-ID: <C1256A3B.003C3260.00@d12mta05.de.ibm.com>
Date: Fri, 27 Apr 2001 14:02:53 +0300
Subject: Re: [Tsvwg] [SCTP checksum problems]
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Those ratios correpond to our experience and make Adler unattractive.

Julo




From owner-ips@ece.cmu.edu  Fri Apr 27 08:17:42 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA25487
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 08:17:42 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RAgGs06325
	for ips-outgoing; Fri, 27 Apr 2001 06:42:16 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RAflA06316
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:41:48 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id MAA88700;
	Fri, 27 Apr 2001 12:41:40 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id MAA99742;
	Fri, 27 Apr 2001 12:41:40 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.003ABD77 ; Fri, 27 Apr 2001 12:41:35 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Santosh Rao <santoshr@cup.hp.com>
cc: matt_wakeley@agilent.com, ips@ece.cmu.edu
Message-ID: <C1256A3B.003ABC37.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 13:46:59 +0300
Subject: Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



I the hole is in the command queue and the task is just aborted the
response to the abort task
will unveil the fact that it did not reach destination.

Initiator can recover from there in several ways - clear the task set, fill
the hole with an iSCSI noop etc.
The latter, I recall, Was sugested to you by Matt Wakeley a while ago.

None of them require any changes in the spec.

Julo


Santosh Rao <santoshr@cup.hp.com> on 27/04/2001 04:56:26

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery




Julian,

Could you please clarify if the below issue is going to be addressed in
the iSCSI draft, as was discussed earlier.
(http://ips.pdl.cs.cmu.edu/mail/msg03155.html).

Specifically, is the spec going to address the issue of how initiators
can plug a hole in CmdSN sequence when they detect a ULP timeout and/or
choose not to use "command retry".

Regards,
Santosh

julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> You had a possible answer from Matt.  However I agree that we might want
to
> address this in text although
> a solution similar to that suggested by Matt should be by now obvious to
> every implementer - the target should leave a placeholder in the input
> queue until the command after gets delivered.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 25/01/2001 21:38:04
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   IPS Reflector <ips@ece.cmu.edu>
> cc:
> Subject:  iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
>
> Julian & All,
>
> The draft is currently lacking a section that addresses abort I/O error
> recovery. Specifically, how is CmdSN bridging issues to be handled in
> the case where an initiator chooses not to retry an I/O [that failed on
> a connection failure that affects the delivery of the command to the
> target or a digest error at the target] because its ULP timer may have
> expired.
>
> In such cases, the initiator can send an Abort Task to inform the target
> that the I.T.T is being aborted and its corresponding CmdSN can be
> bridged, instead of having the target stall infinitely in its attempt to
> enforce ordering and await the missing CmdSN [which is'nt going to
> arrive, because the initiator did not retry the command].
>
> Regards,
> Santosh
>
>  - santoshr.vcf
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Fri Apr 27 08:18:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA25513
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 08:18:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RAJHJ05363
	for ips-outgoing; Fri, 27 Apr 2001 06:19:17 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RAIRA05344
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:18:27 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id MAA67360
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 12:18:18 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id MAA63954
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 12:18:18 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.00389775 ; Fri, 27 Apr 2001 12:18:07 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.003895B2.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 12:59:29 +0300
Subject: Re: iSCSI : target session login behaviour
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



SNACK rejected has been removed from the SCSI Response - Julo

Santosh Rao <santoshr@cup.hp.com> on 26/04/2001 21:41:20

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Stephen Bailey <steph@cs.uchicago.edu>, ips@ece.cmu.edu
cc:   David Black <Black_David@emc.com>
Subject:  Re: iSCSI : target session login behaviour




Stephen Bailey wrote:

> > As a side note, the iSCSI draft Status Class/Codes could do with a misc
> > error category along the lines of the FC "No additional Explantion"
> > reason explantion. This would help deal with error conditions that
don't
> > come under the listed category.
>
> Personally, I think we should add categories for reasons we obviously
> see now, AND have a no additional reason.
>
> One peculiarity with what you're talking about above is that it should
> be a login response status code which expresses this rejection.  The
> login response set does not seem to have an `invalid parameter'
> response for cases when the request is somehow inconsistent.

Steph,

The iSCSI draft is unclear today about the exact mechanism through which
the target indicates "invalid parameters" in response to a received
command.

1) Should it use a REJECT PDU or respond with the appropriate response
for that PDU indicating a response code of "Invalid Parameters" and a
"first bad byte" offset that indicates which parameter the target
disliked.

IMO, an "Invalid Parameters" response in the response codes is
appropriate for SCSI Command and SCSI Task Mgmt commands. [coupled with
a "first bad byte" offset.]

This is missing today.

2) Also, as discussed above, a general "No Addional Explanation" type of
status code in the login response would cover the "misc" category.

3) There are cases of ambiguity in the usage of REJECT or SCSI Response.
Take the case of a "SNACK Reject". It is present in both the SCSI
Response (SNACK Rejected) and REJECT PDU reason code (Data SNACK
Reject). Which mechanism is to be used in this case ?

4) There is no "Status SNACK Rejected" in the REJECT PDU.

Regards,
Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Fri Apr 27 08:19:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA25557
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 08:19:28 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RAgKE06346
	for ips-outgoing; Fri, 27 Apr 2001 06:42:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-4.cisco.com (sj-msg-core-4.cisco.com [171.71.163.10])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R7m0A19681
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 03:48:00 -0400 (EDT)
Received: from mira-sjc5-2.cisco.com (mira-sjc5-2.cisco.com [171.71.163.16])
	by sj-msg-core-4.cisco.com (8.9.3/8.9.1) with ESMTP id AAA29856;
	Fri, 27 Apr 2001 00:43:38 -0700 (PDT)
Received: from cisco.com (ssh-sj1.cisco.com [171.68.225.134])
	by mira-sjc5-2.cisco.com (Mirapoint)
	with ESMTP id AEH07869 (AUTH rrs);
	Fri, 27 Apr 2001 00:43:29 -0700 (PDT)
Message-ID: <3AE9231F.59BF2BA4@cisco.com>
Date: Fri, 27 Apr 2001 02:43:27 -0500
From: Randall Stewart <rrs@cisco.com>
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.12 i386)
X-Accept-Language: en
MIME-Version: 1.0
To: tsvwg@ietf.org
CC: Chip Sharp <chsharp@cisco.com>, ips@ece.cmu.edu,
        Craig Partridge <craig@aland.bbn.com>,
        Jonathan Stone <jonathan@dsg.stanford.edu>
Subject: Re: [Tsvwg] [SCTP checksum problems]
References: <NEBBJGDMMLHHCIKHGBEJMEODCGAA.dotis@sanlight.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

All...

After a review of the numbers I realize I made a very
large error :-< (egg on face) in my analysis of the gprof numbers....
yikes!!

Below are the updated real numbers...


I will also be re-running these on a pentium 100 and with Jonathan's
help
we will tune up both the CRC32 and the adler algorithms and adds some
more
such as fletcher etc..

Please consider these pre-liminary numbers until Jonathan and I can do
a more detailed analysis ..


New numbers inserted below :)

Thanks

R
> 
> > Jim:
> >
> > I am glad you copied me on this.. since being at the bakeoff and with
> > the
> > recent email/site problems I have.. I do not get IETF mail until next
> > week :0
> >
> > Now, some comments...
> >
> >
> > My major concern is that SCTP's checksum is weaker than TCP.What the
> > upper layers do to defend against middle boxes etc they will need to
> > do anyway. I have just ran a few numbers using the sctp_test_app in
> > the reference implementation.
> >
> > I did the following:
> >
> > Compiled the normal ref-imp with -O -pg
> >
> > started two endpoints on the same machine (freeBSD intel/sony vio
> > PCG-Z505JS), this
> > of course has my SCTP kernel patches applied ...
> >
> > Now I setup an association between two endpoints and did
> >
> > bulk:100:0:1000000
> >
> > This transfers 1 Million 100 byte packets holding ascii data from one
> > endpoint to the other.
> >
> > I then captured the gprof information for this run.
> >
> > I did the same exact test after changing the checksum to crc32 and the
> > modified
> > adler32 (16 bit sums).
> >
> > My results (not meant to show strength in catching errors but instead
> > performance
> > of a software version of the sum) are as follows:
> >
> > Adler 32...
> >
> > Sender side and Receiver side Avereage time spent in the checksum
> > routine per
> > call was 121 nano seconds.
> >
******* corrected number******
5.1 microseconds 

> > Adler 32 Modified
> >
> > Sender side and Receiver side Average time spent in the checksum routine
> > per
> > call was 90 nano seconds.

******** corrected number******
3.9 microseconds.


> >
> > CRC32
> >
> > Sender side average time spent in the checksum calculation per packet
> > was
> > 5.3 micro seconds
> >
> > Recevier side average time spent in the checksum calculation per packet
> > was
> > 5.9 micro seconds.
> >
> > I believe the differences seen in the CRC32 can be attributed to how
> > lucky the
> > repsective application is in finding the index table (the ssh_crc32
> > found in
> > FreeBSD) in the processor cache. If I run the CRC comparision with just
> > random
> > data in a stand alone program (very unrealistic) crc32 outperforms all
> > others but
> > this is because the table get completely pre-fetched into cache and is
> > never
> > pulled from main memory... (I started here and then realized the only
> > way to
> > get good information is to do it in a real implementation where a lot of
> > other
> > code would run between crc calls).
> >
> > So on that note... I vote STRONGLY for Modified Adler32... i.e. same as
> > regular
> > adler but make the quantities added be 16 bit sums...
> >
> > I think this will take care of the critical problem i.e. the weakness in
> > the
> > SCTP checksum for small packets... Jonathan/Craig any comments or
> > questions...
> >
> > And yess... I am working on getting things ran on a sparc box but the
> > only on
> > we have here is a sparc20... so the numbers definetly can NOT be
> > compared to anything
> > else...
> >
> > R
> >
> >
> >
> >
> > "WENDT,JIM (HP-Roseville,ex1)" wrote:
> > >
> > > I think this "SCTP checksum" thread spanning IPS and TSVWG was for
> > > discussion around whether or not iSCSI (running over SCTP)
> > could forgo data
> > > integrity checking and transport-like functionality
> > (retransmission, ack,
> > > etc) should SCTP provide a sufficiently strong check-code.
> > > If iSCSI were willing to completely trust SCTP end-to-end
> > across a network
> > > fabric (including "middleboxes"), then that provides one reason
> > for SCTP to
> > > adopt a stronger checksum or CRC.
> > > If iSCSI will still implement its own data integrity check-code
> > above SCTP,
> > > then SCTP needs to make an independent decision on whether its current
> > > check-code is sufficiently strong for its target uses.
> > > Currently, iSCSI contains a data integrity check "digest" that can be
> > > negotiated end-to-end to be disabled on a per-connection basis.
> > >
> > > This discussion begs a few questions:
> > > - Are there clearly different classes of applications (in
> > regards to their
> > > end-to-end data integrity strength needs)?
> > > - How are these application classes' end-to-end data integrity
> > needs meet in
> > > the future?  Is it SCTP, IPSec, application-specific protocol, a new
> > > protocol?
> > > - Is there a general need for strong end-to-end data integrity
> > that could be
> > > provided for in a recommended generic manner?
> > > - Is iSCSI unique in being an "ultra-low error rate
> > application" and should
> > > iSCSI then handle its own data integrity?
> > > - Should SCTP strengthen its checksum to meet the needs of a
> > general class
> > > of data-criticial applications, and/or provide a means for
> > negotiating an
> > > optional stronger checksum?
> > > - What is the role of network infrastructure (router/middlebox
> > hardware and
> > > software) in strengthening end-to-end data integrity?
> > >
> > > Data integrity for iSCSI over TCP is a separate issue. It is
> > unlikely that
> > > we will be able to evolve TCP in a timely manner to utilize a stronger
> > > check-code given TCP's current wide scale deployment (although adding a
> > > stronger checksum/CRC to TCP would seem to be the best solution). So,
> > > something else has to be done either above or below TCP to provide the
> > > required level of iSCSI data integrity. Of course, if TCP's
> > data integrity
> > > deficiency is impacting other data-critical applications, then it seems
> > > prudent to at least consider solving the problem generically.
> > >
> > > Jim
> > >
> > > > -----Original Message-----
> > > > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > > > Sent: Friday, April 20, 2001 1:02 AM
> > > > To: Chip Sharp
> > > > Cc: vince_cavanna@agilent.com; steph@cs.uchicago.edu; WENDT,JIM
> > > > (HP-Roseville,ex1); ips@ece.cmu.edu; tsvwg@ietf.org;
> > > > craig@aland.bbn.com; Jonathan.Wood@sun.com; xieqb@cig.mot.com;
> > > > jonathan@dsg.stanford.edu; rrs@cisco.com
> > > > Subject: RE: [Tsvwg] [SCTP checksum problems]
> > > >
> > > >
> > > >
> > > >
> > > > Chip,
> > > >
> > > > CRC s are not meant to protect against malicious middle boxes
> > > > - rather on
> > > > boxes that strip the strong link CRCs and
> > > > let the end-system rely on the weak TCP checksum.
> > > >
> > > > NAT boxes have good reason to recompute TCP checksums, but
> > > > unless they are
> > > > malicious no reason to recompute iSCSI CRCs.
> > > >
> > > > And against malicious boxes iSCSI has cryptographic digests
> > > > as options.
> > > >
> > > > And I was not aware that we are discussing - in this forum -
> > > > iSCSI data
> > > > integrity options.
> > > >
> > > > Julo
> > > >
> > > > Chip Sharp <chsharp@cisco.com> on 19/04/2001 18:53:53
> > > >
> > > > Please respond to Chip Sharp <chsharp@cisco.com>
> > > >
> > > > To:   vince_cavanna@agilent.com
> > > > cc:   steph@cs.uchicago.edu, vince_cavanna@agilent.com,
> > > > jim_wendt@hp.com,
> > > >       Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org,
> > > >       craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
> > > >       jonathan@dsg.stanford.edu, rrs@cisco.com
> > > > Subject:  RE: [Tsvwg] [SCTP checksum problems]
> > > >
> > > >
> > > >
> > > >
> > > > As was pointed out previously, middle box operations (such as
> > > > NATs) tend to
> > > > creep up the protocol stack and into applications.
> > > >
> > > > Take SIP for example.  It includes IP addresses in its
> > > > INVITE.  In order to
> > > > work across a NAT, the IP addresses it exchanges have to be
> > > > replaced with
> > > > the NATed address.  One way is for the NAT to reach up into
> > > > the SIP INVITE
> > > > and change the address.  This modifies the TCP or UDP
> > > > checksum.  Now SIP
> > > > could have included its own integrity check to protect
> > > > against corrupted or
> > > > modified TCP checksums, but all that would have happened is
> > > > that NATs would
> > > > have changed the SIP checksum in addition to the TCP/UDP checksum.
> > > >
> > > > Therefore, even if iSCSI included its own integrity check, if
> > > > a middle box
> > > > is going to futz with iSCSI packets it will just strip the check, do
> > > > whatever it does and then recalculate the check.
> > > >
> > > > If this is what you want to protect against you will have to
> > > > go to some
> > > > type of digital signature.
> > > >
> > > > At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
> > > > >Stephen,
> > > > >
> > > > >I have to admit that I do not have much direct experience with middle
> > > > boxes,
> > > > >BUT I did have fairly direct and recent experience with a popular NAT
> > > > router
> > > > >from a popular vendor that was corrupting data in a network of
> > > > Macintoshes.
> > > > >
> > > > >Apple's TCP was unaware of any problem as was Apple's Filing
> > > > Protocol and
> > > > >most applications. The only applications that detected the
> > > > corruption were
> > > > >those that performed an integrity check of their own. Those
> > > > applications
> > > > >that assumed a reliable transport (and file system) were doomed to
> > > > >experiencing the indirect effects of the corruption at some
> > > > later time.
> > > > The
> > > > >corruption only happened when large amounts of data were transferred
> > > > >quickly.  The router vendor fixed the problem once; then
> > > > fixed it again;
> > > > >then fixed it one last time before the data corruption finally
> > > > >"disappeared". After several weeks of continuous operation the router
> > > > >appeared to get into a mode where it was once again
> > > > corrupting data. Power
> > > > >cycling the router "fixed it". The story apparently has not
> > > > yet ended.
> > > > >
> > > > >I admit I may have given too much significance to this
> > > > single incident
> > > > that
> > > > >I have personally experienced but on the other hand I don't see the
> > > > >mechanisms in place to prevent this type of problem in the
> > > > future other
> > > > than
> > > > >the end to end integrity checks.
> > > > >
> > > > >Incidentally this incident change my behavior when
> > > > transferring data over
> > > > a
> > > > >network. I will always use a compression utility; not only
> > > > for reducing
> > > > the
> > > > >data to be transmitted but to ensure the integrity of my
> > > > data is protected
> > > > >end to end by the utility's CRC mechanism.
> > > > >
> > > > >I believe quite firmly that we DO need a mechanism to allow
> > > > us to tolerate
> > > > >poor implementations of middle boxes and cannot simply hope that
> > > > eventually
> > > > >such poor implementations will vanish, nor that we will have
> > > > the luxury of
> > > > >being able to select only good implementations for every
> > > > component of our
> > > > >storage network.
> > > > >
> > > > >Vince
> > > > >
> > > > >|-----Original Message-----
> > > > >|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> > > > >|Sent: Wednesday, April 18, 2001 3:09 PM
> > > > >|To: CAVANNA,VICENTE V (A-Roseville,ex1)
> > > > >|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
> > > > >|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
> > > > >|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
> > > > >|Subject: Re: [Tsvwg] [SCTP checksum problems]
> > > > >|
> > > > >|
> > > > >|Vince,
> > > > >|
> > > > >|> I don't think iSCSI can be completely relieved of performing
> > > > >|some data
> > > > >|> integrity checking as long as there exists the possibility
> > > > >|of "middle boxes"
> > > > >|> opening up the transport protocol's packet and thus
> > > > >|potentially invalidating
> > > > >|> any reliability guarantees the transport protocol makes.
> > > > >|
> > > > >|Any protection provided against this failure mode will only be
> > > > >|transient, so we must temper the desire to introduce such a
> > > > >|requirement with reality.
> > > > >|
> > > > >|Middleboxes can just as easily open up to the iSCSI layer and tinker
> > > > >|with the payload, as they do with other ULPs running on TCP
> > > > (e.g HTTP)
> > > > >|today.  Short of securing the connection, there is ALWAYS a
> > > > >|possibility of a middlebox terminating and reoriginating an
> > > > integrity
> > > > >|check.  In case you think this is a farfetched scenario, I
> > > > do get the
> > > > >|impression that there is a high level of interest in `actively
> > > > >|middling' iSCSI once the specs crystalize.  Who shaves the barber?
> > > > >|
> > > > >|An integrity check is not necessary as long as some lower layer
> > > > >|provides adequate integrity guarantees.
> > > > >|
> > > > >|Adding an integrity check above the transport layer is based upon
> > > > >|documentation of the presence of a lot of crappy network
> > > > hardware and
> > > > >|software and analyses of the transport integrity check (TCP
> > > > checksum)
> > > > >|which suggests it might not be adequately strong against some such
> > > > >|observed errors.
> > > > >|
> > > > >|I claim that the high incidence of `broken' (corruption introducing)
> > > > >|components is a result of a variety of factors which have shaped the
> > > > >|development of network components thus far.  The fact that integrity
> > > > >|checks are assumed to be performed in a network context
> > > > substantially
> > > > >|lowers the bar for implementation correctness.
> > > > >|
> > > > >|In a storage (or CPU) context, these types of implementation errors
> > > > >|are a) more easily detectable (more fatal) b) more carefully avoided
> > > > >|during implementation (because of the cost of a potential fatal
> > > > >|error).  If network components magically reached the same `quality
> > > > >|level' as storage and CPU components, there might be no
> > > > justification
> > > > >|for additional integrity checks above the transport.
> > > > Similarly if the
> > > > >|transport (or whatever lower layer) integrity checks are very strong
> > > > >|(e.g. IPSec), there is, again, no need for a higher level integrity
> > > > >|check.
> > > > >|
> > > > >|I am not disagreeing that we need an additional integrity check over
> > > > >|TCP in the present target environment, but I do disagree that iSCSI
> > > > >|will always need such a check, independently of what is running
> > > > >|beneath it.
> > > > >|
> > > > >|Steph
> > > > >|
> > > >
> > > >
> > > > -------------------------------------------------------------------
> > > > Chip Sharp                       Consulting Engineering
> > > > Cisco Systems
> > > > -------------------------------------------------------------------
> > > >
> > > >
> > > >
> > > >
> >
> > --
> > Randall R. Stewart
> > Systems & Solutions Engineering
> > Cisco Systems Inc.
> > rrs@cisco.com 815-342-5222 or 815-477-2127
> >
> > _______________________________________________
> > tsvwg mailing list
> > tsvwg@ietf.org
> > http://www1.ietf.org/mailman/listinfo/tsvwg
> >

-- 
Randall R. Stewart
Systems & Solutions Engineering
Cisco Systems Inc.
rrs@cisco.com 815-342-5222 or 815-477-2127


From owner-ips@ece.cmu.edu  Fri Apr 27 08:21:19 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA25624
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 08:21:19 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RAfG406261
	for ips-outgoing; Fri, 27 Apr 2001 06:41:16 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from hotmail.com (oe42.law10.hotmail.com [64.4.14.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3R4BnA10569
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 00:11:49 -0400 (EDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Thu, 26 Apr 2001 21:11:42 -0700
X-Originating-IP: [198.133.22.68]
From: "Julian Satran" <Julian_Satran@hotmail.com>
To: <ips@ece.cmu.edu>
Subject: Re: iSCSI immediate Delivery Behaviour
Date: Fri, 27 Apr 2001 07:16:43 +0300
MIME-Version: 1.0
Content-Type: multipart/alternative;	boundary="----=_NextPart_000_0050_01C0CEEA.031375D0"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
Disposition-Notification-To: "Julian Satran" <Julian_Satran@hotmail.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Message-ID: <OE42yWVLEFN03a5sqMa000013e1@hotmail.com>
X-OriginalArrivalTime: 27 Apr 2001 04:11:42.0608 (UTC) FILETIME=[2A63D900:01C0CED0]
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.

------=_NextPart_000_0050_01C0CEEA.031375D0
Content-Type: text/plain;
	charset="windows-1255"
Content-Transfer-Encoding: quoted-printable

Charles,

We have  have a mail blackout - they move some boxes to a new building.
Here is what I wrote you.

Julo

To: ips@ece.cmu.edu

cc:=20

From: Julian Satran/Haifa/IBM@IBMIL

Subject: Re: iSCSI: Immediate Delivery Behavior=20

Charles,

There is an explicit statement that iSCSI uses TCP and this implies that =
on any given connection nothing can be delivered out of order.

However if there is a hole in the iSCSI queue (e.g., due to a digest =
error) immediate commands can still be delivered out of order.

In 7.3 there is description of how to handle task management to cover =
for those cases.

Regards,

Julo

Please respond to Charles Monia <cmonia@NishanSystems.com>=20

To: "Ips (E-mail)" <ips@ece.cmu.edu>

cc: Charles Monia <cmonia@NishanSystems.com>=20

Subject: iSCSI: Immediate Delivery Behavior






Hi:

The behavior for immediate commands seems ambiguous and possibly =
needlessly

complex.

Rev 06 says the following regarding ordered delivery to the SCSI layer:

"Except for the commands marked for immediate delivery the iSCSI

target layer MUST deliver the commands to the SCSI target layer in

the order specified by CmdSN. Commands marked for immediate delivery

may be handed over by the iSCSI target layer to the SCSI target layer

as soon as detected. iSCSI may avoid delivering some command to the

SCSI layer if so required by some prior SCSI or iSCSI action (e.g.,

clear task set Task Management request received before all the

commands it was supposed to act on)."

In a non-striped session consisting of one TCP/IP connection, the above

could be interpreted to allow the delivery of an immediate command =
before

other partly received commands that were previously issued. As a result, =
an

operation, such as an abort task, might bypass the command to be aborted =
--

even if both were sent on the same connection.

Assuming that's true, I believe a useful simplification is to require =
that

all traffic flowing over a given TCP/IP connection be delivered to the =
SCSI

layer in the order received over that connection. In a striped session, =
an

immediate command might therefore leapfrog commands on other connections =
but

would never bypass commands on the same connection. In my opinion, that

simplifies the problem of properly purging commands and stale PDUs in =
the

wake of a task management operation.

Charles

Charles Monia

Senior Technology Consultant

Nishan Systems

email: cmonia@nishansystems.com

voice: (408) 519-3986

fax: (408) 435-8385


------=_NextPart_000_0050_01C0CEEA.031375D0
Content-Type: text/html;
	charset="windows-1255"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dwindows-1255">
<META content=3D"MSHTML 5.50.4522.1800" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Charles,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>We have&nbsp; have a mail blackout - =
they move some=20
boxes to a new building.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Here is what I wrote you.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV>
<P><FONT face=3DArial size=3D2>Julo</FONT></P>
<DIR>
<DIR><FONT face=3DHelv color=3D#800080 size=3D1>
<P>To: </FONT><FONT face=3DHelv color=3D#000000=20
size=3D1>ips@ece.cmu.edu</P></FONT><FONT face=3DHelv color=3D#800080 =
size=3D1>
<P>cc: </P>
<P>From: </FONT><FONT face=3DHelv color=3D#000000 size=3D1>Julian=20
Satran/Haifa/IBM@IBMIL</P></FONT><FONT face=3DHelv color=3D#800080 =
size=3D1>
<P>Subject: </FONT><FONT face=3DHelv color=3D#000000 size=3D1>Re: iSCSI: =
Immediate=20
Delivery Behavior</FONT><FONT face=3DHelv color=3D#800080 size=3D1>=20
</P></DIR></DIR></FONT><FONT face=3DHelv color=3D#000000 size=3D2>
<P>Charles,</P>
<P>There is an explicit statement that iSCSI uses TCP and this implies =
that on=20
any given connection nothing can be delivered out of order.</P>
<P>However if there is a hole in the iSCSI queue (e.g., due to a digest =
error)=20
immediate commands can still be delivered out of order.</P>
<P>In 7.3 there is description of how to handle task management to cover =
for=20
those cases.</P>
<P>Regards,</P>
<P>Julo</P>
<DIR>
<DIR></FONT><FONT face=3DHelv color=3D#800080 size=3D1>
<P>Please respond to Charles Monia &lt;cmonia@NishanSystems.com&gt; </P>
<P>To: </FONT><FONT face=3DHelv color=3D#000000 size=3D1>"Ips (E-mail)"=20
&lt;ips@ece.cmu.edu&gt;</P></FONT><FONT face=3DHelv color=3D#800080 =
size=3D1>
<P>cc: </FONT><FONT face=3DHelv color=3D#000000 size=3D1>Charles Monia=20
&lt;cmonia@NishanSystems.com&gt;</FONT><FONT face=3DHelv color=3D#800080 =
size=3D1>=20
</P>
<P>Subject: </FONT><FONT face=3DHelv color=3D#000000 size=3D1>iSCSI: =
Immediate=20
Delivery Behavior</P>
<P>&nbsp;</P>
<P>&nbsp;</P></DIR></DIR>
<P></P></FONT><FONT face=3DCourier color=3D#000000 size=3D1>
<P>Hi:</P>
<P>The behavior for immediate commands seems ambiguous and possibly=20
needlessly</P>
<P>complex.</P>
<P>Rev 06 says the following regarding ordered delivery to the SCSI =
layer:</P>
<P>"Except for the commands marked for immediate delivery the iSCSI</P>
<P>target layer MUST deliver the commands to the SCSI target layer =
in</P>
<P>the order specified by CmdSN. Commands marked for immediate =
delivery</P>
<P>may be handed over by the iSCSI target layer to the SCSI target =
layer</P>
<P>as soon as detected. iSCSI may avoid delivering some command to =
the</P>
<P>SCSI layer if so required by some prior SCSI or iSCSI action =
(e.g.,</P>
<P>clear task set Task Management request received before all the</P>
<P>commands it was supposed to act on)."</P>
<P>In a non-striped session consisting of one TCP/IP connection, the =
above</P>
<P>could be interpreted to allow the delivery of an immediate command =
before</P>
<P>other partly received commands that were previously issued. As a =
result,=20
an</P>
<P>operation, such as an abort task, might bypass the command to be =
aborted=20
--</P>
<P>even if both were sent on the same connection.</P>
<P>Assuming that's true, I believe a useful simplification is to require =

that</P>
<P>all traffic flowing over a given TCP/IP connection be delivered to =
the=20
SCSI</P>
<P>layer in the order received over that connection. In a striped =
session,=20
an</P>
<P>immediate command might therefore leapfrog commands on other =
connections=20
but</P>
<P>would never bypass commands on the same connection. In my opinion, =
that</P>
<P>simplifies the problem of properly purging commands and stale PDUs in =
the</P>
<P>wake of a task management operation.</P>
<P>Charles</P>
<P>Charles Monia</P>
<P>Senior Technology Consultant</P>
<P>Nishan Systems</P>
<P>email: cmonia@nishansystems.com</P>
<P>voice: (408) 519-3986</P>
<P>fax: (408) 435-8385</P></FONT></DIV></BODY></HTML>

------=_NextPart_000_0050_01C0CEEA.031375D0--


From owner-ips@ece.cmu.edu  Fri Apr 27 08:21:51 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA25658
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 08:21:51 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RAJFU05359
	for ips-outgoing; Fri, 27 Apr 2001 06:19:15 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RAISA05346
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:18:28 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id MAA240356
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 12:18:18 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id MAA63956
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 12:18:18 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.00389739 ; Fri, 27 Apr 2001 12:18:07 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.0038959C.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 13:21:47 +0300
Subject: Re: iSCSI: Require iSCSI to use packet formats similar to FC, etc
	??
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



It doesn't add too much (as Robert had so many misgiving about the
performance statements).
I could not find any other two SCSI transports that have similar formats
(even on serial buses - SBP, SST and FCP are
very dissimilar).  As for gateway - remapping is the least of their issues
- state is the major one and we went a long way to minimize this for the
only other relevant serial-SCSI protocol - FCP.  I don't see why we should
make this a requirement.



Julo

"KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com> on
27/04/2001 02:18:21

Please respond to "KRUEGER,MARJORIE (HP-Roseville,ex1)"
      <marjorie_krueger@hp.com>

To:   "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
cc:
Subject:  iSCSI: Require iSCSI to use packet formats similar to FC, etc??




Robert Snively proposes that the iSCSI Requirements document include the
following requirement WRT gateway devices:

   iSCSI MUST use packet formats similar to the common
   packet formats used by other packetized SCSI protocols where
   possible to allow both simple bridging gateways and more
   sophisticated translating gateways.

Comments?

Marjorie Krueger
Networked Storage Architecture
Networked Storage Solutions Org.
Hewlett-Packard
tel: +1 916 785 2656
fax: +1 916 785 0391
email: marjorie_krueger@hp.com





From owner-ips@ece.cmu.edu  Fri Apr 27 08:27:20 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id IAA25910
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 08:27:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RAJKB05366
	for ips-outgoing; Fri, 27 Apr 2001 06:19:20 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RAIQA05343
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 06:18:26 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id MAA113976
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 12:18:17 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id MAA63952
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 12:18:17 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.003897B5 ; Fri, 27 Apr 2001 12:18:08 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.00389630.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 12:39:57 +0300
Subject: RE: iSCSI-06 SCSI Cmd typo
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Chuck - here it is:
1.1  SCSI Command

   Byte /    0       |       1       |       2       |       3       |
      /              |               |               |               |
     |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
     +---------------+---------------+---------------+---------------+
    0|X|I| 0x01      |F|R|W|0 0|ATTR | Reserved      | CRN or Rsvd   |
     +---------------+---------------+---------------+---------------+
    4|TotalAHSLength | DataSegmentLength                             |
     +---------------+---------------+---------------+---------------+
    8| Logical Unit Number (LUN)                                     |
     +                                                               +
   12|                                                               |
     +---------------+---------------+---------------+---------------+
   16| Initiator Task Tag                                            |
     +---------------+---------------+---------------+---------------+
   20| Expected Data Transfer Length                                 |
     +---------------+---------------+---------------+---------------+
   24| CmdSN                                                         |
     +---------------+---------------+---------------+---------------+
   28| ExpStatSN or ExpDataSN                                        |
     +---------------+---------------+---------------+---------------+
   32/ SCSI Command Descriptor Block (CDB)                           /
    +/                                                               /
     +---------------+---------------+---------------+---------------+
   48| Digests if any...                                             |
     +---------------+---------------+---------------+---------------+
     / DataSegment - Command Data (optional)                         /
    +/                                                               /
     +---------------+---------------+---------------+---------------+




Chuck Micalizzi <chuck.micalizzi@qlogic.com> on 26/04/2001 20:57:55

Please respond to Chuck Micalizzi <chuck.micalizzi@qlogic.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  RE: iSCSI-06 SCSI Cmd typo




Julian,

     I'm confused as to what field was removed in order to make
     the CDB field 16 bytes and locate the CmdSN at offset 24.
     Can you send the correct layout of the SCSI Command PDU
     when you get time?

Thank You

chuck micalizzi

-----Original Message-----
From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
Sent: Tuesday, April 24, 2001 2:18 AM
To: dfsmith@almaden.ibm.com
Cc: ips@ece.cmu.edu
Subject: Re: iSCSI-06 SCSI Cmd typo




Thanks Daniel,

It's fixed. CmdSN is at 24 and  the following are shifted.

Julo

Daniel Smith <dfsmith@almaden.ibm.com> on 24/04/2001 03:16:31

Please respond to Daniel Smith <dfsmith@almaden.ibm.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:
Subject:  iSCSI-06 SCSI Cmd typo




In section 2.3 SCSI Command...

the table shows bytes 36--47 reserved for the CDB (12 bytes).
But
the description 2.3.6 says it's 16 bytes.

(I'd prefer 16 bytes---I'm not a big fan of Stat/DataSN.)

This document is getting big---but the latest version seems to be holding
up
well as I read through it.  Good work.

Daniel
--
IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120-6099, USA
K65B/C2 Phone: +1(408)927-2072 Fax: +1(408)927-3010 Home: +1(408)227-5786







From owner-ips@ece.cmu.edu  Fri Apr 27 10:00:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA29877
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 10:00:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RBuLx09316
	for ips-outgoing; Fri, 27 Apr 2001 07:56:21 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from spdmraab.compuserve.com (ds-img-rel-2.compuserve.com [149.174.206.155])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RBthA09297
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 07:55:43 -0400 (EDT)
Received: (from mailgate@localhost)
	by spdmraab.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) id HAA13942
	for ips@ece.cmu.edu; Fri, 27 Apr 2001 07:55:38 -0400 (EDT)
Received: from compuserve.com (dal-tgn-tlv-vty27.as.wcom.net [216.192.236.27])
	by spdmraab.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) with ESMTP id HAA13900;
	Fri, 27 Apr 2001 07:55:33 -0400 (EDT)
Message-ID: <3AE95F1C.D5950596@compuserve.com>
Date: Fri, 27 Apr 2001 06:59:24 -0500
From: Ralph Weber <ralphoweber@compuserve.com>
Reply-To: ENDL_TX@computer.org
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD NSCPCD47  (Win95; I)
X-Accept-Language: en,pdf
MIME-Version: 1.0
To: julian_satran@il.ibm.com
CC: ips@ece.cmu.edu
Subject: Re: iSCSI : EnableACA
References: <C1256A3B.001D2BBE.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

Regarding your discussion with Santosh...

> The task management functions for iSCSI contain a
> description the behaviour I am talking about. This
> was also subject to a long discussion on the list
> that included the need for ACA behaviour for things
> like task-set full, busy etc. - not considered for
> ACA. For the later T10 is handling making ACA
> behaviour available.
>
My question is... Exactly which what are the cases where
T10 is NOT considering making ACA behavior available so
that you believe the EnableACA function is necessary in
iSCSI?

My belief is that every behavior that EnableACA covers
but T10 does not needs to be taken to T10 to see if
they can find a better way to extend ACA coverage to
that area.

Thanks.

Ralph...




From owner-ips@ece.cmu.edu  Fri Apr 27 11:55:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA06450
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 11:55:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RDXOT13635
	for ips-outgoing; Fri, 27 Apr 2001 09:33:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RDX9A13624
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 09:33:09 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id JAA04064; Fri, 27 Apr 2001 09:29:31 -0400 (EDT)
Message-ID: <3AE973CD.3121BEDE@cisco.com>
Date: Fri, 27 Apr 2001 08:27:41 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
CC: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: Re: iSCSI: Require iSCSI to use packet formats similar to FC, etc??
References: <6BD67FFB937FD411A04F00D0B74FE87802A09014@xrose06.rose.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie-

The word "similar" is pretty vague.  If it means that iSCSI should
include the same types of information that the other transports
provide, that's probably required just to get it working.  If it
means that iSCSI messages have to look like Fibre Channel frames
(I don't realy think that's what you mean), that would be a severe
and inappropriate limitation.

I would object to the use of the word "packet", since iSCSI is
built on top of a stream, and does not have packets.

As far as gateways are concerned, I don't see any reason to
be overly restrictive on formats; they will have to be ripped
apart and rebuilt between various SCSI transports anyway.  As
long as the right information is available in any form, there
should be no problem.  Having cobbled up a gateway myself, message
formats are the least of our challenges.

I think that the other requirements already specific (indirectly,
perhaps) that the right kind of information has to be present, so
I would not want to see this requirement added.

--
Mark

"KRUEGER,MARJORIE (HP-Roseville,ex1)" wrote:
> 
> Robert Snively proposes that the iSCSI Requirements document include the
> following requirement WRT gateway devices:
> 
>    iSCSI MUST use packet formats similar to the common
>    packet formats used by other packetized SCSI protocols where
>    possible to allow both simple bridging gateways and more
>    sophisticated translating gateways.
> 
> Comments?
> 
> Marjorie Krueger
> Networked Storage Architecture
> Networked Storage Solutions Org.
> Hewlett-Packard
> tel: +1 916 785 2656
> fax: +1 916 785 0391
> email: marjorie_krueger@hp.com

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Fri Apr 27 13:02:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA08572
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 13:02:06 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3REgWA17335
	for ips-outgoing; Fri, 27 Apr 2001 10:42:32 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3REfhA17286
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 10:41:44 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id QAA157704
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 16:41:32 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA69270
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 16:41:33 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.0050B2D5 ; Fri, 27 Apr 2001 16:41:26 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.0050B16D.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 17:46:50 +0300
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Ralph,

I am not sure that I understand the question.

After the reset task management functions iSCSI requires the target enter
Unit Atention for all other initiators connected to the same target (or LU)
and, after reporting the Unit Attention enter ACA.

You and Ed pointed out that many initiators will be unable to handle this
(will not reset the condition).

EnableACA is a hack intended to allow an initiator to control the target
behavior.

NormACA and the NACA CDB bit have similar purposes but NormACA is a
"read-only" flag and the ACA condition is not created in this case as a
result of a command execution error but as a result of an action by an
external factor (another initiator).

This situation is probably more general - and some other transports dealing
with commands in flight might experience a similar
problem but up to now SPC has no way of handling this situation.

If Ed is going to cover this within the "extended use of ACA" we discussed
for SCSI commands rejected with busy and task set full and for Unit
Attention we might have to coordinate.

Regards,
Julo



Ralph Weber <ralphoweber@compuserve.com> on 27/04/2001 14:59:24

Please respond to ENDL_TX@computer.org

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI : EnableACA




Julian,

Regarding your discussion with Santosh...

> The task management functions for iSCSI contain a
> description the behaviour I am talking about. This
> was also subject to a long discussion on the list
> that included the need for ACA behaviour for things
> like task-set full, busy etc. - not considered for
> ACA. For the later T10 is handling making ACA
> behaviour available.
>
My question is... Exactly which what are the cases where
T10 is NOT considering making ACA behavior available so
that you believe the EnableACA function is necessary in
iSCSI?

My belief is that every behavior that EnableACA covers
but T10 does not needs to be taken to T10 to see if
they can find a better way to extend ACA coverage to
that area.

Thanks.

Ralph...







From owner-ips@ece.cmu.edu  Fri Apr 27 13:05:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA08645
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 13:05:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RFKVg19621
	for ips-outgoing; Fri, 27 Apr 2001 11:20:31 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RFK8A19590
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 11:20:08 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id LAA17993; Fri, 27 Apr 2001 11:20:01 -0400 (EDT)
Message-ID: <3AE98DB0.89AABC31@cisco.com>
Date: Fri, 27 Apr 2001 10:18:08 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: "BURBRIDGE,MATTHEW (HP-UnitedKingdom,ex2)" <matthew_burbridge@hp.com>
CC: "'ips@ece.cmu.edu'" <ips@ece.cmu.edu>
Subject: Re: iSCSI : target session login behaviour
References: <0B9A57FF1D57D411B47500D0B73E5CC101E7A699@dickens.bri.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Correct.  We had intended this behavior in the name-disc draft,
and will add it next time.

--
Mark

"BURBRIDGE,MATTHEW (HP-UnitedKingdom,ex2)" wrote:
> 
> Hari,
> 
> In response to your email
> 
> Either:
> 
> The two HBAs are operating independantly (i.e. sessions do not span HBAs)
> and therefore should have some form of differentiation: e.g. a different
> iSCSI Initiator name, or they do need to have some form of co-operatation at
> the iSCSI layer(s)/configuration to ensure uniqueness of the ISID as you
> have suggested.
> 
> Or:
> 
> The two HBAs are operating together (sessions can span HBAs) in which case
> there is effectively only one iSCSI Layer and so the ISID will be different
> when the initiator creates a new independant session, or the ISID and TSID
> is the same and the initiator is creating a new connection within the same
> session albeit on a different HBA.
> 
> Matthew Burbridge
> NIS-Bristol
> Hewlett Packard
> Telnet: 312 7010
> E-mail: matthewb@bri.hp.com
> 
> > -----Original Message-----
> > From: Mudaliar, Hari [mailto:Hari_Mudaliar@adaptec.com]
> > Sent: 26 April 2001 00:57
> > To: 'Santosh Rao'; Mudaliar, Hari
> > Cc: IPS Reflector
> > Subject: RE: iSCSI : target session login behaviour
> >
> >
> > Santosh,
> >       I get your point. But what if there is more than one
> > iSCSI Host bus
> > adapter in a system? The Initiator Name will be the same and
> > ISID MAY turn
> > out to be the same (unless the ISIDs are apportioned between
> > the initiators
> > through some configuration method). This assumes that
> > multiple sessions can
> > exist between one initiator system (containing multiple iSCSI off-load
> > engines/HBAs) and a target.
> >
> > - Hari
> >
> > -----Original Message-----
> > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > Sent: Wednesday, April 25, 2001 4:18 PM
> > To: Mudaliar, Hari
> > Cc: IPS Reflector
> > Subject: Re: iSCSI : target session login behaviour
> >
> >
> > "Mudaliar, Hari" wrote:
> >
> > >         I am assuming that you are referring to the
> > creation of a new
> > > session with TSID=0 in your example below. Take the case of
> > an initiator
> > I1
> > > who has established a session with a target with an
> > ISID=ISID1. What if a
> > > second initator I2 tries to login to the same target with ISID1? The
> > target
> > > cannot decide to logout the first initiator (who already
> > has a session
> > > established with ISID1) as suggested by you.
> >
> > Hari,
> >
> > You may want to take a second look at my mail. It
> > specifically refers to
> > the problem in the context of a given (Initiator Name, ISID). Your
> > example above does not fall under that category. A 2nd initiator using
> > the same ISID would have a different Initiator Name. (a.k.a initiator
> > WWUI).
> >
> > The problem raised is in the context of an existing session
> > for a given
> > (Initiator Name, ISID). How does a target deal with a second session
> > login received for the same (Initiator Name, ISID) with a NULL TSID ?
> >
> > >         Also, depending on implementation, the target may
> > realize that the
> > > TCP connections for a session were lost (using Keep-Alives
> > or iSCSI NOPs
> > > etc.) when the initiator rebooted thus terminating the
> > session. By the
> > time
> > > a new login from the same initiator is received, the old
> > session info may
> > > have been cleared.
> >
> > Then again, it may not. There's 2 aspects to this issue :
> > 1) Successful session re-logins from the rebooted host.
> > 2) Garbage collection and cleanup of the old session resources.
> >
> > (1) is a more serious issue, since the target MUST NOT reject
> > the login
> > based on a pre-existing active session for a given (Initiator Name,
> > ISID).
> >
> > (2) is handled through garbage collection algorithms, but
> > implementation
> > of the proposal would help accelerate the release of stale session
> > resources.
> >
> > - Santosh
> >
> >
> > >
> > > -----Original Message-----
> > > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > > Sent: Wednesday, April 25, 2001 11:19 AM
> > > To: IPS Reflector
> > > Subject: iSCSI : target session login behaviour
> > >
> > > All,
> > >
> > > How should a target respond when it receives a session
> > login  [on a new
> > > TCP connection] with the same (ISID, Initiator Name) as a session
> > > already active at the target.
> > >
> > > Does such a login request imply :
> > >
> > > 1) the target should perform implicit logout and re-login
> > of the session
> > > identified by (ISID, initiator name) ?
> > >
> > > 2) Or does this result in the target responding to the session login
> > > with :
> > > a login response with status class of non-zero indicating target is
> > > rejecting the login ?
> > >
> > > [The draft does not describe target behaviour for this scenario.]
> > >
> > > iSCSI session login semantics should explicitly state that the above
> > > scenario will result in case (1) above. i.e. when a target sees a
> > > session login for a given (ISID, initiator name), it MUST
> > treat this as
> > > an implicit logout of any previous session active at the
> > target for that
> > > (ISID, initiator name) and then, establish a new session.
> > >
> > > This is required because the above scenario can typically
> > occur when an
> > > initiator reboots without having performed a session logout on all
> > > active sessions.(system did not perform an orderly shutdown).
> > >
> > > As a side note, the iSCSI draft Status Class/Codes could do
> > with a misc
> > > error category along the lines of the FC "No additional Explantion"
> > > reason explantion. This would help deal with error
> > conditions that don't
> > > come under the listed category.
> > >
> > > - Santosh
> >

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Fri Apr 27 13:10:29 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA08796
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 13:10:24 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3REBRO15709
	for ips-outgoing; Fri, 27 Apr 2001 10:11:27 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3REAPA15645
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 10:10:25 -0400 (EDT)
Received: from ahganemw2k (dhcp-161-44-68-139.cisco.com [161.44.68.139]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with SMTP id KAA14492 for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 10:10:20 -0400 (EDT)
From: "Ayman Ghanem" <aghanem@cisco.com>
To: <ips@ece.cmu.edu>
Subject: RE: iSCSI : target session login behaviour
Date: Fri, 27 Apr 2001 09:09:11 -0500
Message-ID: <LOEPJENHBHAHEABBNDAJOEDKCBAA.aghanem@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
In-Reply-To: <C1256A3B.003895B2.00@d12mta02.de.ibm.com>
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Is Status SNACK reject going to be added as a reason for a reject PDU?. If
not, what is the appropriate behavior on a target receiving a status SNACK
that it can not fulfill?. Should it terminate the connection?.

-Ayman

> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> julian_satran@il.ibm.com
> Sent: Friday, April 27, 2001 4:59 AM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI : target session login behaviour
>
>
>
>
> SNACK rejected has been removed from the SCSI Response - Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 26/04/2001 21:41:20
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Stephen Bailey <steph@cs.uchicago.edu>, ips@ece.cmu.edu
> cc:   David Black <Black_David@emc.com>
> Subject:  Re: iSCSI : target session login behaviour
>
>
>
>
> Stephen Bailey wrote:
>
> > > As a side note, the iSCSI draft Status Class/Codes could do
> with a misc
> > > error category along the lines of the FC "No additional Explantion"
> > > reason explantion. This would help deal with error conditions that
> don't
> > > come under the listed category.
> >
> > Personally, I think we should add categories for reasons we obviously
> > see now, AND have a no additional reason.
> >
> > One peculiarity with what you're talking about above is that it should
> > be a login response status code which expresses this rejection.  The
> > login response set does not seem to have an `invalid parameter'
> > response for cases when the request is somehow inconsistent.
>
> Steph,
>
> The iSCSI draft is unclear today about the exact mechanism through which
> the target indicates "invalid parameters" in response to a received
> command.
>
> 1) Should it use a REJECT PDU or respond with the appropriate response
> for that PDU indicating a response code of "Invalid Parameters" and a
> "first bad byte" offset that indicates which parameter the target
> disliked.
>
> IMO, an "Invalid Parameters" response in the response codes is
> appropriate for SCSI Command and SCSI Task Mgmt commands. [coupled with
> a "first bad byte" offset.]
>
> This is missing today.
>
> 2) Also, as discussed above, a general "No Addional Explanation" type of
> status code in the login response would cover the "misc" category.
>
> 3) There are cases of ambiguity in the usage of REJECT or SCSI Response.
> Take the case of a "SNACK Reject". It is present in both the SCSI
> Response (SNACK Rejected) and REJECT PDU reason code (Data SNACK
> Reject). Which mechanism is to be used in this case ?
>
> 4) There is no "Status SNACK Rejected" in the REJECT PDU.
>
> Regards,
> Santosh
>  - santoshr.vcf
>
>
>
>



From owner-ips@ece.cmu.edu  Fri Apr 27 13:28:46 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA09273
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 13:28:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RFMTQ19721
	for ips-outgoing; Fri, 27 Apr 2001 11:22:29 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from dogwood.cisco.com (dogwood.cisco.com [161.44.11.19])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RFMCA19705
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 11:22:12 -0400 (EDT)
Received: from cisco.com (mbakke@mbakke-lnx.cisco.com [161.44.68.87]) by dogwood.cisco.com (8.8.6 (PHNE_14041)/CISCO.SERVER.1.2) with ESMTP id LAA13206; Fri, 27 Apr 2001 11:18:28 -0400 (EDT)
Message-ID: <3AE98D56.3CAB3E6A@cisco.com>
Date: Fri, 27 Apr 2001 10:16:38 -0500
From: Mark Bakke <mbakke@cisco.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-3.uid32 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: "Mudaliar, Hari" <Hari_Mudaliar@adaptec.com>
CC: "'Santosh Rao'" <santoshr@cup.hp.com>, IPS Reflector <ips@ece.cmu.edu>
Subject: Re: iSCSI : target session login behaviour
References: <268DBFF7D2A3D411A37400D0B72E345FE71B3B@aimexc03.corp.adaptec.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Hari-

Although it looks like we didn't explicitly state this in the latest
name-disc draft, we did discuss the splitting of the ISID identifier
space amongst iSCSI HBAs.  We should add a statement like:

All entities (iSCSI HBAs, drivers, hosts in a cluster, whatever) that
share the same Initiator Name MUST coordinate the use of ISIDs amongst
themselves to avoid ISID re-use collisions.

I don't think that's quite the right wording, but anyway, I think that
this helps.

--
Mark

"Mudaliar, Hari" wrote:
> 
> Santosh,
>         I get your point. But what if there is more than one iSCSI Host bus
> adapter in a system? The Initiator Name will be the same and ISID MAY turn
> out to be the same (unless the ISIDs are apportioned between the initiators
> through some configuration method). This assumes that multiple sessions can
> exist between one initiator system (containing multiple iSCSI off-load
> engines/HBAs) and a target.
> 
> - Hari
> 
> -----Original Message-----
> From: Santosh Rao [mailto:santoshr@cup.hp.com]
> Sent: Wednesday, April 25, 2001 4:18 PM
> To: Mudaliar, Hari
> Cc: IPS Reflector
> Subject: Re: iSCSI : target session login behaviour
> 
> "Mudaliar, Hari" wrote:
> 
> >         I am assuming that you are referring to the creation of a new
> > session with TSID=0 in your example below. Take the case of an initiator
> I1
> > who has established a session with a target with an ISID=ISID1. What if a
> > second initator I2 tries to login to the same target with ISID1? The
> target
> > cannot decide to logout the first initiator (who already has a session
> > established with ISID1) as suggested by you.
> 
> Hari,
> 
> You may want to take a second look at my mail. It specifically refers to
> the problem in the context of a given (Initiator Name, ISID). Your
> example above does not fall under that category. A 2nd initiator using
> the same ISID would have a different Initiator Name. (a.k.a initiator
> WWUI).
> 
> The problem raised is in the context of an existing session for a given
> (Initiator Name, ISID). How does a target deal with a second session
> login received for the same (Initiator Name, ISID) with a NULL TSID ?
> 
> >         Also, depending on implementation, the target may realize that the
> > TCP connections for a session were lost (using Keep-Alives or iSCSI NOPs
> > etc.) when the initiator rebooted thus terminating the session. By the
> time
> > a new login from the same initiator is received, the old session info may
> > have been cleared.
> 
> Then again, it may not. There's 2 aspects to this issue :
> 1) Successful session re-logins from the rebooted host.
> 2) Garbage collection and cleanup of the old session resources.
> 
> (1) is a more serious issue, since the target MUST NOT reject the login
> based on a pre-existing active session for a given (Initiator Name,
> ISID).
> 
> (2) is handled through garbage collection algorithms, but implementation
> of the proposal would help accelerate the release of stale session
> resources.
> 
> - Santosh
> 
> >
> > -----Original Message-----
> > From: Santosh Rao [mailto:santoshr@cup.hp.com]
> > Sent: Wednesday, April 25, 2001 11:19 AM
> > To: IPS Reflector
> > Subject: iSCSI : target session login behaviour
> >
> > All,
> >
> > How should a target respond when it receives a session login  [on a new
> > TCP connection] with the same (ISID, Initiator Name) as a session
> > already active at the target.
> >
> > Does such a login request imply :
> >
> > 1) the target should perform implicit logout and re-login of the session
> > identified by (ISID, initiator name) ?
> >
> > 2) Or does this result in the target responding to the session login
> > with :
> > a login response with status class of non-zero indicating target is
> > rejecting the login ?
> >
> > [The draft does not describe target behaviour for this scenario.]
> >
> > iSCSI session login semantics should explicitly state that the above
> > scenario will result in case (1) above. i.e. when a target sees a
> > session login for a given (ISID, initiator name), it MUST treat this as
> > an implicit logout of any previous session active at the target for that
> > (ISID, initiator name) and then, establish a new session.
> >
> > This is required because the above scenario can typically occur when an
> > initiator reboots without having performed a session logout on all
> > active sessions.(system did not perform an orderly shutdown).
> >
> > As a side note, the iSCSI draft Status Class/Codes could do with a misc
> > error category along the lines of the FC "No additional Explantion"
> > reason explantion. This would help deal with error conditions that don't
> > come under the listed category.
> >
> > - Santosh

-- 
Mark A. Bakke
Cisco Systems
mbakke@cisco.com
763.398.1054


From owner-ips@ece.cmu.edu  Fri Apr 27 16:05:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA14479
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 16:05:01 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RIOkj29595
	for ips-outgoing; Fri, 27 Apr 2001 14:24:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from iol.unh.edu (mars.iol.unh.edu [132.177.121.222])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RINoA29551
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 14:23:50 -0400 (EDT)
Received: from localhost (rdr@localhost)
	by iol.unh.edu (8.9.3/8.9.3) with ESMTP id OAA26654;
	Fri, 27 Apr 2001 14:23:37 -0400 (EDT)
Date: Fri, 27 Apr 2001 14:23:37 -0400
From: "Robert D. Russell" <rdr@mars.iol.unh.edu>
To: julian_satran@il.ibm.com
cc: ips@ece.cmu.edu
Subject: Re: iSCSI Parameter Negotiation
In-Reply-To: <C1256A3B.0018EA60.00@d12mta02.de.ibm.com>
Message-ID: <Pine.SGI.4.20.0104271422080.26606-100000@mars.iol.unh.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Julo:

Two small points:

1) I believe there is a minor formating error in draft 6 -- the title
   page for Appendix A is out of place -- it appears on page 96,
   before section 8.  Consequently, Appendix A on page 96 is empty,
   and then all the other Appendix labels are shifted, so the label
   Appendix B on page 104 really should be labeled Appendix A,
   the label Appendix C on page 118 really should be labeled Appendix B,
   etc.

2) The term "operational parameters" is used in many places in draft 6,
   but is never actually defined anywhere.  To me, it appears that there
   are only two types of parameters: security parameters, which are all
   listed in Appendix A, and operational parameters, which are all
   listed in Appendix D.  Therefore, I would suggest a change in the title
   of Appendix D from: "Login/Text Miscellaneous Keys" to
   "Login/Text Operational Keys" or just "Operational Keys".

   The definition of "security parameters" is clearly stated at the end
   of the first paragraph on page 83: "For a list of security parameters
   see Appendix A."  A similar statement should be added somewhere
   in section 4.3 on page 84: "For a list of operational parameters
   see Appendix D."  Alternatively (or additionally), the statement just
   quoted on page 83 could be modified to read: "All security parameters
   are listed in Appendix A, and all operational parameters in Appendix D."

Bob Russell
InterOperability Lab
University of New Hampshire
rdr@iol.unh.edu
603-862-3774
On Fri, 27 Apr 2001 julian_satran@il.ibm.com wrote:


> Robert,
> 
> Yes - the target can "initiate" a negotiation (be a offering party).
> Driven was meant to underline that a sequence should end with an initiator
> setting the F bit and then the target
> doing the same (except for errors).  I will try to fix the text to make it
> clear.
> 
> Julo
> 



From owner-ips@ece.cmu.edu  Fri Apr 27 16:08:40 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA14702
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 16:08:40 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RIJaF29319
	for ips-outgoing; Fri, 27 Apr 2001 14:19:36 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RIJWA29309
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 14:19:32 -0400 (EDT)
Received: from xparelay2.corp.hp.com (xparelay2.corp.hp.com [15.58.137.112])
	by palrel1.hp.com (Postfix) with ESMTP
	id 5B5FE2120; Fri, 27 Apr 2001 11:19:31 -0700 (PDT)
Received: from xatlbh2.atl.hp.com (xatlbh2.atl.hp.com [15.45.89.187])
	by xparelay2.corp.hp.com (Postfix) with ESMTP
	id B7F351F54A; Fri, 27 Apr 2001 14:17:38 -0400 (EDT)
Received: by xatlbh2.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <JKNPRW43>; Fri, 27 Apr 2001 14:19:29 -0400
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A0901C@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'Douglas Otis'" <dotis@sanlight.net>,
        "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc
	??
Date: Fri, 27 Apr 2001 14:19:23 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> As the rules change from technology to technology, there are 
> issues involved
> in this endeavor that will place into focus some potential 
> problems.  I tend
> to think that an independent delivery protocol could be 
> developed.

An independant protocol that is agnostic to the transport medium would
probably be too general to be optimal in any specific transport environment.
I think the solution is to make SCSI truely independant of the transport
(strictly layered on top of the transport).  It seems that T10 is realizing
this and some are working towards that goal.

IMHO, requiring that iSCSI match other transport formats is going at the
problem from the wrong end.

Marj 


From owner-ips@ece.cmu.edu  Fri Apr 27 16:09:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA14731
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 16:09:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RILdQ29482
	for ips-outgoing; Fri, 27 Apr 2001 14:21:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sphmraaa.compuserve.com (hs-img-rel-1.compuserve.com [149.174.177.156])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RILQA29470
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 14:21:26 -0400 (EDT)
Received: (from mailgate@localhost)
	by sphmraaa.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) id OAA12013
	for ips@ece.cmu.edu; Fri, 27 Apr 2001 14:21:21 -0400 (EDT)
Received: from compuserve.com (dal-tgn-tvo-vty2.as.wcom.net [206.175.229.2])
	by sphmraaa.compuserve.com (8.9.3/8.9.3/SUN-REL-1.3) with ESMTP id OAA11985;
	Fri, 27 Apr 2001 14:21:10 -0400 (EDT)
Message-ID: <3AE9B959.2F1DF414@compuserve.com>
Date: Fri, 27 Apr 2001 13:24:26 -0500
From: Ralph Weber <ralphoweber@compuserve.com>
Reply-To: ENDL_TX@computer.org
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD NSCPCD47  (Win95; I)
X-Accept-Language: en,pdf
MIME-Version: 1.0
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
CC: "Gardner, Ed" <Gardner@acm.org>
Subject: Re: iSCSI : EnableACA
References: <C1256A3B.0050B16D.00@d12mta02.de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

} Ralph,
}
} I am not sure that I understand the question.

Looks like you have done a good job to me.

} After the reset task management functions iSCSI requires
} the target enter Unit Attention for all other initiators
} connected to the same target (or LU) and, after reporting
} the Unit Attention enter ACA.
}
} You and Ed pointed out that many initiators will be unable
} to handle this (will not reset the condition).
}
} EnableACA is a hack intended to allow an initiator to control
} the target behavior.

First off, unless Asynchronous Event Reporting is enabled,
Unit Attention conditions get reported with a CHECK CONDITION
status, bingo ACA is controlled in the usual way with the
NACA bit.

Second, Ed is going to cover Unit Attention conditions in his
proposal, in fact that was Ed's hot button that got him writing
a proposal.  Ed's proposal does not exactly make Unit Attention
engage an ACA, instead he is proposing to give the Unit Attention
condition an ACA-like longevity (i.e., the Unit Attention
condition stays active until the initiator acknowledges it).

So, I am wondering why the EnableACA is needed.  From my point of
view, putting something like EnableACA in the protocol definition
is bad layering (not that SCSI has always observed good layering,
but why continue to promote bad habits).

} NormACA and the NACA CDB bit have similar purposes but
} NormACA is a "read-only" flag and the ACA condition is not
} created in this case as a result of a command execution error
} but as a result of an action by an external factor (another
} initiator).
}
} This situation is probably more general - and some other
} transports dealing with commands in flight might experience
} a similar problem but up to now SPC has no way of handling
} this situation.
}
} If Ed is going to cover this within the "extended use of ACA"
} we discussed for SCSI commands rejected with busy and task set
} full and for Unit Attention we might have to coordinate.

I think we need to coordinate.  It also may be necessary to
give Ed a gentle nudge.

Thanks.

Ralph...





From owner-ips@ece.cmu.edu  Fri Apr 27 16:11:22 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA14766
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 16:11:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RHmcN27699
	for ips-outgoing; Fri, 27 Apr 2001 13:48:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RHltA27680
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 13:47:56 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id BB31CAD6; Fri, 27 Apr 2001 10:47:54 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id KAA16638;
	Fri, 27 Apr 2001 10:47:35 -0700 (PDT)
Message-ID: <3AE9B08D.966A27FF@cup.hp.com>
Date: Fri, 27 Apr 2001 10:46:53 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Cc: matt_wakeley@agilent.com
Subject: Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
References: <C1256A3B.003ABC37.00@d12mta02.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------940B0CDAA09F7009597E247D"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------940B0CDAA09F7009597E247D
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Julian,

The conclusion on this thread was that some text was to be added to the
spec to address this issue. The rev 06 does not have any text to this
effect. It would help to explicitly describe how initiators should plug
the hole in CmdSN when they do not intend to use "command retry".

Also, regarding the use of NOP-OUT to fill the hole, why not just use
Abort Task for the same purpose ? Do we need a 2nd outbound PDU from the
initiator just to fill the hole ? When the initiator encounters a ULP
timeout, it would use an Abort Task to error the I/O. If the Abort Task
can contain the CmdSN of the original command being aborted, targets can
fill the hole based on that information, without requiring a second
outbound PDU from the initiator for this purpose.

Some text in the draft on this subject would be helpful to implementors.

- Santosh


julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> You had a possible answer from Matt.  However I agree that we might want
> to address this in text



julian_satran@il.ibm.com wrote:
> 
> I the hole is in the command queue and the task is just aborted the
> response to the abort task
> will unveil the fact that it did not reach destination.
> 
> Initiator can recover from there in several ways - clear the task set, fill
> the hole with an iSCSI noop etc.
> The latter, I recall, Was sugested to you by Matt Wakeley a while ago.
> 
> None of them require any changes in the spec.
> 
> Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 27/04/2001 04:56:26
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   ips@ece.cmu.edu
> Subject:  Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
> 
> Julian,
> 
> Could you please clarify if the below issue is going to be addressed in
> the iSCSI draft, as was discussed earlier.
> (http://ips.pdl.cs.cmu.edu/mail/msg03155.html).
> 
> Specifically, is the spec going to address the issue of how initiators
> can plug a hole in CmdSN sequence when they detect a ULP timeout and/or
> choose not to use "command retry".
> 
> Regards,
> Santosh
> 
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > You had a possible answer from Matt.  However I agree that we might want
> to
> > address this in text although
> > a solution similar to that suggested by Matt should be by now obvious to
> > every implementer - the target should leave a placeholder in the input
> > queue until the command after gets delivered.
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 25/01/2001 21:38:04
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   IPS Reflector <ips@ece.cmu.edu>
> > cc:
> > Subject:  iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
> >
> > Julian & All,
> >
> > The draft is currently lacking a section that addresses abort I/O error
> > recovery. Specifically, how is CmdSN bridging issues to be handled in
> > the case where an initiator chooses not to retry an I/O [that failed on
> > a connection failure that affects the delivery of the command to the
> > target or a digest error at the target] because its ULP timer may have
> > expired.
> >
> > In such cases, the initiator can send an Abort Task to inform the target
> > that the I.T.T is being aborted and its corresponding CmdSN can be
> > bridged, instead of having the target stall infinitely in its attempt to
> > enforce ordering and await the missing CmdSN [which is'nt going to
> > arrive, because the initiator did not retry the command].
> >
> > Regards,
> > Santosh
> >
> >  - santoshr.vcf
>  - santoshr.vcf
--------------940B0CDAA09F7009597E247D
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------940B0CDAA09F7009597E247D--



From owner-ips@ece.cmu.edu  Fri Apr 27 16:57:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA15809
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 16:57:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RIudg01647
	for ips-outgoing; Fri, 27 Apr 2001 14:56:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RIu8A01595
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 14:56:09 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 155ACE3E; Fri, 27 Apr 2001 11:56:08 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id LAA24188;
	Fri, 27 Apr 2001 11:56:03 -0700 (PDT)
Message-ID: <3AE9C099.1FF33815@cup.hp.com>
Date: Fri, 27 Apr 2001 11:55:21 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu, Julian Satran <julian_satran@il.ibm.com>
Cc: someshg@yahoo.com
Subject: iSCSI : Status SNACK mandatory ?
References: <LOEPJENHBHAHEABBNDAJOEDKCBAA.aghanem@cisco.com>
Content-Type: multipart/mixed;
 boundary="------------8A1BFCE67627AA47BE0B8421"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------8A1BFCE67627AA47BE0B8421
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


Rev 06 seems to indicate that Status SNACK is still mandatory (?). 

I thought the group had consensus that Status SNACK recovery must not be
mandated and Julian had agreed to make it optional. (See
http://ips.pdl.cs.cmu.edu/mail/msg04008.html).

The following text in rev 06 seems to indicate that target support for
Status SNACK is still mandatory :

Section 2.16
"iSCSI targets MUST support Status SNACK and MAY support Data SNACK."

Section 6.7.2 :
"The initiator MAY request the missing responses through SNACK, in which
case the target MUST reissue them."

Some clarifications on whether SNACK recovery schemes are going to be
optional would be helpful.

- Santosh



Ayman Ghanem wrote:
> 
> Is Status SNACK reject going to be added as a reason for a reject PDU?. If
> not, what is the appropriate behavior on a target receiving a status SNACK
> that it can not fulfill?. Should it terminate the connection?.
> 
> -Ayman
> 
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> > julian_satran@il.ibm.com
> > Sent: Friday, April 27, 2001 4:59 AM
> > To: ips@ece.cmu.edu
> > Subject: Re: iSCSI : target session login behaviour
> >
> >
> >
> >
> > SNACK rejected has been removed from the SCSI Response - Julo

> > Steph,
> >
> > The iSCSI draft is unclear today about the exact mechanism through which
> > the target indicates "invalid parameters" in response to a received
> > command.
> >
> > 1) Should it use a REJECT PDU or respond with the appropriate response
> > for that PDU indicating a response code of "Invalid Parameters" and a
> > "first bad byte" offset that indicates which parameter the target
> > disliked.
> >
> > IMO, an "Invalid Parameters" response in the response codes is
> > appropriate for SCSI Command and SCSI Task Mgmt commands. [coupled with
> > a "first bad byte" offset.]
> >
> > This is missing today.
> >
> > 2) Also, as discussed above, a general "No Addional Explanation" type of
> > status code in the login response would cover the "misc" category.
> >
> > 3) There are cases of ambiguity in the usage of REJECT or SCSI Response.
> > Take the case of a "SNACK Reject". It is present in both the SCSI
> > Response (SNACK Rejected) and REJECT PDU reason code (Data SNACK
> > Reject). Which mechanism is to be used in this case ?
> >
> > 4) There is no "Status SNACK Rejected" in the REJECT PDU.
> >
> > Regards,
> > Santosh
> >  - santoshr.vcf
> >
> >
> >
> >
--------------8A1BFCE67627AA47BE0B8421
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------8A1BFCE67627AA47BE0B8421--



From owner-ips@ece.cmu.edu  Fri Apr 27 17:04:31 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA16038
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 17:04:30 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RJAjk02575
	for ips-outgoing; Fri, 27 Apr 2001 15:10:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RJAVA02526
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 15:10:32 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id VAA263662
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 21:10:21 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id VAA81646
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 21:10:21 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3B.00695038 ; Fri, 27 Apr 2001 21:10:18 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3B.00694FA7.00@d12mta02.de.ibm.com>
Date: Fri, 27 Apr 2001 22:15:43 +0300
Subject: Re: iSCSI Parameter Negotiation
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Bob,

 - 1 - I've seen it (too late for 6)
I'll see what I can do about 2.

Thanks,
Julo

"Robert D. Russell" <rdr@mars.iol.unh.edu> on 27/04/2001 21:23:37

Please respond to "Robert D. Russell" <rdr@mars.iol.unh.edu>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI Parameter Negotiation




Julo:

Two small points:

1) I believe there is a minor formating error in draft 6 -- the title
   page for Appendix A is out of place -- it appears on page 96,
   before section 8.  Consequently, Appendix A on page 96 is empty,
   and then all the other Appendix labels are shifted, so the label
   Appendix B on page 104 really should be labeled Appendix A,
   the label Appendix C on page 118 really should be labeled Appendix B,
   etc.

2) The term "operational parameters" is used in many places in draft 6,
   but is never actually defined anywhere.  To me, it appears that there
   are only two types of parameters: security parameters, which are all
   listed in Appendix A, and operational parameters, which are all
   listed in Appendix D.  Therefore, I would suggest a change in the title
   of Appendix D from: "Login/Text Miscellaneous Keys" to
   "Login/Text Operational Keys" or just "Operational Keys".

   The definition of "security parameters" is clearly stated at the end
   of the first paragraph on page 83: "For a list of security parameters
   see Appendix A."  A similar statement should be added somewhere
   in section 4.3 on page 84: "For a list of operational parameters
   see Appendix D."  Alternatively (or additionally), the statement just
   quoted on page 83 could be modified to read: "All security parameters
   are listed in Appendix A, and all operational parameters in Appendix D."

Bob Russell
InterOperability Lab
University of New Hampshire
rdr@iol.unh.edu
603-862-3774
On Fri, 27 Apr 2001 julian_satran@il.ibm.com wrote:


> Robert,
>
> Yes - the target can "initiate" a negotiation (be a offering party).
> Driven was meant to underline that a sequence should end with an
initiator
> setting the F bit and then the target
> doing the same (except for errors).  I will try to fix the text to make
it
> clear.
>
> Julo
>






From owner-ips@ece.cmu.edu  Fri Apr 27 17:57:53 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA17425
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 17:57:53 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RJrdS04952
	for ips-outgoing; Fri, 27 Apr 2001 15:53:39 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RJraA04948
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 15:53:36 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id B8C5E94009
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 15:53:35 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI: Require iSCSI to use packet formats similar to FC, etc ?? 
In-Reply-To: Message from julian_satran@il.ibm.com 
   of "Fri, 27 Apr 2001 13:21:47 +0300." <C1256A3B.0038959C.00@d12mta02.de.ibm.com> 
References: <C1256A3B.0038959C.00@d12mta02.de.ibm.com> 
Date: Fri, 27 Apr 2001 15:51:55 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010427195335.B8C5E94009@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

> SST and FCP are very dissimilar

Actually, SST and FCP are deliberately virtually identical.  The SST
status PDU is identical to FCP's.  The SST command PDU is FCP's with
the transfer length removed (it's already in another piece of the ST
PDU which carries the SST command PDU).

I did that so that I wouldn't get trapped in arguments about how this,
that, or the other SAM feature is offered by SST.  I could just say,
whatever FCP does, SST does, in basically the same way.

However, with all the clout iSCSI has, I'm sure we won't have any
trouble convincing anybody that not only does iSCSI do everything the
other widely deployed network SCSI does, we fix all their bugs, and do
twice as much beyond that, too :^)

The most important thing is having a simple, one-to-one correspondence
between components of iSCSI and FCP.  The PDU layouts don't have to be
identical, but that is one way to inarguably achieve the
correspondence.  I don't see any reason why the iSCSI/FCP
correspondence shouldn't be very close.  Put another way, if the
correspondence is WORSE (more algorithmically difficult to make) than
FCP/||SCSI (which is a somewhat poor but well known correspondence)
we've totally blown it.

Steph


From owner-ips@ece.cmu.edu  Fri Apr 27 17:58:25 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA17469
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 17:58:25 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RKTfj06878
	for ips-outgoing; Fri, 27 Apr 2001 16:29:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RKTFA06861
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 16:29:15 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRW8P>; Fri, 27 Apr 2001 13:29:05 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173484@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "Ips Reflector (E-mail)" <ips@ece.cmu.edu>
Cc: "'KRUEGER,MARJORIE (HP-Roseville,ex1)'" <marjorie_krueger@hp.com>
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc
	??
Date: Fri, 27 Apr 2001 13:29:05 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

In my opinion, the proposed requirement is too vague as written and
therefore hard to discuss productively.  I'm not sure what the criteria for
compliance would be.

Charles
> -----Original Message-----
> From: KRUEGER,MARJORIE (HP-Roseville,ex1)
> [mailto:marjorie_krueger@hp.com]
> Sent: Thursday, April 26, 2001 4:18 PM
> To: Ips Reflector (E-mail)
> Subject: iSCSI: Require iSCSI to use packet formats similar 
> to FC, etc??
> 
> 
> Robert Snively proposes that the iSCSI Requirements document 
> include the
> following requirement WRT gateway devices:
> 
>    iSCSI MUST use packet formats similar to the common
>    packet formats used by other packetized SCSI protocols where
>    possible to allow both simple bridging gateways and more
>    sophisticated translating gateways.
> 
> Comments?
> 
> Marjorie Krueger
> Networked Storage Architecture
> Networked Storage Solutions Org.
> Hewlett-Packard
> tel: +1 916 785 2656
> fax: +1 916 785 0391
> email: marjorie_krueger@hp.com 
> 


From owner-ips@ece.cmu.edu  Fri Apr 27 18:03:12 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA17572
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 18:03:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RK4gr05586
	for ips-outgoing; Fri, 27 Apr 2001 16:04:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from pluto.he.net (pluto.he.net [216.218.167.2])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RK43A05561
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 16:04:04 -0400 (EDT)
Received: from DSLmodem (adsl-64-174-149-250.dsl.sntc01.pacbell.net [64.174.149.250]) by pluto.he.net (8.8.6/8.8.2) with SMTP id NAA19494; Fri, 27 Apr 2001 13:03:50 -0700
From: "Sriram Rupanagunta" <sriramr@aarohi-inc.com>
To: "Mark Bakke" <mbakke@cisco.com>,
        "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>
Cc: "Ips Reflector \(E-mail\)" <ips@ece.cmu.edu>
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc??
Date: Fri, 27 Apr 2001 15:08:27 -0500
Message-ID: <AHECJANLDNBAICCKGPIPAEPMCBAA.sriramr@aarohi-inc.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
In-Reply-To: <3AE973CD.3121BEDE@cisco.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

> The word "similar" is pretty vague.  If it means that iSCSI should
> include the same types of information that the other transports
> provide, that's probably required just to get it working.  If it
> means that iSCSI messages have to look like Fibre Channel frames
> (I don't realy think that's what you mean), that would be a severe
> and inappropriate limitation.

In concept, this requirement looks interesting, though I would
have the same concern as Mark, in that the similarity needs to
be elaborated more, before it can be further discussed here, IMO.


> "KRUEGER,MARJORIE (HP-Roseville,ex1)" wrote:
> >
> > Robert Snively proposes that the iSCSI Requirements document include the
> > following requirement WRT gateway devices:
> >
> >    iSCSI MUST use packet formats similar to the common
> >    packet formats used by other packetized SCSI protocols where
> >    possible to allow both simple bridging gateways and more
> >    sophisticated translating gateways.
> >
>



From owner-ips@ece.cmu.edu  Fri Apr 27 19:13:36 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19607
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 19:13:35 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RLGh809555
	for ips-outgoing; Fri, 27 Apr 2001 17:16:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RLFlA09496
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 17:15:47 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 4234E4B6; Fri, 27 Apr 2001 14:15:46 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA06427;
	Fri, 27 Apr 2001 14:15:41 -0700 (PDT)
Message-ID: <3AE9E153.F1886E58@cup.hp.com>
Date: Fri, 27 Apr 2001 14:14:59 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com, ips@ece.cmu.edu
Cc: ENDL_TX@computer.org
Subject: Re: iSCSI : EnableACA
References: <C1256A3B.001D2BBE.00@d12mta02.de.ibm.com> <3AE95F1C.D5950596@compuserve.com>
Content-Type: multipart/mixed;
 boundary="------------7A8E72127B2A14E77727FA09"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------7A8E72127B2A14E77727FA09
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Ralph Weber wrote:
> 
> Julian,
> 
> Regarding your discussion with Santosh...
> 
> > The task management functions for iSCSI contain a
> > description the behaviour I am talking about. This
> > was also subject to a long discussion on the list
> > that included the need for ACA behaviour for things
> > like task-set full, busy etc. - not considered for
> > ACA. For the later T10 is handling making ACA
> > behaviour available.
> >
> My question is... Exactly which what are the cases where
> T10 is NOT considering making ACA behavior available so
> that you believe the EnableACA function is necessary in
> iSCSI?
> 
> My belief is that every behavior that EnableACA covers
> but T10 does not needs to be taken to T10 to see if
> they can find a better way to extend ACA coverage to
> that area.

Julian,

My original proposal for this issue stated that ANY exception condition
on an I/O that had the NACA bit set in its control byte of the CDB must
result in ACA being established.

"Any exception condition" would include :
- I/O being aborted by the target due to a LU Reset, Target Reset or
Clear Task Set issued by another initiator.
- I/O being aborted by the initiator through the use of Abort Task.
- QUEUE FULL, BUSY, RESERVATION CONFLICT, etc status returned on the
I/O.

The caveat to this rule is that while ACA is already active, a second
one would not be established.

The point I'm trying to make is :

1) ACA is being used in this context to aid in the preservation of
strict command ordering. Any attempt to enforce strict command ordering
would require initiators to set the NACA bit in the cdb for their I/Os.
Such initiators would be ACA aware and Clear ACA capable.

The argument that initiators may not be able to perform "Clear ACA" and
so need an additional control [thru EnableACA] to prevent ACA from being
established is not applicable, because, such initiators would not set
NACA in their cdb's, and in that case, ACA would not be established.

2) ACA is a LU level construct and iSCSI is changing this granularity to
be a session level construct. For example, initiators could turn on ACA
to only 1 LU on which they had strong ordering requirements.

3) ACA is a ULP construct and any changes, if found necessary, should be
made in the ULP mode pages and not within iSCSI.

- Santosh
--------------7A8E72127B2A14E77727FA09
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------7A8E72127B2A14E77727FA09--



From owner-ips@ece.cmu.edu  Fri Apr 27 19:17:06 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA19641
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 19:17:05 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RL4hT08857
	for ips-outgoing; Fri, 27 Apr 2001 17:04:43 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RL4eA08851
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 17:04:40 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 1EDA1CDF; Fri, 27 Apr 2001 14:04:39 -0700 (PDT)
Received: from cup.hp.com (santoshr@hpindhhm.cup.hp.com [15.8.80.197])
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) with ESMTP id OAA05253;
	Fri, 27 Apr 2001 14:04:24 -0700 (PDT)
Message-ID: <3AE9DE9E.CA594EB8@cup.hp.com>
Date: Fri, 27 Apr 2001 14:03:26 -0700
From: Santosh Rao <santoshr@cup.hp.com>
Organization: Hewlett Packard, Cupertino.
X-Mailer: Mozilla 4.7 [en] (X11; U; HP-UX B.11.00 9000/778)
X-Accept-Language: en
MIME-Version: 1.0
To: julian_satran@il.ibm.com
Cc: Black_David@emc.com, ips@ece.cmu.edu
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
References: <C1256A3B.0018E62A.00@d12mta05.de.ibm.com>
Content-Type: multipart/mixed;
 boundary="------------CFC42F267EE7F7DED024608A"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

This is a multi-part message in MIME format.
--------------CFC42F267EE7F7DED024608A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

julian_satran@il.ibm.com wrote:
> 
> David,
> 
> I read Bob's mail and my interpretation is similar to his. However I think
> that SPC explicitly states that different transports are free to interpret
> and make use of this page as they find appropriate.
> 
> I have a hard time understanding Santosh's objection as it does not refer
> to the reason the EMDP is there but to the way it is written in FCP (not
> iSCSI).

Julian,

As has been stated earlier, EMDP allows control over the order in which
the target requests outbound data or sends inbound data. EMDP can be
used by initiators to control this order and turn off out-of-order R2T
requests [as well as turn off out of order read data pdus].

This is a useful control option and is already provided by other SCSI
transports. What good reason exists to deny this provision in iSCSI ?

Also, I have some concerns about the ambiguous definition of DataOrder.

Per the spec :
"DataOrder=<yes|no> 
    
The default is yes but targets MAY support no. No is used by iSCSI to
indicate that the data PDUs can be in any order (EMDP = 1). Yes is used
to indicate that incoming data PDUs have to be at continuously
increasing addresses (EMDP = 0)."

Based on the above definition wording :

a) How is DataOrder interpreted for WRITE I/Os ?
b) Is the ordering across the entire SCSI command or a subset of the I/O
? If so, what constitutes this subset ?

Different implementors can arrive at different interpretations reading
the above definition !

- Santosh
--------------CFC42F267EE7F7DED024608A
Content-Type: text/x-vcard; charset=us-ascii;
 name="santoshr.vcf"
Content-Description: Card for Santosh Rao
Content-Disposition: attachment;
 filename="santoshr.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard 
n:Rao;Santosh 
tel;work:408-447-3751
x-mozilla-html:FALSE
org:Hewlett Packard, Cupertino.;SISL
adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
version:2.1
email;internet:santoshr@cup.hp.com
title:Software Design Engineer
x-mozilla-cpt:;21088
fn:Santosh Rao
end:vcard

--------------CFC42F267EE7F7DED024608A--



From owner-ips@ece.cmu.edu  Fri Apr 27 19:52:15 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA20061
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 19:52:15 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RMVkj13399
	for ips-outgoing; Fri, 27 Apr 2001 18:31:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RMVcA13393
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 18:31:39 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRXF0>; Fri, 27 Apr 2001 15:31:22 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173487@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "'Douglas Otis'" <dotis@sanlight.net>,
        Charles Monia
	 <cmonia@NishanSystems.com>,
        "'KRUEGER,MARJORIE (HP-Roseville,ex1)'"
	 <marjorie_krueger@hp.com>,
        ips@ece.cmu.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Fri, 27 Apr 2001 15:31:21 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

Please see below.


> -----Original Message-----
> From: Douglas Otis [mailto:dotis@sanlight.net]
> Sent: Wednesday, April 25, 2001 5:38 PM
> To: Charles Monia; 'KRUEGER,MARJORIE (HP-Roseville,ex1)';
> ips@ece.cmu.edu
> Subject: RE: iSCSI reqmts and Ethernet adapters
> 
> 
> Charles,
> 
> The encapsulation proposal is devoid of techniques for 
> framing.  How do you
> expect to see this lack of framing resolved?  Do you expect 
> to use this
> adapter and not have a means for framing?
> 

No, I don't expect to use any adapter.

Charles








From owner-ips@ece.cmu.edu  Fri Apr 27 19:52:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA20080
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 19:52:36 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RM0jY11827
	for ips-outgoing; Fri, 27 Apr 2001 18:00:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RLxhA11741
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 17:59:44 -0400 (EDT)
Received: from xparelay2.corp.hp.com (xparelay2.corp.hp.com [15.58.137.112])
	by palrel1.hp.com (Postfix) with ESMTP
	id DB93522B4; Fri, 27 Apr 2001 14:59:42 -0700 (PDT)
Received: from xatlbh1.atl.hp.com (xatlbh1.atl.hp.com [15.45.89.186])
	by xparelay2.corp.hp.com (Postfix) with ESMTP
	id 980C81F549; Fri, 27 Apr 2001 17:57:50 -0400 (EDT)
Received: by xatlbh1.atl.hp.com with Internet Mail Service (5.5.2653.19)
	id <J4Z0T5JR>; Fri, 27 Apr 2001 17:59:41 -0400
Message-ID: <499DC368E25AD411B3F100902740AD6502D4F183@xrose03.rose.hp.com>
From: "CONGDON,PAUL (HP-Roseville,ex1)" <paul_congdon@hp.com>
To: "'Stephen Bailey'" <steph@cs.uchicago.edu>, ips@ece.cmu.edu
Subject: RE: iSCSI : target session login behavior 
Date: Fri, 27 Apr 2001 17:54:17 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


Seems like there may be a small security hole in the direction suggested
below.  Before performing the implicit logout, shouldn't you authenticate
the initiator first?  If not, it would be possible for anyone to cause a
session to get dropped simply by knowing their WWUI and ISID.

Paul

+--------------------------+----------------------------+
+ Paul Congdon             + Email: paul_congdon@hp.com +
+ Hewlett Packard Company  + Mail Stop:  5662           +
+ HP ProCurve Networking   + Phone:  (916) 785-5753     +
+ 8000 Foothills Blvd      + Fax:    (916) 785-5949     +
+ Roseville, CA   95747    + Mobile: (916) 765-4056     +
+--------------------------+----------------------------+

> -----Original Message-----
> From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> Sent: Thursday, April 26, 2001 6:18 AM
> To: ips@ece.cmu.edu
> Subject: Re: iSCSI : target session login behaviour 
> 
> 
> Santosh,
> 
> > How should a target respond when it receives a session 
> login  [on a new
> > TCP connection] with the same (ISID, Initiator Name) as a session
> > already active at the target.
> 
> Originally, I thought rejecting the login was the correct behavior,
> and that's what I specified in the error handling pseudocode.  I
> construed this as a consistency (logic) error in the target.  However,
> on further thought, I've changed my mind.  I believe the correct
> behavior is to perform an implicit logout (of all outstanding
> connections in the session).  The target and initiator may have
> different ideas about whether the connections are still live, and like
> in FCP, performing an implicit logout solves this problem.  It also
> provides a mechanisms for rapid, proactive recovery of session
> resources when possible, which is the right thing.
> 
> > iSCSI session login semantics should explicitly state that the above
> > scenario will result in case (1) above.
> 
> I agree.  This and all other possible cases.  
> 
> > As a side note, the iSCSI draft Status Class/Codes could do 
> with a misc
> > error category along the lines of the FC "No additional Explantion"
> > reason explantion. This would help deal with error 
> conditions that don't
> > come under the listed category.
> 
> Personally, I think we should add categories for reasons we obviously
> see now, AND have a no additional reason.
> 
> One peculiarity with what you're talking about above is that it should
> be a login response status code which expresses this rejection.  The
> login response set does not seem to have an `invalid parameter'
> response for cases when the request is somehow inconsistent.
> 
> Steph
> 


From owner-ips@ece.cmu.edu  Fri Apr 27 19:52:37 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA20090
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 19:52:37 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RM4iH12005
	for ips-outgoing; Fri, 27 Apr 2001 18:04:44 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sandmail.sandburst.com (sandburst-gw.bstn-gw02.ma.us.intelilink.net [216.57.129.34])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RM47A11986
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 18:04:07 -0400 (EDT)
Received: from cs.uchicago.edu (dynamite-38.sandburst.com [172.16.5.38])
	by sandmail.sandburst.com (Postfix) with ESMTP id 0A9A994006
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 18:04:07 -0400 (EDT)
To: ips@ece.cmu.edu
Subject: Re: iSCSI : target session login behavior 
In-Reply-To: Message from "CONGDON,PAUL (HP-Roseville,ex1)" <paul_congdon@hp.com> 
   of "Fri, 27 Apr 2001 17:54:17 EDT." <499DC368E25AD411B3F100902740AD6502D4F183@xrose03.rose.hp.com> 
References: <499DC368E25AD411B3F100902740AD6502D4F183@xrose03.rose.hp.com> 
Date: Fri, 27 Apr 2001 18:02:25 -0400
From: Stephen Bailey <steph@cs.uchicago.edu>
Message-Id: <20010427220407.0A9A994006@sandmail.sandburst.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Paul,

> Before performing the implicit logout, shouldn't you authenticate
> the initiator first?

Absolutely.  I didn't really detail the behavior, but I assume the
target won't do anything interesting until required authentication
(maybe none) has been performed.

Steph


From owner-ips@ece.cmu.edu  Fri Apr 27 19:52:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id TAA20107
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 19:52:49 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RMUmM13339
	for ips-outgoing; Fri, 27 Apr 2001 18:30:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RMU8A13262
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 18:30:08 -0400 (EDT)
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id SAA46912;
	Fri, 27 Apr 2001 18:22:39 -0400
Received: from f4n49e (d03nm065h.boulder.ibm.com [9.99.140.49])
	by westrelay02.boulder.ibm.com (8.8.8m3/NCO v4.96.1.0) with ESMTP id QAA131924;
	Fri, 27 Apr 2001 16:30:05 -0600
X-Priority: 1 (High)
Importance: Normal
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc	??
To: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
Cc: <ips@ece.cmu.edu>
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFEF8AB64C.CC82CDB5-ON88256A3B.00797051@LocalDomain>
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Fri, 27 Apr 2001 15:29:34 -0700
X-MIMETrack: Serialize by Router on D03NM065/03/M/IBM(Release 5.0.6 |December 14, 2000) at
 04/27/2001 04:29:55 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Precedence: bulk


There are a number of things we have done with iSCSI, such as Multiple
Connections per Session which are important that do not map to Fibre
Channel.  (Most of you can also call out things in iSCSI which are
important to you, and different from FC).  The important thing is to carry
the semantic of SCSI, and cause as small as state as possible to be
required in the Gateways.  There are probably a number of folks on the
reflector that are building Gateways,  the most famous/infamous is CISCO.
So when Mark said it was not an issue, that other items were lots more
important, that locked the answer for me.  We are near the end of this PDU
format journey, and now that the OP Code issue is solve, we should be on
the tail end of the process.  Changing formats of PDUs should not  be
acceptable, now, unless something is broken.  So I do not think it is
important to add requirement words, that could distract us from finishing,
which have not been needed up to now.

We have bigger focus items now, like Naming, Discovery, Security, etc. we
should focus on this items, and not on items which are  non problems.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Internet address: hufferd@us.ibm.com


"KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>@ece.cmu.edu
on 04/27/2001 11:19:23 AM

Sent by:  owner-ips@ece.cmu.edu


To:   "'Douglas Otis'" <dotis@sanlight.net>, "Ips Reflector (E-mail)"
      <ips@ece.cmu.edu>
cc:
Subject:  RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc
      ??



> As the rules change from technology to technology, there are
> issues involved
> in this endeavor that will place into focus some potential
> problems.  I tend
> to think that an independent delivery protocol could be
> developed.

An independant protocol that is agnostic to the transport medium would
probably be too general to be optimal in any specific transport
environment.
I think the solution is to make SCSI truely independant of the transport
(strictly layered on top of the transport).  It seems that T10 is realizing
this and some are working towards that goal.

IMHO, requiring that iSCSI match other transport formats is going at the
problem from the wrong end.

Marj





From owner-ips@ece.cmu.edu  Fri Apr 27 20:00:24 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA20215
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 20:00:23 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RMpkL14344
	for ips-outgoing; Fri, 27 Apr 2001 18:51:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from atlrel1.hp.com (atlrel1.hp.com [156.153.255.210])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RMomA14289
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 18:50:48 -0400 (EDT)
Received: from xatlrelay1.atl.hp.com (xatlrelay1.atl.hp.com [15.45.89.190])
	by atlrel1.hp.com (Postfix) with ESMTP
	id F0B42D54; Fri, 27 Apr 2001 18:50:47 -0400 (EDT)
Received: from xpabh4.corp.hp.com (xpabh4.corp.hp.com [15.58.136.1])
	by xatlrelay1.atl.hp.com (Postfix) with ESMTP
	id E50091F53E; Fri, 27 Apr 2001 18:48:51 -0400 (EDT)
Received: by xpabh4.corp.hp.com with Internet Mail Service (5.5.2653.19)
	id <JK4D1SYS>; Fri, 27 Apr 2001 15:50:42 -0700
Message-ID: <6BD67FFB937FD411A04F00D0B74FE87802A09023@xrose06.rose.hp.com>
From: "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
To: "'Charles Monia'" <cmonia@NishanSystems.com>,
        "'Douglas Otis'" <dotis@sanlight.net>, ips@ece.cmu.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Fri, 27 Apr 2001 15:50:38 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I'm assuming Charles is implying that he doesn't expect to have the memory
and memory bandwidth limitations of an adapter?  FCIP and iFCP are
switch-sized applications, not NIC card-sized?

Marj 

> > The encapsulation proposal is devoid of techniques for 
> > framing.  How do you
> > expect to see this lack of framing resolved?  Do you expect 
> > to use this
> > adapter and not have a means for framing?
> > 
> 
> No, I don't expect to use any adapter.
> 
> Charles


From owner-ips@ece.cmu.edu  Fri Apr 27 20:02:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA20266
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 20:02:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RN1ka14820
	for ips-outgoing; Fri, 27 Apr 2001 19:01:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RN1eA14814
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 19:01:40 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRX2B>; Fri, 27 Apr 2001 16:01:34 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173488@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "'KRUEGER,MARJORIE (HP-Roseville,ex1)'" <marjorie_krueger@hp.com>,
        Charles Monia <cmonia@NishanSystems.com>,
        "'Douglas Otis'"
	 <dotis@sanlight.net>, ips@ece.cmu.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Fri, 27 Apr 2001 16:01:33 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi Marj:

See below.

> -----Original Message-----
> From: KRUEGER,MARJORIE (HP-Roseville,ex1)
> [mailto:marjorie_krueger@hp.com]
> Sent: Friday, April 27, 2001 3:51 PM
> To: 'Charles Monia'; 'Douglas Otis'; ips@ece.cmu.edu
> Subject: RE: iSCSI reqmts and Ethernet adapters
> 
> 
> I'm assuming Charles is implying that he doesn't expect to 
> have the memory
> and memory bandwidth limitations of an adapter?  FCIP and iFCP are
> switch-sized applications, not NIC card-sized?
> 

Yes, that's true. We also don't plan to use a seperate adapter as the Gbe
interface for our box, so we wouldn't even indirectly benefit from the
adapter's being able to do framing.  I'm pretty sure that's true for the
FCIP folks as well.  I assume that's what Doug was getting at.

Charles


From owner-ips@ece.cmu.edu  Fri Apr 27 21:48:00 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA22729
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 21:48:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3S0NwJ18517
	for ips-outgoing; Fri, 27 Apr 2001 20:23:58 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3S0NYA18508
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 20:23:34 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18])
	by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3S1V9135493;
	Fri, 27 Apr 2001 18:31:09 -0700 (PDT)
	(envelope-from dotis@sanlight.net)
From: "Douglas Otis" <dotis@sanlight.net>
To: "Charles Monia" <cmonia@NishanSystems.com>,
        "'KRUEGER,MARJORIE \(HP-Roseville,ex1\)'" <marjorie_krueger@hp.com>,
        <ips@ece.cmu.edu>
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Fri, 27 Apr 2001 17:21:13 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEPECGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
In-Reply-To: <B300BD9620BCD411A366009027C21D9B173487@ariel.nishansystems.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Charles,

I suspect that I made too great a generalization with respect to networking
in describing hardware as adapters.  If you feel you efforts are in such a
niche market as not to warrant the consideration this effort has on other
networking, then perhaps your application is not of a significant concern.
You would make a good politician.  I suspect however we would make better
progress openly discussing this framing issue as it relates to IPS.

As this was once a hot issue, I don't think it was solved by not doing
anything in this area.  As you are running high data rates, you will be
sensitive to packet loss on longer links as will the next.  The need to do
framing is important.  This need is no less important for your application.
Having a proprietary solution for a specific application was my concern.  If
we do not make headway in this area quickly, each solution will be
proprietary.

Doug

> Hi:
>
> Please see below.
>
>
> > -----Original Message-----
> > From: Douglas Otis [mailto:dotis@sanlight.net]
> > Sent: Wednesday, April 25, 2001 5:38 PM
> > To: Charles Monia; 'KRUEGER,MARJORIE (HP-Roseville,ex1)';
> > ips@ece.cmu.edu
> > Subject: RE: iSCSI reqmts and Ethernet adapters
> >
> >
> > Charles,
> >
> > The encapsulation proposal is devoid of techniques for
> > framing.  How do you
> > expect to see this lack of framing resolved?  Do you expect
> > to use this
> > adapter and not have a means for framing?
> >
>
> No, I don't expect to use any adapter.
>
> Charles
>
>
>
>
>
>
>



From owner-ips@ece.cmu.edu  Fri Apr 27 21:49:26 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA22748
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 21:49:26 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3RNUmi16261
	for ips-outgoing; Fri, 27 Apr 2001 19:30:48 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RNUZA16255
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 19:30:35 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRXK6>; Fri, 27 Apr 2001 16:30:27 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B173489@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "'John Hufferd'" <hufferd@us.ibm.com>,
        "KRUEGER,MARJORIE (HP-Roseville,ex1)" <marjorie_krueger@hp.com>
Cc: ips@ece.cmu.edu
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc
		??
Date: Fri, 27 Apr 2001 16:30:26 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

From an implementation standpoint, I'd much rather see the requirements
document reflect a commitment to freezing the PDU formats.  I don't know if
that's possible. In any event, I don't think the document should provide a
licence to destabilize the PDUs.

Charles
> -----Original Message-----
> From: John Hufferd [mailto:hufferd@us.ibm.com]
> Sent: Friday, April 27, 2001 3:30 PM
> To: KRUEGER,MARJORIE (HP-Roseville,ex1)
> Cc: ips@ece.cmu.edu
> Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC,
> etc ??
> 
> 
> 
> There are a number of things we have done with iSCSI, such as Multiple
> Connections per Session which are important that do not map to Fibre
> Channel.  (Most of you can also call out things in iSCSI which are
> important to you, and different from FC).  The important 
> thing is to carry
> the semantic of SCSI, and cause as small as state as possible to be
> required in the Gateways.  There are probably a number of folks on the
> reflector that are building Gateways,  the most 
> famous/infamous is CISCO.
> So when Mark said it was not an issue, that other items were lots more
> important, that locked the answer for me.  We are near the 
> end of this PDU
> format journey, and now that the OP Code issue is solve, we 
> should be on
> the tail end of the process.  Changing formats of PDUs should not  be
> acceptable, now, unless something is broken.  So I do not think it is
> important to add requirement words, that could distract us 
> from finishing,
> which have not been needed up to now.
> 
> We have bigger focus items now, like Naming, Discovery, 
> Security, etc. we
> should focus on this items, and not on items which are  non problems.
> 
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Internet address: hufferd@us.ibm.com
> 
> 
> "KRUEGER,MARJORIE (HP-Roseville,ex1)" 
> <marjorie_krueger@hp.com>@ece.cmu.edu
> on 04/27/2001 11:19:23 AM
> 
> Sent by:  owner-ips@ece.cmu.edu
> 
> 
> To:   "'Douglas Otis'" <dotis@sanlight.net>, "Ips Reflector (E-mail)"
>       <ips@ece.cmu.edu>
> cc:
> Subject:  RE: iSCSI: Require iSCSI to use packet formats 
> similar to FC, etc
>       ??
> 
> 
> 
> > As the rules change from technology to technology, there are
> > issues involved
> > in this endeavor that will place into focus some potential
> > problems.  I tend
> > to think that an independent delivery protocol could be
> > developed.
> 
> An independant protocol that is agnostic to the transport medium would
> probably be too general to be optimal in any specific transport
> environment.
> I think the solution is to make SCSI truely independant of 
> the transport
> (strictly layered on top of the transport).  It seems that 
> T10 is realizing
> this and some are working towards that goal.
> 
> IMHO, requiring that iSCSI match other transport formats is 
> going at the
> problem from the wrong end.
> 
> Marj
> 
> 
> 


From owner-ips@ece.cmu.edu  Fri Apr 27 22:57:04 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA24576
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 22:57:04 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3S17pA20519
	for ips-outgoing; Fri, 27 Apr 2001 21:07:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway1.readyhosting.com ([63.119.175.29])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3RDsWA14752
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 09:54:32 -0400 (EDT)
Received: from mailserver16 [63.119.175.16] by gateway1.readyhosting.com with ESMTP
  (SMTPD32-6.06) id A89B9DF00D8; Fri, 27 Apr 2001 08:48:11 -0500
Received: from eddylaptop [66.31.72.237] by mailserver16 with ESMTP
  (SMTPD32-6.06) id A87010001C8; Fri, 27 Apr 2001 08:47:28 -0500
Reply-To: <Eddy@Quicksall.com>
From: "Eddy Quicksall" <Eddy@Quicksall.com>
To: <ips@ece.cmu.edu>
Subject: meeting in Nashua
Date: Fri, 27 Apr 2001 09:53:31 -0400
Message-ID: <003101c0cf21$7248fbf0$0100a8c0@eddylaptop>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Can someone tell me where the meeting in Nashua will be held and how to get
there?

mailto:Eddy@Quicksall.com



From owner-ips@ece.cmu.edu  Fri Apr 27 22:58:09 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id WAA24590
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 22:58:09 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3S1Hps20926
	for ips-outgoing; Fri, 27 Apr 2001 21:17:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxbh4.isus.emc.com (mxbh4.isus.emc.com [128.221.10.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3S1HMA20912
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 21:17:22 -0400 (EDT)
Received: by mxbh4.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <JG35CDCF>; Fri, 27 Apr 2001 21:17:16 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154D9@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: Eddy@quicksall.com, ips@ece.cmu.edu
Subject: RE: meeting in Nashua
Date: Fri, 27 Apr 2001 21:17:14 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

http://www.t10.org/ftp/t10/announce/ann-m043.pdf

> -----Original Message-----
> From:	Eddy Quicksall [SMTP:Eddy@quicksall.com]
> Sent:	Friday, April 27, 2001 9:54 AM
> To:	ips@ece.cmu.edu
> Subject:	meeting in Nashua
> 
> Can someone tell me where the meeting in Nashua will be held and how to
> get
> there?
> 
> mailto:Eddy@Quicksall.com


From owner-ips@ece.cmu.edu  Fri Apr 27 23:00:47 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA24643
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 23:00:46 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3S22s722798
	for ips-outgoing; Fri, 27 Apr 2001 22:02:54 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from mxic1.isus.emc.com ([168.159.129.100])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3S22iA22789
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 22:02:44 -0400 (EDT)
Received: by MXIC1 with Internet Mail Service (5.5.2650.21)
	id <2NTPA5P9>; Fri, 27 Apr 2001 22:04:13 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154DB@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: iSCSI Requirements -03 comments
Date: Fri, 27 Apr 2001 22:02:36 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

One more set of comments from me.  I wrote these on
a plane yesterday -- with luck I didn't miss anything
crucial, but plead effects of altitude for oversights
and mistakes herein.

-- Conventions used in this document

Add a sentence indicating that use
of RFC 2119 terms in this informational document reflects requirements
for the protocol specification rather than the usual use of RFC 2119
terms to indicate requirements on implementations.

-- Section 1 - Summary

>From 2.2

"improve on the state of the art for SCSI interconnects" - not clear what
this
	means for the protocol, Gbit Ethernet has performance limits by
comparison
	to the forthcoming 2 Gbit Fibre Channel.

"cost competitive with alternative storage networking technologies" is also
	hard to do in a protocol specification alone.

--> For these two, and any other requirements over which we really don't
have
	complete control (in contrast to technical requirements), use lower
case
	"must", etc. to avoid invoking RFC-2119.

>From 2.4

Modify FIFO transport to allow error conditions to violate FIFO - if
violations
are possible, then MUST be able to configure enforcement of ordering in
presence
of error conditions.  Take text for this from previous discussion on list to
resolve Santosh Rao's comment on this issue.

>From 4.1

I agree with Bob Snively's comment to delete extensibility to other data
integrity
digest formats.  In practice, this is going to require a revision to the
basic protocol specification document and is not to be done lightly.  It'd
be ok
to require a protocol version number to help support this sort of future
change.

>From 5.2

Leave SAM2 completeness text as is.  Features that T10 wants deleted are
cases in
which the "SHOULD" in "SHOULD make such a feature either RECOMMENDED or
REQUIRED
in implementations" is overruled (i.e., we have carefully weighed the
implications
and done something different).

Delete "SHOULD track changes to SCSI and SAM"  Once the RFC is done, it's
done -
it can be subsequently revised.  Replace it with something about
specification
development - the IPS WG SHOULD track such changes to make sure that
the RFC is based on the most current state of SCSI and SAM.

>From 6.3

"SHOULD NOT preclude ..." is backwards.  The requirement to pass muster with
the IESG is "MUST specify and REQUIRE the implementation of ..."

>From 7.1

Make it clear that "(URL)" is an example.

"existing naming authorities" - "authorities" is the wrong word,
"approaches" is
	one alternate possibility.

"SHOULD deal with the complications of the new SCSI security architecture."
-
Rephrase/rewrite to avoid "deal with the complications".  I think this
is about access controls, and hence might be phrased in terms of ensuring
that support is provided for them.

-- Section 2.1

   In the 
   local area, TCP's adaptive retransmission timers provide for 
   automatic and rapid error detection and recovery.

Delete the words "adaptive" and "timers".

In step (6) of deployment, change "SAN" to "storage".

   might also support all host protocols that use TCP (NFS, CIFS, HTTP,
etc).

Replace "all" with "other".

(IPSWG) --> (IPS WG)

-- Section 2.2

See above comment on removing upper case from requirements not completely
within our
control.

    Conventional storage access is of a stop-and-wait 

Replace "of" with "often".

Direct data placement - pick up Bob Snively's rewrite.

-- Section 2.3

Framing discussion needs to indicate that framing MUST be OPTIONAL to
implement.
Add a sentence indicating "SHOULD specify a default framing mechanism"
(i.e.,
specify the first framing mechanism that MUST be implemented *if* any
mechanism
is implemented).  This should be responsive to some of Doug Otis's concerns
about framing mechanism consistency, but I absolutely, positively will not
allow
any document to REQUIRE a common framing mechanism between iSCSI and other
protocols for procedural reasons discussed previously on the list.

-- Section 2.4

Delete the paragraph starting with "In the presence of connection binding,
there are two ways to assign features to connections" and the paragraph
starting with "An alternate approach that was discussed extensively ..."
Recording of WG discussions and rationale for design decisions is better
placed in protocol specification documents rather than requirements
documents.

-- Section 4.1

Discussion of separate header and data digests is internally inconsistent.
Fix this by changing the "MAY" and "MUST" to two "SHOULD"s (i.e., SHOULD
specify separate header and data digests).

Delete "SHOULD" sentence on extensibility to other digest methods.

-- Section 4.2

Add a phrase or sentence to the end of the first paragraph indicating that
recovery is frequently infeasible for tape due to the absence of block
addresses in the SCSI command set used for tape devices.  This'll help
set up the non-idempotent recovery requirement.

Change ""storage servers"" to "network ports".

-- Section 5.2

See above comments on leaving "comply with SAM" text as is and
on tracking changes to SAM and SCSI.

In discussing the requirement to support all command sets and
device types, call out long CDB formats and bi-directional
commands as examples.

Delete the "There is considerable discussion on the mailing list
archives" sentence at the end of the ACA paragraph because
I believe AIX is known to use ACA.

Pick up resolution of list discussion of command ordering in the face of
errors.

-- Section 6.1

Add a MUST for dynamic security mechanism being secure against
man-in-the-middle
attacks that would cause weak or no security to be negotiated when that was
not
the outcome intended by the negotiating parties.

-- Section 6.2 and 6.3

See earlier comment on "MUST NOT preclude" - turns up in both sections.  Add
connection hijacking as a specific case of source spoofing because it
imposes
additional requirements on the mechanism used for this.

-- Section 6.4

I'd prefer to call this "confidentiality" to avoid the fact that "privacy"
has more than one meaning.

-- Section 7.1

Make it clear that URL (RFC 1783) format is RECOMMENDED, not REQUIRED.

"path on which it is found" --> "path by which it is accessed".

LU WWN - iSCSI MAY REQUIRE this to be implemented.

Discussion of SCSI security is fine here, BUT delete the T10 document
reference - we can't make a normative reference to that sort of thing.
Provide a general description of what T10 is up to.  the first sentence
of this paragraph should be used in Section 1 instead of the "deal with
the complications" language that's currently there.

-- Section 7.2

Qualify the "provide some means of determining whether an iSCSI
service is available through an IP address" requirement to either
apply to an <IP address, TCP port> pair or apply to the <IP address>
and the default iSCSI TCP port.  The underlying issue is that the
straightforward thing to do is contact one TCP port and see if
iSCSI responds - if it doesn't, one shouldn't be REQUIRED to
do a port scan to find the
unauthorized iSCSI server running at port 12835, and doing that
scan will actually cause discovery and zoning problems in some
configurations.

    SCSI protocol-dependent techniques SHOULD be used for further 
    discovery beyond the iSCSI layer.

Change this to:

    Standard SCSI discovery techniques (e.g., REPORT_LUNS), as specified
    in the appropriate SCSI standards. MUST be used ...

On target discovery:
	given an IP end point on its well-known port
Generalize this to an IP endpoint and arbitrary TCP port (e.g., the default
TCP port for iSCSI as allocated by IANA)

-- References

We need Ralph Weber's advice about how to write stable references to T10
work in progress.  References 3-5 and 9 (SAM, SPC, CAM, FCP-2) have this
issue.  I think Reference 6 to the Hafner draft has to be deleted because
I don't see how to readily make that an archival/durable reference.
Reference 7 is incomplete, and reference 8 is fine.

------ DLB's comments on Bob Snively's comments --------

(1) Ok, add this requirement that one LU can't block another.
(2) Ok to rewrite ordered command delivery requirement to apply
	to I_T_L nexii, but make sure not to preclude a design
	(e.g., the one iSCSI is currently using) that satisfies
	this by doing command sequencing on I_T nexii.
(3) Change the phrase "passive attack" to "eavesdropping" or
	"traffic monitoring" and rephrase appropriately.
(4) Ok to delete the marketing-oriented text, as I don't think
	it's crucial to the document.  Ask Bob to supply a
	complete list of what he wants deleted.
(5) I like Bob's rewrite of the direct data placement text.
(6) I agree - delete all mention of the alternate connection binding
	model.  This sort of design decision rationale belongs in
	the protocol specification, not the requirements document.
(7) Bob's clarification of negotiation requirements for optional
	features looks reasonable to me.
(8) I agree - there should be one iSCSI CRC, and that should be it, period.
	It can be revised in a subsequent RFC which'll get a new document
number.
(9) Actually, the text is almost correct as written.  Here's how I
	think it should read:

   "In order to be considered a SCSI transport, the iSCSI standard MUST 
   comply with the requirements of the SCSI Architecture Model [SAM2] 
   for a SCSI transport.  Any feature SAM2 requires in a valid 
   transport mapping MUST be specified by iSCSI.  The iSCSI document 
   SHOULD specify that each such feature is RECOMMENDED, 
   or REQUIRED to implement, although they may be OPTIONAL to use."

	When T10 says iSCSI should not specify something that's in SAM,
	that winds up being an exception to the "SHOULD" - the WG must
	carefully consider all of the consequences and implications
	(cf. RFC 2119 definition of SHOULD) before doing something
	different.  Bob and I are in violent agreement about the goal
	and intent and disagreeing only on wording to achieve it,
	perhaps this explanation ought to be added?

(10) Bob's proposed new text for gateway requirements looks good.
(11) We can't delete the congestion control requirement, but adding
	a sentence indicating that TCP satisfies this requirement is
	appropriate.

FWIW, my taste in RFC-2119 terms is to use MUST instead of SHALL
to avoid any possible SHALL vs. SHOULD confusion, but this is my personal
taste and not something that I will enforce as a WG co-chair.

From a procedural standpoint, I think we're going to need a -04 version
of the document for those who have submitted comments to check to make
sure that the changes are satisfactory.  Many thanks to Marjorie for her
patience with this.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Fri Apr 27 23:57:58 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA25749
	for <ips-archive@odin.ietf.org>; Fri, 27 Apr 2001 23:57:57 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3S1hrb22021
	for ips-outgoing; Fri, 27 Apr 2001 21:43:53 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from server1.NishanSystems.COM (smtp.nishansystems.com [216.217.36.162])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3S1hOA22007
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 21:43:24 -0400 (EDT)
Received: by smtp.nishansystems.com with Internet Mail Service (5.5.2653.19)
	id <HPJTRXQA>; Fri, 27 Apr 2001 18:43:14 -0700
Message-ID: <B300BD9620BCD411A366009027C21D9B17348A@ariel.nishansystems.com>
From: Charles Monia <cmonia@NishanSystems.com>
To: "'Douglas Otis'" <dotis@sanlight.net>,
        Charles Monia
	 <cmonia@NishanSystems.com>,
        "'KRUEGER,MARJORIE (HP-Roseville,ex1)'"
	 <marjorie_krueger@hp.com>,
        ips@ece.cmu.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Fri, 27 Apr 2001 18:43:13 -0700
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

Hi:

Are you doing any product development?  If so, it might be even more
productive to share your first hand insights instead of lecturing others.

Charles
> -----Original Message-----
> From: Douglas Otis [mailto:dotis@sanlight.net]
> Sent: Friday, April 27, 2001 5:21 PM
> To: Charles Monia; 'KRUEGER,MARJORIE (HP-Roseville,ex1)';
> ips@ece.cmu.edu
> Subject: RE: iSCSI reqmts and Ethernet adapters
> 
> 
> Charles,
> 
> I suspect that I made too great a generalization with respect 
> to networking
> in describing hardware as adapters.  If you feel you efforts 
> are in such a
> niche market as not to warrant the consideration this effort 
> has on other
> networking, then perhaps your application is not of a 
> significant concern.
> You would make a good politician.  I suspect however we would 
> make better
> progress openly discussing this framing issue as it relates to IPS.
> 
> As this was once a hot issue, I don't think it was solved by not doing
> anything in this area.  As you are running high data rates, 
> you will be
> sensitive to packet loss on longer links as will the next.  
> The need to do
> framing is important.  This need is no less important for 
> your application.
> Having a proprietary solution for a specific application was 
> my concern.  If
> we do not make headway in this area quickly, each solution will be
> proprietary.
> 
> Doug
> 
> > Hi:
> >
> > Please see below.
> >
> >
> > > -----Original Message-----
> > > From: Douglas Otis [mailto:dotis@sanlight.net]
> > > Sent: Wednesday, April 25, 2001 5:38 PM
> > > To: Charles Monia; 'KRUEGER,MARJORIE (HP-Roseville,ex1)';
> > > ips@ece.cmu.edu
> > > Subject: RE: iSCSI reqmts and Ethernet adapters
> > >
> > >
> > > Charles,
> > >
> > > The encapsulation proposal is devoid of techniques for
> > > framing.  How do you
> > > expect to see this lack of framing resolved?  Do you expect
> > > to use this
> > > adapter and not have a means for framing?
> > >
> >
> > No, I don't expect to use any adapter.
> >
> > Charles
> >
> >
> >
> >
> >
> >
> >
> 


From owner-ips@ece.cmu.edu  Sat Apr 28 00:52:54 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA26623
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 00:52:54 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3S2et424431
	for ips-outgoing; Fri, 27 Apr 2001 22:40:55 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from maho3msx2.isus.emc.com (maho3msx2.isus.emc.com [128.221.11.32])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3S2eQA24415
	for <ips@ece.cmu.edu>; Fri, 27 Apr 2001 22:40:26 -0400 (EDT)
Received: by maho3msx2.isus.emc.com with Internet Mail Service (5.5.2650.21)
	id <28S76L2R>; Fri, 27 Apr 2001 22:40:21 -0400
Message-ID: <0F31E5C394DAD311B60C00E029101A07080154DD@corpmx9.isus.emc.com>
From: Black_David@emc.com
To: ips@ece.cmu.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Fri, 27 Apr 2001 22:40:19 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

I think this thread needs to end here.  As noted earlier,
there are a couple of framing mechanism drafts pending
before TSVWG.  Work on TCP framing applicable to any protocol
using TCP is in TSVWG's domain, *not* IPS's.  I would suggest
that further use of list bandwidth to complain about that
assignment of responsibility is counterproductive - I
would suggest contacting the Area Directors directly.
In practice, if TSVWG progresses a suitable framing draft
in a timely fashion, all the IPS protocols can pick it
up by reference.  I've shown up in TSVWG in the past to
help explain why IPS believe this to be important, and
will continue to do so.

In addition to directing further complaints about the assignment
of responsibility for framing to TSVWG and/or the Area Directors,
I would ask Doug Otis to refrain from commenting on implementations
with which he's not familiar or assuming that his implementation
approach is the only feasible/viable one.  Both of these have
caused unnecessary irritation and problems on the list in the
recent past, and have contributed to the large volume of recent
messages.

Any replies to this message should be sent DIRECTLY to me -
DO NOT REPLY TO THE LIST!

Thanks,
--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------



From owner-ips@ece.cmu.edu  Sat Apr 28 04:25:39 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA11719
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 04:25:38 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3S6N5h03065
	for ips-outgoing; Sat, 28 Apr 2001 02:23:05 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3S6M2A03047
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 02:22:03 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id IAA194998
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 08:21:59 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA116920
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 08:21:58 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3C.0022F494 ; Sat, 28 Apr 2001 08:21:48 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3C.0022F327.00@d12mta02.de.ibm.com>
Date: Sat, 28 Apr 2001 09:27:06 +0300
Subject: Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



We could add the RefCmdSN and it may help plug-in the hole but unless the
Abort is ussued on the same connection as the original command we can't be
sure that the old-one will not pop-up (as we enable ExpCmdSN to move on we
don't have even the 2**31-1 protection bracket :-)). Thus sending in
another nop/abort on the same connection is still required.

To simplify the whole process I will:

 a - add RefCmdSN to the Task Management
 b - add a command not received yet to the answers

 c - add a part to 7 reading:

1.1  How to Abort Safely a Command that Was Not Received

   To abort safely a task for which the task abort answer is "Command Not
   Received Yet" the initiator must issue another abort command on the same
   connection as the original command unless this connection was logged out
   in which case it may send it on any connection. The expected response
   for the second abort is Function Complete (if the command did not
   arrive) or "Task was not in task set".


   This convoluted scheme is necessary as the target does not know to what
   connection the hole is related and we don't want to force abort task to
   the same connection as the original command.


   Julo

Santosh Rao <santoshr@cup.hp.com> on 27/04/2001 20:46:53

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:   matt_wakeley@agilent.com
Subject:  Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery




Julian,

The conclusion on this thread was that some text was to be added to the
spec to address this issue. The rev 06 does not have any text to this
effect. It would help to explicitly describe how initiators should plug
the hole in CmdSN when they do not intend to use "command retry".

Also, regarding the use of NOP-OUT to fill the hole, why not just use
Abort Task for the same purpose ? Do we need a 2nd outbound PDU from the
initiator just to fill the hole ? When the initiator encounters a ULP
timeout, it would use an Abort Task to error the I/O. If the Abort Task
can contain the CmdSN of the original command being aborted, targets can
fill the hole based on that information, without requiring a second
outbound PDU from the initiator for this purpose.

Some text in the draft on this subject would be helpful to implementors.

- Santosh


julian_satran@il.ibm.com wrote:
>
> Santosh,
>
> You had a possible answer from Matt.  However I agree that we might want
> to address this in text



julian_satran@il.ibm.com wrote:
>
> I the hole is in the command queue and the task is just aborted the
> response to the abort task
> will unveil the fact that it did not reach destination.
>
> Initiator can recover from there in several ways - clear the task set,
fill
> the hole with an iSCSI noop etc.
> The latter, I recall, Was sugested to you by Matt Wakeley a while ago.
>
> None of them require any changes in the spec.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 27/04/2001 04:56:26
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Julian Satran/Haifa/IBM@IBMIL
> cc:   ips@ece.cmu.edu
> Subject:  Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error
recovery
>
> Julian,
>
> Could you please clarify if the below issue is going to be addressed in
> the iSCSI draft, as was discussed earlier.
> (http://ips.pdl.cs.cmu.edu/mail/msg03155.html).
>
> Specifically, is the spec going to address the issue of how initiators
> can plug a hole in CmdSN sequence when they detect a ULP timeout and/or
> choose not to use "command retry".
>
> Regards,
> Santosh
>
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > You had a possible answer from Matt.  However I agree that we might
want
> to
> > address this in text although
> > a solution similar to that suggested by Matt should be by now obvious
to
> > every implementer - the target should leave a placeholder in the input
> > queue until the command after gets delivered.
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 25/01/2001 21:38:04
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   IPS Reflector <ips@ece.cmu.edu>
> > cc:
> > Subject:  iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
> >
> > Julian & All,
> >
> > The draft is currently lacking a section that addresses abort I/O error
> > recovery. Specifically, how is CmdSN bridging issues to be handled in
> > the case where an initiator chooses not to retry an I/O [that failed on
> > a connection failure that affects the delivery of the command to the
> > target or a digest error at the target] because its ULP timer may have
> > expired.
> >
> > In such cases, the initiator can send an Abort Task to inform the
target
> > that the I.T.T is being aborted and its corresponding CmdSN can be
> > bridged, instead of having the target stall infinitely in its attempt
to
> > enforce ordering and await the missing CmdSN [which is'nt going to
> > arrive, because the initiator did not retry the command].
> >
> > Regards,
> > Santosh
> >
> >  - santoshr.vcf
>  - santoshr.vcf
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Sat Apr 28 11:09:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA14400
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 11:09:44 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SE5Qi29898
	for ips-outgoing; Sat, 28 Apr 2001 10:05:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3SE4gA29824
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 10:04:42 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id QAA232258
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 16:04:35 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id QAA60958
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 16:04:35 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3C.004D4ECC ; Sat, 28 Apr 2001 16:04:24 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3C.004D4D9D.00@d12mta02.de.ibm.com>
Date: Sat, 28 Apr 2001 17:09:44 +0300
Subject: Re: iSCSI : Logout Flavors.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



it is a typo - Julo

The text reads now:

   Indicate the reason for Logout:

      0 - closes the session - the session is closed
      1 - closes the connection - the connection is closed - all commands
      associated with connection (if any) are aborted
      2 - removes the connection for recovery  - connection is closed and
      all commands associated with it (if any) are to be prepared for a new
      allegiance
      3 - removes the connection at target's request (requested through an
      Asynchronous Message) - will result in a logout only if the target
      issued the message

Santosh Rao <santoshr@cup.hp.com> on 28/04/2001 02:52:47

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   santoshr@cup.hp.com
Subject:  iSCSI : Logout Flavors.




Julian,

The Rev 06 draft has a new flavor of logout in the reason code which is
:
"closes the connections"

Could you please clarify if the above is a typo intended to read :
"closes the connection"

or is it indeed meant to clean up all the connections in the session. If
so, how is it different from "closes the session" ?

It would be helpful if a short description is provided for each of the
logout reason codes that describes the error recovery actions the
initiator and target perform on that reason code. For ex : which flavors
result in a cleanup of all outstanding tasks on the connection ?

Lastly, any thoughts on http://ips.pdl.cs.cmu.edu/mail/msg04371.html ?
Inclusion of such a table will provide clear directions for target
implementors on which properties are to be cleared on initiator actions.
Similarly, initiators will be clearly aware of the effects of their
actions on target's objects.

Regards,
Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Sat Apr 28 11:09:48 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA14411
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 11:09:47 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SDkQb29018
	for ips-outgoing; Sat, 28 Apr 2001 09:46:26 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3SDjjA28972
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 09:45:45 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id PAA232380
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 15:45:37 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id PAA167840
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 15:45:38 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3C.004B94F4 ; Sat, 28 Apr 2001 15:45:33 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3C.004B93D5.00@d12mta02.de.ibm.com>
Date: Sat, 28 Apr 2001 16:50:53 +0300
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



What you are suggesting is that during reset the target examines all CDBs
from all initiators.
Doesn't seem very practical to me.  What about a command in flight (that
was the first that had a NACA bit)?
It looks like ED might solve the problem for us making UA behave ACA like.

And EnableACA is not per session (although at a point I intended it to be).

Julo

Santosh Rao <santoshr@cup.hp.com> on 28/04/2001 00:14:59

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:   ENDL_TX@computer.org
Subject:  Re: iSCSI : EnableACA




Ralph Weber wrote:
>
> Julian,
>
> Regarding your discussion with Santosh...
>
> > The task management functions for iSCSI contain a
> > description the behaviour I am talking about. This
> > was also subject to a long discussion on the list
> > that included the need for ACA behaviour for things
> > like task-set full, busy etc. - not considered for
> > ACA. For the later T10 is handling making ACA
> > behaviour available.
> >
> My question is... Exactly which what are the cases where
> T10 is NOT considering making ACA behavior available so
> that you believe the EnableACA function is necessary in
> iSCSI?
>
> My belief is that every behavior that EnableACA covers
> but T10 does not needs to be taken to T10 to see if
> they can find a better way to extend ACA coverage to
> that area.

Julian,

My original proposal for this issue stated that ANY exception condition
on an I/O that had the NACA bit set in its control byte of the CDB must
result in ACA being established.

"Any exception condition" would include :
- I/O being aborted by the target due to a LU Reset, Target Reset or
Clear Task Set issued by another initiator.
- I/O being aborted by the initiator through the use of Abort Task.
- QUEUE FULL, BUSY, RESERVATION CONFLICT, etc status returned on the
I/O.

The caveat to this rule is that while ACA is already active, a second
one would not be established.

The point I'm trying to make is :

1) ACA is being used in this context to aid in the preservation of
strict command ordering. Any attempt to enforce strict command ordering
would require initiators to set the NACA bit in the cdb for their I/Os.
Such initiators would be ACA aware and Clear ACA capable.

The argument that initiators may not be able to perform "Clear ACA" and
so need an additional control [thru EnableACA] to prevent ACA from being
established is not applicable, because, such initiators would not set
NACA in their cdb's, and in that case, ACA would not be established.

2) ACA is a LU level construct and iSCSI is changing this granularity to
be a session level construct. For example, initiators could turn on ACA
to only 1 LU on which they had strong ordering requirements.

3) ACA is a ULP construct and any changes, if found necessary, should be
made in the ULP mode pages and not within iSCSI.

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Sat Apr 28 11:10:01 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA14424
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 11:10:00 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SDvPs29571
	for ips-outgoing; Sat, 28 Apr 2001 09:57:25 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3SDujA29548
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 09:56:45 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id PAA237606
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 15:56:37 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id PAA50104
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 15:56:38 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3C.004C9432 ; Sat, 28 Apr 2001 15:56:26 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3C.004C9386.00@d12mta02.de.ibm.com>
Date: Sat, 28 Apr 2001 17:01:49 +0300
Subject: Re: iSCSI : Logout Flavors.
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



it is a typo - julo

Santosh Rao <santoshr@cup.hp.com> on 28/04/2001 02:52:47

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   santoshr@cup.hp.com
Subject:  iSCSI : Logout Flavors.




Julian,

The Rev 06 draft has a new flavor of logout in the reason code which is
:
"closes the connections"

Could you please clarify if the above is a typo intended to read :
"closes the connection"

or is it indeed meant to clean up all the connections in the session. If
so, how is it different from "closes the session" ?

It would be helpful if a short description is provided for each of the
logout reason codes that describes the error recovery actions the
initiator and target perform on that reason code. For ex : which flavors
result in a cleanup of all outstanding tasks on the connection ?

Lastly, any thoughts on http://ips.pdl.cs.cmu.edu/mail/msg04371.html ?
Inclusion of such a table will provide clear directions for target
implementors on which properties are to be cleared on initiator actions.
Similarly, initiators will be clearly aware of the effects of their
actions on target's objects.

Regards,
Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Sat Apr 28 11:13:49 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA14441
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 11:13:48 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SD9b927601
	for ips-outgoing; Sat, 28 Apr 2001 09:09:37 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3SD8wA27590
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 09:08:58 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id PAA166510
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 15:08:50 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id PAA56734
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 15:08:51 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3C.00483771 ; Sat, 28 Apr 2001 15:08:47 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3C.0048374B.00@d12mta02.de.ibm.com>
Date: Sat, 28 Apr 2001 16:14:10 +0300
Subject: Re: iSCSI : digest error handling violates EMDP/InDataOrder
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh,

Let's take a systematic approach to it.

Restriction on data ordering are required if the source or the destination
of the data is unable to deliver or take data data in any order other that
sequential.

Semiconductor or other direct access memories don't  have this restriction.

Tapes and other sequential media do have this type of restriction and so
some streaming devices.

If the restricted device is a target of a SCSI operation with an
unrestricted initiator then:

a. on reads the target can always ship its data in sequential order
b. on writes the target can  always request the data in sequential order

However if the restricted device is an initiator then:

a. on reads the initiator will request the target to send the data in order
b. on writes the restricted initiator will have to get the R2Ts in order
from the target and will be able to support data recovery through an R2T
only if it has enough buffered data.

A restricted device will act as an initiator only if it becomes a third
part copy manager (CM) in a third party operation an does copy from one of
its devices to another device.

Introducing a new mode bit (as Robert Snively seems to suggest) will not
change the fact that the restriction can't be upholded
and do recovery unless the restricted initiator has enough memory.

The spec should only specify a way to terminate a command in those
conditions and leave it at that.

I will change the wording of the DataOrder to make it clearer but I
consider the whole issue entirely academic and overblown.

Recall also that a CM implemented which such severe buffering restrictions
violates the basic SCSI assumption that a target is the data master.

Regards,
Julo



Santosh Rao <santoshr@cup.hp.com> on 28/04/2001 00:03:26

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   Black_David@emc.com, ips@ece.cmu.edu
Subject:  Re: iSCSI : digest error handling violates EMDP/InDataOrder




julian_satran@il.ibm.com wrote:
>
> David,
>
> I read Bob's mail and my interpretation is similar to his. However I
think
> that SPC explicitly states that different transports are free to
interpret
> and make use of this page as they find appropriate.
>
> I have a hard time understanding Santosh's objection as it does not refer
> to the reason the EMDP is there but to the way it is written in FCP (not
> iSCSI).

Julian,

As has been stated earlier, EMDP allows control over the order in which
the target requests outbound data or sends inbound data. EMDP can be
used by initiators to control this order and turn off out-of-order R2T
requests [as well as turn off out of order read data pdus].

This is a useful control option and is already provided by other SCSI
transports. What good reason exists to deny this provision in iSCSI ?

Also, I have some concerns about the ambiguous definition of DataOrder.

Per the spec :
"DataOrder=<yes|no>

The default is yes but targets MAY support no. No is used by iSCSI to
indicate that the data PDUs can be in any order (EMDP = 1). Yes is used
to indicate that incoming data PDUs have to be at continuously
increasing addresses (EMDP = 0)."

Based on the above definition wording :

a) How is DataOrder interpreted for WRITE I/Os ?
b) Is the ordering across the entire SCSI command or a subset of the I/O
? If so, what constitutes this subset ?

Different implementors can arrive at different interpretations reading
the above definition !

- Santosh
 - santoshr.vcf





From owner-ips@ece.cmu.edu  Sat Apr 28 18:41:45 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA17651
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 18:41:45 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SKkga16147
	for ips-outgoing; Sat, 28 Apr 2001 16:46:42 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel2.hp.com (palrel2.hp.com [156.153.255.234])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3SKk8A16103
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 16:46:08 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel2.hp.com (Postfix) with ESMTP
	id B1534D79; Sat, 28 Apr 2001 13:46:07 -0700 (PDT)
Received: (from santoshr@localhost)
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) id NAA22166;
	Sat, 28 Apr 2001 13:46:03 -0700 (PDT)
From: Santosh Rao <santoshr@cup.hp.com>
Message-Id: <200104282046.NAA22166@hpcuhe.cup.hp.com>
Subject: Re: iSCSI : EnableACA
To: julian_satran@il.ibm.com
Date: Sat, 28 Apr 2001 13:46:02 -0700 (PDT)
Cc: ips@ece.cmu.edu
In-Reply-To: <C1256A3C.004B93D5.00@d12mta02.de.ibm.com> from "julian_satran@il.ibm.com" at Apr 28, 2001 04:50:53 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

> What you are suggesting is that during reset the target examines all CDBs
> from all initiators.
> Doesn't seem very practical to me.  

Julian,

The target needs to walk through its task set to abort each I/O while
processing one of Clear Task Set, LU Reset, Target Reset, Abort Task Set.
On finding an I/O in its task set which has the NACA bit set, it could
establish ACA.

> What about a command in flight (that
> was the first that had a NACA bit)?

Such I/Os would be errored back with "ACA Active" SCSI Status.

> It looks like ED might solve the problem for us making UA behave ACA 
> like.

I had some questions on this (which Ralph might help us with).
If the UA is going to be a persistent condition [like ACA], will it not
require a second mechanism similar to Clear ACA to clear the UA ? If this
is correct, do we need to invent 2 mechanisms to deal with a similar
requirement ? Could we apply the proposal made in my earlier mail to solve
all exception conditions on a strongly ordered command [that sets NACA in
its CDB] by generating ACA.
[Note that we also need to solve this problem when an I/O active in a task
set receives an Abort Task.]

Any comments from the T10 experts would be appreciated.

> 

> And EnableACA is not per session (although at a point I intended it to be).

This may be no longer relevant if we are going to allow T10 to solve this
for us. Per my understanding though, all login keys governed session
control options [and not LU options]. Hence, my comment regarding session
granularity. One way to solve this would have been to move EnableACA to 
a mode page only control option.

Thanks,
Santosh

> Santosh Rao <santoshr@cup.hp.com> on 28/04/2001 00:14:59
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> cc:   ENDL_TX@computer.org
> Subject:  Re: iSCSI : EnableACA
> 
> 
> 
> 
> Ralph Weber wrote:
> >
> > Julian,
> >
> > Regarding your discussion with Santosh...
> >
> > > The task management functions for iSCSI contain a
> > > description the behaviour I am talking about. This
> > > was also subject to a long discussion on the list
> > > that included the need for ACA behaviour for things
> > > like task-set full, busy etc. - not considered for
> > > ACA. For the later T10 is handling making ACA
> > > behaviour available.
> > >
> > My question is... Exactly which what are the cases where
> > T10 is NOT considering making ACA behavior available so
> > that you believe the EnableACA function is necessary in
> > iSCSI?
> >
> > My belief is that every behavior that EnableACA covers
> > but T10 does not needs to be taken to T10 to see if
> > they can find a better way to extend ACA coverage to
> > that area.
> 
> Julian,
> 
> My original proposal for this issue stated that ANY exception condition
> on an I/O that had the NACA bit set in its control byte of the CDB must
> result in ACA being established.
> 
> "Any exception condition" would include :
> - I/O being aborted by the target due to a LU Reset, Target Reset or
> Clear Task Set issued by another initiator.
> - I/O being aborted by the initiator through the use of Abort Task.
> - QUEUE FULL, BUSY, RESERVATION CONFLICT, etc status returned on the
> I/O.
> 
> The caveat to this rule is that while ACA is already active, a second
> one would not be established.
> 
> The point I'm trying to make is :
> 
> 1) ACA is being used in this context to aid in the preservation of
> strict command ordering. Any attempt to enforce strict command ordering
> would require initiators to set the NACA bit in the cdb for their I/Os.
> Such initiators would be ACA aware and Clear ACA capable.
> 
> The argument that initiators may not be able to perform "Clear ACA" and
> so need an additional control [thru EnableACA] to prevent ACA from being
> established is not applicable, because, such initiators would not set
> NACA in their cdb's, and in that case, ACA would not be established.
> 
> 2) ACA is a LU level construct and iSCSI is changing this granularity to
> be a session level construct. For example, initiators could turn on ACA
> to only 1 LU on which they had strong ordering requirements.
> 
> 3) ACA is a ULP construct and any changes, if found necessary, should be
> made in the ULP mode pages and not within iSCSI.


-- 
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################


From owner-ips@ece.cmu.edu  Sat Apr 28 18:44:21 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id SAA17663
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 18:44:20 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SKrGh16380
	for ips-outgoing; Sat, 28 Apr 2001 16:53:16 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3SKpwA16353
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 16:51:58 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id AA143EBA; Sat, 28 Apr 2001 13:51:40 -0700 (PDT)
Received: (from santoshr@localhost)
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) id NAA22347;
	Sat, 28 Apr 2001 13:51:00 -0700 (PDT)
From: Santosh Rao <santoshr@cup.hp.com>
Message-Id: <200104282051.NAA22347@hpcuhe.cup.hp.com>
Subject: Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
To: julian_satran@il.ibm.com
Date: Sat, 28 Apr 2001 13:51:00 -0700 (PDT)
Cc: ips@ece.cmu.edu
In-Reply-To: <C1256A3C.0022F327.00@d12mta02.de.ibm.com> from "julian_satran@il.ibm.com" at Apr 28, 2001 09:27:06 AM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Julian,

Thanks for the clarification. One basic question. Why don't we want to
force Abort Task on the same connection as the command ? Extending
connection allegiance to include the Abort Task would simplify the
solutions to some of these issues and IMHO, it is not too stringent a
requirement that the Abort Task be sent on the same connection.

After all, a portion of the I/O state would be resident in adapter
firmware/hardware and sending Abort Task on the same connection would ease
the abort process at the target by avoiding communicating across
connections [NICs] for the clean up.

Thanks,
Santosh


> We could add the RefCmdSN and it may help plug-in the hole but unless the
> Abort is ussued on the same connection as the original command we can't be
> sure that the old-one will not pop-up (as we enable ExpCmdSN to move on we
> don't have even the 2**31-1 protection bracket :-)). Thus sending in
> another nop/abort on the same connection is still required.
> 
> To simplify the whole process I will:
> 
>  a - add RefCmdSN to the Task Management
>  b - add a command not received yet to the answers
> 
>  c - add a part to 7 reading:
> 
> 1.1  How to Abort Safely a Command that Was Not Received
> 
>    To abort safely a task for which the task abort answer is "Command Not
>    Received Yet" the initiator must issue another abort command on the same
>    connection as the original command unless this connection was logged out
>    in which case it may send it on any connection. The expected response
>    for the second abort is Function Complete (if the command did not
>    arrive) or "Task was not in task set".
> 
> 
>    This convoluted scheme is necessary as the target does not know to what
>    connection the hole is related and we don't want to force abort task to
>    the same connection as the original command.
> 
> 
>    Julo
> 
> Santosh Rao <santoshr@cup.hp.com> on 27/04/2001 20:46:53
> 
> Please respond to Santosh Rao <santoshr@cup.hp.com>
> 
> To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> cc:   matt_wakeley@agilent.com
> Subject:  Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
> 
> 
> 
> 
> Julian,
> 
> The conclusion on this thread was that some text was to be added to the
> spec to address this issue. The rev 06 does not have any text to this
> effect. It would help to explicitly describe how initiators should plug
> the hole in CmdSN when they do not intend to use "command retry".
> 
> Also, regarding the use of NOP-OUT to fill the hole, why not just use
> Abort Task for the same purpose ? Do we need a 2nd outbound PDU from the
> initiator just to fill the hole ? When the initiator encounters a ULP
> timeout, it would use an Abort Task to error the I/O. If the Abort Task
> can contain the CmdSN of the original command being aborted, targets can
> fill the hole based on that information, without requiring a second
> outbound PDU from the initiator for this purpose.
> 
> Some text in the draft on this subject would be helpful to implementors.
> 
> - Santosh
> 
> 
> julian_satran@il.ibm.com wrote:
> >
> > Santosh,
> >
> > You had a possible answer from Matt.  However I agree that we might want
> > to address this in text
> 
> 
> 
> julian_satran@il.ibm.com wrote:
> >
> > I the hole is in the command queue and the task is just aborted the
> > response to the abort task
> > will unveil the fact that it did not reach destination.
> >
> > Initiator can recover from there in several ways - clear the task set,
> fill
> > the hole with an iSCSI noop etc.
> > The latter, I recall, Was sugested to you by Matt Wakeley a while ago.
> >
> > None of them require any changes in the spec.
> >
> > Julo
> >
> > Santosh Rao <santoshr@cup.hp.com> on 27/04/2001 04:56:26
> >
> > Please respond to Santosh Rao <santoshr@cup.hp.com>
> >
> > To:   Julian Satran/Haifa/IBM@IBMIL
> > cc:   ips@ece.cmu.edu
> > Subject:  Re: iSCSI : Bridging missing CmdSNs and Abort I/O Error
> recovery
> >
> > Julian,
> >
> > Could you please clarify if the below issue is going to be addressed in
> > the iSCSI draft, as was discussed earlier.
> > (http://ips.pdl.cs.cmu.edu/mail/msg03155.html).
> >
> > Specifically, is the spec going to address the issue of how initiators
> > can plug a hole in CmdSN sequence when they detect a ULP timeout and/or
> > choose not to use "command retry".
> >
> > Regards,
> > Santosh
> >
> > julian_satran@il.ibm.com wrote:
> > >
> > > Santosh,
> > >
> > > You had a possible answer from Matt.  However I agree that we might
> want
> > to
> > > address this in text although
> > > a solution similar to that suggested by Matt should be by now obvious
> to
> > > every implementer - the target should leave a placeholder in the input
> > > queue until the command after gets delivered.
> > >
> > > Julo
> > >
> > > Santosh Rao <santoshr@cup.hp.com> on 25/01/2001 21:38:04
> > >
> > > Please respond to Santosh Rao <santoshr@cup.hp.com>
> > >
> > > To:   IPS Reflector <ips@ece.cmu.edu>
> > > cc:
> > > Subject:  iSCSI : Bridging missing CmdSNs and Abort I/O Error recovery
> > >
> > > Julian & All,
> > >
> > > The draft is currently lacking a section that addresses abort I/O error
> > > recovery. Specifically, how is CmdSN bridging issues to be handled in
> > > the case where an initiator chooses not to retry an I/O [that failed on
> > > a connection failure that affects the delivery of the command to the
> > > target or a digest error at the target] because its ULP timer may have
> > > expired.
> > >
> > > In such cases, the initiator can send an Abort Task to inform the
> target
> > > that the I.T.T is being aborted and its corresponding CmdSN can be
> > > bridged, instead of having the target stall infinitely in its attempt
> to
> > > enforce ordering and await the missing CmdSN [which is'nt going to
> > > arrive, because the initiator did not retry the command].
> > >
> > > Regards,
> > > Santosh
> > >
> > >  - santoshr.vcf
> >  - santoshr.vcf
>  - santoshr.vcf
> 
> 
> 
> 


-- 
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################


From owner-ips@ece.cmu.edu  Sat Apr 28 20:02:27 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA18155
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 20:02:22 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SMfkx21157
	for ips-outgoing; Sat, 28 Apr 2001 18:41:46 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h008.c007.snv.cp.net [209.228.33.214])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3SMfZA21153
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 18:41:35 -0400 (EDT)
Received: (cpmta 19199 invoked from network); 28 Apr 2001 15:41:29 -0700
Received: from dsl-64-130-130-105.telocity.com (HELO ljoy) (64.130.130.105)
  by smtp.telocity.com (209.228.33.214) with SMTP; 28 Apr 2001 15:41:29 -0700
X-Sent: 28 Apr 2001 22:41:29 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: "Ips" <ips@ece.cmu.edu>
Subject: iscsi: a generalized version compatible with SCTP
Date: Sat, 28 Apr 2001 15:39:17 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIEPJCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

http://www.sanlight.net/draft-otis-record-delivery-00.txt

Doug


From owner-ips@ece.cmu.edu  Sat Apr 28 20:03:11 2001
Received: from ece.cmu.edu ([128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA18171
	for <ips-archive@odin.ietf.org>; Sat, 28 Apr 2001 20:03:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3SM0jV19306
	for ips-outgoing; Sat, 28 Apr 2001 18:00:45 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from palrel1.hp.com (palrel1.hp.com [156.153.255.242])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3SLxmA19226
	for <ips@ece.cmu.edu>; Sat, 28 Apr 2001 17:59:48 -0400 (EDT)
Received: from hpcuhe.cup.hp.com (hpcuhe.cup.hp.com [15.0.80.203])
	by palrel1.hp.com (Postfix) with ESMTP
	id 82D2D1A5E; Sat, 28 Apr 2001 14:59:48 -0700 (PDT)
Received: (from santoshr@localhost)
	by hpcuhe.cup.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.02) id OAA25022;
	Sat, 28 Apr 2001 14:59:43 -0700 (PDT)
From: Santosh Rao <santoshr@cup.hp.com>
Message-Id: <200104282159.OAA25022@hpcuhe.cup.hp.com>
Subject: Re: iSCSI : EnableACA
To: ips@ece.cmu.edu (ips)
Date: Sat, 28 Apr 2001 14:59:42 -0700 (PDT)
Cc: julian_satran@il.ibm.com (Julian Satran)
In-Reply-To: <200104282046.NAA22166@hpcuhe.cup.hp.com> from "Santosh Rao" at Apr 28, 2001 01:46:02 PM
X-Mailer: ELM [version 2.5 PL2]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

> > What about a command in flight (that
> > was the first that had a NACA bit)?
> 
> Such I/Os would be errored back with "ACA Active" SCSI Status.

Oops. I believe your question was regarding the first NACA set CDB
in-flight and not yet in the task set at the time of the task mgmt cmd.

Such a scenario implies the I/Os prior to that first NACA set I/O would
have had no ordering constraints. In this case, no ACA is established for
the task mgmt cmd prior to the first NACA I/O. [CAC is established.]

The CAC is cleared on the next I/O [other than INQUIRY and REPORT LUNS] 
that arrives after CAC was established. Subsequent arrival of a NACA set
I/O is processed normally and no loss of ordering occurs in this case.

Did I miss something about this scenario (?)

Regards,
Santosh

-- 
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################


From owner-ips@ece.cmu.edu  Sun Apr 29 03:37:40 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA04281
	for <ips-archive@odin.ietf.org>; Sun, 29 Apr 2001 03:37:40 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3T5L2807427
	for ips-outgoing; Sun, 29 Apr 2001 01:21:02 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3T5K1A07348
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 01:20:01 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id HAA50398
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 07:19:57 +0200
From: julian_satran@il.ibm.com
Received: from d12mta05.de.ibm.com (d12mta05_cs0 [9.165.222.239])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id HAA133042
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 07:19:57 +0200
Received: by d12mta05.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3D.001D49E8 ; Sun, 29 Apr 2001 07:19:54 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: Santosh Rao <santoshr@cup.hp.com>
cc: ips@ece.cmu.edu
Message-ID: <C1256A3D.001D47F1.00@d12mta05.de.ibm.com>
Date: Sun, 29 Apr 2001 08:25:18 +0300
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



comments in text - Julo

Santosh Rao <santoshr@cup.hp.com> on 28/04/2001 23:46:02

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   ips@ece.cmu.edu
Subject:  Re: iSCSI : EnableACA




> What you are suggesting is that during reset the target examines all CDBs
> from all initiators.
> Doesn't seem very practical to me.

Julian,

The target needs to walk through its task set to abort each I/O while
processing one of Clear Task Set, LU Reset, Target Reset, Abort Task Set.
On finding an I/O in its task set which has the NACA bit set, it could
establish ACA.

> What about a command in flight (that
> was the first that had a NACA bit)?

Such I/Os would be errored back with "ACA Active" SCSI Status.

+++ what if none of the preceding had a Naca bit? +++

> It looks like ED might solve the problem for us making UA behave ACA
> like.

I had some questions on this (which Ralph might help us with).
If the UA is going to be a persistent condition [like ACA], will it not
require a second mechanism similar to Clear ACA to clear the UA ? If this
is correct, do we need to invent 2 mechanisms to deal with a similar
requirement ? Could we apply the proposal made in my earlier mail to solve
all exception conditions on a strongly ordered command [that sets NACA in
its CDB] by generating ACA.
[Note that we also need to solve this problem when an I/O active in a task
set receives an Abort Task.]

Any comments from the T10 experts would be appreciated.

>

> And EnableACA is not per session (although at a point I intended it to
be).

This may be no longer relevant if we are going to allow T10 to solve this
for us. Per my understanding though, all login keys governed session
control options [and not LU options]. Hence, my comment regarding session
granularity. One way to solve this would have been to move EnableACA to
a mode page only control option.

Thanks,
Santosh

> Santosh Rao <santoshr@cup.hp.com> on 28/04/2001 00:14:59
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
> cc:   ENDL_TX@computer.org
> Subject:  Re: iSCSI : EnableACA
>
>
>
>
> Ralph Weber wrote:
> >
> > Julian,
> >
> > Regarding your discussion with Santosh...
> >
> > > The task management functions for iSCSI contain a
> > > description the behaviour I am talking about. This
> > > was also subject to a long discussion on the list
> > > that included the need for ACA behaviour for things
> > > like task-set full, busy etc. - not considered for
> > > ACA. For the later T10 is handling making ACA
> > > behaviour available.
> > >
> > My question is... Exactly which what are the cases where
> > T10 is NOT considering making ACA behavior available so
> > that you believe the EnableACA function is necessary in
> > iSCSI?
> >
> > My belief is that every behavior that EnableACA covers
> > but T10 does not needs to be taken to T10 to see if
> > they can find a better way to extend ACA coverage to
> > that area.
>
> Julian,
>
> My original proposal for this issue stated that ANY exception condition
> on an I/O that had the NACA bit set in its control byte of the CDB must
> result in ACA being established.
>
> "Any exception condition" would include :
> - I/O being aborted by the target due to a LU Reset, Target Reset or
> Clear Task Set issued by another initiator.
> - I/O being aborted by the initiator through the use of Abort Task.
> - QUEUE FULL, BUSY, RESERVATION CONFLICT, etc status returned on the
> I/O.
>
> The caveat to this rule is that while ACA is already active, a second
> one would not be established.
>
> The point I'm trying to make is :
>
> 1) ACA is being used in this context to aid in the preservation of
> strict command ordering. Any attempt to enforce strict command ordering
> would require initiators to set the NACA bit in the cdb for their I/Os.
> Such initiators would be ACA aware and Clear ACA capable.
>
> The argument that initiators may not be able to perform "Clear ACA" and
> so need an additional control [thru EnableACA] to prevent ACA from being
> established is not applicable, because, such initiators would not set
> NACA in their cdb's, and in that case, ACA would not be established.
>
> 2) ACA is a LU level construct and iSCSI is changing this granularity to
> be a session level construct. For example, initiators could turn on ACA
> to only 1 LU on which they had strong ordering requirements.
>
> 3) ACA is a ULP construct and any changes, if found necessary, should be
> made in the ULP mode pages and not within iSCSI.


--
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################





From owner-ips@ece.cmu.edu  Sun Apr 29 03:39:18 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id DAA04511
	for <ips-archive@odin.ietf.org>; Sun, 29 Apr 2001 03:39:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3T6I4j09618
	for ips-outgoing; Sun, 29 Apr 2001 02:18:04 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3T6HHA09603
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 02:17:17 -0400 (EDT)
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id IAA310892
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 08:17:09 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay01.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA34254
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 08:17:09 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3D.0022860A ; Sun, 29 Apr 2001 08:17:05 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
Message-ID: <C1256A3D.00228483.00@d12mta02.de.ibm.com>
Date: Sun, 29 Apr 2001 09:22:30 +0300
Subject: Re: iSCSI : EnableACA
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Santosh - get through every CDB at a reset is not very practical - and I am
afraid it will also be considered bad engineering - you want to clean
everything up as it might be in a mess.

Also the behaviour is inconsistent for commands in flight and those queued
already :

- for those queued already it is ACA if any in the queue is ACA (at least
from your description)
- for those in flight only if the first is ACA

Julo

Santosh Rao <santoshr@cup.hp.com> on 29/04/2001 08:40:44

Please respond to Santosh Rao <santoshr@cup.hp.com>

To:   Julian Satran/Haifa/IBM@IBMIL
cc:   santoshr@hpcuhe.cup.hp.com (Santosh Rao)
Subject:  Re: iSCSI : EnableACA




Julian,

The point I'm trying to make is that if there were no prior NACA I/Os
in the task set, those I/Os had no ordering dependencies. In that case,
ACA is not required. CAC is sufficient.

In the case where ordering of I/Os is required, such I/Os expect ACA to be
established on an error and they have their NACA bit set.

IOW,
1) no NACA in task set => no ordering required => use CAC on error

2) NACA in task set => some I/Os require ordering => use ACA on error

3) First NACA I/Os arrive after CAC => process normally (I/Os prior to CAC
did not expect ordering).

The above is all based on the fundamental model that I/Os that expect
ordering MUST set the NACA bit in their CDBs. Hence, ACA is not required
to be established when the task set does not contain a NACA I/O.

Regards,
Santosh

>
> Sure - the unit attention is cleared by the first and all the others get
> true - for non-ACA it is OK
> for ACA we wanted all to be dropte up to explicit reset.
>
> Julo
>
> Santosh Rao <santoshr@cup.hp.com> on 29/04/2001 00:59:42
>
> Please respond to Santosh Rao <santoshr@cup.hp.com>
>
> To:   ips@ece.cmu.edu (ips)
> cc:   Julian Satran/Haifa/IBM@IBMIL
> Subject:  Re: iSCSI : EnableACA
>
>
>
>
> > > What about a command in flight (that
> > > was the first that had a NACA bit)?
> >
> > Such I/Os would be errored back with "ACA Active" SCSI Status.
>
> Oops. I believe your question was regarding the first NACA set CDB
> in-flight and not yet in the task set at the time of the task mgmt cmd.
>
> Such a scenario implies the I/Os prior to that first NACA set I/O would
> have had no ordering constraints. In this case, no ACA is established for
> the task mgmt cmd prior to the first NACA I/O. [CAC is established.]
>
> The CAC is cleared on the next I/O [other than INQUIRY and REPORT LUNS]
> that arrives after CAC was established. Subsequent arrival of a NACA set
> I/O is processed normally and no loss of ordering occurs in this case.
>
> Did I miss something about this scenario (?)
>
> Regards,
> Santosh
>
> --
> #################################
> Santosh Rao
> Software Design Engineer,
> HP, Cupertino.
> email : santoshr@cup.hp.com
> Phone : 408-447-3751
> #################################
>
>
>
>


--
#################################
Santosh Rao
Software Design Engineer,
HP, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
#################################





From owner-ips@ece.cmu.edu  Sun Apr 29 16:22:32 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA11425
	for <ips-archive@odin.ietf.org>; Sun, 29 Apr 2001 16:22:31 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3TIJcR19533
	for ips-outgoing; Sun, 29 Apr 2001 14:19:38 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h011.c007.snv.cp.net [209.228.33.217])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3TIIeA19493
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 14:18:40 -0400 (EDT)
Received: (cpmta 27117 invoked from network); 29 Apr 2001 11:18:33 -0700
Received: from unknown (HELO ljoy) (64.130.130.105)
  by smtp.telocity.com (209.228.33.217) with SMTP; 29 Apr 2001 11:18:33 -0700
X-Sent: 29 Apr 2001 18:18:33 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: "KRUEGER,MARJORIE \(HP-Roseville,ex1\)" <marjorie_krueger@hp.com>
Cc: "Ips" <ips@ece.cmu.edu>
Subject: RE: iSCSI: Require iSCSI to use packet formats similar to FC, etc??
Date: Sun, 29 Apr 2001 11:16:20 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJIEPNCGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <6BD67FFB937FD411A04F00D0B74FE87802A0901C@xrose06.rose.hp.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Marjorie,

I presented a draft which represents an effort to translate iSCSI into a
simple device transport layer.  You will notice there is no information
related to the technology of the device. The command field could be suitable
for any device or application well beyond the constraints of SCSI.  You will
also notice that I have not expected the transport to understand the content
of the device layer information.  I have included some queue management but
have not defined how this management relates to the device.  iSCSI could be
placed upon such a defined transport but you notice that iSCSI defines the
device in addition to the transport layer.  These definitions of the device
should be T10 efforts as a separate specification and perhaps that is where
iSCSI should be defined once a suitable transport is selected.

It should not be difficult to split the transport information from the
device information.  I am concerned that things like ACA involvement
requires the transport to know too much.  Mode page definitions is an
example of that over-reach.  The symmetric stream use in this presented
draft is compatible with the iSCSI connection allegiance.  If these
connections represent separate paths for reliability to the same device,
then being able to maintain coherency over multiple connections is important
and presently, only SCTP provide that level of coherency.

I have also attempted to make this generic device transport layer
independent of the IP transport protocol selection.  You will notice that
the resulting structures are relatively stark.  Add the record tracking of
SCTP to ??P or use directly and the result is a robust transport.  The minor
innovation is a vector placement mode for SCTP as well as integrity digests.
I assume that SCTP will be using suitable error checking.  Those concerned
about data, will likely include integrity and the jury is still out on what
the basic native error checking will be.  If it is not CRC, then a CRC
digest seems like a reasonable alternative.

As a caveat, sequential devices should restrict the number of Associations
to one.  This restriction is less important for random devices, but there
should be some means to resolve the coherency problem of multiple
associations for management result resolution.  A solution would prevent
information arriving from a command that was considered aborted or prevent
conflicted use of command channel.  A resolution to this problem seems to be
related to the device interface.  One possible technique might be to have
the device layer send some type of asynchronous signal on those affected
requests over all active connections but this still seems to be a difficult
and unresolved problem.  Keeping the Association to one is one method but
not likely desired.

My thought was to have a command channel on each connection dedicated to
resolving management command coherency.  This dedicated channel would
respond to all management commands.  All dedicated channels responding would
be required to validate the management response.  These one to many
management command responses could be done by or at the device layer.

Doug


> > As the rules change from technology to technology, there are
> > issues involved
> > in this endeavor that will place into focus some potential
> > problems.  I tend
> > to think that an independent delivery protocol could be
> > developed.
>
> An independant protocol that is agnostic to the transport medium would
> probably be too general to be optimal in any specific transport
> environment.
> I think the solution is to make SCSI truely independant of the transport
> (strictly layered on top of the transport).  It seems that T10 is
> realizing
> this and some are working towards that goal.
>
> IMHO, requiring that iSCSI match other transport formats is going at the
> problem from the wrong end.
>
> Marj
>



From owner-ips@ece.cmu.edu  Sun Apr 29 16:26:00 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id QAA11478
	for <ips-archive@odin.ietf.org>; Sun, 29 Apr 2001 16:25:59 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3TIYYP20248
	for ips-outgoing; Sun, 29 Apr 2001 14:34:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c017.sfo.cp.net (c017-h007.c017.sfo.cp.net [209.228.12.221])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3THJHA16745
	for <ips@ece.cmu.edu>; Sun, 29 Apr 2001 13:19:17 -0400 (EDT)
Received: (cpmta 22572 invoked from network); 29 Apr 2001 10:19:10 -0700
Date: 29 Apr 2001 10:19:10 -0700
Message-ID: <20010429171910.22570.cpmta@c017.sfo.cp.net>
X-Sent: 29 Apr 2001 17:19:10 GMT
Received: from [132.64.1.34] by mail.sangate.com with HTTP; 29 Apr 2001
    10:19:10 PDT
Content-Type: text/plain
Content-Disposition: inline
Mime-Version: 1.0
To: ips@ece.cmu.edu
From: gabriel@sangate.com
X-Mailer: Web Mail 3.7.1.9
Subject: iSCSI command numbering
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

hi

when an iSCSI target recives task with CmdSN=X and decides to reject it because of internal queueing issues( e.g. queue is full...) 
what should the  target do when it sees the next task with CmdSN=X+1 ?
    a) accept it ( it saw CmdSN=X  so there are no flows in the network layer ) OR
    b) buffer it, waiting for task with CmdSN=X  to be resent( ensure total ordering on the tasks )

/gabi




From owner-ips@ece.cmu.edu  Mon Apr 30 04:41:34 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id EAA02955
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 04:41:33 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3U6L0n20960
	for ips-outgoing; Mon, 30 Apr 2001 02:21:00 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3U6KgA20946
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 02:20:43 -0400 (EDT)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id IAA332592;
	Mon, 30 Apr 2001 08:20:32 +0200
From: julian_satran@il.ibm.com
Received: from d12mta02.de.ibm.com (d12mta01_cs0 [9.165.222.237])
	by d12relay02.de.ibm.com (8.8.8m3/NCO v4.96) with SMTP id IAA145012;
	Mon, 30 Apr 2001 08:20:31 +0200
Received: by d12mta02.de.ibm.com(Lotus SMTP MTA v4.6.5  (863.2 5-20-1999))  id C1256A3E.0022D604 ; Mon, 30 Apr 2001 08:20:30 +0200
X-Lotus-FromDomain: IBMIL@IBMDE
To: ips@ece.cmu.edu
cc: ralphoweber@compuserve.com
Message-ID: <C1256A3E.0022D5C3.00@d12mta02.de.ibm.com>
Date: Mon, 30 Apr 2001 09:19:36 +0200
Subject: Re: iSCSI command numbering
Mime-Version: 1.0
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: owner-ips@ece.cmu.edu
Precedence: bulk



Gabriel,

We are working diligently with T10 to have exceptions like task-set-full,
busy etc. cause an ACA or ACA-like condition and thus help maintain order.

Julo

gabriel@sangate.com on 29-04-2001 19:19:10

Please respond to gabriel@sangate.com

To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI command numbering




hi

when an iSCSI target recives task with CmdSN=X and decides to reject it
because of internal queueing issues( e.g. queue is full...)
what should the  target do when it sees the next task with CmdSN=X+1 ?
    a) accept it ( it saw CmdSN=X  so there are no flows in the network
layer ) OR
    b) buffer it, waiting for task with CmdSN=X  to be resent( ensure total
ordering on the tasks )

/gabi







From owner-ips@ece.cmu.edu  Mon Apr 30 17:14:03 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id RAA26177
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 17:14:02 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3UHIff01460
	for ips-outgoing; Mon, 30 Apr 2001 13:18:41 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1.cos.agilent.com (msgbas1x.cos.agilent.com [192.6.9.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3UHHqA01427
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 13:17:52 -0400 (EDT)
Received: from msgrel1.and.agilent.com (msgrel1.and.agilent.com [130.30.33.104])
	by msgbas1.cos.agilent.com (Postfix) with ESMTP id C049AD51
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 11:17:51 -0600 (MDT)
Received: from rtl.rose.agilent.com (rtl.rose.agilent.com [156.140.232.231])
	by msgrel1.and.agilent.com (Postfix) with ESMTP id 94B7DC1
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 13:17:50 -0400 (EDT)
Received: from mail.rose.agilent.com (mailsrv@bellhop [156.140.233.51])
	by rtl.rose.agilent.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.1.0) with ESMTP id KAA01528
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 10:17:49 -0700 (PDT)
Received: from agilent.com
          (cos1nai134152.cos.agilent.com [141.184.134.152])
          by mail.rose.agilent.com (Netscape Messaging Server 3.6)
           with ESMTP id AAA2DE0 for <ips@ece.cmu.edu>;
          Mon, 30 Apr 2001 10:17:45 -0700
Message-ID: <3AED9F4E.F7A0FE98@agilent.com>
Date: Mon, 30 Apr 2001 10:22:22 -0700
From: "Matt Wakeley" <matt_wakeley@agilent.com>
Reply-To: Matt Wakeley <matt_wakeley@agilent.com>
Organization: Agilent Technologies
X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI Requirements -03 comments
References: <0F31E5C394DAD311B60C00E029101A07080154DB@corpmx9.isus.emc.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Black_David@emc.com wrote:

> -- Section 2.3
> 
> Framing discussion needs to indicate that framing MUST be OPTIONAL to
> implement.
> Add a sentence indicating "SHOULD specify a default framing mechanism"
> (i.e.,
> specify the first framing mechanism that MUST be implemented *if* any
> mechanism
> is implemented).

Dave,

I disagree that if any framing mechanism is implemented, that there "MUST" be
a required mechanism.  Certainly, if the transport WG comes up with a viable
framing mechanism, we don't want to *require* something else like "markers".

-Matt Wakeley
Agilent Technologies


From owner-ips@ece.cmu.edu  Mon Apr 30 20:10:32 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA28752
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 20:10:32 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3UH6YK00826
	for ips-outgoing; Mon, 30 Apr 2001 13:06:34 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from clyde.stargateip.com ([209.237.5.66])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3UH5dA00753;
	Mon, 30 Apr 2001 13:05:39 -0400 (EDT)
Received: from fixpc ([10.10.5.76])
	by clyde.stargateip.com (8.9.1/8.9.1) with SMTP id KAA19080;
	Mon, 30 Apr 2001 10:00:35 -0700 (PDT)
From: "Meena Ramamoorthi" <meena@stargateip.com>
To: "Dave Nagle" <bassoon@ece.cmu.edu>, <ips@ece.cmu.edu>
Subject: RE: iSCSI IETF annoucement
Date: Mon, 30 Apr 2001 10:12:06 -0700
Message-ID: <NEBBJELJBMCNBHIPPOHFCEDOCCAA.meena@stargateip.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Importance: Normal
In-Reply-To: <200104232028.QAA20657@opus.ece.cmu.edu>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6600
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
X-MIME-Autoconverted: from 8bit to quoted-printable by ece.cmu.edu id f3UH6YL00826
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by ietf.org id UAA28752


 It may be good to add and highlight the details  regarding  the management
of  the  FCoverTCPIP devices/ switches, like SNMP, MIB’s etc etc in this
draft.

 Thanks
 Meenakshi


-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
Dave Nagle
Sent: Monday, April 23, 2001 1:29 PM
To: ips@ece.cmu.edu
Subject: iSCSI IETF annoucement




A New Internet-Draft is available from the on-line Internet-Drafts
directories.
This draft is a work item of the IP Storage Working Group of the IETF.

	Title		: Fibre Channel Over TCP/IP (FCIP)
	Author(s)	: M. Rajagopal, R. Bhagwat
	Filename	: draft-ietf-ips-fcovertcpip-02.txt
	Pages		: 26
	Date		: 20-Apr-01

Fibre Channel (FC) is a dominant technology used in Storage Area
Networks (SAN). The purpose of this draft is to specify a standard
way of encapsulating FC frames over TCP/IP and to describe mechanisms
that allow islands of FC SANs to be interconnected  over IP-based
networks. FC over TCP/IP relies on IP-based network services to
provide the connectivity between the SAN islands over LANs, MANs, or
WANs.  The FC over TCP/IP specification relies upon TCP for
congestion control and management and upon both TCP and FC for data
error and data loss recovery.  FC over TCP/IP treats all classes of
FC frames the same -- as datagrams.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcovertcpip-02.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-ips-fcovertcpip-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-ips-fcovertcpip-02.txt".

NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.


Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

- --NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

- --OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<20010420154253.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-ips-fcovertcpip-02.txt

- --OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-ips-fcovertcpip-02.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<20010420154253.I-D@ietf.org>

- --OtherAccess--

- --NextPart--



------- End of Forwarded Message




From owner-ips@ece.cmu.edu  Mon Apr 30 20:19:58 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA28885
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 20:19:58 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3UM9pK21004
	for ips-outgoing; Mon, 30 Apr 2001 18:09:51 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from sj-msg-core-2.cisco.com (sj-msg-core-2.cisco.com [171.69.43.88])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3UM9nJ20996
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 18:09:49 -0400 (EDT)
Received: from sponge.cisco.com (sponge.cisco.com [171.71.61.25])
	by sj-msg-core-2.cisco.com (8.11.3/8.9.1) with ESMTP id f3UMAAE13489
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 15:10:10 -0700 (PDT)
Received: from dap02w2k (rtp-dial-2-223.cisco.com [10.83.96.223])
	by sponge.cisco.com (Mirapoint)
	with SMTP id ACS02188;
	Mon, 30 Apr 2001 17:09:41 -0500 (CDT)
From: "Dave Peterson" <dap@cisco.com>
To: "Ips@Ece. Cmu. Edu" <ips@ece.cmu.edu>
Subject: Proposal to use SLPv2 for FCIP device discovery 
Date: Mon, 30 Apr 2001 17:07:08 -0500
Message-ID: <EDEKKDKNBFCABNBAAOBBGENDCEAA.dap@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
X-MIME-Autoconverted: from 8bit to quoted-printable by ece.cmu.edu id f3UM9pL21004
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by ietf.org id UAA28885

As requested here is the proposal I presented at Mondays' FCIP meeting:

FCIP Draft Proposal
For Clause 4.2, item 3.

Each FCIP device may be statically or dynamically configured with a list of
IP addresses and port numbers corresponding to all the participating FCIP
devices. If dynamic discovery of participating FCIP devices is supported
this function shall be performed using the Service Location Protocol
(SLPv2). For additional FCIP device management capability, the iSNS Internet
Storage Naming Service may be implemented. It is outside the scope of this
specification to describe any static configuration method for participating
FCIP device discovery. Refer to clause XXX for a detailed description of
dynamic discovery of participating FCIP device using SLPv2.

Notes:
1. DHCP:
	a. Allows an entity to discovery information about itself, not discover
information about all other entities.
	b. Uses a broadcast mechanism that may not work via routers without
additional configuration. But, most current router 		implementations will
support the forwarding of DHCP requests across routed subnets.
	c. May be used to find SLP Directory Agents and Scope Lists allowing for a
more scalable solution.
2. DNS Services are too limiting. This is the reason why SLP was started.
3. LDAP is simply a database interface. SLP and iSNS may use LDAP as a
back-end store.
4. SLP FCIP Device Type Specifics
	a. IP address(es)
	b. Port numbers(s)
	c. Scope
	d. Attributes
		i. Discovery domain (i.e. a group name or zone name, don’t want to use
zone name in this context though)
		ii. need to start listing these (if we can think of any more)
	e. Lifetime

Work In progress:
1.	Putting together a model for FCIP device discovery using SLPv2.
2.	Implementing a “default/suggested” SLPv2 template.
3.	Reviewing RFC 3082 – Notification and Subscription for SLP for enhanced
device notification.

Requirements							SLPv2	iSNS
Discovery of FCIP targets					Y	Y
Discovery of FCIP device IP address(es)			Y	Y
Discovery of FCIP port number(s)				Y	Y
Attribute support							Y	Y
Semi-timely notification of devices coming and going	Y	Y
Authentication							Y	Y(uses SLPv2 constructs)
Minimal/no configuration					Y	Y(?)
Provisioning							Y	Y
Multicast support							Y	N
Publicly available source					Y	N
Standardized and mature						Y	N
Lighweight implementation					Y	N

Non-Requirements
Zoning								N	Y
State Change Notification					N	Y
Naming Services							N	Y


David Peterson
Lead Architect - Standards Development
Cisco Systems - SRBU
6450 Wedgwood Road
Maple Grove, MN 55311
Office: 763-398-1007
Cell: 612-802-3299
Email: dap@cisco.com



From owner-ips@ece.cmu.edu  Mon Apr 30 20:22:11 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id UAA28956
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 20:22:11 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3UMWUb22626
	for ips-outgoing; Mon, 30 Apr 2001 18:32:30 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1.cos.agilent.com (msgbas1x.cos.agilent.com [192.6.9.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3UMWT122622
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 18:32:29 -0400 (EDT)
Received: from msgrel1.and.agilent.com (msgrel1.and.agilent.com [130.30.33.104])
	by msgbas1.cos.agilent.com (Postfix) with ESMTP id 5E99EA58
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 16:32:28 -0600 (MDT)
Received: from rtl.rose.agilent.com (rtl.rose.agilent.com [156.140.232.231])
	by msgrel1.and.agilent.com (Postfix) with ESMTP id 6F0D81A
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 18:32:27 -0400 (EDT)
Received: from mail.rose.agilent.com (mailsrv@bellhop [156.140.233.51])
	by rtl.rose.agilent.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.1.0) with ESMTP id PAA11200
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 15:32:26 -0700 (PDT)
Received: from agilent.com (cos1nai130053.cos.agilent.com [141.184.130.53])
          by mail.rose.agilent.com (Netscape Messaging Server 3.6)
           with ESMTP id AAA5D9B for <ips@ece.cmu.edu>;
          Mon, 30 Apr 2001 15:32:22 -0700
Message-ID: <3AEDE7F1.54CFF887@agilent.com>
Date: Mon, 30 Apr 2001 15:32:17 -0700
From: "Matt Wakeley" <matt_wakeley@agilent.com>
Reply-To: Matt Wakeley <matt_wakeley@agilent.com>
Organization: Agilent Technologies
X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: ips@ece.cmu.edu
Subject: Re: iSCSI: multiple sessions b/n a pair of WWUIs.
References: <OF7A4FB579.2EDCC70B-ON88256A3A.00111074@almaden.ibm.com> <3AE85DA4.570F800D@cup.hp.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

One clarification: these are unique between any I-T *pair*.
(you make it sound like an initiator must have a unique ISID for all sessions
with all targets)

-Matt

Santosh Rao wrote:
> 
> Jim,
> 
> Thanks for the clarification. Both the iSCSI and the name-disc drafts
> need to explicitly state that ISID is uniquely assigned for all sessions
> within a given initiator. Similarly, TSID is uniquely assigned for all
> sessions within a given target.
> 
> Regards,
> Santosh


From owner-ips@ece.cmu.edu  Mon Apr 30 21:21:27 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id VAA29727
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 21:21:27 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f3UNE3r25074
	for ips-outgoing; Mon, 30 Apr 2001 19:14:03 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from c007.snv.cp.net (c007-h000.c007.snv.cp.net [209.228.33.206])
	by ece.cmu.edu (8.11.0/8.10.2) with SMTP id f3UNE1125069
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 19:14:01 -0400 (EDT)
Received: (cpmta 12022 invoked from network); 30 Apr 2001 16:13:54 -0700
Received: from dialup-64-34-3-43.telocity.com (HELO ljoy) (64.34.3.43)
  by smtp.telocity.com (209.228.33.206) with SMTP; 30 Apr 2001 16:13:54 -0700
X-Sent: 30 Apr 2001 23:13:54 GMT
From: "Douglas Otis" <dotis@sanlight.net>
To: "Robert Snively" <rsnively@Brocade.COM>,
        "Stephen Bailey" <steph@cs.uchicago.edu>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Re: iSCSI & Linked Commands
Date: Mon, 30 Apr 2001 16:11:47 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJGEAGCHAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <FFD40DB4943CD411876500508BAD0279026B2669@sj5-ex2.brocade.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Robert,

You are correct in the explicit definitions of the SAMS dictionary.  I was
simply attempting to explain the reason for a concern about sequential
ordering as it relates to these two devices.  Perhaps they could add
nextitive addressing.  :)

Doug

> -----Original Message-----
> From: Robert Snively [mailto:rsnively@brocade.com]
> Sent: Monday, April 30, 2001 4:13 AM
> To: Douglas Otis; Robert Snively; Stephen Bailey; ips@ece.cmu.edu
> Subject: RE: iSCSI: Re: iSCSI & Linked Commands
>
>
> Doug,
>
> Sorry, that is not how the SCSI standards define relative
> addressing.  Relative addressing is a displacement from the
> last logical block transferred in an SCB read or write command
> and applies to the read or write command in which the RELADR
> bit is set.
>
> Tape devices have the property of sequential access.  Except
> for Locate (which is an absolute address), all tape addressing
> is "next".  To get to "next + n", you have to do a separate
> explicit command to step across the intervening blocks, which
> are not logical blocks and cannot be directly addressed.
>
> It may be that you are applying generic English definitions to
> words which SCSI has assigned a special meaning.
>
> Bob
>
> >  -----Original Message-----
> >  From: Douglas Otis [mailto:dotis@sanlight.net]
> >  Sent: Thursday, April 26, 2001 10:15 AM
> >  To: Robert Snively; Stephen Bailey; ips@ece.cmu.edu
> >  Subject: RE: iSCSI: Re: iSCSI & Linked Commands
> >
> >
> >  Robert,
> >
> >  Relative addressing is not defined because that is the only means of
> >  addressing.  Relative to the last block.
> >
> >  Doug
> >
> >
> >  > Doug,
> >  >
> >  > Relative addressing is not defined in the SSC command set nor
> >  > in the SPC command set for tapes.
> >  >
> >  > Bob
> >  >
> >  > >  -----Original Message-----
> >  > >  From: Douglas Otis [mailto:dotis@sanlight.net]
> >  > >  Sent: Monday, April 23, 2001 9:36 AM
> >  > >  To: Stephen Bailey; ips@ece.cmu.edu
> >  > >  Subject: RE: iSCSI: Re: iSCSI & Linked Commands
> >  > >
> >  > >
> >  > >  Stephen,
> >  > >
> >  > >  Unlike random access devices, sequential access devices
> >  operate with
> >  > >  relative addressing.  For random access devices, this is a
> >  > >  seldom used
> >  > >  option.  There is a requirement to bind commands together to
> >  > >  ensure order of
> >  > >  execution on these devices.  By popular, you mean not
> >  sequential?
> >  > >
> >  > >  Doug
> >  > >
> >  > >
> >  > >  > Julian,
> >  > >  >
> >  > >  > > According to your logic no FCP implementation can use
> >  > >  linked commands?
> >  > >  > > Is this true for all OS's?  Is it a verified fact
> >  or foloklor?
> >  > >  >
> >  > >  > In my experience it's fact.  I have never used a SCSI
> >  > >  stack which both
> >  > >  > supported AND used linked commands.  Like some others
> >  > >  here, I always
> >  > >  > assumed AIX might :^) Ralph has pointed out that T10
> >  is well aware
> >  > >  > that the feature is not popular.  There are other ways of
> >  > >  > accomplishing the same thing that are less likely to blow
> >  > >  up in your
> >  > >  > face.
> >  > >  >
> >  > >  > > Is it so also for the new MS StorPort driver?
> >  > >  >
> >  > >  > I don't know, but I'd be really surprised if they did
> >  use linked
> >  > >  > commands.  You have to be pretty nuts to rely on a feature
> >  > >  that's not
> >  > >  > even exercised by most SCSI implementations.
> >  > >  >
> >  > >  > Steph
> >  > >  >
> >  > >
> >  > >
> >  >
> >
> >
>



From owner-ips@ece.cmu.edu  Mon Apr 30 23:13:57 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA03895
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 23:13:57 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f410vOX00300
	for ips-outgoing; Mon, 30 Apr 2001 20:57:24 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1.cos.agilent.com (msgbas1x.cos.agilent.com [192.6.9.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f410vN100295
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 20:57:23 -0400 (EDT)
Received: from msgrel1.and.agilent.com (msgrel1.and.agilent.com [130.30.33.104])
	by msgbas1.cos.agilent.com (Postfix) with ESMTP id 3DC95A02
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 18:57:22 -0600 (MDT)
Received: from rtl.rose.agilent.com (rtl.rose.agilent.com [156.140.232.231])
	by msgrel1.and.agilent.com (Postfix) with ESMTP id 44BF921
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 20:57:21 -0400 (EDT)
Received: from mail.rose.agilent.com (bellhop [156.140.233.51])
	by rtl.rose.agilent.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.1.0) with ESMTP id RAA21518
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 17:57:20 -0700 (PDT)
Received: from agilent.com (cos1nai128088.cos.agilent.com [141.184.128.88])
          by mail.rose.agilent.com (Netscape Messaging Server 3.6)
           with ESMTP id AAA485 for <ips@ece.cmu.edu>;
          Mon, 30 Apr 2001 17:56:59 -0700
Message-ID: <3AEE0903.2E46EF43@agilent.com>
Date: Mon, 30 Apr 2001 17:53:24 -0700
From: "Matt Wakeley" <matt_wakeley@agilent.com>
Reply-To: Matt Wakeley <matt_wakeley@agilent.com>
Organization: Agilent Technologies
X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iscsi: comments to iSCSI rev 6
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Section 2.2.3:

The AHSLength field where it is requires RISC type processors to shift the
length left by 8 bytes.

Section 2.3.5, 3rd paragraph: contradicts itself.  First it states that a AHS
header "MUST" be present, then goes on to define what the bi-di read length is
if the header is not present.  If it's not present, it's a protocol error.

Section 2.4.1, last sentance: "b0-b3 MUST be 0" s/b "b1-b4 MUST be 0".

Section 2.7.7: why have residual bits/fields in a data PDU?  If there is
residual, then send a status PDU indicating the residual value.

Section 2.8.3: Keys not understood by the target should be expicitely
indicated as not being understood.  Silence is not a good way to indicate that
one does not understand something.  Also, something more ascii friendly such
as a semi-colon (;) should be used to separate key-value pairs instead of a
null (0x00) character.  This would allow generic text manipulation libraries
to be used.

Section 2.9.3: why "MUST" key-value pairs be returned in the same order they
were issued?  Seems like a rather strong requirement for no apparent reason.

Section 2.12.3: indicate that the LUN is copied from the NOP-IN.  This is much
more clear than "the correct value for the task".

Section 2.14: Why is there no CmdSN for the logout command?  Also, 2 ways of
performing the same operation (cleaning up) are stated in the 3rd paragraph. 
In the interest of reducing "options", I suggest that one be picked as the
only method.

Section 2.17.4: "The target assigns" should be "The target may assign".

Section 6.3 & 6.7.2: why "MUST" a target reissue missing responses?  What if
it is not able to?  There should be the option to reject the SNACK.

Appendix: "Initial Marker-less Interval" - I say again, there should not be a
"minimum markerless interval".  This should be whatever is negotiated.


-Matt Wakeley
Agilent Technologies




From owner-ips@ece.cmu.edu  Mon Apr 30 23:22:17 2001
Received: from ece.cmu.edu (ECE.CMU.EDU [128.2.236.200])
	by ietf.org (8.9.1a/8.9.1a) with SMTP id XAA03990
	for <ips-archive@odin.ietf.org>; Mon, 30 Apr 2001 23:22:17 -0400 (EDT)
Received: (from majordom@localhost)
	by ece.cmu.edu (8.11.0/8.10.2) id f410xuD00368
	for ips-outgoing; Mon, 30 Apr 2001 20:59:56 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from msgbas1.cos.agilent.com (msgbas1x.cos.agilent.com [192.6.9.33])
	by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f410xt100362
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 20:59:55 -0400 (EDT)
Received: from msgrel1.and.agilent.com (msgrel1.and.agilent.com [130.30.33.104])
	by msgbas1.cos.agilent.com (Postfix) with ESMTP id AFA7FC7E
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 18:59:54 -0600 (MDT)
Received: from rtl.rose.agilent.com (rtl.rose.agilent.com [156.140.232.231])
	by msgrel1.and.agilent.com (Postfix) with ESMTP id C09A1B5
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 20:59:53 -0400 (EDT)
Received: from mail.rose.agilent.com (bellhop [156.140.233.51])
	by rtl.rose.agilent.com (8.9.3 (PHNE_18979)/8.9.3 SMKit7.1.0) with ESMTP id RAA21697
	for <ips@ece.cmu.edu>; Mon, 30 Apr 2001 17:59:52 -0700 (PDT)
Received: from agilent.com (cos1nai128088.cos.agilent.com [141.184.128.88])
          by mail.rose.agilent.com (Netscape Messaging Server 3.6)
           with ESMTP id AAA50D for <ips@ece.cmu.edu>;
          Mon, 30 Apr 2001 17:59:47 -0700
Message-ID: <3AEE0903.2E46EF43@agilent.com>
Date: Mon, 30 Apr 2001 17:53:24 -0700
From: "Matt Wakeley" <matt_wakeley@agilent.com>
Reply-To: Matt Wakeley <matt_wakeley@agilent.com>
Organization: Agilent Technologies
X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: IPS Reflector <ips@ece.cmu.edu>
Subject: iscsi: comments to iSCSI rev 6
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

Section 2.2.3:

The AHSLength field where it is requires RISC type processors to shift the
length left by 8 bytes.

Section 2.3.5, 3rd paragraph: contradicts itself.  First it states that a AHS
header "MUST" be present, then goes on to define what the bi-di read length is
if the header is not present.  If it's not present, it's a protocol error.

Section 2.4.1, last sentance: "b0-b3 MUST be 0" s/b "b1-b4 MUST be 0".

Section 2.7.7: why have residual bits/fields in a data PDU?  If there is
residual, then send a status PDU indicating the residual value.

Section 2.8.3: Keys not understood by the target should be expicitely
indicated as not being understood.  Silence is not a good way to indicate that
one does not understand something.  Also, something more ascii friendly such
as a semi-colon (;) should be used to separate key-value pairs instead of a
null (0x00) character.  This would allow generic text manipulation libraries
to be used.

Section 2.9.3: why "MUST" key-value pairs be returned in the same order they
were issued?  Seems like a rather strong requirement for no apparent reason.

Section 2.12.3: indicate that the LUN is copied from the NOP-IN.  This is much
more clear than "the correct value for the task".

Section 2.14: Why is there no CmdSN for the logout command?  Also, 2 ways of
performing the same operation (cleaning up) are stated in the 3rd paragraph. 
In the interest of reducing "options", I suggest that one be picked as the
only method.

Section 2.17.4: "The target assigns" should be "The target may assign".

Section 6.3 & 6.7.2: why "MUST" a target reissue missing responses?  What if
it is not able to?  There should be the option to reject the SNACK.

Appendix: "Initial Marker-less Interval" - I say again, there should not be a
"minimum markerless interval".  This should be whatever is negotiated.


-Matt Wakeley
Agilent Technologies




