
From nobody Sat Dec  2 15:41:07 2017
Return-Path: <ietf@augustcellars.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 649C71287A5 for <cbor@ietfa.amsl.com>; Sat,  2 Dec 2017 15:41:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level: 
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EyDlhO8bANWG for <cbor@ietfa.amsl.com>; Sat,  2 Dec 2017 15:41:00 -0800 (PST)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 057E91205D3 for <cbor@ietf.org>; Sat,  2 Dec 2017 15:41:00 -0800 (PST)
Received: from Jude (192.168.0.11) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Sat, 2 Dec 2017 15:39:41 -0800
From: Jim Schaad <ietf@augustcellars.com>
To: 'Jeffrey Yasskin' <jyasskin@chromium.org>, 'Francesca Palombini' <francesca.palombini@ericsson.com>
CC: <cbor@ietf.org>, <aamelnikov@fastmail.fm>, <hildjj@cursive.net>
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org> <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com> <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com>
In-Reply-To: <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com>
Date: Sat, 2 Dec 2017 15:40:50 -0800
Message-ID: <040d01d36bc6$fe6d7800$fb486800$@augustcellars.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_040E_01D36B83.F04D9360"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQLkcV5/cuYwYNvq7zdAWu7g9fdkoQDl1w1UAkkwjzoCc2JLLgGyH3OJAnAGOCigwITooA==
Content-Language: en-us
X-Originating-IP: [192.168.0.11]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/wQWLc6kkhJpZMk3LCTFP39NN5IQ>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Dec 2017 23:41:05 -0000

------=_NextPart_000_040E_01D36B83.F04D9360
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Looking at the CTAP description, it appears that it does not use the =
same one as the RFC in a really strange way.  It says that the 5/3 is to =
be ignored, but then does an ordering on the major type first.  I am not =
sure that this is actually self-consistent.

=20

I agree that I do not want to try and do an encoding in order to decide =
things.  However, using the CTAP description I think that you might be =
able to do the sorting w/o needing to do the encoding  due to the fact =
that major types are all sorted together.  In fact it may be a really =
close to the memcmp version that I have suggested.  (I would need to do =
some really detailed futzing about how each of the different types is =
encoded to be sure one way or the other.  It might be that nested things =
from tags might mess that up.  Is a tagged uint sorted before a tagged =
sint if the second one has a shorter encoding?)

=20

Jim

=20

=20

=20

From: CBOR [mailto:cbor-bounces@ietf.org] On Behalf Of Jeffrey Yasskin
Sent: Thursday, November 30, 2017 2:14 PM
To: Francesca Palombini <francesca.palombini@ericsson.com>
Cc: cbor@ietf.org; aamelnikov@fastmail.fm; hildjj@cursive.net
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested =
Canonicalization

=20

Belatedly, I've discovered a user of "canonical" CBOR who's proposing a =
different map order than the RFC suggests: =
https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authe=
nticator-protocol-v2.0-rd-20170927.html#message-encoding. (Note that =
this isn't a final standard yet and may change.)

=20

This is justified by the RFC saying that "Those protocols are free to =
define what they mean by a canonical format and what encoders and =
decoders are expected to do.  This section lists some suggestions for =
such protocols." That is (as Jim said), the RFC doesn't specify =
"canonical" CBOR: it just provides an option for higher-level protocols =
to do so.

=20

The use of a different order in CTAP is going to either force its =
implementers to write custom CBOR encoders and decoders or require the =
generic encoders to take a configuration option for the map order. If =
the generic encoders take an option, then it stops being an issue for =
CBORbis to suggest a different order.

=20

---

=20

On the question of *which* order, it may be instructive to look at the =
current Chromium implementation: =
https://cs.chromium.org/chromium/src/content/browser/webauth/cbor/cbor_va=
lues.h?l=3D27 =
<https://cs.chromium.org/chromium/src/content/browser/webauth/cbor/cbor_v=
alues.h?l=3D27&rcl=3D9c334c65915ec312ae1e13c4da9285ee8c58735b> =
&rcl=3D9c334c65915ec312ae1e13c4da9285ee8c58735b

=20

We currently have serializations and deserializations create a C++ =
datastructure representing the CBOR value, and maps are represented by a =
std::map<> whose comparator implements the canonical order. This is not =
as high-performance as it could be, but it does allow the encoder to =
guarantee that its output is well-formed without relying on its caller =
to maintain any invariants. However, once we add support for more than =
just string keys, having the comparison check the keys' serializations' =
lengths first looks like it's going to make us allocate space for the =
serializations inside the comparator. i.e. O(log N) allocations for a =
tree lookup instead of just O(log N) comparisons. There are ways around =
this, but they appear to involve having each CBORValue object cache its =
serialization, which isn't pretty either.

=20

An ordering that doesn't compare the length first (e.g. Jim's memcmp =
suggestion) seems like it'll be more amenable to optimizations.

=20

Jeffrey

=20

=20

On Thu, Oct 5, 2017 at 4:20 AM Francesca Palombini =
<francesca.palombini@ericsson.com =
<mailto:francesca.palombini@ericsson.com> > wrote:

Hi,

(chair-hat on) I would really like to hear the opinion of the working =
group on this one.
Please make sure you read through the mail thread and express your =
thoughts.

Thanks,
Francesca

> -----Original Message-----
> From: core [mailto:core-bounces@ietf.org =
<mailto:core-bounces@ietf.org> ] On Behalf Of Carsten Bormann
> Sent: den 17 september 2017 20:08
> To: Jim Schaad <ietf@augustcellars.com <mailto:ietf@augustcellars.com> =
>
> Cc: cbor@ietf.org <mailto:cbor@ietf.org> ; core@ietf.org =
<mailto:core@ietf.org>  WG <core@ietf.org <mailto:core@ietf.org> >
> Subject: Re: [core] draft-ietf-cbor-7049bis - Change suggested
> Canonicalization
>
> Hi Jim,
>
> I do share your frustration a bit, because I do believe the canonical =
map key
> sorting order is one of the very few things we didn=E2=80=99t quite =
get right about RFC
> 7049.  The question about fixing this really is procedurally, can we =
fix it, and
> what is the impact on people who are now using the old order.
>
> > I complained that this was not a good sorting order before RFC 7049 =
was
> published, so this is not a new issue.  I wish that I could find the =
mails from
> the time as my memory was that you said we could discuss it on the =
next
> version which is what I am trying to do.
>
> I must admit I don=E2=80=99t remember that discussion (it would have =
been more
> than four years in the past) and my mail program apparently =
doesn=E2=80=99t either.
>
> (My mail program does remember errata report 4409 from July 2015, =
where
> you did propose another sorting order.)
>
> >> Why do you think the values need to be in there, too?
> >> Map keys must be different, so the value never can have an effect,
> >
> > That is from below - where you could have multiple duplicate keys.  =
It is
> very possible people will do this even if it is not a valid encoding.
>
> The reason why 7049 does not outright make emitting duplicate keys a
> conformance issue is that, in a streaming implementation, it is hard =
to check
> for duplicate keys.  That argument does not apply to an implementation =
that
> has to sort map member keys to achieve a canonical encoding.  So, =
while
> creating canonical encoding, the implementation can also check for key
> duplication and raise an error if that happens.
>
> >> Now, if it were mid-2013, I would just agree with you, change the
> >> draft, and ask the WG (at the time appsawg) whether there are any
> further comments.
> >
> > Given that you did not agree with me at the time when I raised this =
issue, I
> would disagree with this statement.
>
> Well, I probably should have said =E2=80=9Cknowing what I know =
now=E2=80=9D (i.e., the
> degree of POLS violation that the current rules pose) I would agree =
with you.
> (At the time, the definition we now have in RFC 7049 appeared to be
> expedient.)
>
> >> Unfortunately, CBOR has been out there, we are on the way to an
> >> Internet STD, and the current canonicalization is what=E2=80=99s =
implemented
> >> (at least in a couple of places).  If we change this, we have to =
take
> >> canonicalization out from the STD and put it into a separate =
specification.
> >> We could decide to do this, but I=E2=80=99m not sure that this =
helps.  (We=E2=80=99ll
> >> also have the CER vs. DER situation again.)
> >
> > I do not believe that changing this would in any way stop the =
advancement
> to STD.
>
> Now that (i.e., just going ahead and changing things here) is an =
interesting
> thought.
>
> One problem with that is that in the IETF, the IESG is the gating =
function, and
> it is sufficient for a single member of the IESG to disagree with this =
thought to
> hold up the process.
>
> > This section is completely non-normative there is not a single =
normative
> statement in the entire section. Nor is the set of rules complete =
given that
> there are two paragraphs at the end which have additional rules that =
"might"
> need to be added.  Making this one change would not alter the fact =
that it is
> non-normative and incomplete.
>
> Filling in those blanks might be another argument for going ahead and
> writing a separate document about canonicalized CBOR instead.
>
> > There is a possible multiple encoding issue that may come, but that =
is
> already implicit in the text of this section.  The current text reads =
"Those
> protocols are free to define what they mean by a canonical format and =
what
> encoders and decoders are expected to do." Which means that multiple
> encodings are already not only possible but probably.  I think that we =
can get
> this fixed now and go forward with an obsoleted suggested =
canonicalization
> and be fine.
> >
> > The suggested algorithm is far easier to understand, easier to get =
right and
> also has some advantages where one can do the canonicalization without
> having to do the encoding first.
> >
> > int CompareNodes(node1, node2)
> >   if node1.majorMode !=3D node2.majorMode then return =
node1.majorMode
> - node2.majorMode;
> >   if node1.majorMode has a length field and node1.length !=3D =
node2.length
> then return node1.length - node2.length;
> >   return compare node1 and node2 values - this is major mode =
dependent.
> >
> > With the current method, you need to do a lot more work to try and =
get
> things in the correct order if you have any mixing of major modes, =
which is
> now very common place despite the statement that this is probably a =
bad
> practice.
>
> Indeed, this is one thing we have learned since 2013: There are quite =
good
> reasons for mixing major types in the keys of a single map.
>
> > If you have to emit all of the keys, sort them and then remember the
> original order to emit the values, it is harder than concatenating the =
entire
> thing and then emitting the values after doing the sort.  You are =
going to
> need to keep some type of more complex structure - and thus more code- =
to
> do the emission in the generic case.  Yes for a small fixed set of =
keys this is
> not necessary but I do not believe that is where a good =
canonicalization
> routine is going to be needed.
>
> Here is my implementation of a pre-canonicalizer for maps from the =
cbor-
> canonical gem (the type =E2=80=9CHash=E2=80=9D in Ruby is an =
order-preserving map, so we
> can do all this entirely at the data model level):
>
>       def cbor_pre_canonicalize
>         Hash[collect {|k, v|
>                       k =3D k.cbor_pre_canonicalize
>                       v =3D v.cbor_pre_canonicalize
>                       cc =3D k.to_cbor # already canonical
>                       [cc.size, cc, k, v]}.sort.collect{|s, cc, k, v| =
[k, v]}]
>       end
>
> A sorting rule that is entirely based on (byte-wise) lexical ordering =
would
> enable pre-sorting on types =E2=80=94 there often would be no need to =
generate the
> entire key encoding for sorting.  But then, you have to generate them
> anyway, so the additional overhead is mostly a matter of memory
> allocation/copying.
>
> Here is an (untested) implementation of what I think you are =
proposing:
>
>       def cbor_pre_canonicalize
>         Hash[collect {|k, v|
>                       k =3D k.cbor_pre_canonicalize
>                       v =3D v.cbor_pre_canonicalize
>                       cc =3D k.to_cbor # already canonical
>                       [cc, k, v]}.sort.collect{|cc, k, v| [k, v]}]
>       end
>
> I.e., the size of the key encoding is removed from the list of things =
to sort for.
>
> > I strongly urge that this change be done even if it might hurt =
backwards
> compatibility and the sooner we make the decision to do it the better =
as it
> reduces the number of people who will do it wrong.
>
> Now whether we should do this or not is a good question for the CBOR =
WG
> to look at.
>
> (I=E2=80=99ve added the WG back to the recipient list.)
>
> Gr=C3=BC=C3=9Fe, Carsten
>
>
> _______________________________________________
> core mailing list
> core@ietf.org <mailto:core@ietf.org>=20
> https://www.ietf.org/mailman/listinfo/core
_______________________________________________
CBOR mailing list
CBOR@ietf.org <mailto:CBOR@ietf.org>=20
https://www.ietf.org/mailman/listinfo/cbor


------=_NextPart_000_040E_01D36B83.F04D9360
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta =
http-equiv=3DContent-Type content=3D"text/html; charset=3Dutf-8"><meta =
name=3DGenerator content=3D"Microsoft Word 15 (filtered =
medium)"><style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
	{mso-style-name:msonormal;
	mso-margin-top-alt:auto;
	margin-right:0in;
	mso-margin-bottom-alt:auto;
	margin-left:0in;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
span.EmailStyle18
	{mso-style-type:personal-reply;
	font-family:"Calibri",sans-serif;
	color:windowtext;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri",sans-serif;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]--></head><body lang=3DEN-US link=3Dblue =
vlink=3Dpurple><div class=3DWordSection1><p class=3DMsoNormal>Looking at =
the CTAP description, it appears that it does not use the same one as =
the RFC in a really strange way.=C2=A0 It says that the 5/3 is to be =
ignored, but then does an ordering on the major type first.=C2=A0 I am =
not sure that this is actually self-consistent.<o:p></o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>I agree that =
I do not want to try and do an encoding in order to decide things.=C2=A0 =
However, using the CTAP description I think that you might be able to do =
the sorting w/o needing to do the encoding =C2=A0due to the fact that =
major types are all sorted together.=C2=A0 In fact it may be a really =
close to the memcmp version that I have suggested.=C2=A0 (I would need =
to do some really detailed futzing about how each of the different types =
is encoded to be sure one way or the other. =C2=A0It might be that =
nested things from tags might mess that up.=C2=A0 Is a tagged uint =
sorted before a tagged sint if the second one has a shorter =
encoding?)<o:p></o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal>Jim<o:p></o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal><b>From:</b> =
CBOR [mailto:cbor-bounces@ietf.org] <b>On Behalf Of </b>Jeffrey =
Yasskin<br><b>Sent:</b> Thursday, November 30, 2017 2:14 =
PM<br><b>To:</b> Francesca Palombini =
&lt;francesca.palombini@ericsson.com&gt;<br><b>Cc:</b> cbor@ietf.org; =
aamelnikov@fastmail.fm; hildjj@cursive.net<br><b>Subject:</b> Re: [Cbor] =
[core] draft-ietf-cbor-7049bis - Change suggested =
Canonicalization<o:p></o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><div><p =
class=3DMsoNormal>Belatedly, I've discovered a user of =
&quot;canonical&quot; CBOR who's proposing a different map order than =
the RFC suggests:&nbsp;<a =
href=3D"https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-=
to-authenticator-protocol-v2.0-rd-20170927.html#message-encoding">https:/=
/fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authenticato=
r-protocol-v2.0-rd-20170927.html#message-encoding</a>. (Note that this =
isn't a final standard yet and may change.)<o:p></o:p></p><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>This is justified by the RFC saying that &quot;Those =
protocols are free to define what they mean by a canonical format and =
what encoders and decoders are expected to do.&nbsp; This section lists =
some suggestions for such protocols.&quot; That is (as Jim said), the =
RFC doesn't specify &quot;canonical&quot; CBOR: it just provides an =
option for higher-level protocols to do so.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>The use of a different order in CTAP is going to =
either force its implementers to write custom CBOR encoders and decoders =
or require the generic encoders to take a configuration option for the =
map order. If the generic encoders take an option, then it stops being =
an issue for CBORbis to suggest a different =
order.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>---<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>On the question of *which* order, it may be =
instructive to look at the current Chromium implementation:&nbsp;<a =
href=3D"https://cs.chromium.org/chromium/src/content/browser/webauth/cbor=
/cbor_values.h?l=3D27&amp;rcl=3D9c334c65915ec312ae1e13c4da9285ee8c58735b"=
>https://cs.chromium.org/chromium/src/content/browser/webauth/cbor/cbor_v=
alues.h?l=3D27&amp;rcl=3D9c334c65915ec312ae1e13c4da9285ee8c58735b</a><o:p=
></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>We currently have serializations and deserializations =
create a C++ datastructure representing the CBOR value, and maps are =
represented by a std::map&lt;&gt; whose comparator implements the =
canonical order. This is not as high-performance as it could be, but it =
does allow the encoder to guarantee that its output is well-formed =
without relying on its caller to maintain any invariants. However, once =
we add support for more than just string keys, having the comparison =
check the keys' serializations' lengths first looks like it's going to =
make us allocate space for the serializations inside the comparator. =
i.e. O(log N) allocations for a tree lookup instead of just O(log N) =
comparisons. There are ways around this, but they appear to involve =
having each CBORValue object cache its serialization, which isn't pretty =
either.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>An ordering that doesn't compare the length first =
(e.g. Jim's memcmp suggestion) seems like it'll be more amenable to =
optimizations.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>Jeffrey<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div></div><p class=3DMsoNormal =
style=3D'margin-bottom:12.0pt'><o:p>&nbsp;</o:p></p><div><div><p =
class=3DMsoNormal>On Thu, Oct 5, 2017 at 4:20 AM Francesca Palombini =
&lt;<a =
href=3D"mailto:francesca.palombini@ericsson.com">francesca.palombini@eric=
sson.com</a>&gt; wrote:<o:p></o:p></p></div><blockquote =
style=3D'border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in =
6.0pt;margin-left:4.8pt;margin-right:0in'><p =
class=3DMsoNormal>Hi,<br><br>(chair-hat on) I would really like to hear =
the opinion of the working group on this one.<br>Please make sure you =
read through the mail thread and express your =
thoughts.<br><br>Thanks,<br>Francesca<br><br>&gt; -----Original =
Message-----<br>&gt; From: core [mailto:<a =
href=3D"mailto:core-bounces@ietf.org" =
target=3D"_blank">core-bounces@ietf.org</a>] On Behalf Of Carsten =
Bormann<br>&gt; Sent: den 17 september 2017 20:08<br>&gt; To: Jim Schaad =
&lt;<a href=3D"mailto:ietf@augustcellars.com" =
target=3D"_blank">ietf@augustcellars.com</a>&gt;<br>&gt; Cc: <a =
href=3D"mailto:cbor@ietf.org" target=3D"_blank">cbor@ietf.org</a>; <a =
href=3D"mailto:core@ietf.org" target=3D"_blank">core@ietf.org</a> WG =
&lt;<a href=3D"mailto:core@ietf.org" =
target=3D"_blank">core@ietf.org</a>&gt;<br>&gt; Subject: Re: [core] =
draft-ietf-cbor-7049bis - Change suggested<br>&gt; =
Canonicalization<br>&gt;<br>&gt; Hi Jim,<br>&gt;<br>&gt; I do share your =
frustration a bit, because I do believe the canonical map key<br>&gt; =
sorting order is one of the very few things we didn=E2=80=99t quite get =
right about RFC<br>&gt; 7049.&nbsp; The question about fixing this =
really is procedurally, can we fix it, and<br>&gt; what is the impact on =
people who are now using the old order.<br>&gt;<br>&gt; &gt; I =
complained that this was not a good sorting order before RFC 7049 =
was<br>&gt; published, so this is not a new issue.&nbsp; I wish that I =
could find the mails from<br>&gt; the time as my memory was that you =
said we could discuss it on the next<br>&gt; version which is what I am =
trying to do.<br>&gt;<br>&gt; I must admit I don=E2=80=99t remember that =
discussion (it would have been more<br>&gt; than four years in the past) =
and my mail program apparently doesn=E2=80=99t either.<br>&gt;<br>&gt; =
(My mail program does remember errata report 4409 from July 2015, =
where<br>&gt; you did propose another sorting order.)<br>&gt;<br>&gt; =
&gt;&gt; Why do you think the values need to be in there, too?<br>&gt; =
&gt;&gt; Map keys must be different, so the value never can have an =
effect,<br>&gt; &gt;<br>&gt; &gt; That is from below - where you could =
have multiple duplicate keys.&nbsp; It is<br>&gt; very possible people =
will do this even if it is not a valid encoding.<br>&gt;<br>&gt; The =
reason why 7049 does not outright make emitting duplicate keys a<br>&gt; =
conformance issue is that, in a streaming implementation, it is hard to =
check<br>&gt; for duplicate keys.&nbsp; That argument does not apply to =
an implementation that<br>&gt; has to sort map member keys to achieve a =
canonical encoding.&nbsp; So, while<br>&gt; creating canonical encoding, =
the implementation can also check for key<br>&gt; duplication and raise =
an error if that happens.<br>&gt;<br>&gt; &gt;&gt; Now, if it were =
mid-2013, I would just agree with you, change the<br>&gt; &gt;&gt; =
draft, and ask the WG (at the time appsawg) whether there are =
any<br>&gt; further comments.<br>&gt; &gt;<br>&gt; &gt; Given that you =
did not agree with me at the time when I raised this issue, I<br>&gt; =
would disagree with this statement.<br>&gt;<br>&gt; Well, I probably =
should have said =E2=80=9Cknowing what I know now=E2=80=9D (i.e., =
the<br>&gt; degree of POLS violation that the current rules pose) I =
would agree with you.<br>&gt; (At the time, the definition we now have =
in RFC 7049 appeared to be<br>&gt; expedient.)<br>&gt;<br>&gt; &gt;&gt; =
Unfortunately, CBOR has been out there, we are on the way to an<br>&gt; =
&gt;&gt; Internet STD, and the current canonicalization is =
what=E2=80=99s implemented<br>&gt; &gt;&gt; (at least in a couple of =
places).&nbsp; If we change this, we have to take<br>&gt; &gt;&gt; =
canonicalization out from the STD and put it into a separate =
specification.<br>&gt; &gt;&gt; We could decide to do this, but =
I=E2=80=99m not sure that this helps.&nbsp; (We=E2=80=99ll<br>&gt; =
&gt;&gt; also have the CER vs. DER situation again.)<br>&gt; =
&gt;<br>&gt; &gt; I do not believe that changing this would in any way =
stop the advancement<br>&gt; to STD.<br>&gt;<br>&gt; Now that (i.e., =
just going ahead and changing things here) is an interesting<br>&gt; =
thought.<br>&gt;<br>&gt; One problem with that is that in the IETF, the =
IESG is the gating function, and<br>&gt; it is sufficient for a single =
member of the IESG to disagree with this thought to<br>&gt; hold up the =
process.<br>&gt;<br>&gt; &gt; This section is completely non-normative =
there is not a single normative<br>&gt; statement in the entire section. =
Nor is the set of rules complete given that<br>&gt; there are two =
paragraphs at the end which have additional rules that =
&quot;might&quot;<br>&gt; need to be added.&nbsp; Making this one change =
would not alter the fact that it is<br>&gt; non-normative and =
incomplete.<br>&gt;<br>&gt; Filling in those blanks might be another =
argument for going ahead and<br>&gt; writing a separate document about =
canonicalized CBOR instead.<br>&gt;<br>&gt; &gt; There is a possible =
multiple encoding issue that may come, but that is<br>&gt; already =
implicit in the text of this section.&nbsp; The current text reads =
&quot;Those<br>&gt; protocols are free to define what they mean by a =
canonical format and what<br>&gt; encoders and decoders are expected to =
do.&quot; Which means that multiple<br>&gt; encodings are already not =
only possible but probably.&nbsp; I think that we can get<br>&gt; this =
fixed now and go forward with an obsoleted suggested =
canonicalization<br>&gt; and be fine.<br>&gt; &gt;<br>&gt; &gt; The =
suggested algorithm is far easier to understand, easier to get right =
and<br>&gt; also has some advantages where one can do the =
canonicalization without<br>&gt; having to do the encoding =
first.<br>&gt; &gt;<br>&gt; &gt; int CompareNodes(node1, node2)<br>&gt; =
&gt;&nbsp; &nbsp;if node1.majorMode !=3D node2.majorMode then return =
node1.majorMode<br>&gt; - node2.majorMode;<br>&gt; &gt;&nbsp; &nbsp;if =
node1.majorMode has a length field and node1.length !=3D =
node2.length<br>&gt; then return node1.length - node2.length;<br>&gt; =
&gt;&nbsp; &nbsp;return compare node1 and node2 values - this is major =
mode dependent.<br>&gt; &gt;<br>&gt; &gt; With the current method, you =
need to do a lot more work to try and get<br>&gt; things in the correct =
order if you have any mixing of major modes, which is<br>&gt; now very =
common place despite the statement that this is probably a bad<br>&gt; =
practice.<br>&gt;<br>&gt; Indeed, this is one thing we have learned =
since 2013: There are quite good<br>&gt; reasons for mixing major types =
in the keys of a single map.<br>&gt;<br>&gt; &gt; If you have to emit =
all of the keys, sort them and then remember the<br>&gt; original order =
to emit the values, it is harder than concatenating the entire<br>&gt; =
thing and then emitting the values after doing the sort.&nbsp; You are =
going to<br>&gt; need to keep some type of more complex structure - and =
thus more code- to<br>&gt; do the emission in the generic case.&nbsp; =
Yes for a small fixed set of keys this is<br>&gt; not necessary but I do =
not believe that is where a good canonicalization<br>&gt; routine is =
going to be needed.<br>&gt;<br>&gt; Here is my implementation of a =
pre-canonicalizer for maps from the cbor-<br>&gt; canonical gem (the =
type =E2=80=9CHash=E2=80=9D in Ruby is an order-preserving map, so =
we<br>&gt; can do all this entirely at the data model =
level):<br>&gt;<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;def =
cbor_pre_canonicalize<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;Hash[collect {|k, v|<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;k =3D =
k.cbor_pre_canonicalize<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;v =3D =
v.cbor_pre_canonicalize<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;cc =3D k.to_cbor # already =
canonical<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;[cc.size, cc, k, v]}.sort.collect{|s, cc, k, =
v| [k, v]}]<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;end<br>&gt;<br>&gt; A =
sorting rule that is entirely based on (byte-wise) lexical ordering =
would<br>&gt; enable pre-sorting on types =E2=80=94 there often would be =
no need to generate the<br>&gt; entire key encoding for sorting.&nbsp; =
But then, you have to generate them<br>&gt; anyway, so the additional =
overhead is mostly a matter of memory<br>&gt; =
allocation/copying.<br>&gt;<br>&gt; Here is an (untested) implementation =
of what I think you are proposing:<br>&gt;<br>&gt;&nbsp; &nbsp; &nbsp; =
&nbsp;def cbor_pre_canonicalize<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;Hash[collect {|k, v|<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;k =3D =
k.cbor_pre_canonicalize<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;v =3D =
v.cbor_pre_canonicalize<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;cc =3D k.to_cbor # already =
canonical<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;[cc, k, v]}.sort.collect{|cc, k, v| [k, =
v]}]<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;end<br>&gt;<br>&gt; I.e., the =
size of the key encoding is removed from the list of things to sort =
for.<br>&gt;<br>&gt; &gt; I strongly urge that this change be done even =
if it might hurt backwards<br>&gt; compatibility and the sooner we make =
the decision to do it the better as it<br>&gt; reduces the number of =
people who will do it wrong.<br>&gt;<br>&gt; Now whether we should do =
this or not is a good question for the CBOR WG<br>&gt; to look =
at.<br>&gt;<br>&gt; (I=E2=80=99ve added the WG back to the recipient =
list.)<br>&gt;<br>&gt; Gr=C3=BC=C3=9Fe, Carsten<br>&gt;<br>&gt;<br>&gt; =
_______________________________________________<br>&gt; core mailing =
list<br>&gt; <a href=3D"mailto:core@ietf.org" =
target=3D"_blank">core@ietf.org</a><br>&gt; <a =
href=3D"https://www.ietf.org/mailman/listinfo/core" =
target=3D"_blank">https://www.ietf.org/mailman/listinfo/core</a><br>_____=
__________________________________________<br>CBOR mailing list<br><a =
href=3D"mailto:CBOR@ietf.org" target=3D"_blank">CBOR@ietf.org</a><br><a =
href=3D"https://www.ietf.org/mailman/listinfo/cbor" =
target=3D"_blank">https://www.ietf.org/mailman/listinfo/cbor</a><o:p></o:=
p></p></blockquote></div></div></body></html>
------=_NextPart_000_040E_01D36B83.F04D9360--


From nobody Sat Dec  2 18:15:41 2017
Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BCEE3124E15 for <cbor@ietfa.amsl.com>; Sat,  2 Dec 2017 18:15:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level: 
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9,  DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WCRiUdAp6aep for <cbor@ietfa.amsl.com>; Sat,  2 Dec 2017 18:15:38 -0800 (PST)
Received: from mail-pl0-x22a.google.com (mail-pl0-x22a.google.com [IPv6:2607:f8b0:400e:c01::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 26CDC12420B for <cbor@ietf.org>; Sat,  2 Dec 2017 18:15:38 -0800 (PST)
Received: by mail-pl0-x22a.google.com with SMTP id q7so8457958plk.0 for <cbor@ietf.org>; Sat, 02 Dec 2017 18:15:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;  h=to:from:subject:organization:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=HeiZV6Im2XYCV5GLN1dkmGUv6g/RubYRaRmyci/V5RA=; b=vMEGr0VCAzpPnms3V2LgXLP8T783wJXQgz/dFv6WJBsk6zFKUKgS2rhpuzLx5hHHkC 3j8VxnEwH+kDqll/dD+UlHK3g9RvBGT613E/1JmTqzghYWHuHj+0Fmwxorcvz7ibIMKi S0OA2l69ci8+ETFlR2MHmSum/HP1t+uGw6HrptNrUKtUItAGNM+p4X1Svk4u6cO2DzMY nF1NlAcCdVpow4PSudEm0qgeNd1XO63GspI43V9l8eJQUaX+zuo0sUlRdnKPHnfSZNM/ /biV8HQxkx2z20ASPbamOTJa7GqPtKf81HpVic0Vju2CisywtkzziaPu8Y7WBq5DHNfB Wwqw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:organization:message-id:date :user-agent:mime-version:content-language:content-transfer-encoding; bh=HeiZV6Im2XYCV5GLN1dkmGUv6g/RubYRaRmyci/V5RA=; b=a9FZAgG6jRx0Vz+7/9JXmz8h5INYDl5FNRoDNUEkqDTY+cFex8jyimmZalVT5cWyxS IBrATC6zjcL9flAjA6r+STlbW5bowROwuvKSOua7hKBpFQmiVtzk0JR914nwDllAP6bc RZ03v5GZ2lq38sVTI/jSP93p0xSbsY2PClZmUTKPL+YERvw8D069cQdON8zivnBBif60 Rx0N36bcMEmOo2TUq5ngFQPHFyTHThwFXeEgknJ8ZuZKpLrGmMsuW1nw+JztKoDTrM3/ d/DV+xw+DLG6XUvCMW/CnSbp1F7p+iZn6KVwEzYAVPz8xuU8Ejw5XieQpvd7W9PbGoaz 11iA==
X-Gm-Message-State: AJaThX7l3rLcsN7pxkJ0WtRm5cBgsPjnlJNRpyRToA15UUYKqBxpcMaX 0Fwtg8NytQl0Do4VTCJTQjobeQ==
X-Google-Smtp-Source: AGs4zMYMijqB4rE239af6FHhKAOXTZAHeG0A5/e2b0rgkDvhBHQp7THl3N40g4rpLUH08DjU0YSfHg==
X-Received: by 10.159.194.20 with SMTP id x20mr8560347pln.77.1512267337244; Sat, 02 Dec 2017 18:15:37 -0800 (PST)
Received: from ?IPv6:2406:e007:6f17:1:28cc:dc4c:9703:6781? ([2406:e007:6f17:1:28cc:dc4c:9703:6781]) by smtp.gmail.com with ESMTPSA id l4sm3332908pff.90.2017.12.02.18.15.34 for <cbor@ietf.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 02 Dec 2017 18:15:35 -0800 (PST)
To: cbor@ietf.org
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com>
Date: Sun, 3 Dec 2017 15:15:31 +1300
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/BjLftWl4tWh9T7tw_jAqO7Ejm3w>
Subject: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Dec 2017 02:15:40 -0000

Hi,

I have a question about this CDDL fragment:

   objective-value  /= { 1*elements }
   elements        //= ( @rfcXXXX: { 1*relement } )

   relement  = ( relement-codepoint => relement-value )
   relement-codepoint = uint 
   relement-value     = any
   relement-codepoint /= {
        ?( &(sender-loop-count:1) => 1..255 ),
        ?( &(srv-element:2) => tstr ),
       }

This validates with the Ruby tool and it can generate:

 {"@rfcXXXX": {4224: "toe",
               {1: 225}: "tic",
               2: "tac",
               {1: 36, 2: "toe"}: "tic"
              }
 }

I suspect that the CDDL ABNF allows this via
  memberkey = type1 S "=>"
since type1 covers anything.

Does CBOR allow a map as a key in another map?

However, that is a completely unacceptable construct in Python
("TypeError: unhashable type: 'dict'"); in other words Python
cannot use a map (dictionary) as a key in another map, and I
completely understand that.

What's the ground truth here?

    Brian


From nobody Sun Dec  3 02:30:35 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D0223120713 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 02:30:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jMTbXgv_PE0L for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 02:30:32 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C40731241FC for <cbor@ietf.org>; Sun,  3 Dec 2017 02:30:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [134.102.201.11]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vB3AUR2V000830; Sun, 3 Dec 2017 11:30:27 +0100 (CET)
Received: from client-0253.vpn.uni-bremen.de (client-0253.vpn.uni-bremen.de [134.102.107.253]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3yqPR73Z7FzDXsH; Sun,  3 Dec 2017 11:30:27 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com>
Date: Sun, 3 Dec 2017 11:30:26 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 533989826.73544-56186b9854c6b80e71a450fae08582f8
Content-Transfer-Encoding: quoted-printable
Message-Id: <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org>
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com>
To: Brian E Carpenter <brian.e.carpenter@gmail.com>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/6tn2JqtqEel_5LcgbJHY9WvfsxA>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Dec 2017 10:30:34 -0000

Hi Brian,

> Does CBOR allow a map as a key in another map?

Yes.
Any data item can be used as a map key.

> However, that is a completely unacceptable construct in Python
> ("TypeError: unhashable type: 'dict'"); in other words Python
> cannot use a map (dictionary) as a key in another map, and I
> completely understand that.

Using mutable data structures as map keys is somewhat icky in a hashed =
map implementation.
Strictly speaking, you would need to re-hash a key each time it is =
changed (muted) internally; but how does the map know that this has been =
done?

JavaScript only allows strings (which are immutable in Javascript) as =
keys in =E2=80=9CObject=E2=80=9Ds.
Since that is limiting, it now also has =E2=80=9CMap=E2=80=9Ds.
However, indexing JavaScript =E2=80=9CMap"s is by =3D=3D=3D (identity, =
not equality, with the weird exception that +0 and -0 are the same map =
key), so this punts on the problem that Python is apparently trying to =
avoid.

Gr=C3=BC=C3=9Fe, Carsten


From nobody Sun Dec  3 06:15:36 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 37817124234 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 06:15:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Qn6A9nmNQpcX for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 06:15:31 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7CA381205F1 for <cbor@ietf.org>; Sun,  3 Dec 2017 06:15:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [134.102.201.11]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vB3EFQWF003780; Sun, 3 Dec 2017 15:15:26 +0100 (CET)
Received: from client-0253.vpn.uni-bremen.de (client-0253.vpn.uni-bremen.de [134.102.107.253]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3yqVQk58W0zDXtX; Sun,  3 Dec 2017 15:15:26 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com>
Date: Sun, 3 Dec 2017 15:15:22 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 534003322.79303-a8a313b9fd22b515336289409e12c812
Content-Transfer-Encoding: quoted-printable
Message-Id: <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org>
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org> <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com> <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com>
To: Jeffrey Yasskin <jyasskin@chromium.org>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/fLFmFbltzsty3ZKU8Ca1ZYJ7iBk>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Dec 2017 14:15:34 -0000

On Nov 30, 2017, at 23:14, Jeffrey Yasskin <jyasskin@chromium.org> =
wrote:
>=20
> Belatedly, I've discovered a user of "canonical" CBOR who's proposing =
a different map order than the RFC suggests: =
https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authen=
ticator-protocol-v2.0-rd-20170927.html#message-encoding. (Note that this =
isn't a final standard yet and may change.)

I looked at the spec referenced.

So they essentially add

		=E2=80=A2 If the major types are different, the one with =
the lower value in numerical order sorts earlier.

as a major sorting rule before the existing RFC 7049 canonicalization =
rules:

		=E2=80=A2 If two keys have different lengths, the =
shorter one sorts earlier;
		=E2=80=A2 If two keys have the same length, the one with =
the lower value in (byte-wise) lexical order sorts earlier.

This is different from simply going for byte-wise lexicographic (memcmp =
order(*)), which effectively would get us the first rule as the major =
sorting order already, but get rid of the length-based second rule =
(first rule in Section 3.9 of RFC 7049).

I=E2=80=99ve come to see the putting the length comparison rule early in =
3.9 as a major regression.
One of the objectives when designing the CBOR serialization was not to =
repeat one big mistake that ASN.1 BER makes: to make overall lengths of =
complex composite items visible/important in the encoding of the next =
higher composite.
Here, we are doing just that.  D=E2=80=99oh.

> This is justified by the RFC saying that "Those protocols are free to =
define what they mean by a canonical format and what encoders and =
decoders are expected to do.  This section lists some suggestions for =
such protocols." That is (as Jim said), the RFC doesn't specify =
"canonical" CBOR: it just provides an option for higher-level protocols =
to do so.

Right.  So the change would be to mention two options for this, the old =
canonical, and the saner (memcmp order) canonical.  Now the next step is =
finding names for legacy canonical/saner canonical.  We then have to =
decide whether we turn this into a separate document, at Proposed =
Standard level, or believe that adding another suggestion to 3.9 is =
essentially a bug fix and can be done in the Standard level document.

> The use of a different order in CTAP is going to either force its =
implementers to write custom CBOR encoders and decoders or require the =
generic encoders to take a configuration option for the map order. If =
the generic encoders take an option, then it stops being an issue for =
CBORbis to suggest a different order.

Right.  So I think you are saying we get to fix this.

I=E2=80=99d like to understand why CTAP went for major type first, then =
legacy order.  Do you know when that was decided/who decided that?  Can =
we maybe even influence them to go for the simpler =E2=80=9Csaner=E2=80=9D=
 ordering?

Gr=C3=BC=C3=9Fe, Carsten

(*) well, we wouldn=E2=80=99t define it as =E2=80=9Cmemcmp order=E2=80=9D =
because that would require a normative reference to some C standard plus =
defining the third parameter of the memcmp as the lower of the two =
lengths, which again exposes total lengths.
Instead, we would spell out =E2=80=9Cbytewise lexicographic order of the =
canonical encodings of the keys=E2=80=9D in a few more words.
Note that this can be defined simply based on the first byte that =
differs in the byte sequence =E2=80=94 there are no two different CBOR =
data items where one is a prefix of the other (CBOR is self-delimiting).


From nobody Sun Dec  3 11:06:31 2017
Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 70974126C22 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 11:06:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level: 
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9,  DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3BhvItZVAsvG for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 11:06:29 -0800 (PST)
Received: from mail-pf0-x234.google.com (mail-pf0-x234.google.com [IPv6:2607:f8b0:400e:c00::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DFAC21201F8 for <cbor@ietf.org>; Sun,  3 Dec 2017 11:06:28 -0800 (PST)
Received: by mail-pf0-x234.google.com with SMTP id j124so6898193pfc.2 for <cbor@ietf.org>; Sun, 03 Dec 2017 11:06:28 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;  h=subject:to:cc:references:from:organization:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=6h/LgQx36D3pMDs1U6+clrUkS2U0g/sU4bVgRX8dgTE=; b=Fbd3gYf5NDQJczOMk1YMqA6Nu6DgnwxZJRfchO6mch8FyKQzlsKiLj8ifvQjLP9ZpR QM8aCbQYEskabWNZGyhv2LmFMJs9wX2+fP5zvm4zapuNUItHe3H0GJMRLVmFjekR0sZ8 zF0yZkpIs2ye2r8rSJEdQ/mB1T81CVb/gCEgpX+oPimB+fOFHrkBqv7XdPymWOHBulmQ FLmtdoIUlwRqARkB+JI0W258Vsql4pw0Nk30d+xWsZwqQohDVBVdLWrgfC7teEx3+vtf FAPDbSEbi6jZ+CdvNCWq8LNMgaz2Z8PQ1KDpkSXejln2y9y+beWXdjOVz4B/yHnIKKVe 1B1w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=6h/LgQx36D3pMDs1U6+clrUkS2U0g/sU4bVgRX8dgTE=; b=D7BzHIv+TUc4lYsJ4Zr6cJg4aUEXqo9aZzMv3Uu/wTffAXBorfwhSVlvxnb8qHkpR5 fmZl7GH9sldOTy7KfZSmFU3iYwgc05tduh4jIwDokohERe2HMwA8fRdeKrqJSNmX+VyF KU+Q6GeYitP82g45S+TM+V00XZq0ZSr5i2huFgYLer0swHspybxq8hR0VGFVxL2hweEK CtnsC1kG4FZ4deW6hU0zfPOkmLaAA289A1Hn0Ltwltx/iP0PGKXDbUOwiFiIe3Ahwh/9 rsG2zi1YOgxacO3cRJL6vkmQsMGT2ec6/UNr4dStVjiWR3CpNQxfVMIFzM/AHnRBkCcM M1MQ==
X-Gm-Message-State: AJaThX6gLYUsshIAgKLE1vbrKBOFB7SkuvD3jxG7s7tJWHjoY95/ZA3I K8UXxokpMxHqo63KlM6iSdxhFA==
X-Google-Smtp-Source: AGs4zMamqnRqTTbIJhTdcIvz5Bg5iR+T8NqBOLjLZIRVHzEgG/suBzfSquPIi6i6zyMYjYVtZCPQ9g==
X-Received: by 10.98.46.7 with SMTP id u7mr17059662pfu.37.1512327988003; Sun, 03 Dec 2017 11:06:28 -0800 (PST)
Received: from ?IPv6:2406:e007:6f17:1:28cc:dc4c:9703:6781? ([2406:e007:6f17:1:28cc:dc4c:9703:6781]) by smtp.gmail.com with ESMTPSA id f4sm17086198pgs.30.2017.12.03.11.06.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 03 Dec 2017 11:06:26 -0800 (PST)
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com>
Date: Mon, 4 Dec 2017 08:06:23 +1300
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/jCd9L0WH7HNAf5ZvbjTPQIW6dj8>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Dec 2017 19:06:30 -0000

On 03/12/2017 23:30, Carsten Bormann wrote:
> Hi Brian,
>=20
>> Does CBOR allow a map as a key in another map?
>=20
> Yes.
> Any data item can be used as a map key.

Thanks. That was what the ABNF seemed to say.
=20
>> However, that is a completely unacceptable construct in Python
>> ("TypeError: unhashable type: 'dict'"); in other words Python
>> cannot use a map (dictionary) as a key in another map, and I
>> completely understand that.
>=20
> Using mutable data structures as map keys is somewhat icky in a hashed =
map implementation.
> Strictly speaking, you would need to re-hash a key each time it is chan=
ged (muted) internally; but how does the map know that this has been done=
?

Exactly. The hash needs to be invariant, but it can't be.

> JavaScript only allows strings (which are immutable in Javascript) as k=
eys in =E2=80=9CObject=E2=80=9Ds.
> Since that is limiting, it now also has =E2=80=9CMap=E2=80=9Ds.
> However, indexing JavaScript =E2=80=9CMap"s is by =3D=3D=3D (identity, =
not equality, with the weird exception that +0 and -0 are the same map ke=
y), so this punts on the problem that Python is apparently trying to avoi=
d.

So this suggests to me that while legal, this is a CBOR feature that has
serious issues if you try to use it in interoperable code. I wonder what
Python cbor.loads() would do with such a CBOR fragment?

<pause for testing**>

It crashes with "TypeError: unhashable type: 'dict'" of course.

** I converted my fragment to valid CBOR using diag2cbor, then read it
into Python.

How does Ruby avoid this problem? It seems pretty much unfixable for Pyth=
on.

Regards
    Brian


From nobody Sun Dec  3 12:16:46 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 104EC127011 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 12:16:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1sk4WUjQ0KiH for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 12:16:43 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F0721126D85 for <cbor@ietf.org>; Sun,  3 Dec 2017 12:16:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::b]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vB3KGc7A003580; Sun, 3 Dec 2017 21:16:38 +0100 (CET)
Received: from client-0253.vpn.uni-bremen.de (client-0253.vpn.uni-bremen.de [134.102.107.253]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3yqfRT5DkLzDWMy; Sun,  3 Dec 2017 21:16:37 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com>
Date: Sun, 3 Dec 2017 21:16:36 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 534024996.614453-fae786d44bdda2d4519a2eececec1a8d
Content-Transfer-Encoding: quoted-printable
Message-Id: <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org>
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com>
To: Brian E Carpenter <brian.e.carpenter@gmail.com>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/uMhF4UVHxh8kWGeN8Z59lA9owcY>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Dec 2017 20:16:45 -0000

On Dec 3, 2017, at 20:06, Brian E Carpenter =
<brian.e.carpenter@gmail.com> wrote:
>=20
> On 03/12/2017 23:30, Carsten Bormann wrote:
>> Hi Brian,
>>=20
>>> Does CBOR allow a map as a key in another map?
>>=20
>> Yes.
>> Any data item can be used as a map key.
>=20
> Thanks. That was what the ABNF seemed to say.
>=20
>>> However, that is a completely unacceptable construct in Python
>>> ("TypeError: unhashable type: 'dict'"); in other words Python
>>> cannot use a map (dictionary) as a key in another map, and I
>>> completely understand that.
>>=20
>> Using mutable data structures as map keys is somewhat icky in a =
hashed map implementation.
>> Strictly speaking, you would need to re-hash a key each time it is =
changed (muted) internally; but how does the map know that this has been =
done?
>=20
> Exactly. The hash needs to be invariant, but it can't be.
>=20
>> JavaScript only allows strings (which are immutable in Javascript) as =
keys in =E2=80=9CObject=E2=80=9Ds.
>> Since that is limiting, it now also has =E2=80=9CMap=E2=80=9Ds.
>> However, indexing JavaScript =E2=80=9CMap"s is by =3D=3D=3D =
(identity, not equality, with the weird exception that +0 and -0 are the =
same map key), so this punts on the problem that Python is apparently =
trying to avoid.
>=20
> So this suggests to me that while legal, this is a CBOR feature that =
has
> serious issues if you try to use it in interoperable code.

This sounds more like a problem with the Python implementation of CBOR =
you are using.

> I wonder what
> Python cbor.loads() would do with such a CBOR fragment?
>=20
> <pause for testing**>
>=20
> It crashes with "TypeError: unhashable type: 'dict'" of course.

That=E2=80=99s not what it should be doing.
It should be creating some frozen (immutable) data structure that *can* =
be used as a key, or define its own map type that can use more general =
keys.

(If the generic data model of CBOR has corners that a generic CBOR =
decoder can=E2=80=99t represent, a generic implementation should be =
defining appropriate data structures.  E.g., cbor-ruby defines its own =
CBOR::Tagged and CBOR::Simple for Tags and Simple values not already =
supported by the language.  Of course, there are some weird cases =E2=80=94=
 Ruby does not distinguish +0.0 and -0.0 as map keys, so strictly =
speaking cbor-ruby should define a CBOR::MinusZero or some such for =
this, but it doesn't.)

> ** I converted my fragment to valid CBOR using diag2cbor, then read it
> into Python.
>=20
> How does Ruby avoid this problem? It seems pretty much unfixable for =
Python.

Mutable map keys are not disallowed by Ruby =E2=80=94 you just have to =
live with the (inconsistent) consequences if you do change them in =
place.
(One of the many places where the philosophical differences between Ruby =
and Python are coming out in plain view.)=20

Gr=C3=BC=C3=9Fe, Carsten


From nobody Sun Dec  3 19:03:43 2017
Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5845B126C26 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 19:03:41 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level: 
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9,  DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eNeztFEZw9mC for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 19:03:39 -0800 (PST)
Received: from mail-pg0-x230.google.com (mail-pg0-x230.google.com [IPv6:2607:f8b0:400e:c05::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 917EB126BF6 for <cbor@ietf.org>; Sun,  3 Dec 2017 19:03:39 -0800 (PST)
Received: by mail-pg0-x230.google.com with SMTP id j9so7219656pgc.11 for <cbor@ietf.org>; Sun, 03 Dec 2017 19:03:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;  h=subject:to:cc:references:from:organization:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=jMpBQSb39Ze2nc/n+6tzQS78548+7KBRJvUbCaLegJU=; b=lhSaXx/SJkp3RZMRHcKF/OOJapF9Yy4mr1vJeMZC4jjxI3EFqXLxYqwxg/Mm82tCo0 KhOIh8KNUPU6pN8IoKl5i+ioj10SISkOB/NAj6vgD6iDMDRCHulTrkJ7H3T2hzXSZXfr ffrFSXc23OssrIcSWnXFqYWmfzpzRAyyLaEXCIsdpV6sxQK7bQD6ICRJH++UhynOIqSw KPUrX1Rv0tLSkcIPAIOYA+B1UJfzG+OjN5qAzgv+LGEHgyGgG+0U4Om+nKTw20MCG4O5 l/kShQsJzkcJDZuCGoTvLkC70isoCYLRDh8mvBlrKN9cC1rWsai8u6zAkiO9tit/MzoN y+IA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=jMpBQSb39Ze2nc/n+6tzQS78548+7KBRJvUbCaLegJU=; b=BVzMYTHcOQr2rIZakmhzn32PCoNtSolRBR1pgj3ke5bJ04CxPGMjN7uxSlB6spdtnw UGFpd9J9NPSSd1DuI73l40Sqkb7fND20dz3L5FE93XIaMV4O+X1OgJV2BfyviaWQNVbE 7cyASPrJfC9yBkAEj1W0G8/xj4cSFElexWvFauvQm2/tgqpGwS43epMvFFV39QBMhQxk ALzy4lTnxI7x37+y+Pw+zVT4DO5uWIlkCgxVnOsq15b1ogGMNGRexzfrRTUuBL0kAu+b Iff7wxewiBuTWrmI+dX9sUtsaULhERIr7D0c7QdXfW7+GwX2OI0Co2DE6ZirZ6sjFwky oDng==
X-Gm-Message-State: AJaThX5QIrFfswJ3pcUDPYLlIyuBDLqH6kv5vErkUAkM7mGjW7egL8Zt +dLi0jG1VHIMUSLposT1W0ippQ==
X-Google-Smtp-Source: AGs4zMae5jBpFyqbZhUZ2fSEGuGID+N/PCl7grkP/2/+BEcNFoNruRKgNsKAGUNHA1cZ5I0nVjPg4Q==
X-Received: by 10.84.160.197 with SMTP id v5mr12940313plg.206.1512356618552; Sun, 03 Dec 2017 19:03:38 -0800 (PST)
Received: from ?IPv6:2406:e007:6f17:1:28cc:dc4c:9703:6781? ([2406:e007:6f17:1:28cc:dc4c:9703:6781]) by smtp.gmail.com with ESMTPSA id p85sm20795449pfk.147.2017.12.03.19.03.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 03 Dec 2017 19:03:37 -0800 (PST)
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com> <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com>
Date: Mon, 4 Dec 2017 16:03:33 +1300
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/qTmm2xS3dY09oQ8Hkqt346idSIc>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 03:03:41 -0000

On 04/12/2017 09:16, Carsten Bormann wrote:
> On Dec 3, 2017, at 20:06, Brian E Carpenter <brian.e.carpenter@gmail.co=
m> wrote:
>>
>> On 03/12/2017 23:30, Carsten Bormann wrote:
>>> Hi Brian,
>>>
>>>> Does CBOR allow a map as a key in another map?
>>>
>>> Yes.
>>> Any data item can be used as a map key.
>>
>> Thanks. That was what the ABNF seemed to say.
>>
>>>> However, that is a completely unacceptable construct in Python
>>>> ("TypeError: unhashable type: 'dict'"); in other words Python
>>>> cannot use a map (dictionary) as a key in another map, and I
>>>> completely understand that.
>>>
>>> Using mutable data structures as map keys is somewhat icky in a hashe=
d map implementation.
>>> Strictly speaking, you would need to re-hash a key each time it is ch=
anged (muted) internally; but how does the map know that this has been do=
ne?
>>
>> Exactly. The hash needs to be invariant, but it can't be.
>>
>>> JavaScript only allows strings (which are immutable in Javascript) as=
 keys in =E2=80=9CObject=E2=80=9Ds.
>>> Since that is limiting, it now also has =E2=80=9CMap=E2=80=9Ds.
>>> However, indexing JavaScript =E2=80=9CMap"s is by =3D=3D=3D (identity=
, not equality, with the weird exception that +0 and -0 are the same map =
key), so this punts on the problem that Python is apparently trying to av=
oid.
>>
>> So this suggests to me that while legal, this is a CBOR feature that h=
as
>> serious issues if you try to use it in interoperable code.
>=20
> This sounds more like a problem with the Python implementation of CBOR =
you are using.
>=20
>> I wonder what
>> Python cbor.loads() would do with such a CBOR fragment?
>>
>> <pause for testing**>
>>
>> It crashes with "TypeError: unhashable type: 'dict'" of course.
>=20
> That=E2=80=99s not what it should be doing.
> It should be creating some frozen (immutable) data structure that *can*=
 be used as a key, or define its own map type that can use more general k=
eys.
>=20
> (If the generic data model of CBOR has corners that a generic CBOR deco=
der can=E2=80=99t represent, a generic implementation should be defining =
appropriate data structures.  E.g., cbor-ruby defines its own CBOR::Tagge=
d and CBOR::Simple for Tags and Simple values not already supported by th=
e language.  Of course, there are some weird cases =E2=80=94 Ruby does no=
t distinguish +0.0 and -0.0 as map keys, so strictly speaking cbor-ruby s=
hould define a CBOR::MinusZero or some such for this, but it doesn't.)
>=20

Both the cbor packages I've found for Python have this problem.  I'm not =
even sure I know how to explain it in Pythonic language in a bug report; =
you're asking Pythonistas to think the unthinkable :-).

>> ** I converted my fragment to valid CBOR using diag2cbor, then read it=

>> into Python.
>>
>> How does Ruby avoid this problem? It seems pretty much unfixable for P=
ython.
>=20
> Mutable map keys are not disallowed by Ruby =E2=80=94 you just have to =
live with the (inconsistent) consequences if you do change them in place.=


Right. And that's why I'm going to try hard to insist that we avoid this =
construct in the CBOR application in question (draft-eckert-anima-grasp-d=
nssd-00). I think the inconsistency makes this a poor design, regardless =
of the Python defect.

> (One of the many places where the philosophical differences between Rub=
y and Python are coming out in plain view.)=20

You can of course make a hash out of a Python dictionary, as hash(str(my_=
dict)), but it isn't invariant when the dictionary is updated, so it does=
n't strike me as very useful. (And this issue has consequences for the di=
scussion of canonical ordering of maps.)

Thanks again
    Brian


From nobody Sun Dec  3 21:32:37 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 24DE4126B72 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 21:32:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level: 
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ui6lPhMSZKXr for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 21:32:33 -0800 (PST)
Received: from mail-it0-x236.google.com (mail-it0-x236.google.com [IPv6:2607:f8b0:4001:c0b::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A1E06124D37 for <cbor@ietf.org>; Sun,  3 Dec 2017 21:32:33 -0800 (PST)
Received: by mail-it0-x236.google.com with SMTP id d16so3750371itj.1 for <cbor@ietf.org>; Sun, 03 Dec 2017 21:32:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fDi+WVYRJ99y6S29oor6tssRnGLR9hyjBN5xcKAKJSk=; b=Y8A/2Xxo01o+81ZDQFtidKF+BVCCxPQEpXyv876PQGz4EIdzA3qcIyBlH/851NOYLY PufzHP+UkhfiZjGV6YrtRf1MHysjQtkdXpCWPmyD1HFD2kNbQWpjoXOID4C5rKvKMUNy Acc35a/TJ79ZZ9bG2odaizD9iOE+tbGJHfYBs=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fDi+WVYRJ99y6S29oor6tssRnGLR9hyjBN5xcKAKJSk=; b=GIFJOaJnVGucVvCvjBkedQ33QbkjpWytaomukVtm7FYqxnFZAvA2aZEV+CfFBiscfi xArpukApyE2whQPdnLEiu8vwCj/YdJXi7Op5AVmjl37fzHEksEzrnKM3qyNb3dBofpnr uvsRTqOy5o7+YpNO0w0VmhClziFmmX5yF1+xRksCKZvTkyvGm9G7MFVkcH1PYVE7a7Fl i7shdIDLpsIXMjcRAlDAOJvvvaiZaip4DgpAGYLSn/FrywJa2pecO1gKiJ8mbG5RIciq avyuaxQXVBzuCdJ1vWk5VGPZcuZDEsc/w/g9YvWlO8vmdzu7sJvK1Dkj/eIN1DQ2Cdbp KAcg==
X-Gm-Message-State: AKGB3mKOXc6nX6NNt8/KYAUNf25Xyac2qaEsf1QtheCmW3wK1WVtFLAT fgBo1yxyeZS6427X/KU4OoXUWXrJIbeIm2fzkFmBt7QsLgw=
X-Google-Smtp-Source: AGs4zMY11K8l67FV0AebptVChjK9YzXiSYRAp0CV4zmZ8wEDhJ7K17du/6MtOB3sALKTs2RIsQo0T8m+y8nHsjBvSug=
X-Received: by 10.36.60.212 with SMTP id m203mr9358453ita.96.1512365552564; Sun, 03 Dec 2017 21:32:32 -0800 (PST)
MIME-Version: 1.0
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com> <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org> <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com>
In-Reply-To: <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Mon, 04 Dec 2017 05:32:19 +0000
Message-ID: <CANh-dXmd2AMPWcvimOvxSWqwdCm0_Lg_MROwRePiW4x7SPBf8A@mail.gmail.com>
To: brian.e.carpenter@gmail.com
Cc: Carsten Bormann <cabo@tzi.org>, cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a114849408866f2055f7d0b26"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/vhGfynXfH7nACqOrKwZ02u9o_Hs>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 05:32:36 -0000

--001a114849408866f2055f7d0b26
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Dec 3, 2017 at 7:03 PM Brian E Carpenter <
brian.e.carpenter@gmail.com> wrote:

> On 04/12/2017 09:16, Carsten Bormann wrote:
> > On Dec 3, 2017, at 20:06, Brian E Carpenter <brian.e.carpenter@gmail.co=
m>
> wrote:
> >>
> >> On 03/12/2017 23:30, Carsten Bormann wrote:
> >>> Hi Brian,
> >>>
> >>>> Does CBOR allow a map as a key in another map?
> >>>
> >>> Yes.
> >>> Any data item can be used as a map key.
> >>
> >> Thanks. That was what the ABNF seemed to say.
> >>
> >>>> However, that is a completely unacceptable construct in Python
> >>>> ("TypeError: unhashable type: 'dict'"); in other words Python
> >>>> cannot use a map (dictionary) as a key in another map, and I
> >>>> completely understand that.
> >>>
> >>> Using mutable data structures as map keys is somewhat icky in a hashe=
d
> map implementation.
> >>> Strictly speaking, you would need to re-hash a key each time it is
> changed (muted) internally; but how does the map know that this has been
> done?
> >>
> >> Exactly. The hash needs to be invariant, but it can't be.
> >>
> >>> JavaScript only allows strings (which are immutable in Javascript) as
> keys in =E2=80=9CObject=E2=80=9Ds.
> >>> Since that is limiting, it now also has =E2=80=9CMap=E2=80=9Ds.
> >>> However, indexing JavaScript =E2=80=9CMap"s is by =3D=3D=3D (identity=
, not equality,
> with the weird exception that +0 and -0 are the same map key), so this
> punts on the problem that Python is apparently trying to avoid.
> >>
> >> So this suggests to me that while legal, this is a CBOR feature that h=
as
> >> serious issues if you try to use it in interoperable code.
> >
> > This sounds more like a problem with the Python implementation of CBOR
> you are using.
> >
> >> I wonder what
> >> Python cbor.loads() would do with such a CBOR fragment?
> >>
> >> <pause for testing**>
> >>
> >> It crashes with "TypeError: unhashable type: 'dict'" of course.
> >
> > That=E2=80=99s not what it should be doing.
> > It should be creating some frozen (immutable) data structure that *can*
> be used as a key, or define its own map type that can use more general ke=
ys.
> >
> > (If the generic data model of CBOR has corners that a generic CBOR
> decoder can=E2=80=99t represent, a generic implementation should be defin=
ing
> appropriate data structures.  E.g., cbor-ruby defines its own CBOR::Tagge=
d
> and CBOR::Simple for Tags and Simple values not already supported by the
> language.  Of course, there are some weird cases =E2=80=94 Ruby does not
> distinguish +0.0 and -0.0 as map keys, so strictly speaking cbor-ruby
> should define a CBOR::MinusZero or some such for this, but it doesn't.)
> >
>
> Both the cbor packages I've found for Python have this problem.  I'm not
> even sure I know how to explain it in Pythonic language in a bug report;
> you're asking Pythonistas to think the unthinkable :-).
>

Here are three possible ways to explain the idea of an immutable dictionary
to the theoretical Pythonista who doesn't understand the concept:

1. The language already has sets and frozensets, and multiple customized
dictionary types in the collections module. A frozendict is to a dict as a
frozenset is to a set.
2. namedtuples are basically frozen dictionaries plus an ordering on the
fields.
3. Any Python object is basically a dict with some custom methods, so any
immutable object is an immutable dict.

Jeffrey

--001a114849408866f2055f7d0b26
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr">On Sun, Dec 3,=
 2017 at 7:03 PM Brian E Carpenter &lt;<a href=3D"mailto:brian.e.carpenter@=
gmail.com">brian.e.carpenter@gmail.com</a>&gt; wrote:<br></div><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px sol=
id rgb(204,204,204);padding-left:1ex">On 04/12/2017 09:16, Carsten Bormann =
wrote:<br>
&gt; On Dec 3, 2017, at 20:06, Brian E Carpenter &lt;<a href=3D"mailto:bria=
n.e.carpenter@gmail.com" target=3D"_blank">brian.e.carpenter@gmail.com</a>&=
gt; wrote:<br>
&gt;&gt;<br>
&gt;&gt; On 03/12/2017 23:30, Carsten Bormann wrote:<br>
&gt;&gt;&gt; Hi Brian,<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; Does CBOR allow a map as a key in another map?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Yes.<br>
&gt;&gt;&gt; Any data item can be used as a map key.<br>
&gt;&gt;<br>
&gt;&gt; Thanks. That was what the ABNF seemed to say.<br>
&gt;&gt;<br>
&gt;&gt;&gt;&gt; However, that is a completely unacceptable construct in Py=
thon<br>
&gt;&gt;&gt;&gt; (&quot;TypeError: unhashable type: &#39;dict&#39;&quot;); =
in other words Python<br>
&gt;&gt;&gt;&gt; cannot use a map (dictionary) as a key in another map, and=
 I<br>
&gt;&gt;&gt;&gt; completely understand that.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Using mutable data structures as map keys is somewhat icky in =
a hashed map implementation.<br>
&gt;&gt;&gt; Strictly speaking, you would need to re-hash a key each time i=
t is changed (muted) internally; but how does the map know that this has be=
en done?<br>
&gt;&gt;<br>
&gt;&gt; Exactly. The hash needs to be invariant, but it can&#39;t be.<br>
&gt;&gt;<br>
&gt;&gt;&gt; JavaScript only allows strings (which are immutable in Javascr=
ipt) as keys in =E2=80=9CObject=E2=80=9Ds.<br>
&gt;&gt;&gt; Since that is limiting, it now also has =E2=80=9CMap=E2=80=9Ds=
.<br>
&gt;&gt;&gt; However, indexing JavaScript =E2=80=9CMap&quot;s is by =3D=3D=
=3D (identity, not equality, with the weird exception that +0 and -0 are th=
e same map key), so this punts on the problem that Python is apparently try=
ing to avoid.<br>
&gt;&gt;<br>
&gt;&gt; So this suggests to me that while legal, this is a CBOR feature th=
at has<br>
&gt;&gt; serious issues if you try to use it in interoperable code.<br>
&gt;<br>
&gt; This sounds more like a problem with the Python implementation of CBOR=
 you are using.<br>
&gt;<br>
&gt;&gt; I wonder what<br>
&gt;&gt; Python cbor.loads() would do with such a CBOR fragment?<br>
&gt;&gt;<br>
&gt;&gt; &lt;pause for testing**&gt;<br>
&gt;&gt;<br>
&gt;&gt; It crashes with &quot;TypeError: unhashable type: &#39;dict&#39;&q=
uot; of course.<br>
&gt;<br>
&gt; That=E2=80=99s not what it should be doing.<br>
&gt; It should be creating some frozen (immutable) data structure that *can=
* be used as a key, or define its own map type that can use more general ke=
ys.<br>
&gt;<br>
&gt; (If the generic data model of CBOR has corners that a generic CBOR dec=
oder can=E2=80=99t represent, a generic implementation should be defining a=
ppropriate data structures.=C2=A0 E.g., cbor-ruby defines its own CBOR::Tag=
ged and CBOR::Simple for Tags and Simple values not already supported by th=
e language.=C2=A0 Of course, there are some weird cases =E2=80=94 Ruby does=
 not distinguish +0.0 and -0.0 as map keys, so strictly speaking cbor-ruby =
should define a CBOR::MinusZero or some such for this, but it doesn&#39;t.)=
<br>
&gt;<br>
<br>
Both the cbor packages I&#39;ve found for Python have this problem.=C2=A0 I=
&#39;m not even sure I know how to explain it in Pythonic language in a bug=
 report; you&#39;re asking Pythonistas to think the unthinkable :-).<br></b=
lockquote><div><br></div><div>Here are three possible ways to explain the i=
dea of an immutable dictionary to the theoretical Pythonista who doesn&#39;=
t understand the concept:</div><div><br></div><div>1. The language already =
has sets and frozensets, and multiple customized dictionary types in the co=
llections module. A frozendict is to a dict as a frozenset is to a set.</di=
v><div>2. namedtuples are basically frozen dictionaries plus an ordering on=
 the fields.</div><div>3. Any Python object is basically a dict with some c=
ustom methods, so any immutable object is an immutable dict.</div><div><br>=
</div><div>Jeffrey</div></div></div>

--001a114849408866f2055f7d0b26--


From nobody Sun Dec  3 23:42:58 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A4110126B7E for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 23:42:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LMLVuxD7WDbw for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 23:42:54 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7422F126D73 for <cbor@ietf.org>; Sun,  3 Dec 2017 23:42:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [134.102.201.11]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vB47govV006648; Mon, 4 Dec 2017 08:42:50 +0100 (CET)
Received: from client-0007.vpn.uni-bremen.de (client-0007.vpn.uni-bremen.de [134.102.107.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3yqxgG1Lc3zDWX0; Mon,  4 Dec 2017 08:42:50 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com>
Date: Mon, 4 Dec 2017 08:42:49 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 534066169.154964-af277a10106606d5d1162b8adf48fddd
Content-Transfer-Encoding: quoted-printable
Message-Id: <52016718-0051-4EE9-8B7E-B0CCAC88D219@tzi.org>
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com> <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org> <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com>
To: Brian E Carpenter <brian.e.carpenter@gmail.com>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/lqT9PvC2K3SqZN_X5MuHVEZfLCA>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 07:42:56 -0000

On Dec 4, 2017, at 04:03, Brian E Carpenter =
<brian.e.carpenter@gmail.com> wrote:
>=20
>> Mutable map keys are not disallowed by Ruby =E2=80=94 you just have =
to live with the (inconsistent) consequences if you do change them in =
place.
>=20
> Right. And that's why I'm going to try hard to insist that we avoid =
this construct in the CBOR application in question =
(draft-eckert-anima-grasp-dnssd-00). I think the inconsistency makes =
this a poor design, regardless of the Python defect.
>=20
>> (One of the many places where the philosophical differences between =
Ruby and Python are coming out in plain view.)=20
>=20
> You can of course make a hash out of a Python dictionary, as =
hash(str(my_dict)), but it isn't invariant when the dictionary is =
updated, so it doesn't strike me as very useful. (And this issue has =
consequences for the discussion of canonical ordering of maps.)

Hmm.  CBOR is for the serialization of data.  Those data are immutable =
in nature while being exchanged.

The fact that the most suggestive mapping of the CBOR data types to =
Python data types is using mutable data structures, and that Python =
chooses to expose some limitations in the implementation of these data =
structures as limitations in their allowable structure, doesn=E2=80=99t =
strike me as a very justifiable influence on the design of the CBOR data =
types.  (Note that the limitations exposed pertain to any use of =
non-trivial types as map keys, not just maps as map keys; a rather =
severe restriction.)

Python indeed happens to be a likely implementation language for =
protocols, many of which might be using data items as map keys that =
happen to be mapped to mutable dats structures (maps, arrays) in Python. =
 Maybe it is worth exploring ways to fill in that gap in the Python CBOR =
implementations.  Since I=E2=80=99m not very familiar with the preferred =
styles of Python programming, I can=E2=80=99t help a lot with that; is =
there something like an alist style data structure in Python that could =
be used for representing maps with those keys that cannot be put in a =
Python dict?

Gr=C3=BC=C3=9Fe, Carsten


From nobody Sun Dec  3 23:59:28 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EDF82126E64 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 23:59:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bxzo1HW3yVa3 for <cbor@ietfa.amsl.com>; Sun,  3 Dec 2017 23:59:25 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 51D84126B7E for <cbor@ietf.org>; Sun,  3 Dec 2017 23:59:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::b]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vB47xKYH020325; Mon, 4 Dec 2017 08:59:20 +0100 (CET)
Received: from client-0007.vpn.uni-bremen.de (client-0007.vpn.uni-bremen.de [134.102.107.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3yqy2H6S8bzDWXX; Mon,  4 Dec 2017 08:59:19 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CANh-dXmd2AMPWcvimOvxSWqwdCm0_Lg_MROwRePiW4x7SPBf8A@mail.gmail.com>
Date: Mon, 4 Dec 2017 08:59:19 +0100
Cc: brian.e.carpenter@gmail.com, cbor@ietf.org
X-Mao-Original-Outgoing-Id: 534067158.885291-59dd787af066facf56300617542e2559
Content-Transfer-Encoding: quoted-printable
Message-Id: <18F7238D-6F8E-4BDE-9160-3C5F71FF6882@tzi.org>
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com> <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org> <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com> <CANh-dXmd2AMPWcvimOvxSWqwdCm0_Lg_MROwRePiW4x7SPBf8A@mail.gmail.com>
To: Jeffrey Yasskin <jyasskin@chromium.org>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/gFY0XqjEkC232uafkjbKRGyHXmk>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 07:59:27 -0000

On Dec 4, 2017, at 06:32, Jeffrey Yasskin <jyasskin@chromium.org> wrote:
>=20
> Here are three possible ways to explain the idea of an immutable =
dictionary to the theoretical Pythonista who doesn't understand the =
concept:
>=20
> 1. The language already has sets and frozensets, and multiple =
customized dictionary types in the collections module. A frozendict is =
to a dict as a frozenset is to a set.
> 2. namedtuples are basically frozen dictionaries plus an ordering on =
the fields.
> 3. Any Python object is basically a dict with some custom methods, so =
any immutable object is an immutable dict.

Right, but that approach may lead to a somewhat weird way of putting =
together CBOR data structures.
If non-trivial data need to be transformed into a less natural data =
structure in order to be usable as map keys, the limitation on the map =
turns into a transformation of the map key =E2=80=94 when you extract =
that map key and use it somewhere else, it has a different shape than =
maybe is expected.
I=E2=80=99d prefer to do this in a way that doesn=E2=80=99t have =
non-local influences beyond overcoming the limitations of the map =
representation itself.

(In Ruby you can simply =E2=80=9Cfreeze=E2=80=9D a data structure before =
putting that in as a map key, preventing any surprises that you would =
get when that data structure is mutated while serving as a map key.  A =
frozen structure retains all the interfaces for data access, including =
proper equality.  Not sure it=E2=80=99s as simple as than in Python.)

Gr=C3=BC=C3=9Fe, Carsten


From nobody Mon Dec  4 08:39:21 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 98FC7126BF7 for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 08:39:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level: 
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iszC01cer39e for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 08:39:19 -0800 (PST)
Received: from mail-it0-x22b.google.com (mail-it0-x22b.google.com [IPv6:2607:f8b0:4001:c0b::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D2CC4124239 for <cbor@ietf.org>; Mon,  4 Dec 2017 08:39:18 -0800 (PST)
Received: by mail-it0-x22b.google.com with SMTP id d16so8096809itj.1 for <cbor@ietf.org>; Mon, 04 Dec 2017 08:39:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XBPjPt3Uws32TZ8gtUhLwEt9s97Ly0ryNRt4iBHf6kk=; b=VQ+Uv2vww3OMcsMBVWSHHfGpW7VbglYgKgxZHMEEvbDTkU4ej0X01lqpPgcyVAU8NI FZraBlb+fC/w087b8UtqkE/FwbxDsQV9rdafrayNqETKuaDGGVi5MbxR01SntPFkvqW/ 3Eu61yI69FDcxHnVatb3S0gtXdw8e/dJeF744=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XBPjPt3Uws32TZ8gtUhLwEt9s97Ly0ryNRt4iBHf6kk=; b=i+5JLP67xF9SgVwNDKVWFOika0tQ9u3ZyZE1BJKZKw/socxECvZu20pcACKxO0nz/R Vd3osDKwBiqIPX38AzHLOG0Lg9ZlW5aGqEQiRcoUD5hCnZHEaiCC8LHCEiWfVbQje/4e d7jpPg+8uGCoqPQsTU+maRM0cI2P48hJvX2f1pvbN4zopOZVwx2OdyB+VpxECCmVvf0e 3Vi/S7QoECh8EtumOGiOO2n/P13F+vCy0JxOpPf87FpPxbQivIsRQDjbJQF0SqAVlEWj ocN2snHIzQBJMFwKYkzSzJqGVITJyM1Ao11PMZ85DC3K7Dg4RAwlvAHGm/LyaO8k78Sa jwww==
X-Gm-Message-State: AJaThX7O6iGVImpxz8D9U1Al/lSjXOayVXnPPLXMDUf4zaGQMjj44mLR SPFdnFqRFGlqeKYWxrVJjqcdpVdOk07oTyCexbqQhg==
X-Google-Smtp-Source: AGs4zMackkVPGVzs6nO18LlmBG0tfp17+MHH/WcuCf81InnFg2hcecOwWuqm29O7ZYFz5H8y/dhJ4DNftBM5Q/Esryk=
X-Received: by 10.107.16.206 with SMTP id 75mr25057239ioq.83.1512405557674; Mon, 04 Dec 2017 08:39:17 -0800 (PST)
MIME-Version: 1.0
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com> <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org> <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com> <CANh-dXmd2AMPWcvimOvxSWqwdCm0_Lg_MROwRePiW4x7SPBf8A@mail.gmail.com> <18F7238D-6F8E-4BDE-9160-3C5F71FF6882@tzi.org>
In-Reply-To: <18F7238D-6F8E-4BDE-9160-3C5F71FF6882@tzi.org>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Mon, 04 Dec 2017 16:39:03 +0000
Message-ID: <CANh-dX=aXKo2Xhkbk9m3MD8hOo--vcQ1D5hQsFoDzHQoM5-5nA@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Jeffrey Yasskin <jyasskin@chromium.org>, brian.e.carpenter@gmail.com, cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a113eda96065707055f865c5f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/1r4k_FE3NqVMRw5PoBixjha_4xU>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 16:39:20 -0000

--001a113eda96065707055f865c5f
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Dec 3, 2017 at 11:59 PM Carsten Bormann <cabo@tzi.org> wrote:

> On Dec 4, 2017, at 06:32, Jeffrey Yasskin <jyasskin@chromium.org> wrote:
> >
> > Here are three possible ways to explain the idea of an immutable
> dictionary to the theoretical Pythonista who doesn't understand the conce=
pt:
> >
> > 1. The language already has sets and frozensets, and multiple customize=
d
> dictionary types in the collections module. A frozendict is to a dict as =
a
> frozenset is to a set.
> > 2. namedtuples are basically frozen dictionaries plus an ordering on th=
e
> fields.
> > 3. Any Python object is basically a dict with some custom methods, so
> any immutable object is an immutable dict.
>
> Right, but that approach may lead to a somewhat weird way of putting
> together CBOR data structures.
> If non-trivial data need to be transformed into a less natural data
> structure in order to be usable as map keys, the limitation on the map
> turns into a transformation of the map key =E2=80=94 when you extract tha=
t map key
> and use it somewhere else, it has a different shape than maybe is expecte=
d.
> I=E2=80=99d prefer to do this in a way that doesn=E2=80=99t have non-loca=
l influences
> beyond overcoming the limitations of the map representation itself.
>
> (In Ruby you can simply =E2=80=9Cfreeze=E2=80=9D a data structure before =
putting that in
> as a map key, preventing any surprises that you would get when that data
> structure is mutated while serving as a map key.  A frozen structure
> retains all the interfaces for data access, including proper equality.  N=
ot
> sure it=E2=80=99s as simple as than in Python.)
>

I haven't used it, but https://pypi.python.org/pypi/frozendict/1.2 looks
like the right type to return from CBOR deserialization. For serialization,
it probably makes sense to use the interface in
https://docs.python.org/3/library/pickle.html#pickling-class-instances to
pull fields out of an object, rather than forcing folks to transform those
objects into a CBOR type first. Then immutable objects used as the keys in
a dict naturally work.

Jeffrey

--001a113eda96065707055f865c5f
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr">On Sun, Dec 3,=
 2017 at 11:59 PM Carsten Bormann &lt;<a href=3D"mailto:cabo@tzi.org">cabo@=
tzi.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"=
margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-lef=
t:1ex">On Dec 4, 2017, at 06:32, Jeffrey Yasskin &lt;<a href=3D"mailto:jyas=
skin@chromium.org" target=3D"_blank">jyasskin@chromium.org</a>&gt; wrote:<b=
r>
&gt;<br>
&gt; Here are three possible ways to explain the idea of an immutable dicti=
onary to the theoretical Pythonista who doesn&#39;t understand the concept:=
<br>
&gt;<br>
&gt; 1. The language already has sets and frozensets, and multiple customiz=
ed dictionary types in the collections module. A frozendict is to a dict as=
 a frozenset is to a set.<br>
&gt; 2. namedtuples are basically frozen dictionaries plus an ordering on t=
he fields.<br>
&gt; 3. Any Python object is basically a dict with some custom methods, so =
any immutable object is an immutable dict.<br>
<br>
Right, but that approach may lead to a somewhat weird way of putting togeth=
er CBOR data structures.<br>
If non-trivial data need to be transformed into a less natural data structu=
re in order to be usable as map keys, the limitation on the map turns into =
a transformation of the map key =E2=80=94 when you extract that map key and=
 use it somewhere else, it has a different shape than maybe is expected.<br=
>
I=E2=80=99d prefer to do this in a way that doesn=E2=80=99t have non-local =
influences beyond overcoming the limitations of the map representation itse=
lf.<br>
<br>
(In Ruby you can simply =E2=80=9Cfreeze=E2=80=9D a data structure before pu=
tting that in as a map key, preventing any surprises that you would get whe=
n that data structure is mutated while serving as a map key.=C2=A0 A frozen=
 structure retains all the interfaces for data access, including proper equ=
ality.=C2=A0 Not sure it=E2=80=99s as simple as than in Python.)<br></block=
quote><div><br></div><div>I haven&#39;t used it, but=C2=A0<a href=3D"https:=
//pypi.python.org/pypi/frozendict/1.2">https://pypi.python.org/pypi/frozend=
ict/1.2</a> looks like the right type to return from CBOR deserialization. =
For serialization, it probably makes sense to use the interface in=C2=A0<a =
href=3D"https://docs.python.org/3/library/pickle.html#pickling-class-instan=
ces">https://docs.python.org/3/library/pickle.html#pickling-class-instances=
</a> to pull fields out of an object, rather than forcing folks to transfor=
m those objects into a CBOR type first. Then immutable objects used as the =
keys in a dict naturally work.</div><div><br></div><div>Jeffrey</div></div>=
</div>

--001a113eda96065707055f865c5f--


From nobody Mon Dec  4 11:30:16 2017
Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C715F126DFF for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 11:30:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level: 
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9,  DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5b3SDGId_4vo for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 11:30:14 -0800 (PST)
Received: from mail-pg0-x22a.google.com (mail-pg0-x22a.google.com [IPv6:2607:f8b0:400e:c05::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 56E79128799 for <cbor@ietf.org>; Mon,  4 Dec 2017 11:30:13 -0800 (PST)
Received: by mail-pg0-x22a.google.com with SMTP id o2so8889766pgc.8 for <cbor@ietf.org>; Mon, 04 Dec 2017 11:30:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;  h=subject:to:cc:references:from:organization:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=2nVwzIzOpqpohdo2ri93IADxNNKy3/as5B3lzOeVFCY=; b=dvpW9o9htkrZUNa11/Czm9C3K3Vb+zYHvc+GSzhjXjDYLUkQKUQJ4s533RJZH2dr/M j4alcgnJFMF5K02cB1o9xfeJozkrMG6RoOhIakwMTuafrH22JY5NzLcCV9gMBmShAGD1 Bwiz25ABGZ/RVdavBSP0WlQ5wKMOywR9VA+XbcsX+HFmom6iaYVmgZ86R+pV2CXrpmGb XgOj1TmfcN0chsw5wLTw176ApUGGrMiF/cGSm0ai5RipbvVuqF0mNtvZyaxJTSNFawFD l1/KXGysDP4VYeVyeRBhDaTbpe/5mFU8pqHd1TGCxphEuKdaTyof3DyZQTb0KBaxQYC5 eB2w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=2nVwzIzOpqpohdo2ri93IADxNNKy3/as5B3lzOeVFCY=; b=Kc/YvePUXA/qQhbzRHyKI0wHlka9yM/0YcRczNcKgc96hPYSKjiDSBBwx5aga02X6q gB+IX6JvWbiRpvd1np2LDrdKngLg7uTM9C+Zf8viRRxdn/MiHaHxveo2sYGEhfaRDOnm myp2FXYdw2qNNsY8QR+V5O4QHw7slERmdmL2kdbuf2Q42NCTSQCPqSIbGcPI83ffjUYs KdQxay3+C7+Xg0gBH35pI8KdPj382++Q8fAr9Qcu0jut3zNH6NACYE9wtM5tKvRyzfxq hgNkJ6CFW6ht3KH386/iEGUJc1pPEuJsZAGWPBJ4cKEMxuhoqzoEAL3EhXmEmPvPDD82 CEDA==
X-Gm-Message-State: AJaThX6UxgyoZtEZfmHnC4Xu4rWbmmEoPTGoJzdAWv/kNbHnIIqJ+9yF bakf34We8UtT7pSw9sVLvKIF0Q==
X-Google-Smtp-Source: AGs4zMZK1LsThfa/K434Yo/TnLTpy4qYPXGHKHUgvNSlkNsKAsTZ9DL6c18ds7LgA30c5dO4oLltpw==
X-Received: by 10.84.165.171 with SMTP id y40mr16021765pla.362.1512415812284;  Mon, 04 Dec 2017 11:30:12 -0800 (PST)
Received: from ?IPv6:2406:e007:6f17:1:28cc:dc4c:9703:6781? ([2406:e007:6f17:1:28cc:dc4c:9703:6781]) by smtp.gmail.com with ESMTPSA id f4sm22449362pgo.1.2017.12.04.11.30.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 Dec 2017 11:30:11 -0800 (PST)
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com> <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org> <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com> <52016718-0051-4EE9-8B7E-B0CCAC88D219@tzi.org>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <5cb58b94-3e44-9c8b-4e2c-6f554f5d0d65@gmail.com>
Date: Tue, 5 Dec 2017 08:30:09 +1300
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <52016718-0051-4EE9-8B7E-B0CCAC88D219@tzi.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/XhpyYGXHIu4i00mha590Kbggnf8>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 19:30:15 -0000

On 04/12/2017 20:42, Carsten Bormann wrote:
> On Dec 4, 2017, at 04:03, Brian E Carpenter <brian.e.carpenter@gmail.co=
m> wrote:
>>
>>> Mutable map keys are not disallowed by Ruby =E2=80=94 you just have t=
o live with the (inconsistent) consequences if you do change them in plac=
e.
>>
>> Right. And that's why I'm going to try hard to insist that we avoid th=
is construct in the CBOR application in question (draft-eckert-anima-gras=
p-dnssd-00). I think the inconsistency makes this a poor design, regardle=
ss of the Python defect.
>>
>>> (One of the many places where the philosophical differences between R=
uby and Python are coming out in plain view.)=20
>>
>> You can of course make a hash out of a Python dictionary, as hash(str(=
my_dict)), but it isn't invariant when the dictionary is updated, so it d=
oesn't strike me as very useful. (And this issue has consequences for the=
 discussion of canonical ordering of maps.)
>=20
> Hmm.  CBOR is for the serialization of data.  Those data are immutable =
in nature while being exchanged.

Yes. That's why using frozen or 'pickled' versions of a Python data struc=
ture makes sense. You can even do this:

class mydict(dict):
    def __hash__(self):
        return hash(str(self))

and then any object of type mydict is a hashable dictionary. But as far a=
s I can tell, you can't add a __hash__() method to a built-in type.=20
> The fact that the most suggestive mapping of the CBOR data types to Pyt=
hon data types is using mutable data structures, and that Python chooses =
to expose some limitations in the implementation of these data structures=
 as limitations in their allowable structure, doesn=E2=80=99t strike me a=
s a very justifiable influence on the design of the CBOR data types.  (No=
te that the limitations exposed pertain to any use of non-trivial types a=
s map keys, not just maps as map keys; a rather severe restriction.)

Well, it's a matter of taste, I guess. As you pointed out, there are cons=
equences when the key changes, anyway, regardless of programming language=
=2E

> Python indeed happens to be a likely implementation language for protoc=
ols, many of which might be using data items as map keys that happen to b=
e mapped to mutable dats structures (maps, arrays) in Python.  Maybe it i=
s worth exploring ways to fill in that gap in the Python CBOR implementat=
ions.  Since I=E2=80=99m not very familiar with the preferred styles of P=
ython programming, I can=E2=80=99t help a lot with that; is there somethi=
ng like an alist style data structure in Python that could be used for re=
presenting maps with those keys that cannot be put in a Python dict?

I will mull over what Jeffrey has said, although it's stretching my own P=
ython skills a bit. I also need to look at another couple of CBOR impleme=
ntations that I've never tried, to see if they have the same problem.

Ideally we'd have the Python implementers in this discussion.

Regards
   Brian


From nobody Mon Dec  4 14:19:08 2017
Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EAF6C126BF3 for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 14:19:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level: 
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cHnKXsTiH6zI for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 14:19:04 -0800 (PST)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9D6DA128D0F for <cbor@ietf.org>; Mon,  4 Dec 2017 14:18:49 -0800 (PST)
Received: from faui40p.informatik.uni-erlangen.de (faui40p.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:77]) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTP id 5F94D58C5B3; Mon,  4 Dec 2017 23:18:45 +0100 (CET)
Received: by faui40p.informatik.uni-erlangen.de (Postfix, from userid 10463) id 4A0ACB0D37B; Mon,  4 Dec 2017 23:18:45 +0100 (CET)
Date: Mon, 4 Dec 2017 23:18:45 +0100
From: Toerless Eckert <tte@cs.fau.de>
To: Carsten Bormann <cabo@tzi.org>
Cc: Brian E Carpenter <brian.e.carpenter@gmail.com>, cbor@ietf.org
Message-ID: <20171204221845.GA1942@faui40p.informatik.uni-erlangen.de>
References: <da387def-6f9a-774e-13b1-f861bfe6d833@gmail.com> <42C5D0EA-8986-4994-A078-7D0BC085EE9A@tzi.org> <3123c29e-0b9a-3f96-a92d-08f643229bc2@gmail.com> <CCBAE961-76ED-403D-BD35-8E6312ADC29E@tzi.org> <6ab27e32-91f2-803b-8337-40ff5ee22643@gmail.com> <52016718-0051-4EE9-8B7E-B0CCAC88D219@tzi.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <52016718-0051-4EE9-8B7E-B0CCAC88D219@tzi.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/a_-A6ik_nMIyfRH-K_PTEqNNIcM>
Subject: Re: [Cbor] Maps used as keys?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 22:19:07 -0000

When i dynamically process data structures and need a map as a key, i would
use an (object) reference to it as the key to make it invariant.  But there are
no references in CBOR AFAIK, so if i used a reference to an object in multiple
parts of my compoint data structure, the CBOR encoding would likely end
up serializing the referenced structure (like a map) separately every time (waste).
Unless i come up with application workarounds (references = names, create a map of
names to object,...).

I can not remember if i have seen serialization libraries that solved this problem
well, but i have really only used perl with some XML serialization libraries in the distant
past, and i think they all ended up not being able to create serialization formats that
include references, but instead serialized referenced structures/objects
every time they encountered a reference to it. But i haven't looked into that
for a long time (which is obvious by the mentioning of "perl" instad of "python").

Toerless

On Mon, Dec 04, 2017 at 08:42:49AM +0100, Carsten Bormann wrote:
> On Dec 4, 2017, at 04:03, Brian E Carpenter <brian.e.carpenter@gmail.com> wrote:
> > 
> >> Mutable map keys are not disallowed by Ruby ??? you just have to live with the (inconsistent) consequences if you do change them in place.
> > 
> > Right. And that's why I'm going to try hard to insist that we avoid this construct in the CBOR application in question (draft-eckert-anima-grasp-dnssd-00). I think the inconsistency makes this a poor design, regardless of the Python defect.
> > 
> >> (One of the many places where the philosophical differences between Ruby and Python are coming out in plain view.) 
> > 
> > You can of course make a hash out of a Python dictionary, as hash(str(my_dict)), but it isn't invariant when the dictionary is updated, so it doesn't strike me as very useful. (And this issue has consequences for the discussion of canonical ordering of maps.)
> 
> Hmm.  CBOR is for the serialization of data.  Those data are immutable in nature while being exchanged.
> 
> The fact that the most suggestive mapping of the CBOR data types to Python data types is using mutable data structures, and that Python chooses to expose some limitations in the implementation of these data structures as limitations in their allowable structure, doesn???t strike me as a very justifiable influence on the design of the CBOR data types.  (Note that the limitations exposed pertain to any use of non-trivial types as map keys, not just maps as map keys; a rather severe restriction.)
> 
> Python indeed happens to be a likely implementation language for protocols, many of which might be using data items as map keys that happen to be mapped to mutable dats structures (maps, arrays) in Python.  Maybe it is worth exploring ways to fill in that gap in the Python CBOR implementations.  Since I???m not very familiar with the preferred styles of Python programming, I can???t help a lot with that; is there something like an alist style data structure in Python that could be used for representing maps with those keys that cannot be put in a Python dict?
> 
> Grüße, Carsten
> 
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor


From nobody Mon Dec  4 14:46:05 2017
Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0EDE7126BF3 for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 14:46:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level: 
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 50SNjHCNCkEC for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 14:46:02 -0800 (PST)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 58420128D0F for <cbor@ietf.org>; Mon,  4 Dec 2017 14:46:02 -0800 (PST)
Received: from faui40p.informatik.uni-erlangen.de (faui40p.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:77]) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTP id 1201B58C5B9 for <cbor@ietf.org>; Mon,  4 Dec 2017 23:45:58 +0100 (CET)
Received: by faui40p.informatik.uni-erlangen.de (Postfix, from userid 10463) id ECC3CB0D37B; Mon,  4 Dec 2017 23:45:57 +0100 (CET)
Date: Mon, 4 Dec 2017 23:45:57 +0100
From: Toerless Eckert <tte@cs.fau.de>
To: cbor@ietf.org
Message-ID: <20171204224557.GB1942@faui40p.informatik.uni-erlangen.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/cUO1Ahjt6xBrIQknx7Uq8zrdhfk>
Subject: [Cbor] CDDL "type definition" vs instance definition (was: Re: Maps used as keys?)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 22:46:04 -0000

The discus from Brian was started with me trying to do something which i am not
sure CDDL  supports well - or how.  So looking for suggestions how to best do it:

We want to standardize for GRASP protocols a map where keys are uint, every 
assigned key gets registed by an IANA registry and the type and definition of
the key dpeends then of course on the spec of that key. Which could be multiple RFCs.

Aka: Good ol' TLV registry with CBOR/CDDL.

So the first definition was trying to sepcify this, see below.

The [2] lines are actual key definitions, for key 1 (named sender-loop-count) and key 2 
(named srv-element). The lines [1] are meant to express the intended typing, aka:
map with uint key and per-key value. 

But... I have problems with [1]: if i include it as specified, then this means
every possible uint is valid and every possible value for every possible key. Which
is not true. 

Now, if i would just remove the [1] lines, i had no definition anymore that
relement needs to be a "registered" uint...

So... how to best fix this ?

Thanks
    Toerless


objective-value  /= { 1*elements }
elements        //= ( @rfcXXXX: { 1*relement } )

relement  = ( relement-codepoint => relement-value ) ; [1]
relement-codepoint = uint                            ; [1]
relement-value     = any                             ; [1]

relement //= ( &(sender-loop-count:1) => 1..255 )    ; [2]
relement //= ( &(srv-element:2) => context-element ) ; [2]

context-element  =  {
     ?( &(private:0)      => any),
     ?( &(msg-type:1)     => msg-type),
     ?( &(service:2)      => tstr),
     *( &(instance:3)     => tstr),
     ?( &(domain:4)       => tstr),
     ?( &(priority:5)     => 0..65535 ),
     ?( &(weight:6)       => 0..65535 ),
     *( &(kvpairs:7)      => { *(tstr: any) }),
    }


From nobody Mon Dec  4 15:09:14 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3EDBD128D44 for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 15:09:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xPaEs68rA8eL for <cbor@ietfa.amsl.com>; Mon,  4 Dec 2017 15:09:11 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 905BB128D40 for <cbor@ietf.org>; Mon,  4 Dec 2017 15:09:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [134.102.201.11]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vB4N96dg010831; Tue, 5 Dec 2017 00:09:06 +0100 (CET)
Received: from [192.168.217.124] (p5DC7E827.dip0.t-ipconnect.de [93.199.232.39]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3yrLD1671nzDWkx; Tue,  5 Dec 2017 00:09:05 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <20171204224557.GB1942@faui40p.informatik.uni-erlangen.de>
Date: Tue, 5 Dec 2017 00:09:04 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 534121744.709309-600fc9850513f9156b366057e492b25a
Content-Transfer-Encoding: quoted-printable
Message-Id: <A833BFCF-6EA1-496F-B5F1-8C5C126CFE67@tzi.org>
References: <20171204224557.GB1942@faui40p.informatik.uni-erlangen.de>
To: Toerless Eckert <tte@cs.fau.de>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/V6I_d_SCapL5Aty1OFxruertpO0>
Subject: Re: [Cbor] CDDL "type definition" vs instance definition (was: Re: Maps used as keys?)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Dec 2017 23:09:13 -0000

On Dec 4, 2017, at 23:45, Toerless Eckert <tte@cs.fau.de> wrote:
>=20
> So... how to best fix this ?

This is what .within was invented for.

I=E2=80=99d write something like


foo =3D { 1*relement }
relement =3D $$relement .within relement-generic

relement-generic =3D ( relement-codepoint =3D> relement-value ) ; [1]
relement-codepoint =3D uint                            ; [1]
relement-value     =3D any                             ; [1]

$$relement //=3D ( &(sender-loop-count:1) =3D> 1..255 )    ; [2]
$$relement //=3D ( &(srv-element:2) =3D> context-element ) ; [2]
context-element =3D "and so on=E2=80=9D


=E2=80=A6 except that .within is for types, not for groups. =20
So we have to do this in a bit more circuitous way:


foo =3D foo-specific .within foo-generic
foo-specific =3D { 1*$$relement }
foo-generic =3D { 1*relement-generic }

relement-generic =3D ( relement-codepoint =3D> relement-value ) ; [1]
relement-codepoint =3D uint                            ; [1]
relement-value     =3D any                             ; [1]

$$relement //=3D ( &(sender-loop-count:1) =3D> 1..255 )    ; [2]
$$relement //=3D ( &(srv-element:2) =3D> context-element ) ; [2]
context-element =3D "and so on=E2=80=9D


Gr=C3=BC=C3=9Fe, Carsten


From nobody Fri Dec 15 17:05:06 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 902B11201FA for <cbor@ietfa.amsl.com>; Fri, 15 Dec 2017 17:05:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level: 
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id phMisd7cL8fT for <cbor@ietfa.amsl.com>; Fri, 15 Dec 2017 17:05:01 -0800 (PST)
Received: from mail-io0-x22e.google.com (mail-io0-x22e.google.com [IPv6:2607:f8b0:4001:c06::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 75930128D0F for <cbor@ietf.org>; Fri, 15 Dec 2017 17:05:01 -0800 (PST)
Received: by mail-io0-x22e.google.com with SMTP id n41so4427920ioe.1 for <cbor@ietf.org>; Fri, 15 Dec 2017 17:05:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XoXsRVKYFVt0+brzKHU5tx+RUmSAeFqN0hPu6bKei3E=; b=hOR49cDycTvSGa9EkUDLzZLzuTg69B6s+EBAKISDRhzQcx5N4m00bkBDsl74zl6LE6 9nYNK92sUJlrFtBp4rIAN1mESgsZVgLkvVE9f1H+sk7f0WzLKnIFUVIi0B+MHkCbtzYQ JbwNoPMt/Ad6BWvNC4e/XWtn3d1Ym3/yRKPSM=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XoXsRVKYFVt0+brzKHU5tx+RUmSAeFqN0hPu6bKei3E=; b=XPJs0+vO8JxLL5TZNrVYlW2bQlcHc2gBaYA58X33wDhF7ZOYsLGNUk88iQG2KCi13l fbwUWn2E/p7hxNj2k1B3dfWNmnQdHFDj12TkjtIjDT2n9G0iTOUR7Qfd6qX3iFCcWt6z DQZEK66K7K/HCvmAMRLCeKwEL4c+yofb43MiUmaRmMhUADzUvmlySSTbRRz5DJX0sL9g DqmgDGx/wt9ZwgYcAYEZC36nbd+gwJvvudIdNk/VPtHn77WUfFQInYGEU15QHmhzxeYO nbw33CUI0Vk1K4cQzZ5wnHXZ+ZCre+GmOlbN2rUXAtbVeARNnX+2rrh8/ohW/chN+J+E ngIw==
X-Gm-Message-State: AKGB3mK0wp89yEpBUAx2yAvsWQ/MmikF9w8gcE3BgeJLo1oXkBrORU9J ejGOjR8zeS4Xv5W7CX6jO6ybO/jFuSndWWv7DtGWww==
X-Google-Smtp-Source: ACJfBosYLoQLtAt/bbLTC+zf4fkMOZ8ONfg1itjL3L7/N9tuQKl57NS4jiVLVbqd6idWoz/SSTE8YffpeUWHYXGLJzM=
X-Received: by 10.107.35.140 with SMTP id j134mr11647438ioj.166.1513386300162;  Fri, 15 Dec 2017 17:05:00 -0800 (PST)
MIME-Version: 1.0
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org> <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com> <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com> <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org>
In-Reply-To: <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Sat, 16 Dec 2017 01:04:47 +0000
Message-ID: <CANh-dX==+LTA9s4b56kM_0v7WOwmZyX=FmDaxeY0BJbMFM=kvg@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Jeffrey Yasskin <jyasskin@chromium.org>, cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a114028e0d5143b05606ab4d1"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/sjMoXa1dbJ6JkT5UwoVt8lCKd7w>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Dec 2017 01:05:04 -0000

--001a114028e0d5143b05606ab4d1
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Sorry for the belated response. I asked the owners of the CTAP spec your
questions in https://github.com/fido-alliance/fido-2-specs/issues/389,
which most of you can't read because it's a private repository, but I've
gotten permission to paste the answer here.

They felt they had to specify something because section 3.9 isn't normative=
.

They wanted to "allow for a simple comparison using unbounded memcmp", and
they wanted to guarantee that "an overflow would never occur with this
encoding."

They hadn't realized that CBOR is self-delimiting, which answers the second
concern.

CTAP happens to only use integers and text strings as map keys, which the
author convinced me makes the CTAP rule equivalent to the pure
lexicographic ordering. That'll make it a backward-compatible change to
switch to the lexicographic ordering if they ever do add more complex keys,
but it seems unlikely that they'll need more complex keys.

For Jim, yes, [30] (81 18 1E) comes before [-1] (81 20) in the pure
lexicographic order, but after in the major-then-length-then-lexicographic
order.

Jeffrey


On Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann <cabo@tzi.org> wrote:

> On Nov 30, 2017, at 23:14, Jeffrey Yasskin <jyasskin@chromium.org> wrote:
> >
> > Belatedly, I've discovered a user of "canonical" CBOR who's proposing a
> different map order than the RFC suggests:
> https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authe=
nticator-protocol-v2.0-rd-20170927.html#message-encoding.
> (Note that this isn't a final standard yet and may change.)
>
> I looked at the spec referenced.
>
> So they essentially add
>
>                 =E2=80=A2 If the major types are different, the one with =
the lower
> value in numerical order sorts earlier.
>
> as a major sorting rule before the existing RFC 7049 canonicalization
> rules:
>
>                 =E2=80=A2 If two keys have different lengths, the shorter=
 one
> sorts earlier;
>                 =E2=80=A2 If two keys have the same length, the one with =
the lower
> value in (byte-wise) lexical order sorts earlier.
>
> This is different from simply going for byte-wise lexicographic (memcmp
> order(*)), which effectively would get us the first rule as the major
> sorting order already, but get rid of the length-based second rule (first
> rule in Section 3.9 of RFC 7049).
>
> I=E2=80=99ve come to see the putting the length comparison rule early in =
3.9 as a
> major regression.
> One of the objectives when designing the CBOR serialization was not to
> repeat one big mistake that ASN.1 BER makes: to make overall lengths of
> complex composite items visible/important in the encoding of the next
> higher composite.
> Here, we are doing just that.  D=E2=80=99oh.
>
> > This is justified by the RFC saying that "Those protocols are free to
> define what they mean by a canonical format and what encoders and decoder=
s
> are expected to do.  This section lists some suggestions for such
> protocols." That is (as Jim said), the RFC doesn't specify "canonical"
> CBOR: it just provides an option for higher-level protocols to do so.
>
> Right.  So the change would be to mention two options for this, the old
> canonical, and the saner (memcmp order) canonical.  Now the next step is
> finding names for legacy canonical/saner canonical.  We then have to deci=
de
> whether we turn this into a separate document, at Proposed Standard level=
,
> or believe that adding another suggestion to 3.9 is essentially a bug fix
> and can be done in the Standard level document.
>
> > The use of a different order in CTAP is going to either force its
> implementers to write custom CBOR encoders and decoders or require the
> generic encoders to take a configuration option for the map order. If the
> generic encoders take an option, then it stops being an issue for CBORbis
> to suggest a different order.
>
> Right.  So I think you are saying we get to fix this.
>
> I=E2=80=99d like to understand why CTAP went for major type first, then l=
egacy
> order.  Do you know when that was decided/who decided that?  Can we maybe
> even influence them to go for the simpler =E2=80=9Csaner=E2=80=9D orderin=
g?
>
> Gr=C3=BC=C3=9Fe, Carsten
>
> (*) well, we wouldn=E2=80=99t define it as =E2=80=9Cmemcmp order=E2=80=9D=
 because that would
> require a normative reference to some C standard plus defining the third
> parameter of the memcmp as the lower of the two lengths, which again
> exposes total lengths.
> Instead, we would spell out =E2=80=9Cbytewise lexicographic order of the =
canonical
> encodings of the keys=E2=80=9D in a few more words.
> Note that this can be defined simply based on the first byte that differs
> in the byte sequence =E2=80=94 there are no two different CBOR data items=
 where one
> is a prefix of the other (CBOR is self-delimiting).
>
>

--001a114028e0d5143b05606ab4d1
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Sorry for the belated response. I asked the owners of the =
CTAP spec your questions in <a href=3D"https://github.com/fido-alliance/fid=
o-2-specs/issues/389">https://github.com/fido-alliance/fido-2-specs/issues/=
389</a>, which most of you can&#39;t read because it&#39;s a private reposi=
tory, but I&#39;ve gotten permission to paste the answer here.<div><br></di=
v><div>They felt they had to specify something because section 3.9 isn&#39;=
t normative.</div><div><br></div><div>They wanted to &quot;allow for a simp=
le comparison using unbounded memcmp&quot;, and they wanted to guarantee th=
at &quot;an overflow would never occur with this encoding.&quot;</div><div>=
<br></div><div>They hadn&#39;t realized that CBOR is self-delimiting, which=
 answers the second concern.</div><div><br></div><div>CTAP happens to only =
use integers and text strings as map keys, which the author convinced me ma=
kes the CTAP rule equivalent to the pure lexicographic ordering. That&#39;l=
l make it a backward-compatible change to switch to the lexicographic order=
ing if they ever do add more complex keys, but it seems unlikely that they&=
#39;ll need more complex keys.</div><div><br></div><div>For Jim, yes, [30] =
(81 18 1E) comes before [-1] (81 20) in the pure lexicographic order, but a=
fter in the major-then-length-then-lexicographic order.</div><div><div><br>=
</div><div>Jeffrey<br><br><br><div class=3D"gmail_quote"><div dir=3D"ltr">O=
n Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann &lt;<a href=3D"mailto:cabo@tz=
i.org">cabo@tzi.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quot=
e" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)=
;padding-left:1ex">On Nov 30, 2017, at 23:14, Jeffrey Yasskin &lt;<a href=
=3D"mailto:jyasskin@chromium.org" target=3D"_blank">jyasskin@chromium.org</=
a>&gt; wrote:<br>
&gt;<br>
&gt; Belatedly, I&#39;ve discovered a user of &quot;canonical&quot; CBOR wh=
o&#39;s proposing a different map order than the RFC suggests: <a href=3D"h=
ttps://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authenti=
cator-protocol-v2.0-rd-20170927.html#message-encoding" rel=3D"noreferrer" t=
arget=3D"_blank">https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-=
client-to-authenticator-protocol-v2.0-rd-20170927.html#message-encoding</a>=
. (Note that this isn&#39;t a final standard yet and may change.)<br>
<br>
I looked at the spec referenced.<br>
<br>
So they essentially add<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 If the ma=
jor types are different, the one with the lower value in numerical order so=
rts earlier.<br>
<br>
as a major sorting rule before the existing RFC 7049 canonicalization rules=
:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 If two ke=
ys have different lengths, the shorter one sorts earlier;<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 If two ke=
ys have the same length, the one with the lower value in (byte-wise) lexica=
l order sorts earlier.<br>
<br>
This is different from simply going for byte-wise lexicographic (memcmp ord=
er(*)), which effectively would get us the first rule as the major sorting =
order already, but get rid of the length-based second rule (first rule in S=
ection 3.9 of RFC 7049).<br>
<br>
I=E2=80=99ve come to see the putting the length comparison rule early in 3.=
9 as a major regression.<br>
One of the objectives when designing the CBOR serialization was not to repe=
at one big mistake that ASN.1 BER makes: to make overall lengths of complex=
 composite items visible/important in the encoding of the next higher compo=
site.<br>
Here, we are doing just that.=C2=A0 D=E2=80=99oh.<br>
<br>
&gt; This is justified by the RFC saying that &quot;Those protocols are fre=
e to define what they mean by a canonical format and what encoders and deco=
ders are expected to do.=C2=A0 This section lists some suggestions for such=
 protocols.&quot; That is (as Jim said), the RFC doesn&#39;t specify &quot;=
canonical&quot; CBOR: it just provides an option for higher-level protocols=
 to do so.<br>
<br>
Right.=C2=A0 So the change would be to mention two options for this, the ol=
d canonical, and the saner (memcmp order) canonical.=C2=A0 Now the next ste=
p is finding names for legacy canonical/saner canonical.=C2=A0 We then have=
 to decide whether we turn this into a separate document, at Proposed Stand=
ard level, or believe that adding another suggestion to 3.9 is essentially =
a bug fix and can be done in the Standard level document.<br>
<br>
&gt; The use of a different order in CTAP is going to either force its impl=
ementers to write custom CBOR encoders and decoders or require the generic =
encoders to take a configuration option for the map order. If the generic e=
ncoders take an option, then it stops being an issue for CBORbis to suggest=
 a different order.<br>
<br>
Right.=C2=A0 So I think you are saying we get to fix this.<br>
<br>
I=E2=80=99d like to understand why CTAP went for major type first, then leg=
acy order.=C2=A0 Do you know when that was decided/who decided that?=C2=A0 =
Can we maybe even influence them to go for the simpler =E2=80=9Csaner=E2=80=
=9D ordering?<br>
<br>
Gr=C3=BC=C3=9Fe, Carsten<br>
<br>
(*) well, we wouldn=E2=80=99t define it as =E2=80=9Cmemcmp order=E2=80=9D b=
ecause that would require a normative reference to some C standard plus def=
ining the third parameter of the memcmp as the lower of the two lengths, wh=
ich again exposes total lengths.<br>
Instead, we would spell out =E2=80=9Cbytewise lexicographic order of the ca=
nonical encodings of the keys=E2=80=9D in a few more words.<br>
Note that this can be defined simply based on the first byte that differs i=
n the byte sequence =E2=80=94 there are no two different CBOR data items wh=
ere one is a prefix of the other (CBOR is self-delimiting).<br>
<br>
</blockquote></div></div></div></div>

--001a114028e0d5143b05606ab4d1--


From nobody Fri Dec 15 18:26:03 2017
Return-Path: <ietf@augustcellars.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B6381126E7A for <cbor@ietfa.amsl.com>; Fri, 15 Dec 2017 18:26:02 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level: 
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qF6uaEXMk6GR for <cbor@ietfa.amsl.com>; Fri, 15 Dec 2017 18:26:00 -0800 (PST)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6CAAB124239 for <cbor@ietf.org>; Fri, 15 Dec 2017 18:25:59 -0800 (PST)
Received: from Jude (73.180.8.170) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Fri, 15 Dec 2017 18:24:15 -0800
From: Jim Schaad <ietf@augustcellars.com>
To: 'Jeffrey Yasskin' <jyasskin@chromium.org>, 'Carsten Bormann' <cabo@tzi.org>
CC: <cbor@ietf.org>
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org> <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com> <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com> <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org> <CANh-dX==+LTA9s4b56kM_0v7WOwmZyX=FmDaxeY0BJbMFM=kvg@mail.gmail.com>
In-Reply-To: <CANh-dX==+LTA9s4b56kM_0v7WOwmZyX=FmDaxeY0BJbMFM=kvg@mail.gmail.com>
Date: Fri, 15 Dec 2017 18:25:31 -0800
Message-ID: <015501d37615$26e1a830$74a4f890$@augustcellars.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0156_01D375D2.18C06400"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQLkcV5/cuYwYNvq7zdAWu7g9fdkoQDl1w1UAkkwjzoCc2JLLgGyH3OJAnAGOCgB7/DF0QLqEmWwoK5SjmA=
Content-Language: en-us
X-Originating-IP: [73.180.8.170]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/zXrPC8ZJiYHrfBZAkiO4UkK-rfw>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Dec 2017 02:26:02 -0000

------=_NextPart_000_0156_01D375D2.18C06400
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

=20

=20

From: CBOR [mailto:cbor-bounces@ietf.org] On Behalf Of Jeffrey Yasskin
Sent: Friday, December 15, 2017 5:05 PM
To: Carsten Bormann <cabo@tzi.org>
Cc: Jeffrey Yasskin <jyasskin@chromium.org>; cbor@ietf.org
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested =
Canonicalization

=20

Sorry for the belated response. I asked the owners of the CTAP spec your =
questions in https://github.com/fido-alliance/fido-2-specs/issues/389, =
which most of you can't read because it's a private repository, but I've =
gotten permission to paste the answer here.

=20

They felt they had to specify something because section 3.9 isn't =
normative.

=20

They wanted to "allow for a simple comparison using unbounded memcmp", =
and they wanted to guarantee that "an overflow would never occur with =
this encoding."

=20

They hadn't realized that CBOR is self-delimiting, which answers the =
second concern.

=20

CTAP happens to only use integers and text strings as map keys, which =
the author convinced me makes the CTAP rule equivalent to the pure =
lexicographic ordering. That'll make it a backward-compatible change to =
switch to the lexicographic ordering if they ever do add more complex =
keys, but it seems unlikely that they'll need more complex keys.

=20

For Jim, yes, [30] (81 18 1E) comes before [-1] (81 20) in the pure =
lexicographic order, but after in the =
major-then-length-then-lexicographic order.

=20

[JLS] Yes, but the major-length-lexicographic order does match what the =
byte-by-byte ordering would be.  So if we change to that it means that =
they would not need to do a heavy re-write if they add new items.

=20

Jim

=20

=20

Jeffrey



On Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann <cabo@tzi.org =
<mailto:cabo@tzi.org> > wrote:

On Nov 30, 2017, at 23:14, Jeffrey Yasskin <jyasskin@chromium.org =
<mailto:jyasskin@chromium.org> > wrote:
>
> Belatedly, I've discovered a user of "canonical" CBOR who's proposing =
a different map order than the RFC suggests: =
https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authe=
nticator-protocol-v2.0-rd-20170927.html#message-encoding. (Note that =
this isn't a final standard yet and may change.)

I looked at the spec referenced.

So they essentially add

                =E2=80=A2 If the major types are different, the one with =
the lower value in numerical order sorts earlier.

as a major sorting rule before the existing RFC 7049 canonicalization =
rules:

                =E2=80=A2 If two keys have different lengths, the =
shorter one sorts earlier;
                =E2=80=A2 If two keys have the same length, the one with =
the lower value in (byte-wise) lexical order sorts earlier.

This is different from simply going for byte-wise lexicographic (memcmp =
order(*)), which effectively would get us the first rule as the major =
sorting order already, but get rid of the length-based second rule =
(first rule in Section 3.9 of RFC 7049).

I=E2=80=99ve come to see the putting the length comparison rule early in =
3.9 as a major regression.
One of the objectives when designing the CBOR serialization was not to =
repeat one big mistake that ASN.1 BER makes: to make overall lengths of =
complex composite items visible/important in the encoding of the next =
higher composite.
Here, we are doing just that.  D=E2=80=99oh.

> This is justified by the RFC saying that "Those protocols are free to =
define what they mean by a canonical format and what encoders and =
decoders are expected to do.  This section lists some suggestions for =
such protocols." That is (as Jim said), the RFC doesn't specify =
"canonical" CBOR: it just provides an option for higher-level protocols =
to do so.

Right.  So the change would be to mention two options for this, the old =
canonical, and the saner (memcmp order) canonical.  Now the next step is =
finding names for legacy canonical/saner canonical.  We then have to =
decide whether we turn this into a separate document, at Proposed =
Standard level, or believe that adding another suggestion to 3.9 is =
essentially a bug fix and can be done in the Standard level document.

> The use of a different order in CTAP is going to either force its =
implementers to write custom CBOR encoders and decoders or require the =
generic encoders to take a configuration option for the map order. If =
the generic encoders take an option, then it stops being an issue for =
CBORbis to suggest a different order.

Right.  So I think you are saying we get to fix this.

I=E2=80=99d like to understand why CTAP went for major type first, then =
legacy order.  Do you know when that was decided/who decided that?  Can =
we maybe even influence them to go for the simpler =
=E2=80=9Csaner=E2=80=9D ordering?

Gr=C3=BC=C3=9Fe, Carsten

(*) well, we wouldn=E2=80=99t define it as =E2=80=9Cmemcmp =
order=E2=80=9D because that would require a normative reference to some =
C standard plus defining the third parameter of the memcmp as the lower =
of the two lengths, which again exposes total lengths.
Instead, we would spell out =E2=80=9Cbytewise lexicographic order of the =
canonical encodings of the keys=E2=80=9D in a few more words.
Note that this can be defined simply based on the first byte that =
differs in the byte sequence =E2=80=94 there are no two different CBOR =
data items where one is a prefix of the other (CBOR is self-delimiting).


------=_NextPart_000_0156_01D375D2.18C06400
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta =
http-equiv=3DContent-Type content=3D"text/html; charset=3Dutf-8"><meta =
name=3DGenerator content=3D"Microsoft Word 15 (filtered =
medium)"><style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
	{mso-style-name:msonormal;
	mso-margin-top-alt:auto;
	margin-right:0in;
	mso-margin-bottom-alt:auto;
	margin-left:0in;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
span.EmailStyle18
	{mso-style-type:personal-reply;
	font-family:"Calibri",sans-serif;
	color:windowtext;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri",sans-serif;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]--></head><body lang=3DEN-US link=3Dblue =
vlink=3Dpurple><div class=3DWordSection1><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><div =
style=3D'border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in =
4.0pt'><div><div style=3D'border:none;border-top:solid #E1E1E1 =
1.0pt;padding:3.0pt 0in 0in 0in'><p class=3DMsoNormal><b>From:</b> CBOR =
[mailto:cbor-bounces@ietf.org] <b>On Behalf Of </b>Jeffrey =
Yasskin<br><b>Sent:</b> Friday, December 15, 2017 5:05 PM<br><b>To:</b> =
Carsten Bormann &lt;cabo@tzi.org&gt;<br><b>Cc:</b> Jeffrey Yasskin =
&lt;jyasskin@chromium.org&gt;; cbor@ietf.org<br><b>Subject:</b> Re: =
[Cbor] [core] draft-ietf-cbor-7049bis - Change suggested =
Canonicalization<o:p></o:p></p></div></div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><div><p class=3DMsoNormal>Sorry =
for the belated response. I asked the owners of the CTAP spec your =
questions in <a =
href=3D"https://github.com/fido-alliance/fido-2-specs/issues/389">https:/=
/github.com/fido-alliance/fido-2-specs/issues/389</a>, which most of you =
can't read because it's a private repository, but I've gotten permission =
to paste the answer here.<o:p></o:p></p><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>They felt they had to specify something because =
section 3.9 isn't normative.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>They wanted to &quot;allow for a simple comparison =
using unbounded memcmp&quot;, and they wanted to guarantee that &quot;an =
overflow would never occur with this =
encoding.&quot;<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>They hadn't realized that CBOR is self-delimiting, =
which answers the second concern.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>CTAP happens to only use integers and text strings as =
map keys, which the author convinced me makes the CTAP rule equivalent =
to the pure lexicographic ordering. That'll make it a =
backward-compatible change to switch to the lexicographic ordering if =
they ever do add more complex keys, but it seems unlikely that they'll =
need more complex keys.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>For Jim, yes, [30] (81 18 1E) comes before [-1] (81 =
20) in the pure lexicographic order, but after in the =
major-then-length-then-lexicographic order.<o:p></o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal><span =
style=3D'color:#00B0F0'>[JLS] Yes, but the major-length-lexicographic =
order does match what the byte-by-byte ordering would be.=C2=A0 So if we =
change to that it means that they would not need to do a heavy re-write =
if they add new items.<o:p></o:p></span></p><p class=3DMsoNormal><span =
style=3D'color:#00B0F0'><o:p>&nbsp;</o:p></span></p><p =
class=3DMsoNormal><span =
style=3D'color:#00B0F0'>Jim<o:p></o:p></span></p><p =
class=3DMsoNormal><span =
style=3D'color:#00B0F0'><o:p>&nbsp;</o:p></span></p></div><div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p class=3DMsoNormal =
style=3D'margin-bottom:12.0pt'>Jeffrey<br><br><o:p></o:p></p><div><div><p=
 class=3DMsoNormal>On Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann &lt;<a =
href=3D"mailto:cabo@tzi.org">cabo@tzi.org</a>&gt; =
wrote:<o:p></o:p></p></div><blockquote =
style=3D'border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in =
6.0pt;margin-left:4.8pt;margin-right:0in'><p class=3DMsoNormal =
style=3D'margin-bottom:12.0pt'>On Nov 30, 2017, at 23:14, Jeffrey =
Yasskin &lt;<a href=3D"mailto:jyasskin@chromium.org" =
target=3D"_blank">jyasskin@chromium.org</a>&gt; wrote:<br>&gt;<br>&gt; =
Belatedly, I've discovered a user of &quot;canonical&quot; CBOR who's =
proposing a different map order than the RFC suggests: <a =
href=3D"https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-=
to-authenticator-protocol-v2.0-rd-20170927.html#message-encoding" =
target=3D"_blank">https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fi=
do-client-to-authenticator-protocol-v2.0-rd-20170927.html#message-encodin=
g</a>. (Note that this isn't a final standard yet and may =
change.)<br><br>I looked at the spec referenced.<br><br>So they =
essentially add<br><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; =E2=80=A2 If the major types are different, the one with the =
lower value in numerical order sorts earlier.<br><br>as a major sorting =
rule before the existing RFC 7049 canonicalization rules:<br><br>&nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =E2=80=A2 If two keys =
have different lengths, the shorter one sorts earlier;<br>&nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =E2=80=A2 If two keys have the =
same length, the one with the lower value in (byte-wise) lexical order =
sorts earlier.<br><br>This is different from simply going for byte-wise =
lexicographic (memcmp order(*)), which effectively would get us the =
first rule as the major sorting order already, but get rid of the =
length-based second rule (first rule in Section 3.9 of RFC =
7049).<br><br>I=E2=80=99ve come to see the putting the length comparison =
rule early in 3.9 as a major regression.<br>One of the objectives when =
designing the CBOR serialization was not to repeat one big mistake that =
ASN.1 BER makes: to make overall lengths of complex composite items =
visible/important in the encoding of the next higher composite.<br>Here, =
we are doing just that.&nbsp; D=E2=80=99oh.<br><br>&gt; This is =
justified by the RFC saying that &quot;Those protocols are free to =
define what they mean by a canonical format and what encoders and =
decoders are expected to do.&nbsp; This section lists some suggestions =
for such protocols.&quot; That is (as Jim said), the RFC doesn't specify =
&quot;canonical&quot; CBOR: it just provides an option for higher-level =
protocols to do so.<br><br>Right.&nbsp; So the change would be to =
mention two options for this, the old canonical, and the saner (memcmp =
order) canonical.&nbsp; Now the next step is finding names for legacy =
canonical/saner canonical.&nbsp; We then have to decide whether we turn =
this into a separate document, at Proposed Standard level, or believe =
that adding another suggestion to 3.9 is essentially a bug fix and can =
be done in the Standard level document.<br><br>&gt; The use of a =
different order in CTAP is going to either force its implementers to =
write custom CBOR encoders and decoders or require the generic encoders =
to take a configuration option for the map order. If the generic =
encoders take an option, then it stops being an issue for CBORbis to =
suggest a different order.<br><br>Right.&nbsp; So I think you are saying =
we get to fix this.<br><br>I=E2=80=99d like to understand why CTAP went =
for major type first, then legacy order.&nbsp; Do you know when that was =
decided/who decided that?&nbsp; Can we maybe even influence them to go =
for the simpler =E2=80=9Csaner=E2=80=9D =
ordering?<br><br>Gr=C3=BC=C3=9Fe, Carsten<br><br>(*) well, we =
wouldn=E2=80=99t define it as =E2=80=9Cmemcmp order=E2=80=9D because =
that would require a normative reference to some C standard plus =
defining the third parameter of the memcmp as the lower of the two =
lengths, which again exposes total lengths.<br>Instead, we would spell =
out =E2=80=9Cbytewise lexicographic order of the canonical encodings of =
the keys=E2=80=9D in a few more words.<br>Note that this can be defined =
simply based on the first byte that differs in the byte sequence =
=E2=80=94 there are no two different CBOR data items where one is a =
prefix of the other (CBOR is =
self-delimiting).<o:p></o:p></p></blockquote></div></div></div></div></di=
v></div></body></html>
------=_NextPart_000_0156_01D375D2.18C06400--


From nobody Tue Dec 19 11:35:26 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0DCAB12751F for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 11:35:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.709
X-Spam-Level: 
X-Spam-Status: No, score=-2.709 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PgchAtgkwk_t for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 11:35:22 -0800 (PST)
Received: from mail-it0-x22f.google.com (mail-it0-x22f.google.com [IPv6:2607:f8b0:4001:c0b::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 52EE11200F1 for <cbor@ietf.org>; Tue, 19 Dec 2017 11:35:22 -0800 (PST)
Received: by mail-it0-x22f.google.com with SMTP id z6so4041538iti.4 for <cbor@ietf.org>; Tue, 19 Dec 2017 11:35:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=X9Cie4sZ8BxQWGEAZ/ejTjZZlvJ8iWNvfytxmygCJlE=; b=MSXe7nDr9qdFl/+gI3qXyWaLkH6U0/RGY/lXvPm24eShHqRxSmqTSbcwJt/vGxEl59 3nHBVLBTamx3Z9Z0/4/Ui7Ey4o43ryiCJqXQKEGcs4LCfmdt7r6N2dKiYqbwGT7hR5GA 8tZBeYXdB/RmwTAXmujn7QirtrBQwiB3s30XwJLKQmfciNMwdxY2G+w2BBpX91gTpX+e vXRF2jG/D2TQy6lJn0s08lbkMLuGlZImalDWcRQDkl9fmFfB8Z+c1YBhTJNqgdLgmQK4 rwCvW2tgWcqKik/uxLeWzg99C0L9MXMwlAMs7EiV6HRTFgtCjaeEwB4uU8m+smP19kBC t48g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=X9Cie4sZ8BxQWGEAZ/ejTjZZlvJ8iWNvfytxmygCJlE=; b=SvClN+Q8DdlaO7O+xywLtzW0W6gLyklnNUXbS0OFh1gCDMDx27V27mVssrd2I1r7AI pdjW2sV4Q+hV5bu+T9oyEqdcLlWp3TKzJnAaBfHzDz2VzzaQIxq3AEdm89waGlhOKk5I zvR3kXKy6dYncBZw6E1dGUvV+yjrAuhYRh6lauQxjhpyL24XjYE+qLOcclgAubQ+pUuf ABlQH1+SQvTkasVWbCW9SPeO44gYU1etzVoDT+GwGQcs0YGnvq31IwmKKYMDX1Z/jYJt 9kiirGo3n9hppirr43lH3sDUohEwOR1WBcvVxexESvtV0zyk8akN8ekinFvqlWoFTeuh CroA==
X-Gm-Message-State: AKGB3mIAbAzeTJFZciVCq0HPF/kiW/mVE2Ef7XVL8h9z6klzfN3ONdd1 T+Prt57Wlh24k1O4XZDJuE+ogwNfoxb9GAOvksiMSv2vDTE=
X-Google-Smtp-Source: ACJfBouqv9OcJmLUP3sCncx5vUBwuRyinVLbsih8vSmn9df0SuYfVyGbmuFeFs5XJwLhUv7c5xoDV+5bp1NbNjE4jyI=
X-Received: by 10.36.217.208 with SMTP id p199mr4818694itg.106.1513712120738;  Tue, 19 Dec 2017 11:35:20 -0800 (PST)
MIME-Version: 1.0
From: Jeffrey Yasskin <jyasskin@google.com>
Date: Tue, 19 Dec 2017 19:35:07 +0000
Message-ID: <CANh-dXmQLfvovGak0oEkbjVx3_7zmZc-Ohp=XpCXgx7LWE-=Wg@mail.gmail.com>
To: cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a11448e784067130560b691d9"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/DOBFm4fBr5mepaN7AQPPUgOYG_Q>
Subject: [Cbor] CDDL: Pick between "text" vs "tstr" and "bytes" vs "bstr".
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Dec 2017 19:35:24 -0000

--001a11448e784067130560b691d9
Content-Type: text/plain; charset="UTF-8"

I don't think the CDDL prelude
<https://tools.ietf.org/html/draft-ietf-cbor-cddl-00#appendix-E> should
standardize two English ways to spell the same thing. I don't have a real
preference between "text" vs "tstr", but I'd like to pick one so that
people don't have to think about which one to use in our own specs.

In case nobody has a preference, I'll suggest using "tstr" and "bstr", on
the theory that the types have more in common than they differ, so it's
nice to include the similarity in the name. However, if others prefer
"text" and "bytes", I'll happily vote for that pair instead.

Jeffrey

--001a11448e784067130560b691d9
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I don&#39;t think the=C2=A0<a href=3D"https://tools.ietf.o=
rg/html/draft-ietf-cbor-cddl-00#appendix-E">CDDL prelude</a> should standar=
dize two English ways to spell the same thing. I don&#39;t have a real pref=
erence between &quot;text&quot; vs &quot;tstr&quot;, but I&#39;d like to pi=
ck one so that people don&#39;t have to think about which one to use in our=
 own specs.<div><br></div><div>In case nobody has a preference, I&#39;ll su=
ggest using &quot;tstr&quot; and &quot;bstr&quot;, on the theory that the t=
ypes have more in common than they differ, so it&#39;s nice to include the =
similarity in the name. However, if others prefer &quot;text&quot; and &quo=
t;bytes&quot;, I&#39;ll happily vote for that pair instead.</div><div><br><=
/div><div>Jeffrey</div></div>

--001a11448e784067130560b691d9--


From nobody Tue Dec 19 12:31:23 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 62FFF1201F2 for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 12:31:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PhavDfmyP6ep for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 12:31:20 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AC4401277BB for <cbor@ietf.org>; Tue, 19 Dec 2017 12:31:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::b]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vBJKVD6t007893; Tue, 19 Dec 2017 21:31:13 +0100 (CET)
Received: from [192.168.217.114] (p5DC7E04D.dip0.t-ipconnect.de [93.199.224.77]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3z1V0x0XLMzDXLL; Tue, 19 Dec 2017 21:31:13 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CANh-dXmQLfvovGak0oEkbjVx3_7zmZc-Ohp=XpCXgx7LWE-=Wg@mail.gmail.com>
Date: Tue, 19 Dec 2017 21:31:12 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 535408272.223489-7a8250de7151ffb06b0aeaee951e1347
Content-Transfer-Encoding: quoted-printable
Message-Id: <8274AE9A-F926-4F9A-A863-15DCC7572FB2@tzi.org>
References: <CANh-dXmQLfvovGak0oEkbjVx3_7zmZc-Ohp=XpCXgx7LWE-=Wg@mail.gmail.com>
To: Jeffrey Yasskin <jyasskin@google.com>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/zIQrE-aewaSOgKT1FZEsM5RSSgE>
Subject: Re: [Cbor] CDDL: Pick between "text" vs "tstr" and "bytes" vs "bstr".
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Dec 2017 20:31:22 -0000

Hi Jeffrey,

On Dec 19, 2017, at 20:35, Jeffrey Yasskin <jyasskin@google.com> wrote:
>=20
> I don't think the CDDL prelude should standardize two English ways to =
spell the same thing. I don't have a real preference between "text" vs =
"tstr", but I'd like to pick one so that people don't have to think =
about which one to use in our own specs.

I agree that a single name for each of text strings and byte strings =
would have been good.

One observation we made with the initial choice of `tstr` and `bstr` is =
that these look too similar; it is too easy to mistake one for the =
other.  So we came up with the less cryptic `text` and `bytes`.  (A lot =
of people think that the text one should be called =E2=80=9Cstring=E2=80=9D=
; well=E2=80=A6 If you really need this to get rid of some dissonance, =
adding a line =E2=80=9Cstring =3D text=E2=80=9D to your spec will give =
you that.)

> In case nobody has a preference, I'll suggest using "tstr" and "bstr", =
on the theory that the types have more in common than they differ, so =
it's nice to include the similarity in the name. However, if others =
prefer "text" and "bytes", I'll happily vote for that pair instead.

I don=E2=80=99t think we should take away one of the aliases at this =
stage.  (But, of course, if we do that, a simple =E2=80=9Ctstr =3D =
text=E2=80=9D or v.v. will =E2=80=9Crepair=E2=80=9D a spec.)

We *could* express a preference, as in =E2=80=9Ctstr and bstr are =
aliases for backwards compatibility; text and bytes are the preferred =
names=E2=80=9D or v.v.

Gr=C3=BC=C3=9Fe, Carsten


From nobody Tue Dec 19 12:44:52 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 707E212D853 for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 12:44:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.009
X-Spam-Level: 
X-Spam-Status: No, score=-2.009 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id er6f2rzC6RYC for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 12:44:44 -0800 (PST)
Received: from mail-io0-x230.google.com (mail-io0-x230.google.com [IPv6:2607:f8b0:4001:c06::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D47A71201F2 for <cbor@ietf.org>; Tue, 19 Dec 2017 12:44:43 -0800 (PST)
Received: by mail-io0-x230.google.com with SMTP id d16so14990402iob.4 for <cbor@ietf.org>; Tue, 19 Dec 2017 12:44:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1vNInu3l4oDSC+2iBvQEQg3VLqsJgpKGLXn5RBXM+sI=; b=GsAcQHRs3fOo7mv6i4FE7ldnneVSX+SKn6+L2140n5TEsA0C7wPBB7GJrhAT5Yz6I7 H9Zm77z2AgdvyORUTbjR5mYu1TmrfUf97LY550da2sArArouC0nLCI7S2nTlxppLWOsL HI/94R2FxOW/CbYyWm4+mYQnbAkYI/LZ/akKEFavHqflL1p8lvV5yJZr4q9MXhtn02zK W5KkzKP3Bh1lg+lGd726nyPc8+4S6Pc77/8t8oG4zPlC+s8vvZs7boPV+xuRegrW3PzN KfIqj3SObAGTXlRGxzbi3MH3vnsDeRV4leOQQc/HD5EjOeVMDP6R5rpRRbkz/JQCTOOY o7ug==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1vNInu3l4oDSC+2iBvQEQg3VLqsJgpKGLXn5RBXM+sI=; b=WapD4kJceSLHjDx6sfB/3PL7V28XOwFad/RhpvYuMD+9fp1WuBcmYI+K+HvOtKXwnu HQEASYiVSgogbX/BMuvIEJ8y1YkynnAoAyiITSmi3Bzibs3KQaOi/FlvCkAsWDbIXoUn WniYi1o8JBS8/m+Ib4bDwPRdNd6Z5HAoc7EW86QLm5GFl9STOAOT7KaruiJSFvnzVcYi Z/Sl/rz9oiDntKv3pBBOZsRoONzgMdvEPtfoh1BfJ37ngS+mA8672Cwq92Q2ivtFIODI Z0r1hjV04/ClJzS83SIaJTGtDyMRyWuQWNmRcC/Cf5D1FwYknyRavlaLCZzffyfNdNWg Se6w==
X-Gm-Message-State: AKGB3mIhJGxDhxMaeTW40tmuQ7OsfbyAqPCn7p+gRKxnE+brAyM0x6sJ TfI1mZdTEvBdmMHpj8j7KDSu30qoVuPLj6TDfr4Sz5Rr
X-Google-Smtp-Source: ACJfBosKe0BT+9+H0Roe5mdsHm75Zli0JDDci96wTZZLSafGGw+CM27oHPAAGf2/Z68U8GAKmAE8kibNAj2rcMj7nkE=
X-Received: by 10.107.201.1 with SMTP id z1mr5467037iof.83.1513716282696; Tue, 19 Dec 2017 12:44:42 -0800 (PST)
MIME-Version: 1.0
References: <CANh-dXmQLfvovGak0oEkbjVx3_7zmZc-Ohp=XpCXgx7LWE-=Wg@mail.gmail.com> <8274AE9A-F926-4F9A-A863-15DCC7572FB2@tzi.org>
In-Reply-To: <8274AE9A-F926-4F9A-A863-15DCC7572FB2@tzi.org>
From: Jeffrey Yasskin <jyasskin@google.com>
Date: Tue, 19 Dec 2017 20:44:29 +0000
Message-ID: <CANh-dX=54CVzUcY7NMdEZprxYkW_HqBGi4+r485pnZG_KSoM2Q@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
Content-Type: multipart/alternative; boundary="94eb2c0b77c0530df20560b78948"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/_7ffI0DmOkb64JeTdI2OSd-1oSM>
Subject: Re: [Cbor] CDDL: Pick between "text" vs "tstr" and "bytes" vs "bstr".
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Dec 2017 20:44:50 -0000

--94eb2c0b77c0530df20560b78948
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Tue, Dec 19, 2017 at 12:31 PM Carsten Bormann <cabo@tzi.org> wrote:

> Hi Jeffrey,
>
> On Dec 19, 2017, at 20:35, Jeffrey Yasskin <jyasskin@google.com> wrote:
> >
> > I don't think the CDDL prelude should standardize two English ways to
> spell the same thing. I don't have a real preference between "text" vs
> "tstr", but I'd like to pick one so that people don't have to think about
> which one to use in our own specs.
>
> I agree that a single name for each of text strings and byte strings woul=
d
> have been good.
>
> One observation we made with the initial choice of `tstr` and `bstr` is
> that these look too similar; it is too easy to mistake one for the other.
> So we came up with the less cryptic `text` and `bytes`.  (A lot of people
> think that the text one should be called =E2=80=9Cstring=E2=80=9D; well=
=E2=80=A6 If you really need
> this to get rid of some dissonance, adding a line =E2=80=9Cstring =3D tex=
t=E2=80=9D to your
> spec will give you that.)
>
> > In case nobody has a preference, I'll suggest using "tstr" and "bstr",
> on the theory that the types have more in common than they differ, so it'=
s
> nice to include the similarity in the name. However, if others prefer
> "text" and "bytes", I'll happily vote for that pair instead.
>
> I don=E2=80=99t think we should take away one of the aliases at this stag=
e.  (But,
> of course, if we do that, a simple =E2=80=9Ctstr =3D text=E2=80=9D or v.v=
. will =E2=80=9Crepair=E2=80=9D a
> spec.)
>

This stage, before CDDL is an RFC, seems like the right time to bite the
bullet and pick one. If you prefer "text" and "bytes", let's do that. We
should try to avoid Stu Feldman's famous mistake with `make`. ;)


> We *could* express a preference, as in =E2=80=9Ctstr and bstr are aliases=
 for
> backwards compatibility; text and bytes are the preferred names=E2=80=9D =
or v.v.
>

Jeffrey

--94eb2c0b77c0530df20560b78948
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr">On Tue, Dec 19=
, 2017 at 12:31 PM Carsten Bormann &lt;<a href=3D"mailto:cabo@tzi.org">cabo=
@tzi.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D=
"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Jeffrey,=
<br>
<br>
On Dec 19, 2017, at 20:35, Jeffrey Yasskin &lt;<a href=3D"mailto:jyasskin@g=
oogle.com" target=3D"_blank">jyasskin@google.com</a>&gt; wrote:<br>
&gt;<br>
&gt; I don&#39;t think the CDDL prelude should standardize two English ways=
 to spell the same thing. I don&#39;t have a real preference between &quot;=
text&quot; vs &quot;tstr&quot;, but I&#39;d like to pick one so that people=
 don&#39;t have to think about which one to use in our own specs.<br>
<br>
I agree that a single name for each of text strings and byte strings would =
have been good.<br>
<br>
One observation we made with the initial choice of `tstr` and `bstr` is tha=
t these look too similar; it is too easy to mistake one for the other.=C2=
=A0 So we came up with the less cryptic `text` and `bytes`.=C2=A0 (A lot of=
 people think that the text one should be called =E2=80=9Cstring=E2=80=9D; =
well=E2=80=A6 If you really need this to get rid of some dissonance, adding=
 a line =E2=80=9Cstring =3D text=E2=80=9D to your spec will give you that.)=
<br>
<br>
&gt; In case nobody has a preference, I&#39;ll suggest using &quot;tstr&quo=
t; and &quot;bstr&quot;, on the theory that the types have more in common t=
han they differ, so it&#39;s nice to include the similarity in the name. Ho=
wever, if others prefer &quot;text&quot; and &quot;bytes&quot;, I&#39;ll ha=
ppily vote for that pair instead.<br>
<br>
I don=E2=80=99t think we should take away one of the aliases at this stage.=
=C2=A0 (But, of course, if we do that, a simple =E2=80=9Ctstr =3D text=E2=
=80=9D or v.v. will =E2=80=9Crepair=E2=80=9D a spec.)<br></blockquote><div>=
<br></div><div>This stage, before CDDL is an RFC, seems like the right time=
 to bite the bullet and pick one. If you prefer &quot;text&quot; and &quot;=
bytes&quot;, let&#39;s do that. We should try to avoid Stu Feldman&#39;s fa=
mous mistake with `make`. ;)</div><div>=C2=A0</div><blockquote class=3D"gma=
il_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-lef=
t:1ex">
We *could* express a preference, as in =E2=80=9Ctstr and bstr are aliases f=
or backwards compatibility; text and bytes are the preferred names=E2=80=9D=
 or v.v.<br></blockquote><div><br></div><div>Jeffrey</div></div></div>

--94eb2c0b77c0530df20560b78948--


From nobody Tue Dec 19 16:13:59 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1501512D94B for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 16:13:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.71
X-Spam-Level: 
X-Spam-Status: No, score=-2.71 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LSu6RTOmuXkJ for <cbor@ietfa.amsl.com>; Tue, 19 Dec 2017 16:13:55 -0800 (PST)
Received: from mail-it0-x236.google.com (mail-it0-x236.google.com [IPv6:2607:f8b0:4001:c0b::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5ABED12D948 for <cbor@ietf.org>; Tue, 19 Dec 2017 16:13:55 -0800 (PST)
Received: by mail-it0-x236.google.com with SMTP id p139so4781808itb.1 for <cbor@ietf.org>; Tue, 19 Dec 2017 16:13:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=F8sltyJ5ELTGzvs1BBDNMjSXADWgnTNtpaytTpJXNzo=; b=ba5XQS0EsAZhjmDdMvE6VkvfI85Go1VqRCNXfHOtVfJJzdJAXGGyIvE8v00jogQxmI iHIjpS+v/gc+I94TGyjH4mPMR/5T4Qt7KOKcnKPyFTt4r2jUD8xZZN9dPwJqkQUz/4/O n3eikXkgQ/3jZzxrF9pbzqmWIczwg2Wj1Pv8Yql2AHXemhHAvFKZrWqjMQq4Va0f15AM +RH6GKgzjaEd5MchgHK+9r14+7PBaySzOIeSYJ4/T3jkjVvX33L5YO6uStlp84Fl5a9R ItYXm9qtR96sUj33sUX5qAoqE4cdYARZFCiXPx+AbDNjyCTV5Yd8y49zV3uG9+GkLmey eVgg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=F8sltyJ5ELTGzvs1BBDNMjSXADWgnTNtpaytTpJXNzo=; b=g9pdO0zS0yw2QZ9KNcreXYK9rJ2QF9JCkU21ddIc11FKMCKcQ9uOFgaTBnNCvigGd+ f0G08tFx7PScumZ48iQ+fwbGfRv6EQmEBfdzoutUcSja+/GINmNpTgDbi6rFba4GayTF hzor4091rnjgliCxUeDDxNMYk84lXMPr1W4G3U3x/RlnJdPS3rWsKEkli0L3wS7ysExN +ekVJSnBo7mha6Mw6EJymDi1Y3u9wHd+4I/3UNNcPjjado0OVIQ2+BBKFTT5jC8yxMs7 JRlSv69oXZKUVxPkz/HnW1YSH3eQ/jFvIsFhMwcUdtJ7VHb+ZRCFc9Tp3x4wcObCVPPW 1Q5g==
X-Gm-Message-State: AKGB3mLMS0wdOygH8WqeYiDAjO/X8HSZxKWCUm49n9U21gDyDxkY4DCU 5VeVCobrigRtEPnQZbugL03W8DyySGgrxe4VMmurrRoVh84=
X-Google-Smtp-Source: ACJfBotfz0fnRdPTAUhoMxgsC8AmPxwh5J7p1bjfaetkVb/jXivVgHYZMbRwuodhg1s9z379WZKdjgXdQGzUz7yea6M=
X-Received: by 10.36.93.5 with SMTP id w5mr5607138ita.124.1513728834131; Tue, 19 Dec 2017 16:13:54 -0800 (PST)
MIME-Version: 1.0
References: <CANh-dXmQLfvovGak0oEkbjVx3_7zmZc-Ohp=XpCXgx7LWE-=Wg@mail.gmail.com> <8274AE9A-F926-4F9A-A863-15DCC7572FB2@tzi.org> <CANh-dX=54CVzUcY7NMdEZprxYkW_HqBGi4+r485pnZG_KSoM2Q@mail.gmail.com>
In-Reply-To: <CANh-dX=54CVzUcY7NMdEZprxYkW_HqBGi4+r485pnZG_KSoM2Q@mail.gmail.com>
From: Jeffrey Yasskin <jyasskin@google.com>
Date: Wed, 20 Dec 2017 00:13:40 +0000
Message-ID: <CANh-dXk9UL04owE81ceRtnCEKy-nb2cdwV3g3PTUi3a5Nw59UQ@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a1143f14c728e3e0560ba75ff"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/LFTJ0GKm1CZgLikqOR-kSB8lIk0>
Subject: Re: [Cbor] CDDL: Pick between "text" vs "tstr" and "bytes" vs "bstr".
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Dec 2017 00:13:57 -0000

--001a1143f14c728e3e0560ba75ff
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Tue, Dec 19, 2017 at 12:44 PM Jeffrey Yasskin <jyasskin@google.com>
wrote:

> On Tue, Dec 19, 2017 at 12:31 PM Carsten Bormann <cabo@tzi.org> wrote:
>
>> Hi Jeffrey,
>>
>> On Dec 19, 2017, at 20:35, Jeffrey Yasskin <jyasskin@google.com> wrote:
>> >
>> > I don't think the CDDL prelude should standardize two English ways to
>> spell the same thing. I don't have a real preference between "text" vs
>> "tstr", but I'd like to pick one so that people don't have to think abou=
t
>> which one to use in our own specs.
>>
>> I agree that a single name for each of text strings and byte strings
>> would have been good.
>>
>> One observation we made with the initial choice of `tstr` and `bstr` is
>> that these look too similar; it is too easy to mistake one for the other=
.
>> So we came up with the less cryptic `text` and `bytes`.  (A lot of peopl=
e
>> think that the text one should be called =E2=80=9Cstring=E2=80=9D; well=
=E2=80=A6 If you really need
>> this to get rid of some dissonance, adding a line =E2=80=9Cstring =3D te=
xt=E2=80=9D to your
>> spec will give you that.)
>>
>> > In case nobody has a preference, I'll suggest using "tstr" and "bstr",
>> on the theory that the types have more in common than they differ, so it=
's
>> nice to include the similarity in the name. However, if others prefer
>> "text" and "bytes", I'll happily vote for that pair instead.
>>
>> I don=E2=80=99t think we should take away one of the aliases at this sta=
ge.
>> (But, of course, if we do that, a simple =E2=80=9Ctstr =3D text=E2=80=9D=
 or v.v. will
>> =E2=80=9Crepair=E2=80=9D a spec.)
>>
>
> This stage, before CDDL is an RFC, seems like the right time to bite the
> bullet and pick one. If you prefer "text" and "bytes", let's do that. We
> should try to avoid Stu Feldman's famous mistake with `make`. ;)
>

Sorry for being obscure here. Stu is quoted as saying

"=E2=80=A6 I just did something simple with the pattern newline-tab. It wor=
ked, it
stayed. And then a few weeks later I had a user population of about a
dozen, most of them friends, and I didn't want to screw up my embedded
base. The rest, sadly, is history."

Jeffrey

--001a1143f14c728e3e0560ba75ff
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr">On Tue, Dec 19=
, 2017 at 12:44 PM Jeffrey Yasskin &lt;<a href=3D"mailto:jyasskin@google.co=
m">jyasskin@google.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_q=
uote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,2=
04);padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=
=3D"ltr">On Tue, Dec 19, 2017 at 12:31 PM Carsten Bormann &lt;<a href=3D"ma=
ilto:cabo@tzi.org" target=3D"_blank">cabo@tzi.org</a>&gt; wrote:<br></div><=
blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l=
eft:1px solid rgb(204,204,204);padding-left:1ex">Hi Jeffrey,<br>
<br>
On Dec 19, 2017, at 20:35, Jeffrey Yasskin &lt;<a href=3D"mailto:jyasskin@g=
oogle.com" target=3D"_blank">jyasskin@google.com</a>&gt; wrote:<br>
&gt;<br>
&gt; I don&#39;t think the CDDL prelude should standardize two English ways=
 to spell the same thing. I don&#39;t have a real preference between &quot;=
text&quot; vs &quot;tstr&quot;, but I&#39;d like to pick one so that people=
 don&#39;t have to think about which one to use in our own specs.<br>
<br>
I agree that a single name for each of text strings and byte strings would =
have been good.<br>
<br>
One observation we made with the initial choice of `tstr` and `bstr` is tha=
t these look too similar; it is too easy to mistake one for the other.=C2=
=A0 So we came up with the less cryptic `text` and `bytes`.=C2=A0 (A lot of=
 people think that the text one should be called =E2=80=9Cstring=E2=80=9D; =
well=E2=80=A6 If you really need this to get rid of some dissonance, adding=
 a line =E2=80=9Cstring =3D text=E2=80=9D to your spec will give you that.)=
<br>
<br>
&gt; In case nobody has a preference, I&#39;ll suggest using &quot;tstr&quo=
t; and &quot;bstr&quot;, on the theory that the types have more in common t=
han they differ, so it&#39;s nice to include the similarity in the name. Ho=
wever, if others prefer &quot;text&quot; and &quot;bytes&quot;, I&#39;ll ha=
ppily vote for that pair instead.<br>
<br>
I don=E2=80=99t think we should take away one of the aliases at this stage.=
=C2=A0 (But, of course, if we do that, a simple =E2=80=9Ctstr =3D text=E2=
=80=9D or v.v. will =E2=80=9Crepair=E2=80=9D a spec.)<br></blockquote><div>=
<br></div><div>This stage, before CDDL is an RFC, seems like the right time=
 to bite the bullet and pick one. If you prefer &quot;text&quot; and &quot;=
bytes&quot;, let&#39;s do that. We should try to avoid Stu Feldman&#39;s fa=
mous mistake with `make`. ;)</div></div></div></blockquote><div><br></div><=
div>Sorry for being obscure here. Stu is quoted as saying</div><div><br></d=
iv><div>&quot;=E2=80=A6=C2=A0<span style=3D"font-size:12.8px">I just did so=
mething simple with the pattern newline-tab. It worked, it stayed. And then=
 a few weeks later I had a user population of about a dozen, most of them f=
riends, and I didn&#39;t want to screw up my embedded base. The rest, sadly=
, is history.&quot;</span></div><div><br></div><div>Jeffrey</div></div></di=
v>

--001a1143f14c728e3e0560ba75ff--


From nobody Wed Dec 20 07:46:45 2017
Return-Path: <dev+ietf@seantek.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7A04A12426E for <cbor@ietfa.amsl.com>; Wed, 20 Dec 2017 07:46:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9YBFIpCTkubV for <cbor@ietfa.amsl.com>; Wed, 20 Dec 2017 07:46:42 -0800 (PST)
Received: from smtp-out-1.mxes.net (smtp-out-1.mxes.net [67.222.241.250]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A65B81200FC for <cbor@ietf.org>; Wed, 20 Dec 2017 07:46:42 -0800 (PST)
Received: from [192.168.123.7] (cpe-76-90-60-238.socal.res.rr.com [76.90.60.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id B279D27502 for <cbor@ietf.org>; Wed, 20 Dec 2017 10:46:41 -0500 (EST)
To: cbor@ietf.org
References: <CANh-dXmQLfvovGak0oEkbjVx3_7zmZc-Ohp=XpCXgx7LWE-=Wg@mail.gmail.com>
From: Sean Leonard <dev+ietf@seantek.com>
Message-ID: <ed312533-62d2-1fc8-c945-b652e35f4467@seantek.com>
Date: Wed, 20 Dec 2017 07:44:25 -0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0
MIME-Version: 1.0
In-Reply-To: <CANh-dXmQLfvovGak0oEkbjVx3_7zmZc-Ohp=XpCXgx7LWE-=Wg@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/CIonrhXELXupjjMZAn6UouYaP-c>
Subject: Re: [Cbor] CDDL: Pick between "text" vs "tstr" and "bytes" vs "bstr".
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Dec 2017 15:46:44 -0000

On 12/19/2017 11:35 AM, Jeffrey Yasskin wrote:
> I don't think the CDDL prelude=20
> <https://tools.ietf.org/html/draft-ietf-cbor-cddl-00#appendix-E>=20
> should standardize two English ways to spell the same thing. I don't=20
> have a real preference between "text" vs "tstr", but I'd like to pick=20
> one so that people don't have to think about which one to use in our=20
> own specs.
>
> In case nobody has a preference, I'll suggest using "tstr" and "bstr", =

> on the theory that the types have more in common than they differ, so=20
> it's nice to include the similarity in the name. However, if others=20
> prefer "text" and "bytes", I'll happily vote for that pair instead.

I am for "tstr" and "bstr".

Rationale: #2 and #3 are data types for strings (sequences of symbols)=20
and therefore are the same. A very cursory survey of RFCs and=20
Internet-Drafts to-date:
https://www.google.com/search?q=3Dbstr+tstr+site%3Atools.ietf.org

suggests that "tstr" is unique to CBOR-using documents in the IETF;=20
"bstr" is (somewhat surprisingly) also pretty unique to CBOR-using=20
documents in the IETF. BSTR (always capitalized) is also a data type in=20
Microsoft products, being a binary string that can contain octets, or=20
UTF-16-encoded hexadecitets [heh]. Thus bstr and tstr are, or will be,=20
part of the vernacular, so folks should just learn them and move on.

"bytes" and "text" are clearly not unique to CBOR/CDDL. Furthermore,=20
major tag 3 can store "bytes" (e.g., base64-encoded), and major tag 2=20
can store "text" (e.g., text encoded in anything other than UTF-8, such=20
as ISO-8859-1, UTF-16, etc.). Major tag 0 could be used to store a=20
single byte (1-octet) or a single textual code point (1-32-bit=20
UTF-32-encoded character). Finally, RFC 7049 itself uses the term=20
"bytes" (as it should) to describe all CBOR encodings, so "bytes" could=20
be confused with the contents of a major tag 2 item, and some complete=20
CBOR encoded item, therefore being more ambiguous. So, I find the terms=20
"bytes" and "text" to be imprecise, compared to tstr and bstr.

I suppose that tstr *could* be renamed to ustr for Unicode or UTF-8=20
string (and a quick search suggests nobody has used ustr in any RFC or=20
posted Internet-Draft), but tstr is already unique and gets the point=20
across.

Regards,

Sean


From nobody Fri Dec 22 14:14:12 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3D4D5127136 for <cbor@ietfa.amsl.com>; Fri, 22 Dec 2017 14:14:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.46
X-Spam-Level: 
X-Spam-Status: No, score=-2.46 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3nI1vDT0SUKC for <cbor@ietfa.amsl.com>; Fri, 22 Dec 2017 14:14:07 -0800 (PST)
Received: from mail-io0-x236.google.com (mail-io0-x236.google.com [IPv6:2607:f8b0:4001:c06::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8B60E124207 for <cbor@ietf.org>; Fri, 22 Dec 2017 14:14:07 -0800 (PST)
Received: by mail-io0-x236.google.com with SMTP id 87so21786119ior.5 for <cbor@ietf.org>; Fri, 22 Dec 2017 14:14:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eEyotSlpsMkQQrGzKwFqA9tdhO2yrifXUsA5/ypBoKk=; b=Zw1Nq4xUWJCstNHLtDWecSihconwnxxdvfMEtZj3zYYsdeOJXtx3f1TVNLfJ2QtrmU i7LrDU9vR3EEWBVnhYpzcCzHeIL/qygUzD0EhBu4488oaanNykdS230mWoL0++K9x2Nq je8CKnOvyySaTpfU1I7axxl0CYx1lKXJ765yI=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eEyotSlpsMkQQrGzKwFqA9tdhO2yrifXUsA5/ypBoKk=; b=C2ir8WMO7n6KUe4ux1OqEPDZ5u+jj0qajk4l6AOYZQmdhttYKDplFE3OLRB2JHgN/a 1rviScm38p4JZBljwt7CvrMrBnbtC/JzxJXcSJKMY2IyH3oPmBwyhN6jwwYksuK+qVhA +CIdEDZuZfOe9MyIl7ZZhH7kPLv1iXzPhCEEuCjOrWFRsl2IhxxFErOw9aZICM7hKgZ4 aXUbTirJFnVRddVFOPo9JDKsUDeLBwrA8dL1f/pIQu1D2bxrdZNBZCT52kOouFUDUe0V I3uhoGQxdGBE4kwl+8u12OQZ/cirxSph/+SqRmtdfK8GSqYKefIf03Yg2e+HwZn1+zKc DEJw==
X-Gm-Message-State: AKGB3mJLQIOEnBH33xP7ac+UL2nkeODXDOV9hEpDXI4YfeLstugx3c0O plGTYyOYYyCnEgNUKS1pfaJAiTSV8Gg9/Cwfkac1nQ==
X-Google-Smtp-Source: ACJfBov4DIC9MzlQK+P51zILAURkyu4qjmOqO8zlwH+PEAdgr1m8PjkEnKOObSe+UfHQDLP5lp54m5qOjeXyTFRS6ms=
X-Received: by 10.107.35.140 with SMTP id j134mr21260598ioj.166.1513980846408;  Fri, 22 Dec 2017 14:14:06 -0800 (PST)
MIME-Version: 1.0
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org> <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com> <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com> <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org>
In-Reply-To: <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Fri, 22 Dec 2017 22:13:52 +0000
Message-ID: <CANh-dXkOko=_Om1uQeA1NBCAkeVnY3r2itVg=f6Pj0_H57K0Zw@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Jeffrey Yasskin <jyasskin@chromium.org>, cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a114028e08ccdf40560f522d2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/KoplznKxiCPfp129Yx2v_2pmnhk>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Dec 2017 22:14:10 -0000

--001a114028e08ccdf40560f522d2
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann <cabo@tzi.org> wrote:

> On Nov 30, 2017, at 23:14, Jeffrey Yasskin <jyasskin@chromium.org> wrote:
> >
> > Belatedly, I've discovered a user of "canonical" CBOR who's proposing a
> different map order than the RFC suggests:
> https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authe=
nticator-protocol-v2.0-rd-20170927.html#message-encoding.
> (Note that this isn't a final standard yet and may change.)
>
> I looked at the spec referenced.
>
> So they essentially add
>
>                 =E2=80=A2 If the major types are different, the one with =
the lower
> value in numerical order sorts earlier.
>
> as a major sorting rule before the existing RFC 7049 canonicalization
> rules:
>
>                 =E2=80=A2 If two keys have different lengths, the shorter=
 one
> sorts earlier;
>                 =E2=80=A2 If two keys have the same length, the one with =
the lower
> value in (byte-wise) lexical order sorts earlier.
>
> This is different from simply going for byte-wise lexicographic (memcmp
> order(*)), which effectively would get us the first rule as the major
> sorting order already, but get rid of the length-based second rule (first
> rule in Section 3.9 of RFC 7049).
>
> I=E2=80=99ve come to see the putting the length comparison rule early in =
3.9 as a
> major regression.
> One of the objectives when designing the CBOR serialization was not to
> repeat one big mistake that ASN.1 BER makes: to make overall lengths of
> complex composite items visible/important in the encoding of the next
> higher composite.
> Here, we are doing just that.  D=E2=80=99oh.
>
> > This is justified by the RFC saying that "Those protocols are free to
> define what they mean by a canonical format and what encoders and decoder=
s
> are expected to do.  This section lists some suggestions for such
> protocols." That is (as Jim said), the RFC doesn't specify "canonical"
> CBOR: it just provides an option for higher-level protocols to do so.
>
> Right.  So the change would be to mention two options for this, the old
> canonical, and the saner (memcmp order) canonical.  Now the next step is
> finding names for legacy canonical/saner canonical.  We then have to deci=
de
> whether we turn this into a separate document, at Proposed Standard level=
,
> or believe that adding another suggestion to 3.9 is essentially a bug fix
> and can be done in the Standard level document.
>
> > The use of a different order in CTAP is going to either force its
> implementers to write custom CBOR encoders and decoders or require the
> generic encoders to take a configuration option for the map order. If the
> generic encoders take an option, then it stops being an issue for CBORbis
> to suggest a different order.
>
> Right.  So I think you are saying we get to fix this.
>

I've tried to implement this in https://github.com/cbor-wg/CBORbis/pull/9.
I defined a core set of rules so that other specs can use them by
reference, gave several examples of protocols that will need to extend the
rules, and defined a second set of rules that match the canonical order
that RFC7049 suggested.

Do folks like this direction?

I don't have a strong opinion about which document should hold the
canonicalization rules. Can we ask the IESG which they'd prefer?

Jeffrey

--001a114028e08ccdf40560f522d2
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr">On Sun, Dec 3,=
 2017 at 6:15 AM Carsten Bormann &lt;<a href=3D"mailto:cabo@tzi.org">cabo@t=
zi.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"m=
argin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left=
:1ex">On Nov 30, 2017, at 23:14, Jeffrey Yasskin &lt;<a href=3D"mailto:jyas=
skin@chromium.org" target=3D"_blank">jyasskin@chromium.org</a>&gt; wrote:<b=
r>
&gt;<br>
&gt; Belatedly, I&#39;ve discovered a user of &quot;canonical&quot; CBOR wh=
o&#39;s proposing a different map order than the RFC suggests: <a href=3D"h=
ttps://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authenti=
cator-protocol-v2.0-rd-20170927.html#message-encoding" rel=3D"noreferrer" t=
arget=3D"_blank">https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-=
client-to-authenticator-protocol-v2.0-rd-20170927.html#message-encoding</a>=
. (Note that this isn&#39;t a final standard yet and may change.)<br>
<br>
I looked at the spec referenced.<br>
<br>
So they essentially add<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 If the ma=
jor types are different, the one with the lower value in numerical order so=
rts earlier.<br>
<br>
as a major sorting rule before the existing RFC 7049 canonicalization rules=
:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 If two ke=
ys have different lengths, the shorter one sorts earlier;<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=80=A2 If two ke=
ys have the same length, the one with the lower value in (byte-wise) lexica=
l order sorts earlier.<br>
<br>
This is different from simply going for byte-wise lexicographic (memcmp ord=
er(*)), which effectively would get us the first rule as the major sorting =
order already, but get rid of the length-based second rule (first rule in S=
ection 3.9 of RFC 7049).<br>
<br>
I=E2=80=99ve come to see the putting the length comparison rule early in 3.=
9 as a major regression.<br>
One of the objectives when designing the CBOR serialization was not to repe=
at one big mistake that ASN.1 BER makes: to make overall lengths of complex=
 composite items visible/important in the encoding of the next higher compo=
site.<br>
Here, we are doing just that.=C2=A0 D=E2=80=99oh.<br>
<br>
&gt; This is justified by the RFC saying that &quot;Those protocols are fre=
e to define what they mean by a canonical format and what encoders and deco=
ders are expected to do.=C2=A0 This section lists some suggestions for such=
 protocols.&quot; That is (as Jim said), the RFC doesn&#39;t specify &quot;=
canonical&quot; CBOR: it just provides an option for higher-level protocols=
 to do so.<br>
<br>
Right.=C2=A0 So the change would be to mention two options for this, the ol=
d canonical, and the saner (memcmp order) canonical.=C2=A0 Now the next ste=
p is finding names for legacy canonical/saner canonical.=C2=A0 We then have=
 to decide whether we turn this into a separate document, at Proposed Stand=
ard level, or believe that adding another suggestion to 3.9 is essentially =
a bug fix and can be done in the Standard level document.<br>
<br>
&gt; The use of a different order in CTAP is going to either force its impl=
ementers to write custom CBOR encoders and decoders or require the generic =
encoders to take a configuration option for the map order. If the generic e=
ncoders take an option, then it stops being an issue for CBORbis to suggest=
 a different order.<br>
<br>
Right.=C2=A0 So I think you are saying we get to fix this.<br></blockquote>=
<div><br></div><div>I&#39;ve tried to implement this in=C2=A0<a href=3D"htt=
ps://github.com/cbor-wg/CBORbis/pull/9">https://github.com/cbor-wg/CBORbis/=
pull/9</a>. I defined a core set of rules so that other specs can use them =
by reference, gave several examples of protocols that will need to extend t=
he rules, and defined a second set of rules that match the canonical order =
that RFC7049 suggested.</div><div><br></div><div>Do folks like this directi=
on?</div><div><br></div><div>I don&#39;t have a strong opinion about which =
document should hold the canonicalization rules. Can we ask the IESG which =
they&#39;d prefer?</div><div><br></div><div>Jeffrey</div></div></div>

--001a114028e08ccdf40560f522d2--


From nobody Fri Dec 22 14:23:43 2017
Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 341C6127369 for <cbor@ietfa.amsl.com>; Fri, 22 Dec 2017 14:23:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bhxAib-0ce15 for <cbor@ietfa.amsl.com>; Fri, 22 Dec 2017 14:23:37 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E56A2126BF0 for <cbor@ietf.org>; Fri, 22 Dec 2017 14:23:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [134.102.201.11]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id vBMMNXdc006898; Fri, 22 Dec 2017 23:23:33 +0100 (CET)
Received: from client-0064.vpn.uni-bremen.de (client-0064.vpn.uni-bremen.de [134.102.107.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3z3NM85cHWzDWZ6; Fri, 22 Dec 2017 23:23:32 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CANh-dXkOko=_Om1uQeA1NBCAkeVnY3r2itVg=f6Pj0_H57K0Zw@mail.gmail.com>
Date: Fri, 22 Dec 2017 23:23:32 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 535674211.927005-2a2efdd6a72848aa78709dcbfb5761c7
Content-Transfer-Encoding: quoted-printable
Message-Id: <5D1F5ECC-C7DA-427F-B8A1-2040EA75FDE6@tzi.org>
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org> <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com> <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com> <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org> <CANh-dXkOko=_Om1uQeA1NBCAkeVnY3r2itVg=f6Pj0_H57K0Zw@mail.gmail.com>
To: Jeffrey Yasskin <jyasskin@chromium.org>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/nv5Atlkp4NzEy23brnmyuPORG1w>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Dec 2017 22:23:40 -0000

Hi Jeffrey,

quick reactions after a first skim:

I=E2=80=99m not sure the direction should be to open this up; I think =
the recommendations should become more narrow as we learn about the =
practical issues.

We could write a separate document on the preferred c14n so we can keep =
this out of the main document.

I don=E2=80=99t think we necessarily want to encourage cross-over =
between int and float, so I think the =E2=80=9Cshortest float=E2=80=9D =
rule should be applied independent of whether that cross-over is =
desired.

Why keep the =E2=80=9Cshortest int=E2=80=9D rule less well defined for =
protocols that use bignums?  It should apply there as well, i.e., use =
bignums only for integers too large for the major type 0/1 formats.

Gr=C3=BC=C3=9Fe, Carsten


> On Dec 22, 2017, at 23:13, Jeffrey Yasskin <jyasskin@chromium.org> =
wrote:
>=20
> On Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann <cabo@tzi.org> wrote:
> On Nov 30, 2017, at 23:14, Jeffrey Yasskin <jyasskin@chromium.org> =
wrote:
> >
> > Belatedly, I've discovered a user of "canonical" CBOR who's =
proposing a different map order than the RFC suggests: =
https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authen=
ticator-protocol-v2.0-rd-20170927.html#message-encoding. (Note that this =
isn't a final standard yet and may change.)
>=20
> I looked at the spec referenced.
>=20
> So they essentially add
>=20
>                 =E2=80=A2 If the major types are different, the one =
with the lower value in numerical order sorts earlier.
>=20
> as a major sorting rule before the existing RFC 7049 canonicalization =
rules:
>=20
>                 =E2=80=A2 If two keys have different lengths, the =
shorter one sorts earlier;
>                 =E2=80=A2 If two keys have the same length, the one =
with the lower value in (byte-wise) lexical order sorts earlier.
>=20
> This is different from simply going for byte-wise lexicographic =
(memcmp order(*)), which effectively would get us the first rule as the =
major sorting order already, but get rid of the length-based second rule =
(first rule in Section 3.9 of RFC 7049).
>=20
> I=E2=80=99ve come to see the putting the length comparison rule early =
in 3.9 as a major regression.
> One of the objectives when designing the CBOR serialization was not to =
repeat one big mistake that ASN.1 BER makes: to make overall lengths of =
complex composite items visible/important in the encoding of the next =
higher composite.
> Here, we are doing just that.  D=E2=80=99oh.
>=20
> > This is justified by the RFC saying that "Those protocols are free =
to define what they mean by a canonical format and what encoders and =
decoders are expected to do.  This section lists some suggestions for =
such protocols." That is (as Jim said), the RFC doesn't specify =
"canonical" CBOR: it just provides an option for higher-level protocols =
to do so.
>=20
> Right.  So the change would be to mention two options for this, the =
old canonical, and the saner (memcmp order) canonical.  Now the next =
step is finding names for legacy canonical/saner canonical.  We then =
have to decide whether we turn this into a separate document, at =
Proposed Standard level, or believe that adding another suggestion to =
3.9 is essentially a bug fix and can be done in the Standard level =
document.
>=20
> > The use of a different order in CTAP is going to either force its =
implementers to write custom CBOR encoders and decoders or require the =
generic encoders to take a configuration option for the map order. If =
the generic encoders take an option, then it stops being an issue for =
CBORbis to suggest a different order.
>=20
> Right.  So I think you are saying we get to fix this.
>=20
> I've tried to implement this in =
https://github.com/cbor-wg/CBORbis/pull/9. I defined a core set of rules =
so that other specs can use them by reference, gave several examples of =
protocols that will need to extend the rules, and defined a second set =
of rules that match the canonical order that RFC7049 suggested.
>=20
> Do folks like this direction?
>=20
> I don't have a strong opinion about which document should hold the =
canonicalization rules. Can we ask the IESG which they'd prefer?
>=20
> Jeffrey


From nobody Fri Dec 22 15:56:49 2017
Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4AAF5120454 for <cbor@ietfa.amsl.com>; Fri, 22 Dec 2017 15:56:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.46
X-Spam-Level: 
X-Spam-Status: No, score=-2.46 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id trbdjOuwLiIC for <cbor@ietfa.amsl.com>; Fri, 22 Dec 2017 15:56:45 -0800 (PST)
Received: from mail-it0-x22d.google.com (mail-it0-x22d.google.com [IPv6:2607:f8b0:4001:c0b::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 10FC21201F2 for <cbor@ietf.org>; Fri, 22 Dec 2017 15:56:45 -0800 (PST)
Received: by mail-it0-x22d.google.com with SMTP id b5so15946757itc.3 for <cbor@ietf.org>; Fri, 22 Dec 2017 15:56:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fpy12kK5AaN/GNHI7m17DI95pEKnomY2TQGvTiE7gUw=; b=Pd92+E9ShHNRzDwr0707cnNQGmhz+cHpT9RqWbGyWcEcQ45eeEciKyDRZ+SDtCtKxh OxarwbBX3+66DnqwaEedFebzi9CcbVbLiaApVL21vC0tiJGRtuSYr5PStJ1T/jlfvDbY /MsPbCeChxd4R6IkK8giSW24bqXwqe8gW2tO8=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fpy12kK5AaN/GNHI7m17DI95pEKnomY2TQGvTiE7gUw=; b=f+JMAQQYweBZrLhBCAxtU9nV/zm7z6BF9VRWpCvYzG8j67wcJ5WazblhKeTVhmEmU1 3CDW4gYQWGi9AOGQ7A+vQ8ZbpnxKayj1cyH7mAnxCktXEjjdXs5boQ099nfLKr+ATJKA g3d7WRSTD1pwjElk9J7fC+0/bNCIU0l1fnR1ot2zFqBkFfc3ttGZDoGg1LvaB2g4ErXM cycMyomtt/B8oA75w23FefUeN3RYNDHzFxzaDoJIwvryh7WLIjzHWdRU863tKpSzSpiF lOM1TXq3F/vbQy9QqYMdIGiW9eRxmWOuSQ3lQe+M0WiXNE2MX/GpNeLwy0uOacrExZ/v cdug==
X-Gm-Message-State: AKGB3mLGUW0yP35FwVhbmqmsUX4JaCEAz1LMVk1Oks+xIVjxsGrwKDe9 l597/4iPpeX57brGyIFdAKwsgbHETSqU2HHKRV3eVA==
X-Google-Smtp-Source: ACJfBovyfLB1vaNBHZC3DGoV3AXDf13pqC7PlP44pJSQOFKz96dofLRV1JywuoHTGDxOFzQmJZbT04QdJHOfSFnqz/Q=
X-Received: by 10.36.60.212 with SMTP id m203mr19116354ita.96.1513987003910; Fri, 22 Dec 2017 15:56:43 -0800 (PST)
MIME-Version: 1.0
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org> <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com> <CANh-dX=UGDNX1CCQCL_-9T5kjp4i5vwqrTnQ8D6V7qkLX2PotA@mail.gmail.com> <1FED1F56-93BA-410F-B7C4-E83D31E7CC4E@tzi.org> <CANh-dXkOko=_Om1uQeA1NBCAkeVnY3r2itVg=f6Pj0_H57K0Zw@mail.gmail.com> <5D1F5ECC-C7DA-427F-B8A1-2040EA75FDE6@tzi.org>
In-Reply-To: <5D1F5ECC-C7DA-427F-B8A1-2040EA75FDE6@tzi.org>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Fri, 22 Dec 2017 23:56:29 +0000
Message-ID: <CANh-dXmjPHM+gHQDqgHknHU8a2ShvuwsmZrgAM+HQhTEAk_iMQ@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Jeffrey Yasskin <jyasskin@chromium.org>, cbor@ietf.org
Content-Type: multipart/alternative; boundary="001a11484940906fd30560f6916c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/6DvzsfTqSHiam8KIacHxSXwyAYY>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Dec 2017 23:56:48 -0000

--001a11484940906fd30560f6916c
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

I don't think https://github.com/cbor-wg/CBORbis/pull/9 allows anything
that RFC7049 didn't, which is the meaning I get from "open this up".

We *could* move this (all of section 3?) to a separate document, but I
haven't seen anyone say that we need to. A downside of moving section 3 to
another RFC is that it'll make it harder to find. Someone authoritative
(Francesca?) should just make this call so that we can stop angsting about
it.

I'm generally happy to change exactly which examples demonstrate that
protocol designers need to think about canonicalization even if they start
from the core canonicalization requirements. I'd appreciate a concrete
statement of which examples to use though.

Once we add floating values, the type of a field matters in defining its
canonicalization. A float64 field only needs to canonicalize NaN. A
float16/float32/float64 field needs to canonicalize to either the smallest
or largest type. An int/float field needs to again canonicalize toward or
away from int. A bigint field needs to prefer either a fixed-length (useful
for cryptographic signatures) or the shortest representation. A
decfrac/bigfloat/number field has an even more complex problem. I don't
personally have enough examples of existing canonicalized CBOR-based
protocols to make any confident recommendations here. If the list gives me
some, along with the field experience that justifies them, I'm happy to
write them down in my patch.

Jeffrey


On Fri, Dec 22, 2017 at 2:23 PM Carsten Bormann <cabo@tzi.org> wrote:

> Hi Jeffrey,
>
> quick reactions after a first skim:
>
> I=E2=80=99m not sure the direction should be to open this up; I think the
> recommendations should become more narrow as we learn about the practical
> issues.
>
> We could write a separate document on the preferred c14n so we can keep
> this out of the main document.
>
> I don=E2=80=99t think we necessarily want to encourage cross-over between=
 int and
> float, so I think the =E2=80=9Cshortest float=E2=80=9D rule should be app=
lied independent
> of whether that cross-over is desired.
>
> Why keep the =E2=80=9Cshortest int=E2=80=9D rule less well defined for pr=
otocols that use
> bignums?  It should apply there as well, i.e., use bignums only for
> integers too large for the major type 0/1 formats.
>
> Gr=C3=BC=C3=9Fe, Carsten
>
>
> > On Dec 22, 2017, at 23:13, Jeffrey Yasskin <jyasskin@chromium.org>
> wrote:
> >
> > On Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann <cabo@tzi.org> wrote:
> > On Nov 30, 2017, at 23:14, Jeffrey Yasskin <jyasskin@chromium.org>
> wrote:
> > >
> > > Belatedly, I've discovered a user of "canonical" CBOR who's proposing
> a different map order than the RFC suggests:
> https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-authe=
nticator-protocol-v2.0-rd-20170927.html#message-encoding.
> (Note that this isn't a final standard yet and may change.)
> >
> > I looked at the spec referenced.
> >
> > So they essentially add
> >
> >                 =E2=80=A2 If the major types are different, the one wit=
h the
> lower value in numerical order sorts earlier.
> >
> > as a major sorting rule before the existing RFC 7049 canonicalization
> rules:
> >
> >                 =E2=80=A2 If two keys have different lengths, the short=
er one
> sorts earlier;
> >                 =E2=80=A2 If two keys have the same length, the one wit=
h the
> lower value in (byte-wise) lexical order sorts earlier.
> >
> > This is different from simply going for byte-wise lexicographic (memcmp
> order(*)), which effectively would get us the first rule as the major
> sorting order already, but get rid of the length-based second rule (first
> rule in Section 3.9 of RFC 7049).
> >
> > I=E2=80=99ve come to see the putting the length comparison rule early i=
n 3.9 as
> a major regression.
> > One of the objectives when designing the CBOR serialization was not to
> repeat one big mistake that ASN.1 BER makes: to make overall lengths of
> complex composite items visible/important in the encoding of the next
> higher composite.
> > Here, we are doing just that.  D=E2=80=99oh.
> >
> > > This is justified by the RFC saying that "Those protocols are free to
> define what they mean by a canonical format and what encoders and decoder=
s
> are expected to do.  This section lists some suggestions for such
> protocols." That is (as Jim said), the RFC doesn't specify "canonical"
> CBOR: it just provides an option for higher-level protocols to do so.
> >
> > Right.  So the change would be to mention two options for this, the old
> canonical, and the saner (memcmp order) canonical.  Now the next step is
> finding names for legacy canonical/saner canonical.  We then have to deci=
de
> whether we turn this into a separate document, at Proposed Standard level=
,
> or believe that adding another suggestion to 3.9 is essentially a bug fix
> and can be done in the Standard level document.
> >
> > > The use of a different order in CTAP is going to either force its
> implementers to write custom CBOR encoders and decoders or require the
> generic encoders to take a configuration option for the map order. If the
> generic encoders take an option, then it stops being an issue for CBORbis
> to suggest a different order.
> >
> > Right.  So I think you are saying we get to fix this.
> >
> > I've tried to implement this in
> https://github.com/cbor-wg/CBORbis/pull/9. I defined a core set of rules
> so that other specs can use them by reference, gave several examples of
> protocols that will need to extend the rules, and defined a second set of
> rules that match the canonical order that RFC7049 suggested.
> >
> > Do folks like this direction?
> >
> > I don't have a strong opinion about which document should hold the
> canonicalization rules. Can we ask the IESG which they'd prefer?
> >
> > Jeffrey
>
>

--001a11484940906fd30560f6916c
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>I don&#39;t think <a href=3D"https://github.com/cbor-=
wg/CBORbis/pull/9">https://github.com/cbor-wg/CBORbis/pull/9</a>=C2=A0allow=
s anything that RFC7049 didn&#39;t, which is the meaning I get from &quot;o=
pen this up&quot;.=C2=A0</div><div><br></div><div>We *could* move this (all=
 of section 3?) to a separate document, but I haven&#39;t seen anyone say t=
hat we need to. A downside of moving section 3 to another RFC is that it&#3=
9;ll make it harder to find. Someone authoritative (Francesca?)=C2=A0should=
 just make this call so that we can stop angsting about it.</div><div><br><=
/div><div>I&#39;m generally happy to change exactly which examples demonstr=
ate that protocol designers need to think about canonicalization even if th=
ey start from the core canonicalization requirements. I&#39;d appreciate a =
concrete statement of which examples to use though.</div><div><br></div><di=
v>Once we add floating values, the type of a field matters in defining its =
canonicalization. A float64 field only needs to canonicalize NaN. A float16=
/float32/float64 field needs to canonicalize to either the smallest or larg=
est type. An int/float field needs to again canonicalize toward or away fro=
m int. A bigint field needs to prefer either a <span style=3D"color:rgb(34,=
34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-=
variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-sp=
acing:normal;text-align:start;text-indent:0px;text-transform:none;white-spa=
ce:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoratio=
n-style:initial;text-decoration-color:initial;float:none;display:inline">fi=
xed-length (useful for cryptographic signatures) or the=C2=A0</span>shortes=
t representation. A decfrac/bigfloat/number field has an even more complex =
problem. I don&#39;t personally have enough examples of existing canonicali=
zed CBOR-based protocols to make any confident recommendations here. If the=
 list gives me some, along with the field experience that justifies them, I=
&#39;m happy to write them down in my patch.</div><div><br></div><div>Jeffr=
ey</div><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Fri, De=
c 22, 2017 at 2:23 PM Carsten Bormann &lt;<a href=3D"mailto:cabo@tzi.org">c=
abo@tzi.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex">Hi Jeffrey,<br>
<br>
quick reactions after a first skim:<br>
<br>
I=E2=80=99m not sure the direction should be to open this up; I think the r=
ecommendations should become more narrow as we learn about the practical is=
sues.<br>
<br>
We could write a separate document on the preferred c14n so we can keep thi=
s out of the main document.<br>
<br>
I don=E2=80=99t think we necessarily want to encourage cross-over between i=
nt and float, so I think the =E2=80=9Cshortest float=E2=80=9D rule should b=
e applied independent of whether that cross-over is desired.<br>
<br>
Why keep the =E2=80=9Cshortest int=E2=80=9D rule less well defined for prot=
ocols that use bignums?=C2=A0 It should apply there as well, i.e., use bign=
ums only for integers too large for the major type 0/1 formats.<br>
<br>
Gr=C3=BC=C3=9Fe, Carsten<br>
<br>
<br>
&gt; On Dec 22, 2017, at 23:13, Jeffrey Yasskin &lt;<a href=3D"mailto:jyass=
kin@chromium.org" target=3D"_blank">jyasskin@chromium.org</a>&gt; wrote:<br=
>
&gt;<br>
&gt; On Sun, Dec 3, 2017 at 6:15 AM Carsten Bormann &lt;<a href=3D"mailto:c=
abo@tzi.org" target=3D"_blank">cabo@tzi.org</a>&gt; wrote:<br>
&gt; On Nov 30, 2017, at 23:14, Jeffrey Yasskin &lt;<a href=3D"mailto:jyass=
kin@chromium.org" target=3D"_blank">jyasskin@chromium.org</a>&gt; wrote:<br=
>
&gt; &gt;<br>
&gt; &gt; Belatedly, I&#39;ve discovered a user of &quot;canonical&quot; CB=
OR who&#39;s proposing a different map order than the RFC suggests: <a href=
=3D"https://fidoalliance.org/specs/fido-v2.0-rd-20170927/fido-client-to-aut=
henticator-protocol-v2.0-rd-20170927.html#message-encoding" rel=3D"noreferr=
er" target=3D"_blank">https://fidoalliance.org/specs/fido-v2.0-rd-20170927/=
fido-client-to-authenticator-protocol-v2.0-rd-20170927.html#message-encodin=
g</a>. (Note that this isn&#39;t a final standard yet and may change.)<br>
&gt;<br>
&gt; I looked at the spec referenced.<br>
&gt;<br>
&gt; So they essentially add<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=E2=80=A2=
 If the major types are different, the one with the lower value in numerica=
l order sorts earlier.<br>
&gt;<br>
&gt; as a major sorting rule before the existing RFC 7049 canonicalization =
rules:<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=E2=80=A2=
 If two keys have different lengths, the shorter one sorts earlier;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=E2=80=A2=
 If two keys have the same length, the one with the lower value in (byte-wi=
se) lexical order sorts earlier.<br>
&gt;<br>
&gt; This is different from simply going for byte-wise lexicographic (memcm=
p order(*)), which effectively would get us the first rule as the major sor=
ting order already, but get rid of the length-based second rule (first rule=
 in Section 3.9 of RFC 7049).<br>
&gt;<br>
&gt; I=E2=80=99ve come to see the putting the length comparison rule early =
in 3.9 as a major regression.<br>
&gt; One of the objectives when designing the CBOR serialization was not to=
 repeat one big mistake that ASN.1 BER makes: to make overall lengths of co=
mplex composite items visible/important in the encoding of the next higher =
composite.<br>
&gt; Here, we are doing just that.=C2=A0 D=E2=80=99oh.<br>
&gt;<br>
&gt; &gt; This is justified by the RFC saying that &quot;Those protocols ar=
e free to define what they mean by a canonical format and what encoders and=
 decoders are expected to do.=C2=A0 This section lists some suggestions for=
 such protocols.&quot; That is (as Jim said), the RFC doesn&#39;t specify &=
quot;canonical&quot; CBOR: it just provides an option for higher-level prot=
ocols to do so.<br>
&gt;<br>
&gt; Right.=C2=A0 So the change would be to mention two options for this, t=
he old canonical, and the saner (memcmp order) canonical.=C2=A0 Now the nex=
t step is finding names for legacy canonical/saner canonical.=C2=A0 We then=
 have to decide whether we turn this into a separate document, at Proposed =
Standard level, or believe that adding another suggestion to 3.9 is essenti=
ally a bug fix and can be done in the Standard level document.<br>
&gt;<br>
&gt; &gt; The use of a different order in CTAP is going to either force its=
 implementers to write custom CBOR encoders and decoders or require the gen=
eric encoders to take a configuration option for the map order. If the gene=
ric encoders take an option, then it stops being an issue for CBORbis to su=
ggest a different order.<br>
&gt;<br>
&gt; Right.=C2=A0 So I think you are saying we get to fix this.<br>
&gt;<br>
&gt; I&#39;ve tried to implement this in <a href=3D"https://github.com/cbor=
-wg/CBORbis/pull/9" rel=3D"noreferrer" target=3D"_blank">https://github.com=
/cbor-wg/CBORbis/pull/9</a>. I defined a core set of rules so that other sp=
ecs can use them by reference, gave several examples of protocols that will=
 need to extend the rules, and defined a second set of rules that match the=
 canonical order that RFC7049 suggested.<br>
&gt;<br>
&gt; Do folks like this direction?<br>
&gt;<br>
&gt; I don&#39;t have a strong opinion about which document should hold the=
 canonicalization rules. Can we ask the IESG which they&#39;d prefer?<br>
&gt;<br>
&gt; Jeffrey<br>
<br>
</blockquote></div></div></div>

--001a11484940906fd30560f6916c--

