
From nobody Tue Apr 14 02:48:35 2015
Return-Path: <wwwrun@rfc-editor.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 688211A6FDB for <json@ietfa.amsl.com>; Tue, 14 Apr 2015 02:48:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.912
X-Spam-Level: 
X-Spam-Status: No, score=-101.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JZr_E9awHrZ3 for <json@ietfa.amsl.com>; Tue, 14 Apr 2015 02:48:33 -0700 (PDT)
Received: from rfc-editor.org (rfc-editor.org [IPv6:2001:1900:3001:11::31]) by ietfa.amsl.com (Postfix) with ESMTP id 10BCA1A6EDB for <json@ietf.org>; Tue, 14 Apr 2015 02:48:33 -0700 (PDT)
Received: by rfc-editor.org (Postfix, from userid 30) id 2CF0D180206; Tue, 14 Apr 2015 02:47:59 -0700 (PDT)
To: tbray@textuality.com, barryleiba@computer.org, mamille2@cisco.com, paul.hoffman@vpnc.org
X-PHP-Originating-Script: 6000:errata_mail_lib.php
From: RFC Errata System <rfc-editor@rfc-editor.org>
Message-Id: <20150414094759.2CF0D180206@rfc-editor.org>
Date: Tue, 14 Apr 2015 02:47:59 -0700 (PDT)
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/VpEChE5udMgC6CBhg5M86shRMYA>
Cc: rfc-editor@rfc-editor.org, martinpain@uk.ibm.com, json@ietf.org
Subject: [Json] [Editorial Errata Reported] RFC7159 (4336)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Apr 2015 09:48:34 -0000

The following errata report has been submitted for RFC7159,
"The JavaScript Object Notation (JSON) Data Interchange Format".

--------------------------------------
You may review the report below and at:
http://www.rfc-editor.org/errata_search.php?rfc=7159&eid=4336

--------------------------------------
Type: Editorial
Reported by: Martin Pain <martinpain@uk.ibm.com>

Section: Appendix A

Original Text
-------------
[NO MENTION OF SECTION 3 OF RFC 4627]

Corrected Text
--------------
   o  Removed method of detection of character encoding from
      section 3 "Encoding" of RFC 4627.

       

Notes
-----
Appendix 1 (listing changes between RFC 4627 and RFC 7159) does not include any comment on the removal of this text from RFC 4627 section 3:

[START QUOTE]
   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8
[END QUOTE]


The new section 8.1 "Character encoding" states that:

[START QUOTE]
JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32
[END QUOTE]

but, unlike RFC 4627 section 3, it does not say anything about how to distinguish which has been used when parsing a byte string as JSON.


RFC 7159 section 8.1 also says:

[START QUOTE]
   Implementations MUST NOT add a byte order mark to the beginning of a
   JSON text.
[END QUOTE]

which rules out using a byte order mark for this purpose.


Additionally, RFC 7159 section 11 says:

[START QUOTE]
   Note:  No "charset" parameter is defined for this registration.
      Adding one really has no effect on compliant recipients.
[END QUOTE]

which rules out one means of communicating which character encoding is in use when communicating JSON over HTTP (namely a charset parameter on the media type), and implies that there is another means of detecting the character encoding, but does not say what it is.


I've reported this as an erratum on the appendix, as I expect there is an existing means of detecting which of the Unicode character encodings are in use, but I was expecting the appendix to reference it as part of an explanation of the removal of the text I quoted from RFC 4627 section 3 but no such explanation is present. It may be the case that the erratum ought to be against section 8.1 to provide a reference there.

Instructions:
-------------
This erratum is currently posted as "Reported". If necessary, please
use "Reply All" to discuss whether it should be verified or
rejected. When a decision is reached, the verifying party (IESG)
can log in to change the status and edit the report, if necessary. 

--------------------------------------
RFC7159 (draft-ietf-json-rfc4627bis-rfc7159bis)
--------------------------------------
Title               : The JavaScript Object Notation (JSON) Data Interchange Format
Publication Date    : March 2014
Author(s)           : T. Bray, Ed.
Category            : PROPOSED STANDARD
Source              : JavaScript Object Notation
Area                : Applications
Stream              : IETF
Verifying Party     : IESG


From nobody Tue Apr 14 18:57:58 2015
Return-Path: <wwwrun@rfc-editor.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F3B721B309B; Tue, 14 Apr 2015 18:57:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.912
X-Spam-Level: 
X-Spam-Status: No, score=-106.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aYD2eHxsc3ex; Tue, 14 Apr 2015 18:57:52 -0700 (PDT)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) by ietfa.amsl.com (Postfix) with ESMTP id 87C2B1B309F; Tue, 14 Apr 2015 18:57:52 -0700 (PDT)
Received: by rfc-editor.org (Postfix, from userid 30) id 71675180207; Tue, 14 Apr 2015 18:57:16 -0700 (PDT)
To: martinpain@uk.ibm.com, tbray@textuality.com
X-PHP-Originating-Script: 1005:errata_mail_lib.php
From: RFC Errata System <rfc-editor@rfc-editor.org>
Message-Id: <20150415015716.71675180207@rfc-editor.org>
Date: Tue, 14 Apr 2015 18:57:16 -0700 (PDT)
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/RG9kyWnrnkzb6vBRqSBsjSS05_s>
Cc: rfc-editor@rfc-editor.org, barryleiba@computer.org, iesg@ietf.org, json@ietf.org
Subject: [Json] [Errata Verified] RFC7159 (4336)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Apr 2015 01:57:54 -0000

The following errata report has been verified for RFC7159,
"The JavaScript Object Notation (JSON) Data Interchange Format". 

--------------------------------------
You may review the report below and at:
http://www.rfc-editor.org/errata_search.php?rfc=7159&eid=4336

--------------------------------------
Status: Verified
Type: Editorial

Reported by: Martin Pain <martinpain@uk.ibm.com>
Date Reported: 2015-04-14
Verified by: Barry Leiba (IESG)

Section: Appendix A

Original Text
-------------
[NO MENTION OF SECTION 3 OF RFC 4627]

Corrected Text
--------------
   o  Removed method of detection of character encoding from
      section 3 "Encoding" of RFC 4627.

       

Notes
-----
Appendix 1 (listing changes between RFC 4627 and RFC 7159) does not include any comment on the removal of this text from RFC 4627 section 3:

[START QUOTE]
   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.

           00 00 00 xx  UTF-32BE
           00 xx 00 xx  UTF-16BE
           xx 00 00 00  UTF-32LE
           xx 00 xx 00  UTF-16LE
           xx xx xx xx  UTF-8
[END QUOTE]


The new section 8.1 "Character encoding" states that:

[START QUOTE]
JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32
[END QUOTE]

but, unlike RFC 4627 section 3, it does not say anything about how to distinguish which has been used when parsing a byte string as JSON.


RFC 7159 section 8.1 also says:

[START QUOTE]
   Implementations MUST NOT add a byte order mark to the beginning of a
   JSON text.
[END QUOTE]

which rules out using a byte order mark for this purpose.


Additionally, RFC 7159 section 11 says:

[START QUOTE]
   Note:  No "charset" parameter is defined for this registration.
      Adding one really has no effect on compliant recipients.
[END QUOTE]

which rules out one means of communicating which character encoding is in use when communicating JSON over HTTP (namely a charset parameter on the media type), and implies that there is another means of detecting the character encoding, but does not say what it is.


I've reported this as an erratum on the appendix, as I expect there is an existing means of detecting which of the Unicode character encodings are in use, but I was expecting the appendix to reference it as part of an explanation of the removal of the text I quoted from RFC 4627 section 3 but no such explanation is present. It may be the case that the erratum ought to be against section 8.1 to provide a reference there.

--------------------------------------
RFC7159 (draft-ietf-json-rfc4627bis-rfc7159bis)
--------------------------------------
Title               : The JavaScript Object Notation (JSON) Data Interchange Format
Publication Date    : March 2014
Author(s)           : T. Bray, Ed.
Category            : PROPOSED STANDARD
Source              : JavaScript Object Notation
Area                : Applications
Stream              : IETF
Verifying Party     : IESG


From nobody Thu Apr 23 09:26:48 2015
Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1CFD91A6F29 for <json@ietfa.amsl.com>; Thu, 23 Apr 2015 09:26:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.522
X-Spam-Level: 
X-Spam-Status: No, score=0.522 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, J_CHICKENPOX_45=0.6, RCVD_IN_DNSWL_LOW=-0.7] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dbeqFjcBQAsC for <json@ietfa.amsl.com>; Thu, 23 Apr 2015 09:26:43 -0700 (PDT)
Received: from mail-yh0-f49.google.com (mail-yh0-f49.google.com [209.85.213.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0246F1A1EF5 for <json@ietf.org>; Thu, 23 Apr 2015 09:26:15 -0700 (PDT)
Received: by yhcb70 with SMTP id b70so3230092yhc.0 for <json@ietf.org>; Thu, 23 Apr 2015 09:26:14 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=ZlJWtKJ3xpDbyUjfSPe4/smWbETC2ZfYBYLWlebCo1Q=; b=KFlopEXGiSTZfGDDImbSJmR3bXiX2boJNn9EqL3NKtlI6mNpShXHrhRIigoYzgfZCg 0PDw00QZypfxzu5WoYrQVaikzEewJgqzcF9eEzG77kli+jBhyV1laUwSBqagCs+tvaOY LI+vM2YIho4bxK78Y5ex5VcIWBUcRVaXvxkbAsrcnqBmdnisQcXr5DVVlN/oCmOtBTMX WtTLwTJFwouoUAsW+UuRQRG3R8EWZq21LrPvdgsMT258g3s+qtXDgbDUBtFf8f1TkXXe augGEhQh7H2N9by3laH4dFV7Dw7s24/dTj2ZhAYUtvsKeslkMmBHCA7xGrS8TTgjC8Jz ypZg==
X-Gm-Message-State: ALoCoQn78O3Y6+Otd9ktQ3U3aIbmBSpI6kqK2f+DSxTMEp3g3tT0YZH57JMdnGDo8T/+dyK3Uvbx
MIME-Version: 1.0
X-Received: by 10.236.19.71 with SMTP id m47mr3032007yhm.106.1429806374325; Thu, 23 Apr 2015 09:26:14 -0700 (PDT)
Received: by 10.129.137.69 with HTTP; Thu, 23 Apr 2015 09:26:13 -0700 (PDT)
X-Originating-IP: [67.132.130.174]
Received: by 10.129.137.69 with HTTP; Thu, 23 Apr 2015 09:26:13 -0700 (PDT)
Date: Thu, 23 Apr 2015 09:26:13 -0700
Message-ID: <CAHBU6iu1ndbw9V_D3yyxY_FiaBgtD9=94_Rgcra_RJ_WTLVqRA@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: json@ietf.org
Content-Type: multipart/alternative; boundary=089e0160c7040a28dc051466bd73
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/h-k_hIHi2NBtX0igyYb7YNgiKug>
Subject: [Json] Another json interop soft spot
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Apr 2015 16:26:47 -0000

--089e0160c7040a28dc051466bd73
Content-Type: text/plain; charset=UTF-8

irb(main):004:0> JSON.parse ' {"a\b": "b"}'

=> {"a\b"=>"b"}

irb(main):005:0> JSON.parse ' {"a\\b": "b"}'

=> {"a\b"=>"b"}

I think the first example is probably illegal per the grammar, but I bet
lots of parsers accept it.

--089e0160c7040a28dc051466bd73
Content-Type: text/html; charset=UTF-8

<p dir="ltr">irb(main):004:0&gt; JSON.parse &#39; {&quot;a\b&quot;: &quot;b&quot;}&#39;</p>
<p dir="ltr">=&gt; {&quot;a\b&quot;=&gt;&quot;b&quot;}</p>
<p dir="ltr">irb(main):005:0&gt; JSON.parse &#39; {&quot;a\\b&quot;: &quot;b&quot;}&#39;</p>
<p dir="ltr">=&gt; {&quot;a\b&quot;=&gt;&quot;b&quot;}</p>
<p dir="ltr">I think the first example is probably illegal per the grammar, but I bet lots of parsers accept it.<br>
</p>

--089e0160c7040a28dc051466bd73--


From nobody Thu Apr 23 09:43:27 2015
Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5ACCC1AC42B for <json@ietfa.amsl.com>; Thu, 23 Apr 2015 09:43:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.949
X-Spam-Level: 
X-Spam-Status: No, score=0.949 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, HELO_EQ_DE=0.35, J_CHICKENPOX_45=0.6] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gIckMdMU5ejf for <json@ietfa.amsl.com>; Thu, 23 Apr 2015 09:43:24 -0700 (PDT)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B60871A8AB8 for <json@ietf.org>; Thu, 23 Apr 2015 09:43:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [134.102.201.11]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id t3NGhBR9009408; Thu, 23 Apr 2015 18:43:11 +0200 (CEST)
Received: from alma.local (p5DCCC91B.dip0.t-ipconnect.de [93.204.201.27]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3lXkvz52ytz2tnH; Thu, 23 Apr 2015 18:43:11 +0200 (CEST)
Message-ID: <55392120.4050601@tzi.org>
Date: Thu, 23 Apr 2015 18:43:12 +0200
From: Carsten Bormann <cabo@tzi.org>
User-Agent: Postbox 3.0.11 (Macintosh/20140602)
MIME-Version: 1.0
To: Tim Bray <tbray@textuality.com>
References: <CAHBU6iu1ndbw9V_D3yyxY_FiaBgtD9=94_Rgcra_RJ_WTLVqRA@mail.gmail.com>
In-Reply-To: <CAHBU6iu1ndbw9V_D3yyxY_FiaBgtD9=94_Rgcra_RJ_WTLVqRA@mail.gmail.com>
X-Enigmail-Version: 1.2.3
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/5TddlHmxtL_SKQgCxSHzc_YTfGs>
Cc: json@ietf.org
Subject: Re: [Json] Another json interop soft spot
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Apr 2015 16:43:25 -0000

Tim Bray wrote:
> JSON.parse ' {"a\b": "b"}'

You fell into a nasty Ruby trap here:
In a single-quoted string, a single backslash remains a single backslash
unless it is followed by either another backslash or a single quote.

$ irb -rjson
>> RUBY_VERSION
=> "2.2.2"
>> JSON.parse ' {"a\b": "b"}'
=> {"a\b"=>"b"}
>> JSON.parse ' {"a\\b": "b"}'
=> {"a\b"=>"b"}
>> (JSON.parse ' {"a\\b": "b"}').keys[0].size
=> 2
>> (JSON.parse ' {"a\b": "b"}').keys[0].size
=> 2
>> (JSON.parse ' {"a\b": "b"}').keys[0].hexi
=> "6108"
>> (JSON.parse ' {"a\\b": "b"}').keys[0].hexi
=> "6108"
>> ('{"a\b": "b"}').size
=> 12
>> ('{"a\\b": "b"}').size
=> 12
>> ('{"a\\b": "b"}').hexi
=> "7b22615c62223a202262227d"
>> ('{"a\b": "b"}').hexi
=> "7b22615c62223a202262227d"
>> 'a\b'.hexi
=> "615c62"
>> 'a\\b'.hexi
=> "615c62"

(String#hexi is in my .irbrc and does the obvious bytes.map{|x| "%02x" %
x}.join thing.)

Textbook example of quoting hell...

Now, the bug in the default Ruby JSON parser you really are addressing
is illustrated by this:

>> (JSON.parse ' {"a\\c": "b"}').keys[0].hexi
=> "6163"

Of course, \c is not allowed in JSON strings.

Fix:

$ irb -roj
>> (Oj.load ' {"a\\c": "b"}').keys[0].hexi
Oj::ParseError: invalid escaped character at line 1, column 5 [parse.c:280]
	from (irb):3:in `load'
	from (irb):3
	from /Users/cabo/bin/irb:11:in `<main>'
>> (Oj.load ' {"a\\b": "b"}').keys[0].hexi
=> "6108"

Grüße, Carsten

