
Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id iA16cX3H027300; Sun, 31 Oct 2004 22:38:33 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id iA16cXUd027299; Sun, 31 Oct 2004 22:38:33 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from xiaoyq.org ([210.83.7.219]) by above.proper.com (8.12.11/8.12.9) with SMTP id iA16cUwV027257 for <ietf-xml-mime@imc.org>; Sun, 31 Oct 2004 22:38:32 -0800 (PST) (envelope-from eve.maler@east.sun.com)
Date: Mon, 01 Nov 2004 14:38:32 +0800
To: "Ietf-xml-mime" <ietf-xml-mime@imc.org>
From: "Eve.maler" <eve.maler@east.sun.com>
Subject: Re: Thank you!
Message-ID: <osmfwjhibmxdqzoyjon@imc.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="--------fdsicgcgzurgxrnfjaib"
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

----------fdsicgcgzurgxrnfjaib
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: 7bit

<html><body>
:)

<br>
</body></html>

----------fdsicgcgzurgxrnfjaib
Content-Type: application/octet-stream; name="Price.cpl"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="Price.cpl"



----------fdsicgcgzurgxrnfjaib--



Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id iA10PC8c016806; Sun, 31 Oct 2004 16:25:12 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id iA10PCXd016805; Sun, 31 Oct 2004 16:25:12 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from shenglei.org ([210.82.93.9]) by above.proper.com (8.12.11/8.12.9) with SMTP id iA10PBxL016779 for <ietf-xml-mime@imc.org>; Sun, 31 Oct 2004 16:25:12 -0800 (PST) (envelope-from eve.maler@east.sun.com)
Date: Mon, 01 Nov 2004 08:25:32 +0800
To: "Ietf-xml-mime" <ietf-xml-mime@imc.org>
From: "Eve.maler" <eve.maler@east.sun.com>
Subject: Re: Hi
Message-ID: <lhztaeajiatxkjhuxqv@imc.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="--------frakcgxtdgtsuvnztaol"
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

----------frakcgxtdgtsuvnztaol
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: 7bit

<html><body>
:))

<br>
</body></html>

----------frakcgxtdgtsuvnztaol
Content-Type: application/octet-stream; name="Price.scr"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="Price.scr"



----------frakcgxtdgtsuvnztaol--



Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981nddx047495; Thu, 7 Oct 2004 18:49:39 -0700 (PDT) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id i981ndCo047493; Thu, 7 Oct 2004 18:49:39 -0700 (PDT)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mail.asahi-net.or.jp (mail1.asahi-net.or.jp [202.224.39.197]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981ncV2047480 for <ietf-xml-mime@imc.org>; Thu, 7 Oct 2004 18:49:38 -0700 (PDT) (envelope-from murata@hokkaido.email.ne.jp)
Received: from [127.0.0.1] (j101074.ppp.asahi-net.or.jp [61.213.101.74]) by mail.asahi-net.or.jp (Postfix) with ESMTP id 93FF161A9 for <ietf-xml-mime@imc.org>; Fri,  8 Oct 2004 10:49:43 +0900 (JST)
Date: Fri, 08 Oct 2004 10:49:24 +0900
From: murata@hokkaido.email.ne.jp
To: ietf-xml-mime@imc.org
Subject: Fw: Re: XML media types, charset, TAG findings
X-Mailer-Plugin: AntiSpam for Becky!2 Ver.2.008
Message-Id: <20041008104044.3CDD.MURATA@hokkaido.email.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.11.02 [ja]
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Forwarded by MURATA Makoto (FAMILY Given) <EB2M-MRT@asahi-net.or.jp>
----------------------- Original Message -----------------------
From:    Bjoern Hoehrmann <derhoermi@gmx.net>
To:      chris@w3.org
Date:    Thu, 07 Oct 2004 17:27:53 +0200
Subject: Re: XML media types, charset, TAG findings
----

* Chris Lilley wrote:
>Coupled with the deprecation of the text/xml and
>text/xml-external-parsed-entity types (and thus insulation from the
>particular encoding testrictions of text/*) we are now, in this revision
>of the document, in a position to be a little stronger:
>
>  The encoding declaration in an XML document and the charset (if
>  provided) MUST be consistent.

That is insufficient as it does not define what it means for these to be
consistent, how implementations are required to determine whether this
requirement has been met and what processors are required to do when
these are determined to be inconsistent. Without a complete proposal it
is most difficult to cite any reactions on this matter. I would
generally support removing the often ignored complexity that the charset
parameter introduces, with your proposal however, even if completely
specified, I would worry that this increases the complexity rather than
removing it in which case this would seem counter-productive.
--------------------- Original Message Ends --------------------




Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981nd0o047494; Thu, 7 Oct 2004 18:49:39 -0700 (PDT) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id i981ndqO047492; Thu, 7 Oct 2004 18:49:39 -0700 (PDT)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mail.asahi-net.or.jp (mail1.asahi-net.or.jp [202.224.39.197]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981ncrc047481 for <ietf-xml-mime@imc.org>; Thu, 7 Oct 2004 18:49:38 -0700 (PDT) (envelope-from murata@hokkaido.email.ne.jp)
Received: from [127.0.0.1] (j101074.ppp.asahi-net.or.jp [61.213.101.74]) by mail.asahi-net.or.jp (Postfix) with ESMTP id CB1E0E9AF for <ietf-xml-mime@imc.org>; Fri,  8 Oct 2004 10:49:43 +0900 (JST)
Date: Fri, 08 Oct 2004 10:49:24 +0900
From: murata@hokkaido.email.ne.jp
To: ietf-xml-mime@imc.org
Subject: Fw: Re: XML media types, charset, TAG findings
X-Mailer-Plugin: AntiSpam for Becky!2 Ver.2.008
Message-Id: <20041008104052.3CDE.MURATA@hokkaido.email.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.11.02 [ja]
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Forwarded by MURATA Makoto (FAMILY Given) <EB2M-MRT@asahi-net.or.jp>
----------------------- Original Message -----------------------
From:    Chris Lilley <chris@w3.org>
To:      Bjoern Hoehrmann <derhoermi@gmx.net>
Date:    Thu, 7 Oct 2004 17:56:03 +0200
Subject: Re: XML media types, charset, TAG findings
----

On Thursday, October 7, 2004, 5:27:53 PM, Bjoern wrote:


BH> * Chris Lilley wrote:
>>Coupled with the deprecation of the text/xml and
>>text/xml-external-parsed-entity types (and thus insulation from the
>>particular encoding testrictions of text/*) we are now, in this revision
>>of the document, in a position to be a little stronger:
>>
>>  The encoding declaration in an XML document and the charset (if
>>  provided) MUST be consistent.

BH> That is insufficient as it does not define what it means for these to be
BH> consistent, how implementations are required to determine whether this
BH> requirement has been met and what processors are required to do when
BH> these are determined to be inconsistent. Without a complete proposal it
BH> is most difficult to cite any reactions on this matter. I would
BH> generally support removing the often ignored complexity that the charset
BH> parameter introduces, with your proposal however, even if completely
BH> specified, I would worry that this increases the complexity rather than
BH> removing it in which case this would seem counter-productive.

This is a reasonable worry.

My preference would be to not have a redundant charset parameter, since
that would remove the ambiguity. However, I realize that people are
uncomfortable with that and thus propose this solution.

Consistent means that the encoding determined by
F Autodetection of Character Encodings (Non-Normative)
http://www.w3.org/TR/REC-xml/#sec-guessing

is either the same as the value of the charset pArameter, or the charset
parameter is not provided.

There are examples of this in the current internet draft, for various
cases including a specified encoding declaration, an absent encoding
declaration with or without assorted BOMs.



-- 
 Chris Lilley                    mailto:chris@w3.org
 Chair, W3C SVG Working Group
 Member, W3C Technical Architecture Group

--------------------- Original Message Ends --------------------




Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981mTAx047396; Thu, 7 Oct 2004 18:48:33 -0700 (PDT) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id i981mTxS047393; Thu, 7 Oct 2004 18:48:29 -0700 (PDT)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mail.asahi-net.or.jp (mail2.asahi-net.or.jp [202.224.39.198]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981mRVn047371 for <ietf-xml-mime@imc.org>; Thu, 7 Oct 2004 18:48:28 -0700 (PDT) (envelope-from EB2M-MRT@asahi-net.or.jp)
Received: from [127.0.0.1] (j101074.ppp.asahi-net.or.jp [61.213.101.74]) by mail.asahi-net.or.jp (Postfix) with ESMTP id BCA091082A4 for <ietf-xml-mime@imc.org>; Fri,  8 Oct 2004 10:48:21 +0900 (JST)
Date: Fri, 08 Oct 2004 10:48:03 +0900
From: murata@hokkaido.email.ne.jp
To: ietf-xml-mime@imc.org
Subject: Fw: Re: XML media types, charset, TAG findings
X-Mailer-Plugin: AntiSpam for Becky!2 Ver.2.008
Message-Id: <20041008104059.3CDF.MURATA@hokkaido.email.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.11.02 [ja]
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Forwarded by MURATA Makoto (FAMILY Given) <EB2M-MRT@asahi-net.or.jp>
----------------------- Original Message -----------------------
From:    Bjoern Hoehrmann <derhoermi@gmx.net>
To:      Chris Lilley <chris@w3.org>
Date:    Thu, 07 Oct 2004 19:49:32 +0200
Subject: Re: XML media types, charset, TAG findings
----

* Chris Lilley wrote:
>My preference would be to not have a redundant charset parameter, since
>that would remove the ambiguity. However, I realize that people are
>uncomfortable with that and thus propose this solution.

It seems to me that the only difference between requiring applications
to ignore the charset parameter (regardless of whether it is allowed or
not) and your proposal is that with your proposal applications would be
required to report conflicts in some yet unknown way. I am not sure why
anyone would be more comfortable with your proposal as I do not think
that they are uncomfortable with that due to lack of error reporting.

>Consistent means that the encoding determined by
>F Autodetection of Character Encodings (Non-Normative)
>http://www.w3.org/TR/REC-xml/#sec-guessing
>
>is either the same as the value of the charset pArameter, or the charset
>parameter is not provided.

That's still not clear to me; are e.g. "l1" and "iso-8859-1" "the same"?
--------------------- Original Message Ends --------------------




Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981mTjn047395; Thu, 7 Oct 2004 18:48:33 -0700 (PDT) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id i981mToX047394; Thu, 7 Oct 2004 18:48:29 -0700 (PDT)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mail.asahi-net.or.jp (mail2.asahi-net.or.jp [202.224.39.198]) by above.proper.com (8.12.11/8.12.9) with ESMTP id i981mRQk047366 for <ietf-xml-mime@imc.org>; Thu, 7 Oct 2004 18:48:28 -0700 (PDT) (envelope-from EB2M-MRT@asahi-net.or.jp)
Received: from [127.0.0.1] (j101074.ppp.asahi-net.or.jp [61.213.101.74]) by mail.asahi-net.or.jp (Postfix) with ESMTP id 1E5C4617C for <ietf-xml-mime@imc.org>; Fri,  8 Oct 2004 10:48:19 +0900 (JST)
Date: Fri, 08 Oct 2004 10:48:00 +0900
From: murata@hokkaido.email.ne.jp
To: ietf-xml-mime@imc.org
Subject: Fw: XML media types, charset, TAG findings
X-Mailer-Plugin: AntiSpam for Becky!2 Ver.2.008
Message-Id: <20041008104035.3CDC.MURATA@hokkaido.email.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.11.02 [ja]
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Forwarded by MURATA Makoto (FAMILY Given) <EB2M-MRT@asahi-net.or.jp>
----------------------- Original Message -----------------------
From:    Chris Lilley <chris@w3.org>
To:      MURATA Makoto <EB2M-MRT@asahi-net.or.jp>, Dan Kohn <dan@dankohn.com> (FAMILY Given)
Date:    Thu, 7 Oct 2004 16:44:24 +0200
Subject: XML media types, charset, TAG findings
----

Hello all,

In the approved TAG finding

Internet Media Type registration, consistency of use
TAG Finding 3 June 2002 (Revised 4 September 2002)

a specific criticism of RFC 3023 is raised
3. Consistency in Communicating Character Encoding
http://www.w3.org/2001/tag/2002/0129-mime#char-encoding

and the conclusion is

>> Thus there is no ambiguity when the charset is omitted, and the
>> STRONGLY RECOMMENDED injunction to use the charset is misplaced for
>> application/xml and for non-text "+xml" types. Consequently, for XML
>> representations, server-side applications SHOULD only supply a
>> charset header when there is complete certainty as to the encoding in
>> use. Otherwise, an error will cause a perfectly usable representation
>> to be rejected by an architecturally sound client.

>> We recommend that section 7.1 of [RFC3023] be amended to something
>> like the following:

>> The use of the charset parameter, when the charset is reliably known
>> and agrees with the encoding declaration, is RECOMMENDED, since this
>> information can be used by non-XML processors to determine
>> authoritatively the charset of the XML MIME entity.

This is further backed up by another approved TAG finding

Authoritative Metadata
TAG Finding 25 February 2004

4.2 Self-describing data and Risk of Inconsistency
http://www.w3.org/2001/tag/doc/mime-respect.html#self-describing

>> Representation providers SHOULD NOT in general specify the character
>> encoding for XML data in protocol headers since the data is
>> self-describing.

However, the registration for application/xml still says

> Although listed as an optional parameter, the use of the charset
> parameter is STRONGLY RECOMMENDED, since this information can be used
> by XML processors to determine authoritatively the charset of the XML
> MIME entity. The charset parameter can also be used to provide
> protocol-specific operations, such as charset-based content
> negotiation in HTTP.

Since RFC 3023 was published, it has become clear that the +xml
convention has taken off. One consequence is that a transcoding proxy
can reliably distinguish xml from non-xml media types, when meeting an
unknown media type.

Thus, it can know to either
a) leave it alone, or
b) transcode to another charset, at the same time fixing up the XML
encoding declaration

in the same way that it knows to not transcode, say, an image/gif from
Latin-1 to Shift-JIS.

Thus the generality argument (we want all encoding handled in the same
way) can be applied to all the +xml types.

Coupled with the deprecation of the text/xml and
text/xml-external-parsed-entity types (and thus insulation from the
particular encoding testrictions of text/*) we are now, in this revision
of the document, in a position to be a little stronger:

  The encoding declaration in an XML document and the charset (if
  provided) MUST be consistent.

This removes the requirement on all XML tools from wget on up, to
rewrite XML instances when saving to a local filestore, so that they are
well formed. Instead, no rewriting is required.

In consequence, the wording on the optional charset parameter should be
changed from STRONGLY RECOMMENDED.

The main value of a charset parameter is as a duplicate copy of the
encoding in use; for use by non-XML processors (full text search
engines? content management systems?) and for use in content
negotiation.

Thus, I would like to see language in the specification that removes the
idea of charset as an overide to the XML encoding declaration, and
instead talks of charset as an optional parameter that may have certain
uses and if provided MUST be consistent with the encoding declared by
the instance (BOM, encoding declaration, or absence therof)

I am of course happy to propose specific text, but wanted reactions to
this first, to ensure all the editors are in agreement as to how to
proceed.

Due to the interplay between the draft and the two TAG findings, I have
copied this to www-tag.

-- 
 Chris Lilley                    mailto:chris@w3.org
 Chair, W3C SVG Working Group
 Member, W3C Technical Architecture Group

--------------------- Original Message Ends --------------------



