
Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBKEuJib093018 for <ietf-xml-mime-bks@above.proper.com>; Sat, 20 Dec 2003 06:56:21 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBKEuJD2093017 for ietf-xml-mime-bks; Sat, 20 Dec 2003 06:56:19 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mail.asahi-net.or.jp (mail1.asahi-net.or.jp [202.224.39.197]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBKEuGib093012 for <ietf-xml-mime@imc.org>; Sat, 20 Dec 2003 06:56:16 -0800 (PST) (envelope-from murata@hokkaido.email.ne.jp)
Received: from [127.0.0.1] (j125245.ppp.asahi-net.or.jp [61.213.125.245]) by mail.asahi-net.or.jp (Postfix) with ESMTP id 632DD3568A; Sat, 20 Dec 2003 23:56:16 +0900 (JST)
Date: Sat, 20 Dec 2003 23:52:56 +0900
From: MURATA Makoto <murata@hokkaido.email.ne.jp>
To: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: Chris Lilley <chris@w3.org>, ietf-types@iana.org, w3c-archive@w3.org, Linus Walleij <triad@df.lth.se>, ietf-xml-mime@imc.org, Joachim.Strombergson@informasic.com, paf@cisco.com
In-Reply-To: <4.2.0.58.J.20031219133801.06ffa768@localhost>
References: <20031220000621.1347.MURATA@hokkaido.email.ne.jp> <4.2.0.58.J.20031219133801.06ffa768@localhost>
Message-Id: <20031220230150.134C.MURATA@hokkaido.email.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.06.02
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

> There are two separate issues:
> 1) Does a registration allow the 'charset' parameter or not.
> 2) Does an actual entity have the 'charset' parameter or not.
> 
> It is not totally clear to me which one of these you are talking above.

1)

> In my opinion, it is highly preferable that all registrations
> allow the 'charset' parameter, to avoid a patchwork. The draft
> should contain some justification for this.

RFC 3023 already provides a registration template which introduces the 
charset parameter.  But the recommendation to introduce the charset parameter 
is a SHOULD rather than a MUST.  People read RFC 3023 but they intentionally 
dropped the charset.

I studied existing +xml media types registered at IANA.  
Here is the result.

1. IETF tree

With the exception of application/cnrp+xml, all +xml media types 
have the charset parameter.

1) beep+xml  	[RFC3080]
This keeps the charset parameter.

2) cnrp+xml  	[RFC3367]
This omits the charset parameter.  Not restricted to UTF-8.

3) cpl+xml 	[RFCXXXX]	
http://www.ietf.org/internet-drafts/draft-ietf-iptel-cpl-08.txt
This keeps the charset parameter.

4) pidf+xml 	[RFC-ietf-impp-cpim-pidf-08.txt]
http://www.ietf.org/internet-drafts/draft-ietf-impp-cpim-pidf-08.txt
This keeps the charset parameter.

5) reginfo+xml 	[RFC-ietf-sipping-reg-event-00.txt]
http://www.ietf.org/internet-drafts/draft-ietf-sipping-reg-event-00.txt
This keeps the charset parameter.

6) watcherinfo+xml 	[RFC-ietf-simple-winfo-format-04.txt]
http://www.ietf.org/internet-drafts/draft-ietf-simple-winfo-format-04.txt
This keeps the charset parameter.

7) xhtml+xml 	[RFC3236]
http://www.rfc-editor.org/rfc/rfc3236.txt
This keeps the charset parameter.


2. Vendor tree

Only two media types have the charset parameter.  Since media types in
the vendor tree do not always have accompanying documents, we do not
know if there are good reasons to omit the charset.

1) vnd.criticaltools.wbs+xml 	[Spiller]

http://www.iana.org/assignments/media-types/application/vnd.criticaltools.wbs+xml
This omits the charset parameter.
The details of this structure are proprietary to Critical Tools, Inc.

2) vnd.irepository.package+xml 	[Knowles]

http://www.iana.org/assignments/media-types/application/vnd.irepository.package+xml
This omits the charset parameter.
Use of this MIME type is limited to users of Lucidoc and associated
document management tools published by iRepository.net, Inc.

3) vnd.liberty-request+xml 	[McDowell]

http://www.iana.org/assignments/media-types/application/vnd.liberty-request+xml
This omits the charset parameter.

4) vnd.llamagraphics.life-balance.exchange+xml 	[White]

http://www.iana.org/assignments/media-types/application/vnd.llamagraphics.life-balance.exchange+xml
This keeps the charset parameter.

5) vnd.mozilla.xul+xml 	[McDaniel]

http://www.iana.org/assignments/media-types/application/vnd.mozilla.xul+xml
This keeps the charset parameter.

6) vnd.pwg-xhtml-print+xml 	[Wright]

http://www.iana.org/assignments/media-types/application/vnd.pwg-xhtml-print+xml
This omits the charset parameter.

7) vnd.wv.csp+xml 	[Ingimundarson]

http://www.iana.org/assignments/media-types/application/vnd.wv.csp+xml
This omits the charset parameter.  
They wrote "No parameters are required - covered by client-server capability
negotiation."

8) vnd.wv.csp+wbxml 	[Salmi]

http://www.iana.org/assignments/media-types/application/vnd.wv.csp+wbxml
This omits the charset parameter.  
They wrote "No parameters are required since WV capability negotiation
covers this."

9) vnd.wv.ssp+xml 	[Ingimundarson]

http://www.iana.org/assignments/media-types/application/vnd.wv.ssp+xml
This omits the charset parameter.  


> As for whether the actual entity should come with a 'charset'
> parameter, we should also have a discussion of the various
> issues in the draft.
> 
> 
> >By the way, "The Standard Hex Format" uses UTF-8 only.
> >
> >http://www.ietf.org/internet-drafts/draft-strombergson-shf-00.txt
> 
> I have not found 'UTF-8' anywhere in this document, so I'm not
> sure where you saw this restriction.

I am afraid that I made a mistake.  We have discussed this issue in 
the mailing list, but nothing is written down yet.

> Another one is
> http://www.ietf.org/internet-drafts/draft-sbml-media-type-02.txt.
> 
> This was approved by the IESG yesterday, so I guess it's too late
> to try to change it. In practice, I very much hope that no implementation
> will reject sbml data that comes with a redundant charset=utf-8.

Now we have two exceptions in the IETF tree.

Practically, what can we do?

Cheers,

-- 
MURATA Makoto <murata@hokkaido.email.ne.jp>




Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBJIthib063588 for <ietf-xml-mime-bks@above.proper.com>; Fri, 19 Dec 2003 10:55:43 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBJIthAY063587 for ietf-xml-mime-bks; Fri, 19 Dec 2003 10:55:43 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from dr-nick.w3.org (dr-nick.w3.org [18.29.1.73]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBJIteib063582 for <ietf-xml-mime@imc.org>; Fri, 19 Dec 2003 10:55:40 -0800 (PST) (envelope-from duerst@w3.org)
Received: from enoshima (homer.w3.org [18.29.0.30]) by dr-nick.w3.org (Postfix) with ESMTP id 5628813914; Fri, 19 Dec 2003 13:55:39 -0500 (EST)
Message-Id: <4.2.0.58.J.20031219133801.06ffa768@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J 
Date: Fri, 19 Dec 2003 13:55:31 -0500
To: MURATA Makoto <murata@hokkaido.email.ne.jp>
From: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: Chris Lilley <chris@w3.org>, ietf-types@iana.org, w3c-archive@w3.org, Linus Walleij <triad@df.lth.se>, ietf-xml-mime@imc.org, Joachim.Strombergson@InformAsic.com, paf@cisco.com
In-Reply-To: <20031220000621.1347.MURATA@hokkaido.email.ne.jp>
References: <4.2.0.58.J.20031218112549.00a93ad0@localhost> <1071764448.22957.11.camel@felicia> <4.2.0.58.J.20031218112549.00a93ad0@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Hello Makoto,

At 00:20 03/12/20 +0900, MURATA Makoto wrote:

>On Thu, 18 Dec 2003 11:30:44 -0500
>Martin Duerst <duerst@w3.org> wrote:
>
> > There is some work starting on updating RFC 3023, and we hope that
> > we can use it to document the issues and questions that have
> > come up on the list. Chris has volunteered to coauthor, and I'll
> > help from the sidelines. Probably having all the information together
> > in one place, is better than having some of the information separated out.
>
>As a co-author, I also agree to add some guidelines for the omission of the
>charset parameter.  Chris and I will start this work shortly.

We are looking forward to this!


>When a media type uses UTF-8 only, I can agree to omit the charset
>parameter. "UTF-8" is already a recommended policy
>(http://www.w3.org/TR/charmod/#sec-UniqueEncoding).  I do not see strong
>reasons to specify charset="utf-8" always.  However, even in this case, I
>am sympathetic to Martin's worry ("we start to get into a patchwork").
>In all other cases, I do not want to allow omission of the charset
>parameter.  How do others feel?

There are two separate issues:
1) Does a registration allow the 'charset' parameter or not.
2) Does an actual entity have the 'charset' parameter or not.

It is not totally clear to me which one of these you are talking above.
In my opinion, it is highly preferable that all registrations
allow the 'charset' parameter, to avoid a patchwork. The draft
should contain some justification for this.

As for whether the actual entity should come with a 'charset'
parameter, we should also have a discussion of the various
issues in the draft.


>By the way, "The Standard Hex Format" uses UTF-8 only.
>
>http://www.ietf.org/internet-drafts/draft-strombergson-shf-00.txt

I have not found 'UTF-8' anywhere in this document, so I'm not
sure where you saw this restriction. The document contains:

 >>>>>>>>
9.1 Optional parameters

    none.

    There is no charset parameter. Character handling has identical
    semantics to the case where the charset parameter of the
    "application/xml" media type is omitted, as described in RFC3023 [4].
 >>>>>>>>

I think this should be fixed to reinstate the charset parameter
as discussed on this mailing list.


Another one is
http://www.ietf.org/internet-drafts/draft-sbml-media-type-02.txt.

This was approved by the IESG yesterday, so I guess it's too late
to try to change it. In practice, I very much hope that no implementation
will reject sbml data that comes with a redundant charset=utf-8.


Regards,    Martin.


Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBJHQaib059649 for <ietf-xml-mime-bks@above.proper.com>; Fri, 19 Dec 2003 09:26:36 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBJHQaNM059648 for ietf-xml-mime-bks; Fri, 19 Dec 2003 09:26:36 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from www.markbaker.ca (static-80-155.dsl.cuic.ca [216.126.80.155] (may be forged)) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBJHQXib059642 for <ietf-xml-mime@imc.org>; Fri, 19 Dec 2003 09:26:34 -0800 (PST) (envelope-from mbaker@markbaker.ca)
Received: (from mbaker@localhost) by www.markbaker.ca (8.11.6/8.11.6) id hBJHOTZ31604; Fri, 19 Dec 2003 12:24:29 -0500
Date: Fri, 19 Dec 2003 12:24:29 -0500
From: Mark Baker <distobj@acm.org>
To: MURATA Makoto <murata@hokkaido.email.ne.jp>
Cc: Martin Duerst <duerst@w3.org>, Linus Walleij <triad@df.lth.se>, Chris Lilley <chris@w3.org>, ietf-types@iana.org, w3c-archive@w3.org, ietf-xml-mime@imc.org
Subject: Re: proposed media type registration: application/voicexml+xml
Message-ID: <20031219122429.L7952@www.markbaker.ca>
References: <1071764448.22957.11.camel@felicia> <4.2.0.58.J.20031218112549.00a93ad0@localhost> <20031220000621.1347.MURATA@hokkaido.email.ne.jp>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <20031220000621.1347.MURATA@hokkaido.email.ne.jp>; from murata@hokkaido.email.ne.jp on Sat, Dec 20, 2003 at 12:20:01AM +0900
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

On Sat, Dec 20, 2003 at 12:20:01AM +0900, MURATA Makoto wrote:
> However, even in this case, I
> am sympathetic to Martin's worry ("we start to get into a patchwork").
> In all other cases, I do not want to allow omission of the charset
> parameter.  How do others feel?

Do we know that all existing application/*+xml types define the charset
param by reference to 3023?  If so, then I agree;  I believe that the
benefit of this simplification outweighs the cost of (potential)
redundancy.

Mark.
-- 
Mark Baker.   Ottawa, Ontario, CANADA.        http://www.markbaker.ca


Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBJFNTib054858 for <ietf-xml-mime-bks@above.proper.com>; Fri, 19 Dec 2003 07:23:29 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBJFNTN9054857 for ietf-xml-mime-bks; Fri, 19 Dec 2003 07:23:29 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mail.asahi-net.or.jp (mail1.asahi-net.or.jp [202.224.39.197]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBJFNRib054841 for <ietf-xml-mime@imc.org>; Fri, 19 Dec 2003 07:23:27 -0800 (PST) (envelope-from murata@hokkaido.email.ne.jp)
Received: from [127.0.0.1] (j125245.ppp.asahi-net.or.jp [61.213.125.245]) by mail.asahi-net.or.jp (Postfix) with ESMTP id 049CA7861; Sat, 20 Dec 2003 00:23:21 +0900 (JST)
Date: Sat, 20 Dec 2003 00:20:01 +0900
From: MURATA Makoto <murata@hokkaido.email.ne.jp>
To: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: Linus Walleij <triad@df.lth.se>, Chris Lilley <chris@w3.org>, ietf-types@iana.org, w3c-archive@w3.org, ietf-xml-mime@imc.org
In-Reply-To: <4.2.0.58.J.20031218112549.00a93ad0@localhost>
References: <1071764448.22957.11.camel@felicia> <4.2.0.58.J.20031218112549.00a93ad0@localhost>
Message-Id: <20031220000621.1347.MURATA@hokkaido.email.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.06.02
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

On Thu, 18 Dec 2003 11:30:44 -0500
Martin Duerst <duerst@w3.org> wrote:

> There is some work starting on updating RFC 3023, and we hope that
> we can use it to document the issues and questions that have
> come up on the list. Chris has volunteered to coauthor, and I'll
> help from the sidelines. Probably having all the information together
> in one place, is better than having some of the information separated out.

As a co-author, I also agree to add some guidelines for the omission of the 
charset parameter.  Chris and I will start this work shortly.

When a media type uses UTF-8 only, I can agree to omit the charset
parameter. "UTF-8" is already a recommended policy
(http://www.w3.org/TR/charmod/#sec-UniqueEncoding).  I do not see strong 
reasons to specify charset="utf-8" always.  However, even in this case, I
am sympathetic to Martin's worry ("we start to get into a patchwork").
In all other cases, I do not want to allow omission of the charset
parameter.  How do others feel?

By the way, "The Standard Hex Format" uses UTF-8 only.

http://www.ietf.org/internet-drafts/draft-strombergson-shf-00.txt

Cheers,

-- 
MURATA Makoto <murata@hokkaido.email.ne.jp>




Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBIKurib036251 for <ietf-xml-mime-bks@above.proper.com>; Thu, 18 Dec 2003 12:56:53 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBIKurk9036250 for ietf-xml-mime-bks; Thu, 18 Dec 2003 12:56:53 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from dr-nick.w3.org (dr-nick.w3.org [18.29.1.73]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBIKuqib036245 for <ietf-xml-mime@imc.org>; Thu, 18 Dec 2003 12:56:52 -0800 (PST) (envelope-from duerst@w3.org)
Received: from enoshima (homer.w3.org [18.29.0.30]) by dr-nick.w3.org (Postfix) with ESMTP id B5CE913642; Thu, 18 Dec 2003 15:56:53 -0500 (EST)
Message-Id: <4.2.0.58.J.20031218132824.07129d38@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J 
Date: Thu, 18 Dec 2003 15:56:29 -0500
To: Max Froumentin <mf@w3.org>, Linus Walleij <triad@df.lth.se>
From: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: Chris Lilley <chris@w3.org>, ietf-types@iana.org, w3c-archive@w3.org, ietf-xml-mime@imc.org
In-Reply-To: <878yladmbb.fsf@w3.org>
References: <1071764448.22957.11.camel@felicia> <4.2.0.58.J.20031216160942.077942a8@localhost> <4.2.0.58.J.20031216160942.077942a8@localhost> <4.2.0.58.J.20031216180134.06e36e20@localhost> <1071764448.22957.11.camel@felicia>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

At 19:15 03/12/18 +0100, Max Froumentin wrote:
>Linus Walleij <triad@df.lth.se> wrote:

> > OK so then I regard this as official W3.org policy on transport type.
>
>I expect that that policy would rather be set by the TAG. See:
>http://www.w3.org/2001/tag/2002/0129-mime
>
>"Thus there is no ambiguity when the charset is omitted, and the
>STRONGLY RECOMMENDED injunction [in RFC3023] to use the charset is
>misplaced for application/xml and for non-text "+xml"
>types.

I agree that we should change the 'strongly recommended' to a
more balanced wording and a more detailed discussion in
the upcomming RFC 3023 update.

I also just found out that there are apparently two findings
dealing with the above issue, the other one (still at draft
stage) at http://www.w3.org/2001/tag/doc/mime-respect-20031210.html
(diff at
http://www.w3.org/2001/tag/doc/mime-respect-20031210-diff.html).


>Consequently, for XML representations, server-side applications
>SHOULD only supply a charset header when there is complete certainty
>as to the encoding in use. Otherwise, an error will cause a perfectly
>usable representation to be rejected by an architecturally sound
>client."

I think that this is basically right, but highly overstated.
There are two points:

1) It implies that leaving off the charset parameter will always
    lead to perfectly correct documents.
    While there are certain cases where indeed leaving out the
    charset parameter will improve things, when the charset
    parameter is wrong, there are also cases where things will
    get worse, and there are cases where things are not affected.

2) The language used seems to be inappropriate for a specification,
    because specifications in general assume that people do what
    the spec says. If we would fill up our specs with notes saying
    that you shouldn't do this if you don't get it right, our specs
    would all be much longer and much more difficult to read.


In addition, note that the finding you cited continues as follows:

 >>>>>>>>
We recommend that section 7.1 of [RFC3023] be amended to something like the 
following:

     The use of the charset parameter, when the charset is reliably known 
and agrees with the encoding declaration, is RECOMMENDED, since this 
information can be used by non-XML processors to determine authoritatively 
the charset of the XML MIME entity.
 >>>>>>>>

So it does not look to me that the TAG is recommending to remove
the charset parameter from application/foo+xml registrations.


Regards,     Martin.


Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBIGiLib023571 for <ietf-xml-mime-bks@above.proper.com>; Thu, 18 Dec 2003 08:44:21 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBIGiL0R023570 for ietf-xml-mime-bks; Thu, 18 Dec 2003 08:44:21 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from dr-nick.w3.org (dr-nick.w3.org [18.29.1.73]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBIGiKib023565 for <ietf-xml-mime@imc.org>; Thu, 18 Dec 2003 08:44:20 -0800 (PST) (envelope-from duerst@w3.org)
Received: from enoshima (homer.w3.org [18.29.0.30]) by dr-nick.w3.org (Postfix) with ESMTP id DF13F13893; Thu, 18 Dec 2003 11:44:20 -0500 (EST)
Message-Id: <4.2.0.58.J.20031218112549.00a93ad0@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J 
Date: Thu, 18 Dec 2003 11:30:44 -0500
To: Linus Walleij <triad@df.lth.se>
From: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: Chris Lilley <chris@w3.org>, ietf-types@iana.org, w3c-archive@w3.org, ietf-xml-mime@imc.org
In-Reply-To: <1071764448.22957.11.camel@felicia>
References: <4.2.0.58.J.20031216180134.06e36e20@localhost> <4.2.0.58.J.20031216160942.077942a8@localhost> <4.2.0.58.J.20031216160942.077942a8@localhost> <4.2.0.58.J.20031216180134.06e36e20@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

At 17:20 03/12/18 +0100, Linus Walleij wrote:
>ons 2003-12-17 klockan 00.12 skrev Martin Duerst:
>
> > Sorry I forgot to mention it, but Chris and me actually had such a
> > 'meeting' last week, and the email I wrote was the result of this
> > (and a phone call with Max).
>
>OK so then I regard this as official W3.org policy on transport type.

You can regard it as what you want. As far as I understand, it
represents our best knowledge. But it is in no way 'official' as
far as W3C is concerned.


>If you three (Martin, Chris and Max) could write up an informational
>RFC on how you believe foo/bar+xml transport types should be handled
>we could sort out this issue once and for all. I think noone would
>object to a transport type policy for XML from three W3.org members.

There is some work starting on updating RFC 3023, and we hope that
we can use it to document the issues and questions that have
come up on the list. Chris has volunteered to coauthor, and I'll
help from the sidelines. Probably having all the information together
in one place, is better than having some of the information separated out.

Regards,   Martin.


Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBHKiPib095399 for <ietf-xml-mime-bks@above.proper.com>; Wed, 17 Dec 2003 12:44:25 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBHKiPsj095398 for ietf-xml-mime-bks; Wed, 17 Dec 2003 12:44:25 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from dr-nick.w3.org (dr-nick.w3.org [18.29.1.73]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBHKiNib095377 for <ietf-xml-mime@imc.org>; Wed, 17 Dec 2003 12:44:24 -0800 (PST) (envelope-from duerst@w3.org)
Received: from enoshima (homer.w3.org [18.29.0.30]) by dr-nick.w3.org (Postfix) with ESMTP id 3138D13495; Wed, 17 Dec 2003 15:44:25 -0500 (EST)
Message-Id: <4.2.0.58.J.20031217152847.077c2078@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J 
Date: Wed, 17 Dec 2003 15:40:39 -0500
To: ben@morrow.me.uk, ietf-types@iana.org
From: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: ietf-xml-mime@imc.org
In-Reply-To: <20031217181719.GA11430@mauzo.mauzo.dyndns.org>
References: <4.2.0.58.J.20031216160942.077942a8@localhost> <87ptewu3gc.fsf@w3.org> <4.2.0.58.J.20031216160942.077942a8@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Hello Ben,

[Putting ietf-xml-mime@imc.org back on the cc list, because
I think quite some of this discussion may make its way into
the next version of RFC 3023 in one way or another.]

At 18:17 03/12/17 +0000, ben@morrow.me.uk wrote:
>At  4pm on 16/12/03 Martin Duerst wrote:
> > I just by chance realized that you had removed the 'charset'
> > parameter from the registration for application/voicexml+xml,
> > and also for application/ssml+xml. I have found something similar
> > in other recent registration proposals.

>The usual intent when omitting the 'charset' parameter from the
>registration is that the XML *must* be encoded in UTF8 or UTF16.

This is an interesting idea. Can you point to any actual
registrations where this is the case?


>If this is done, then the entity will not need to include a charset
>declaration in the body either, and will be universally understood
>everywhere.

Yes, this is true for application/foo+xml. For text/foo+xml,
it is not true, but then I hope nobody is talking about that anyway.


>I would suggest that those registrations which do not
>specify a charset be updated to state that encodings other than UTF8
>and UTF16 may not be used.

I'm not really sure that this helps. It would not work together
with any of the points I have brought up:

- Generic xml processors would still accept 'charset' parameters,
   even if the registration forbade it. They also would still
   accept content in other encodings (with quite some variation,
   of course), whether declared with a 'charset' parameter or
   with the encoding pseudo-attribute on an XML declaration.
- Technology such as JSP and databases would still produce
   'charset' parameters, even if the registration didn't allow it.

There may be cases where one wants a certain media type to be
restricted to UTF-8 and UTF-16 (or even in some cases only
one of them), but just saying
"no charset parameter on media type == UTF-8 or UTF-16 only"
doesn't really cut it.


Regards,    Martin.




Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBGNJCib059331 for <ietf-xml-mime-bks@above.proper.com>; Tue, 16 Dec 2003 15:19:12 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBGNJCw2059330 for ietf-xml-mime-bks; Tue, 16 Dec 2003 15:19:12 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from dr-nick.w3.org (dr-nick.w3.org [18.29.1.73]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBGNJCib059325 for <ietf-xml-mime@imc.org>; Tue, 16 Dec 2003 15:19:12 -0800 (PST) (envelope-from duerst@w3.org)
Received: from enoshima (homer.w3.org [18.29.0.30]) by dr-nick.w3.org (Postfix) with ESMTP id C5F0B138C0; Tue, 16 Dec 2003 18:19:06 -0500 (EST)
Message-Id: <4.2.0.58.J.20031216180134.06e36e20@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J 
Date: Tue, 16 Dec 2003 18:12:56 -0500
To: Linus Walleij <triad@df.lth.se>, ietf-types@iana.org, ietf-xml-mime@imc.org
From: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: Chris Lilley <chris@w3.org>, Max Froumentin <mf@w3.org>, w3c-archive@w3.org
In-Reply-To: <1071614605.15819.19.camel@felicia>
References: <4.2.0.58.J.20031216160942.077942a8@localhost> <4.2.0.58.J.20031216160942.077942a8@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Hello Linus,

At 23:43 03/12/16 +0100, Linus Walleij wrote:
>tis 2003-12-16 klockan 22.32 skrev Martin Duerst:
>
> > I just by chance realized that you had removed the 'charset'
> > parameter from the registration for application/voicexml+xml,
> > and also for application/ssml+xml. I have found something similar
> > in other recent registration proposals.
>
>Now, wait a second here.
>
>Now there is Martin Duerst from W3.org telling us that we must
>indeed include a "charset" parameter with our foo/bar+xml
>MIME transport types.
>
>Then there is Chris Lilley of the same W3.org telling us *repeatedly*
>to remove it:

First, we are on an IETF list, which means that we are all acting
as concerned individuals, and that we are mainly interested in
getting the best possible solution.


>Now, I've been trusting Chris up til now, simply because he
>represents W3.org.
>
>You are of course welcome to run your internal disputes on the
>issue on these lists if you like, but when you're already at the
>same organizational body, could you *PLEASE* book an internal meeting
>and try to formulate a consensus from the W3.org side on how you
>think this thing should be handled? After all, you invented XML.

Sorry I forgot to mention it, but Chris and me actually had such a
'meeting' last week, and the email I wrote was the result of this
(and a phone call with Max).


Regards,      Martin.


Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBGLfWib055988 for <ietf-xml-mime-bks@above.proper.com>; Tue, 16 Dec 2003 13:41:32 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBGLfW2q055987 for ietf-xml-mime-bks; Tue, 16 Dec 2003 13:41:32 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mtl.alis.com (mtl.alis.com [199.84.165.71]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBGLfUib055981 for <ietf-xml-mime@imc.org>; Tue, 16 Dec 2003 13:41:31 -0800 (PST) (envelope-from FYergeau@alis.com)
Received: from alis-2k.alis.domain (alis-2k.alis.com [199.84.165.130]) by mtl.alis.com (8.12.8p2/8.12.8) with ESMTP id hBGLfQ1Z036188; Tue, 16 Dec 2003 16:41:26 -0500 (EST) (envelope-from FYergeau@alis.com)
Received: by alis-2k.alis.domain with Internet Mail Service (5.5.2653.19) id <WY59V946>; Tue, 16 Dec 2003 16:41:25 -0500
Message-ID: <F7D4BDA0E5A1D14B99D32C022AEB73660EB466@alis-2k.alis.domain>
From: Francois Yergeau <FYergeau@alis.com>
To: "'Martin Duerst'" <duerst@w3.org>, Max Froumentin <mf@w3.org>, ietf-types@iana.org
Cc: w3c-archive@w3.org, ietf-xml-mime@imc.org, Ben Kovitz <bkovitz@caltech.edu>, Linus Walleij <triad@df.lth.se>
Subject: RE: proposed media type registration: application/voicexml+xml
Date: Tue, 16 Dec 2003 16:41:19 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain; charset="ISO-8859-1"
X-Spam-Checker-Version: SpamAssassin 2.53 (1.174.2.15-2003-03-30-exp)
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by above.proper.com id hBGLfVib055983
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Seconded, for all the reasons given by Martin. Please restore the charset
parameter.

Regards,

-- 
François

> -----Message d'origine-----
> De : Martin Duerst [mailto:duerst@w3.org]
> Envoyé : 16 décembre 2003 16:33
> À : Max Froumentin; ietf-types@iana.org
> Cc : w3c-archive@w3.org; ietf-xml-mime@imc.org; Ben Kovitz; Linus
> Walleij
> Objet : Re: proposed media type registration: application/voicexml+xml
> 
> 
> 
> Hello Max, others,
> 
> I just by chance realized that you had removed the 'charset'
> parameter from the registration for application/voicexml+xml,
> and also for application/ssml+xml. I have found something similar
> in other recent registration proposals.
> 
> Here is why I think this is really not a good idea:
> 
> First, we start to get into a patchwork where some types
> allow the 'charset', and others don't. Assuming that there
> is something like generic XML processors (e.g. parsers)
> (which is the whole point of XML), how should such a parser
> know whether the 'charset' parameter is allowed or not,
> and keep up with new registrations?
> 
> Second, there are various scenarios a charset parameter in
> the header is automatically generated, or where it's much
> easier to generate it than to avoid it. The following are
> examples:
> - The classical example of transcoding (converting from one
>    encoding to another). Not very frequent these days on the
>    Web in general, but still used for Russian/Cyrillic encodings
>    and in some mobile phone scenarios.
> - Scripting technologies, for example JSP. With JSP, it is much
>    more straightforward to produce output with the right encoding
>    with a 'charset' parameter in the header than without. The
>    reason for this is that JSP allows to produce any kind of
>    output, not limited to XML, and has to know how to convert
>    from the Java-internal encoding to whatever is used on the
>    wire. Putting that information in the 'charset' parameter
>    in the header then is straightforward; anything else has
>    to be done by hand. Hopefully, excluding certain classes
>    of content production technologies is not what you want.
> - Databases that store content as characters rather than bytes,
>    in a single encoding (in many cases e.g. uniformly UTF-8),
>    and transcode on output. Again, if they use generic technology,
>    getting the 'charset' into the header is much more straightforward
>    than putting it into the body.
> 
> Given all these cases, I think it's not at all appropriate to
> remove the 'charset' parameter from the registration, because
> it would severely limit the use of technology that in good
> faith, and with good reasons, is using it.
> 
> In the long run, I think that an update to RFC 3023 should address
> these issues in more detail to help content producers understand the
> advantages and problems related to charset/encoding information.
> 
> Regards,    Martin.
> 
> 
> At 17:15 03/08/21 +0200, Max Froumentin wrote:
> 
> >Hi,
> >
> >Please consider the attached Internet Draft submission: "The
> >application/voicexml+xml Media Type" (originating from the Voice
> >Browser Working Group of the W3C), for review.
> >
> >Cheers,
> >
> >Max Froumentin, W3C
> 



Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBGLYKib055740 for <ietf-xml-mime-bks@above.proper.com>; Tue, 16 Dec 2003 13:34:20 -0800 (PST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.10/8.12.9/Submit) id hBGLYKt3055739 for ietf-xml-mime-bks; Tue, 16 Dec 2003 13:34:20 -0800 (PST)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from dr-nick.w3.org (dr-nick.w3.org [18.29.1.73]) by above.proper.com (8.12.10/8.12.8) with ESMTP id hBGLYIib055732 for <ietf-xml-mime@imc.org>; Tue, 16 Dec 2003 13:34:18 -0800 (PST) (envelope-from duerst@w3.org)
Received: from enoshima (homer.w3.org [18.29.0.30]) by dr-nick.w3.org (Postfix) with ESMTP id 91BF613957; Tue, 16 Dec 2003 16:33:21 -0500 (EST)
Message-Id: <4.2.0.58.J.20031216160942.077942a8@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J 
Date: Tue, 16 Dec 2003 16:32:59 -0500
To: Max Froumentin <mf@w3.org>, ietf-types@iana.org
From: Martin Duerst <duerst@w3.org>
Subject: Re: proposed media type registration: application/voicexml+xml
Cc: w3c-archive@w3.org, ietf-xml-mime@imc.org, Ben Kovitz <bkovitz@caltech.edu>, Linus Walleij <triad@df.lth.se>
In-Reply-To: <87ptewu3gc.fsf@w3.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Hello Max, others,

I just by chance realized that you had removed the 'charset'
parameter from the registration for application/voicexml+xml,
and also for application/ssml+xml. I have found something similar
in other recent registration proposals.

Here is why I think this is really not a good idea:

First, we start to get into a patchwork where some types
allow the 'charset', and others don't. Assuming that there
is something like generic XML processors (e.g. parsers)
(which is the whole point of XML), how should such a parser
know whether the 'charset' parameter is allowed or not,
and keep up with new registrations?

Second, there are various scenarios a charset parameter in
the header is automatically generated, or where it's much
easier to generate it than to avoid it. The following are
examples:
- The classical example of transcoding (converting from one
   encoding to another). Not very frequent these days on the
   Web in general, but still used for Russian/Cyrillic encodings
   and in some mobile phone scenarios.
- Scripting technologies, for example JSP. With JSP, it is much
   more straightforward to produce output with the right encoding
   with a 'charset' parameter in the header than without. The
   reason for this is that JSP allows to produce any kind of
   output, not limited to XML, and has to know how to convert
   from the Java-internal encoding to whatever is used on the
   wire. Putting that information in the 'charset' parameter
   in the header then is straightforward; anything else has
   to be done by hand. Hopefully, excluding certain classes
   of content production technologies is not what you want.
- Databases that store content as characters rather than bytes,
   in a single encoding (in many cases e.g. uniformly UTF-8),
   and transcode on output. Again, if they use generic technology,
   getting the 'charset' into the header is much more straightforward
   than putting it into the body.

Given all these cases, I think it's not at all appropriate to
remove the 'charset' parameter from the registration, because
it would severely limit the use of technology that in good
faith, and with good reasons, is using it.

In the long run, I think that an update to RFC 3023 should address
these issues in more detail to help content producers understand the
advantages and problems related to charset/encoding information.

Regards,    Martin.


At 17:15 03/08/21 +0200, Max Froumentin wrote:

>Hi,
>
>Please consider the attached Internet Draft submission: "The
>application/voicexml+xml Media Type" (originating from the Voice
>Browser Working Group of the W3C), for review.
>
>Cheers,
>
>Max Froumentin, W3C


