
Received: by above.proper.com (8.11.6/8.11.3) id g1SJidE15600 for ietf-xml-mime-bks; Thu, 28 Feb 2002 11:44:39 -0800 (PST)
Received: from serrano.hesketh.net (serrano.hesketh.net [66.45.6.210]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1SJibi15595 for <ietf-xml-mime@imc.org>; Thu, 28 Feb 2002 11:44:38 -0800 (PST)
Received: from [192.168.124.14] (syr-24-24-11-230.twcny.rr.com [24.24.11.230]) by serrano.hesketh.net (8.11.6/8.11.3) with ESMTP id g1SJiY632256 for <ietf-xml-mime@imc.org>; Thu, 28 Feb 2002 14:44:34 -0500
X-Originating-IP: [66.45.6.210]
X-Spam-Filter: check_local@serrano.hesketh.net by digitalanswers.org
X-More-Information: http://spamfighter.hesketh.net
Subject: updated xmlns media feature
From: "Simon St.Laurent" <simonstl@simonstl.com>
To: ietf-xml-mime@imc.org
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
X-Mailer: Evolution/1.0.2 
Date: 28 Feb 2002 15:48:53 -0500
Message-Id: <1014929364.826.6.camel@localhost.localdomain>
Mime-Version: 1.0
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

A new Internet-Draft of Registration of xmlns Media Feature Tag is
available:
http://www.ietf.org/internet-drafts/draft-stlaurent-feature-xmlns-02.txt

This document specifies an xmlns Media Feature per RFC 2506 for
identifying some or all of the URIs defining XML namespaces in a
given XML resource, and the relative importance of these namespaces.
This feature is designed primarily for use with the XML Media Types
defined in RFC 3023, to provide additional hints as to the
processing requirements of a given XML resource.

This draft includes clarifications in its introduction, more explicit
notice that the order in which features are listed is unimportant, and a
new author (Ian Graham).

-- 
Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!
http://simonstl.com



Received: from localhost (localhost [[UNIX: localhost]]) by above.proper.com (8.11.6/8.11.3) id g1L394426402 for ietf-xml-mime-bks; Wed, 20 Feb 2002 19:09:04 -0800 (PST)
Received: from ic-unix.ic.utoronto.ca (ic-unix.ic.utoronto.ca [142.150.64.2]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1L393326398 for <ietf-xml-mime@imc.org>; Wed, 20 Feb 2002 19:09:03 -0800 (PST)
Received: from localhost (igraham@localhost) by ic-unix.ic.utoronto.ca (8.9.3+Sun/8.9.1) with ESMTP id VAA02413; Wed, 20 Feb 2002 21:59:07 -0500 (EST)
Date: Wed, 20 Feb 2002 21:59:07 -0500 (EST)
From: Ian Graham <igraham@ic-unix.ic.utoronto.ca>
Reply-To: Ian Graham <ian.graham@utoronto.ca>
To: "Simon St.Laurent" <simonstl@simonstl.com>
cc: Mark Nottingham <mnot@mnot.net>, www-tag@w3.org, ietf-xml-mime@imc.org
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
In-Reply-To: <1014172119.943.428.camel@localhost.localdomain>
Message-ID: <Pine.SOL.4.21.0202201851170.24607-100000@ic-unix.ic.utoronto.ca>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

I concur with Simon here.  As has been noted before, the root namespace is
often not the one you want to 'dispatch' based on.

I think it makes more sense to think of the namespace 'set' specified in a
MIME header as representing a 'processing signature hint' for the
data. But note that this hint is not unique for a document -- it can
vary depending on how you want the data to be processed. 

Looked at this way, the root namespace is no longer that important -- it's
just one potential namespace part of the overall signature. 

Hope this makes sense --

Ian

On 19 Feb 2002, Simon St.Laurent wrote:

> 
> On Tue, 2002-02-19 at 19:20, Mark Nottingham wrote:
> > Might I suggest that any revision of RFC3023 include a new parameter
> > for application/xml and text/xml (say, 'rootNS') that contains the
> > root element's namespace URI, to allow HTTP content negotiation with
> > current implementations? Yes, this won't address cases where there is
> > a need to negotiate on more than one namespace in the document, but
> > it will certainly help with the simple cases, where dispatch is based
> > upon the root element's namespace (which seems to be the direction
> > things are going in).
> 
> You're welcome to register a rootNS content feature.  
> 
> I have to admit that I don't understand or sympathize with the
> fascination with root element namespaces, and don't believe that RFC
> 3023 needs to add any XML-specific parameters, but a content feature
> should take care of what you're looking for.
>  
> -- 
> Simon St.Laurent
> Ring around the content, a pocket full of brackets
> Errors, errors, all fall down!
> http://simonstl.com
> 
> 




Received: by above.proper.com (8.11.6/8.11.3) id g1K3PZb16498 for ietf-xml-mime-bks; Tue, 19 Feb 2002 19:25:35 -0800 (PST)
Received: from mercury.ccil.org (mail@mercury.ccil.org [192.190.237.100]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1K3PY316493 for <ietf-xml-mime@imc.org>; Tue, 19 Feb 2002 19:25:34 -0800 (PST)
Received: from cowan by mercury.ccil.org with local (Exim 3.12 #1 (Debian)) id 16dNNL-0001z9-00; Tue, 19 Feb 2002 22:24:59 -0500
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
In-Reply-To: <20020219170620.D27739@mnot.net> from Mark Nottingham at "Feb 19, 2002 05:06:28 pm"
To: Mark Nottingham <mnot@mnot.net>
Date: Tue, 19 Feb 2002 22:24:59 -0500 (EST)
CC: ned.freed@mrochek.com, "Simon St.Laurent" <simonstl@simonstl.com>, www-tag@w3.org, ietf-xml-mime@imc.org
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-Id: <E16dNNL-0001z9-00@mercury.ccil.org>
From: John Cowan <cowan@mercury.ccil.org>
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Mark Nottingham scripsit:

> True. However, there are many formats (especially XML-based) which
> are neither vendor-specific or 'personal'; instead, the represent
> loose consensus among a number of partners or other interested
> parties.

There is no reason why lightweight "vendors" cannot be created for
such situations.  A media type emerging on the xml-dev list,
for example, might very well have a name like
application/vnd.xml-dev.fubar+xml.

-- 
John Cowan           http://www.ccil.org/~cowan              cowan@ccil.org
To say that Bilbo's breath was taken away is no description at all.  There
are no words left to express his staggerment, since Men changed the language
that they learned of elves in the days when all the world was wonderful.
        --_The Hobbit_


Received: by above.proper.com (8.11.6/8.11.3) id g1K1Ts614718 for ietf-xml-mime-bks; Tue, 19 Feb 2002 17:29:54 -0800 (PST)
Received: from mail.mnot.net (adsl-64-170-196-242.dsl.snfc21.pacbell.net [64.170.196.242]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1K1Tr314714 for <ietf-xml-mime@imc.org>; Tue, 19 Feb 2002 17:29:53 -0800 (PST)
Received: by mail.mnot.net (Postfix, from userid 500) id EA50F976D; Tue, 19 Feb 2002 17:29:56 -0800 (PST)
Date: Tue, 19 Feb 2002 17:29:56 -0800
From: Mark Nottingham <mnot@mnot.net>
To: "Simon St.Laurent" <simonstl@simonstl.com>
Cc: www-tag@w3.org, ietf-xml-mime@imc.org
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
Message-ID: <20020219172949.E27739@mnot.net>
References: <1011891409.5317.341.camel@localhost.localdomain> <20020219162007.C27739@mnot.net> <1014172119.943.428.camel@localhost.localdomain>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1014172119.943.428.camel@localhost.localdomain>
User-Agent: Mutt/1.3.23i
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

Is it possible to negotiate a content feature with current HTTP
implementations? AFAIK it isn't, but it would be possible to do so on
a media type attribute, using server-driven negotiation.

If I'm missing something, a reference would be very much appreciated.

Cheers,


On Tue, Feb 19, 2002 at 09:28:05PM -0500, Simon St.Laurent wrote:
> On Tue, 2002-02-19 at 19:20, Mark Nottingham wrote:
> > Might I suggest that any revision of RFC3023 include a new parameter
> > for application/xml and text/xml (say, 'rootNS') that contains the
> > root element's namespace URI, to allow HTTP content negotiation with
> > current implementations? Yes, this won't address cases where there is
> > a need to negotiate on more than one namespace in the document, but
> > it will certainly help with the simple cases, where dispatch is based
> > upon the root element's namespace (which seems to be the direction
> > things are going in).
> 
> You're welcome to register a rootNS content feature.  
> 
> I have to admit that I don't understand or sympathize with the
> fascination with root element namespaces, and don't believe that RFC
> 3023 needs to add any XML-specific parameters, but a content feature
> should take care of what you're looking for.
>  
> -- 
> Simon St.Laurent
> Ring around the content, a pocket full of brackets
> Errors, errors, all fall down!
> http://simonstl.com
> 

-- 
Mark Nottingham
http://www.mnot.net/
 


Received: from localhost (localhost [[UNIX: localhost]]) by above.proper.com (8.11.6/8.11.3) id g1K1O4T14636 for ietf-xml-mime-bks; Tue, 19 Feb 2002 17:24:04 -0800 (PST)
Received: from serrano.hesketh.net (serrano.hesketh.net [66.45.6.210]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1K1O2314632 for <ietf-xml-mime@imc.org>; Tue, 19 Feb 2002 17:24:02 -0800 (PST)
Received: from [192.168.124.14] (syr-24-24-11-230.twcny.rr.com [24.24.11.230]) by serrano.hesketh.net (8.11.6/8.11.3) with ESMTP id g1K1NtQ16160; Tue, 19 Feb 2002 20:23:55 -0500
X-Spam-Filter: check_local@serrano.hesketh.net by digitalanswers.org
X-More-Information: http://spamfighter.hesketh.net
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
From: "Simon St.Laurent" <simonstl@simonstl.com>
To: Mark Nottingham <mnot@mnot.net>
Cc: www-tag@w3.org, ietf-xml-mime@imc.org
In-Reply-To: <20020219162007.C27739@mnot.net>
References: <1011891409.5317.341.camel@localhost.localdomain>  <20020219162007.C27739@mnot.net>
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
X-Mailer: Evolution/1.0.1 
Date: 19 Feb 2002 21:28:05 -0500
Message-Id: <1014172119.943.428.camel@localhost.localdomain>
Mime-Version: 1.0
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

On Tue, 2002-02-19 at 19:20, Mark Nottingham wrote:
> Might I suggest that any revision of RFC3023 include a new parameter
> for application/xml and text/xml (say, 'rootNS') that contains the
> root element's namespace URI, to allow HTTP content negotiation with
> current implementations? Yes, this won't address cases where there is
> a need to negotiate on more than one namespace in the document, but
> it will certainly help with the simple cases, where dispatch is based
> upon the root element's namespace (which seems to be the direction
> things are going in).

You're welcome to register a rootNS content feature.  

I have to admit that I don't understand or sympathize with the
fascination with root element namespaces, and don't believe that RFC
3023 needs to add any XML-specific parameters, but a content feature
should take care of what you're looking for.
 
-- 
Simon St.Laurent
Ring around the content, a pocket full of brackets
Errors, errors, all fall down!
http://simonstl.com



Received: from localhost (localhost [[UNIX: localhost]]) by above.proper.com (8.11.6/8.11.3) id g1K1K7I14574 for ietf-xml-mime-bks; Tue, 19 Feb 2002 17:20:07 -0800 (PST)
Received: from mauve.mrochek.com (mauve.mrochek.com [209.55.107.55]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1K1K6314570 for <ietf-xml-mime@imc.org>; Tue, 19 Feb 2002 17:20:06 -0800 (PST)
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01KEGS39PZ4G003WI0@mauve.mrochek.com> (original mail from NED@mauve.mrochek.com) for ietf-xml-mime@imc.org; Tue, 19 Feb 2002 17:20:07 -0800 (PST)
Date: Tue, 19 Feb 2002 17:07:48 -0800 (PST)
From: ned+xml-mime@mrochek.com
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
In-reply-to: "Your message dated Tue, 19 Feb 2002 17:06:28 -0800" <20020219170620.D27739@mnot.net>
To: Mark Nottingham <mnot@mnot.net>
Cc: ned.freed@mrochek.com, "Simon St.Laurent" <simonstl@simonstl.com>, www-tag@w3.org, ietf-xml-mime@imc.org
Message-id: <01KEGSIZZCL2003WI0@mauve.mrochek.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=us-ascii
Content-transfer-encoding: 7BIT
References: <1011891409.5317.341.camel@localhost.localdomain> <1011891409.5317.341.camel@localhost.localdomain> <01KEGRBV6QUA003WI0@mauve.mrochek.com> <01KEGRBV6QUA003WI0@mauve.mrochek.com>
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

> True. However, there are many formats (especially XML-based) which
> are neither vendor-specific or 'personal'; instead, the represent
> loose consensus among a number of partners or other interested
> parties.

So? This sort of stuff is registered under vnd and prs all the time.

> These probably fall most accurately under prs, but there are some
> (human) implications to 'personal' that seem to make people avoid it
> for these uses. Yes, this is largely psychological.

I'd have to say "delusional" is closer to the mark. I would be the first to
agree that the names "vendor" and "personal" are far from perfect names. But we
had to pick something, this is what we came up with, at the time nobody was
able to suggest anything better.

Regardless, the bottom line is all the facet name does is tie the collection to
a particular set of registration rules. Fretting about name perception is a
case of the perfect being the enemy of the good enough.

> Additionally, there is still a need to have a one-to-one mapping
> between media types and namespaces when so desired by the
> application. The most effective way to do this IMHO is to include the
> namespace URI in the media type as a parameter, rather than forcing
> translation between a registered type and a URI (Yes, I'm aware that
> there is reasoning behind the cost of registering a media type).

Perhaps. However, I was merely pointing out that your assertion that RFC
publication is needed to register a type is just not true.

				Ned


Received: from localhost (localhost [[UNIX: localhost]]) by above.proper.com (8.11.6/8.11.3) id g1K16QS14360 for ietf-xml-mime-bks; Tue, 19 Feb 2002 17:06:26 -0800 (PST)
Received: from mail.mnot.net (adsl-64-170-196-242.dsl.snfc21.pacbell.net [64.170.196.242]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1K16O314356 for <ietf-xml-mime@imc.org>; Tue, 19 Feb 2002 17:06:24 -0800 (PST)
Received: by mail.mnot.net (Postfix, from userid 500) id 3A1B4976D; Tue, 19 Feb 2002 17:06:28 -0800 (PST)
Date: Tue, 19 Feb 2002 17:06:28 -0800
From: Mark Nottingham <mnot@mnot.net>
To: ned.freed@mrochek.com
Cc: "Simon St.Laurent" <simonstl@simonstl.com>, www-tag@w3.org, ietf-xml-mime@imc.org
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
Message-ID: <20020219170620.D27739@mnot.net>
References: <1011891409.5317.341.camel@localhost.localdomain> <1011891409.5317.341.camel@localhost.localdomain> <01KEGRBV6QUA003WI0@mauve.mrochek.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <01KEGRBV6QUA003WI0@mauve.mrochek.com>
User-Agent: Mutt/1.3.23i
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

True. However, there are many formats (especially XML-based) which
are neither vendor-specific or 'personal'; instead, the represent
loose consensus among a number of partners or other interested
parties.

These probably fall most accurately under prs, but there are some
(human) implications to 'personal' that seem to make people avoid it
for these uses. Yes, this is largely psychological.

Additionally, there is still a need to have a one-to-one mapping
between media types and namespaces when so desired by the
application. The most effective way to do this IMHO is to include the
namespace URI in the media type as a parameter, rather than forcing
translation between a registered type and a URI (Yes, I'm aware that
there is reasoning behind the cost of registering a media type).

Cheers,


On Tue, Feb 19, 2002 at 04:43:33PM -0800, ned.freed@mrochek.com wrote:
> > This seems at odds with XML's easy extensibility and the lost
> > overhead of using XML namespaces; to do conneg on any format that I
> > create, I have to write an I-D, get it approved as Informational, and
> > then register it with IANA.
> 
> You need do nothing of the sort. RFC publication and IESG approval is only
> required for things in the IETF tree. Vendor and personal tree entries are as
> simple as filling out a Web form at IANA:
> 
>    http://www.iana.org/cgi-bin/mediatypes.pl
> 
> The recent shift from isi.edu to iana.org has caused a lot of delay in
> registration approvals, but that should be done by now.
> 
> 				Ned

-- 
Mark Nottingham
http://www.mnot.net/
 


Received: by above.proper.com (8.11.6/8.11.3) id g1K0k7Q14147 for ietf-xml-mime-bks; Tue, 19 Feb 2002 16:46:07 -0800 (PST)
Received: from mauve.mrochek.com (mauve.mrochek.com [209.55.107.55]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1K0k6314143 for <ietf-xml-mime@imc.org>; Tue, 19 Feb 2002 16:46:06 -0800 (PST)
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01KEGMXTHAAO003WI0@mauve.mrochek.com> (original mail from NED@mauve.mrochek.com) for ietf-xml-mime@imc.org; Tue, 19 Feb 2002 16:46:07 -0800 (PST)
Date: Tue, 19 Feb 2002 16:43:33 -0800 (PST)
From: ned+xml-mime@mrochek.com
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
In-reply-to: "Your message dated Tue, 19 Feb 2002 16:20:07 -0800" <20020219162007.C27739@mnot.net>
To: Mark Nottingham <mnot@mnot.net>
Cc: "Simon St.Laurent" <simonstl@simonstl.com>, www-tag@w3.org, ietf-xml-mime@imc.org
Message-id: <01KEGRBV6QUA003WI0@mauve.mrochek.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=us-ascii
Content-transfer-encoding: 7BIT
References: <1011891409.5317.341.camel@localhost.localdomain> <1011891409.5317.341.camel@localhost.localdomain>
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

> This seems at odds with XML's easy extensibility and the lost
> overhead of using XML namespaces; to do conneg on any format that I
> create, I have to write an I-D, get it approved as Informational, and
> then register it with IANA.

You need do nothing of the sort. RFC publication and IESG approval is only
required for things in the IETF tree. Vendor and personal tree entries are as
simple as filling out a Web form at IANA:

   http://www.iana.org/cgi-bin/mediatypes.pl

The recent shift from isi.edu to iana.org has caused a lot of delay in
registration approvals, but that should be done by now.

				Ned


Received: from localhost (localhost [[UNIX: localhost]]) by above.proper.com (8.11.6/8.11.3) id g1K0K6Q13722 for ietf-xml-mime-bks; Tue, 19 Feb 2002 16:20:06 -0800 (PST)
Received: from mail.mnot.net (adsl-64-170-196-242.dsl.snfc21.pacbell.net [64.170.196.242]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g1K0K4313718 for <ietf-xml-mime@imc.org>; Tue, 19 Feb 2002 16:20:04 -0800 (PST)
Received: by mail.mnot.net (Postfix, from userid 500) id B3A49976D; Tue, 19 Feb 2002 16:20:07 -0800 (PST)
Date: Tue, 19 Feb 2002 16:20:07 -0800
From: Mark Nottingham <mnot@mnot.net>
To: "Simon St.Laurent" <simonstl@simonstl.com>
Cc: www-tag@w3.org, ietf-xml-mime@imc.org
Subject: Re: Revised Internet-Draft: Media Feature - xmlns
Message-ID: <20020219162007.C27739@mnot.net>
References: <1011891409.5317.341.camel@localhost.localdomain>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1011891409.5317.341.camel@localhost.localdomain>
User-Agent: Mutt/1.3.23i
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

I like Simon's proposal [1], with one important caveat;

  It's currently difficult/impossible to do HTTP content negotiation
  on an XML-based format unless you define a media type for it.

One could arrange to do negiotiation on Content-Features, but AFAIK
this isn't defined for HTTP, and certainly not implemented.

This seems at odds with XML's easy extensibility and the lost
overhead of using XML namespaces; to do conneg on any format that I
create, I have to write an I-D, get it approved as Informational, and
then register it with IANA.

Might I suggest that any revision of RFC3023 include a new parameter
for application/xml and text/xml (say, 'rootNS') that contains the
root element's namespace URI, to allow HTTP content negotiation with
current implementations? Yes, this won't address cases where there is
a need to negotiate on more than one namespace in the document, but
it will certainly help with the simple cases, where dispatch is based
upon the root element's namespace (which seems to be the direction
things are going in).

Cheers,


[1] http://www.ietf.org/internet-drafts/draft-stlaurent-feature-xmlns-01.txt


-- 
Mark Nottingham
http://www.mnot.net/
 


Received: from localhost (localhost [[UNIX: localhost]]) by above.proper.com (8.11.6/8.11.3) id g119jfA15458 for ietf-xml-mime-bks; Fri, 1 Feb 2002 01:45:41 -0800 (PST)
Received: from virginia.yamato.ibm.co.jp (virginia.yamato.ibm.co.jp [203.141.89.165]) by above.proper.com (8.11.6/8.11.3) with ESMTP id g119jc315454 for <ietf-xml-mime@imc.org>; Fri, 1 Feb 2002 01:45:39 -0800 (PST)
Received: from ns.trl.ibm.com (ns.trl.ibm.com [9.116.48.18]) by virginia.yamato.ibm.co.jp (8.11.6/3.7W/GW3.3) with ESMTP id g119is119172; Fri, 1 Feb 2002 18:44:54 +0900
Received: from localhost by ns.trl.ibm.com (AIX4.3/8.9.3/TRL4.5SRV) id SAA18334; Fri, 1 Feb 2002 18:44:54 +0900
Date: Fri, 01 Feb 2002 18:40:15 +0900 (JST)
Message-Id: <20020201.184015.27296549.mmurata@trl.ibm.com>
To: ietf-xml-mime@imc.org, www-i18n-comments@w3.org, www-tag@w3.org
Subject: Re: [nsMediaType-3] Principles and corner cases
From: MURATA Makoto <mmurata@trl.ibm.co.jp>
X-Mailer: Mew version 2.1 on Emacs 20.4 / Mule 4.1 (AOI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

In [1], Tim wrote:
> - there is the whole issue of the charset header.

0. Introduction

There have been a lot of discussion about the encoding issue.  Rather
than repeating it, we have to begin with a better understanding of the
current situation.  Here is my attempt to clarify the current (messy)
status.  Although this memo is still incomplete, I hope this has some
points.


1. Textual resources

Many types of WWW resources are textual.  Since many charsets are 
in use, we have to determine the charset of each textual resource 
so as to handle it correctly .  

1.1 Documents

XML, HTML, and CSS of W3C have textual representations.  Plain text 
is certainly textual.

1.2 Programs

Source programs are textual and written in some encoding.  When we
transmit source programs or compile/execute them on the fly, encoding
issues will arise.

- VBScript received from the server, 
- Javascript received from the server, 
- JSP source files at the server,
- perl programs at the server, etc.

1.3 Generation of documents by programs

On the www server, programs generate documents on the fly.  These
programs have to specify encoding for such documents.  APIs of most 
programming languages allow charset specification

Furthermore, programs have to embed in-band signature, when it is
necessary.  Most APIs and programming languages do not provide any
support.  Rather, programmers have use "print" carefully so as to
create in-band signatures (e.g., meta tags).

- CGI programs,
- Servlets,
- Applets,
- XSLT stylesheets


1.4 Form data

Finally, text typed in forms of HTML and sent as multipart/form-data
via HTTP also reqiure encoding information.

- text typed in <textarea> of HTML,
- text typed in <input type="text"> of HTML, and
- file uploaded by <input type="file"> of HTML,


2. Current situation

There are already too many methods for determining the encoding.  
I show a list of such methods and further show which 
is used for which type of resource.

A: the charset parameter of MIME entities

B: in-band declaration (META tags of HTML, @charset of CSS, 
   encoding declarations of XML),

C: the charset attribute of the referring element (XML or HTML)
  in the referring resource,

D: the charset of the referring resource (typically HTML), 

E: the charset of the HTML document containing the <input> or 
   <textarea> element,

F: guessing based on bit patterns

G: configuration files

H: Manual intervention


2.1 XML documents received from the HTTP server 

A: the charset parameter of media types such as 
   text/xml and application/xml

B: the encoding declaration in XML documents

Note: RFC 3023 certainly says A > B.


2.2 HTML documents received from the HTTP server 

A: the charset parameter of the media type
   text/html

B: Meta tags

F: Some browsers sniff the charset.

H: the menu for choosing the encoding

Note: The HTML 4.01 recommendation blesses both A and B, but 
RFC 2854 (text/html) strongly recommends A only.  However, 
RFC 2854 references to HTML 4.


2.3 CSS stylesheets received from the HTTP server 

A: the charset parameter of media types such as 
   text/css

B: @encoding in CSS stylesheets

C: the attribute "charset" of LINK elements of HTML 4.01; 
   the charset attribute of the stylesheet-linking PI.

F: Some browsers sniff the charset.

Note: The CSS recommendation blesses both A and B,
but RFC 2318 (text/css) merely mentions A.


2.4 XSLT stylesheets received from the HTTP server 

A: the charset parameter of media types such as 
   text/xml and application/xml

B: the encoding declaration in XML documents

C: the charset attribute of the stylesheet-linking PI

Note: Use of C is incorrect.


2.5 plain text received from the HTTP server 

A: the charset parameter of the media type text/plain

F: Some browsers sniff the charset.

H: the menu for choosing the encoding


2.6 XML documents which are stored at the server but have 
    not been transmitted to the client yet

B: the encoding declaration in XML documents

G: Apache provides the directive AddCharset for configuration 
   files.


2.7 HTML documents which are stored at the server but have 
    not been transmitted to the client yet

B: META tags in this document

C: This document may be referenced by some anchor elements 
   of HTML 4.01, which specify the charset attribute.

G: Apache provides the directive AddCharset for configuration 
   files.


2.8 An HTML document that is generated at the server on the fly 
    but has not been transmitted to the client yet

B: META tags specified in this document

Note: Generating programs typically specify the charset 
     *TWICE*: once for the encoding of the output 
     stream and once for generating meta tags.


2.9 An HTML document temporarily created at the client by XSLT

B: META tags specified in this document

Note: The encoding parameter of xsl:output can specify the charset.
      Moreover, when the output method is HTML, this parameter 
      also generates an appropriate META tag.


2.10 CSS stylesheets which are stored at the server but have 
     not been transmitted to the client yet

B: @encoding in CSS stylesheets

C: the attribute "charset" of LINK elements of HTML 4.01; 
   the charset attribute of the stylesheet-linking PI.

G: Apache provides the directive AddCharset for configuration 
   files.

2.11 XSLT stylesheets which are stored at the server but have 
     not been transmitted to the client yet

B: the encoding declaration in XSLT stylesheets

C: the charset attribute of the stylesheet-linking PI.

G: Apache provides the directive AddCharset for configuration 
   files.

Note: Use of C is incorrect.


2.12 plain text stored at the server which are stored at the server
     but have not been transmitted to the client yet

G: Apache provides the directive AddCharset for configuration 
   files.


2.13 text typed in <textarea> or <input type="text"> of HTML and 
   transmitted via HTTP

A: Each part of a multipart/form-data should have the charset parameter.

E: As the charset of such text, browsers typically use the charset 
   of the HTML page.

Note: Unfortunately, the charset parameter for parts of multipart/form-data 
      is not widely implemented.


2.14 file uploaded by <input type="file"> of HTML

A: Each part of a multipart/form-data should have the charset parameter.

Note: Unfortunately, the charset parameter for parts of multipart/form-data 
      is not widely implemented.


2.15 Javascript, VBScript, etc. received from the HTTP server 

B: Script elements of HTML 4.01 provide the charset parameter.

D: the charset of the referring resource (typically HTML) 

F: Some browsers sniff the charset.

Note 1: Since there are no media types for such programming languages, 
        the charset parameter is not available.

Note 2: Since scripts in such programming languages contains 
        many ASCII characters and a small number of non-ASCII 
        characters, guessing almost always fails.

Note 3: The referring resource may be an HTML document 
        temporarily created by XSLT at the client side.  
        Even when users create everything in Shift_JIS, 
        creates UTF-16 HTML documents and assumes the referenced
        Javascript as UTF-16.


2.16 E-mail sent via SMTP 

A: the charset parameter of MIME entities,

F: content sniffing

Note: The encoding of E-mail received by and stored at the 
      SMTP client is up to the mail program.

2.17 JSP pages

G: The pageEncoding attribute of the page directive of JSP 1.2.


3. Misc

3.1 Database

Typically, web servers are front ends for database systems.  
Encoding issues will arise especially because legacy data 
are in legacy encodings.


3.2 Content negotiation

 We also have to consider content negotiation issues.  If 
configuration of the charset parameter is difficult, the 
same thing applies to configuration for negotiation.

- charset negotiation, 
- language negotiation,
- media type negotiation,
- CONNEG


4. Concluding remarks

Unfortunately, the encoding issue is complicated, inconsistent, and
incomprehensible.  Furthermore, different patch levels of WWW browsers
behave slightly differently.  As a result, it is extremely difficult
to internationalize Web applications.  Many WWW developers in Japan
suffer.

I agree that we have to change the current situation.  However, I also
think that we can easily impair the situation by shortsighted
"improvements".  I believe that we strongly need a long-term plan.

In my understanding, I18N people at IETF and the I18N WG have always
believed authoritative use of the charset parameter.  I believe that a
long-term solution is to design an XML-based language for WWW server
configuration and to reference to such configuration files from all
WWW technologies, and that we should avoid ad-hoc solutions wherever
possible.  I feel that further promotion of meta tags, @charset,
and encoding declarations merely makes the situation worse.

P.S.  I don't know which mailing list or working group is best 
for this discussion.  Probably, the I18N WG of W3C?

Cheers,

IBM Tokyo Research Lab / International University of Japan, Research Institute

MURATA Makoto (FAMILY Given)
-----------------------------------------------------------------------
[1] http://lists.w3.org/Archives/Public/www-tag/2002Jan/0177.html

