
From: rousskov@measurement-factory.com (Alex Rousskov)
Date: Wed, 31 Dec 2003 08:40:21 -0700 (MST)
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FF28BC7.9060405@gmx.de>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A5C5.6090800@gmx.de> <20031230105949.GH76219@finch-staff-1.thus.net> <20031231005501.31892d3b.henrik@levkowetz.com> <3FF28BC7.9060405@gmx.de>
Message-ID: <Pine.BSF.4.53.0312310835480.77494@measurement-factory.com>

On Wed, 31 Dec 2003, Julian Reschke wrote:

> Henrik Levkowetz wrote:
>
> > Good exposition.
> >
> > However, the special character option is starting to look like a rathole
> > to me. What if somebody actually needs to use this character withouth
> > implying the additional semantics we've added to it?
>
> In IETF documents, this can't happen by definition (because they must
> consist soleley of ASCII characters).

True. However, IETF requirements might change and there are people
using xml2rfc for non-IETF documents. While we should not spend much
cycles accommodating these two exceptional scenarios, we also should not
alienate them if there are good alternatives. It's just "wrong", IMO,
to overload character meaning when we have the power of XML to express
what we want.

Alex.


From: rousskov@measurement-factory.com (Alex Rousskov)
Date: Wed, 31 Dec 2003 08:35:21 -0700 (MST)
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FF28BFE.4040807@gmx.de>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A5C5.6090800@gmx.de> <20031230105949.GH76219@finch-staff-1.thus.net> <20031231005501.31892d3b.henrik@levkowetz.com> <Pine.BSF.4.53.0312301721390.20917@measurement-factory.com> <3FF28BFE.4040807@gmx.de>
Message-ID: <Pine.BSF.4.53.0312310832400.77494@measurement-factory.com>

On Wed, 31 Dec 2003, Julian Reschke wrote:

> Alex Rousskov wrote:
> > I agree that "overloading" special characters is a bad idea.
> >
> > If "i.e.<nul />" looks too ugly, we can have an xml2rfc-specific
> > entity to encode non-sentence-ending-period or non-sentence-ending
> > space. For example,
> >
> > 	i.e.&nse;
> > or
> > 	i.e&nsep;
> > instead of
> > 	i.e.<nul />
> >
> > The first option probably looks better than others (IMO), but they are
> > all ugly.
>
> And that entitity would expand to what...?

It does not really matter from the user point of view, but I am
guessing it would expand to an element or PI with some meaningful name
(not an xml2rfc-special character).

Alex.


From: GK@ninebynine.org (Graham Klyne)
Date: Wed, 31 Dec 2003 11:26:22 +0000
Subject: [xml2rfc] Appendix subsections
In-Reply-To: <3FF14EE2.1060000@gmx.de>
References: <p0610070ebc162f837404@[10.0.2.4]> <p0610070ebc162f837404@[10.0.2.4]>
Message-ID: <5.1.0.14.2.20031231112456.0256fe98@127.0.0.1>

At 11:09 30/12/03 +0100, Julian Reschke wrote:
>>Right now, 2629bis says that appendix is identical to section, but that 
>>it can contain any number of appendix elements. Doing so (using the 
>>current web tool) produces subsections named "Appendix A.1", "Appendix 
>>A.2.3", etc.
> > ...
>
>As far as I understand the appendix element is work-in-progress (waiting 
>for the RFC Editor to clarify the desired numbering of appendices and 
>special sections such as references, authors...). For stable results, just 
>use section elements inside the back section.

Aha, that clarifies the question for me.  I just use <section> in the 
<back> part of the document.  The results look fine to me.  I hope that 
doesn't change (or at least, not too dramatically).

#g


------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact



From: julian.reschke@gmx.de (Julian Reschke)
Date: Wed, 31 Dec 2003 09:42:38 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <Pine.BSF.4.53.0312301721390.20917@measurement-factory.com>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A5C5.6090800@gmx.de> <20031230105949.GH76219@finch-staff-1.thus.net> <20031231005501.31892d3b.henrik@levkowetz.com> <Pine.BSF.4.53.0312301721390.20917@measurement-factory.com>
Message-ID: <3FF28BFE.4040807@gmx.de>

Alex Rousskov wrote:
> I agree that "overloading" special characters is a bad idea.
> 
> If "i.e.<nul />" looks too ugly, we can have an xml2rfc-specific
> entity to encode non-sentence-ending-period or non-sentence-ending
> space. For example,
> 
> 	i.e.&nse;
> or
> 	i.e&nsep;
> instead of
> 	i.e.<nul />
> 
> The first option probably looks better than others (IMO), but they are
> all ugly.

And that entitity would expand to what...?

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: julian.reschke@gmx.de (Julian Reschke)
Date: Wed, 31 Dec 2003 09:41:43 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031231005501.31892d3b.henrik@levkowetz.com>
References: <3FEEDA07.3070900@gmx.de>	<20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<3FEFFFD7.6020001@gmx.de>	<20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<3FF0A5C5.6090800@gmx.de>	<20031230105949.GH76219@finch-staff-1.thus.net> <20031231005501.31892d3b.henrik@levkowetz.com>
Message-ID: <3FF28BC7.9060405@gmx.de>

Henrik Levkowetz wrote:

> Good exposition.  
> 
> However, the special character option is starting to look like a rathole
> to me. What if somebody actually needs to use this character withouth
> implying the additional semantics we've added to it?

In IETF documents, this can't happen by definition (because they must 
consist soleley of ASCII characters).

> ...

Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: rousskov@measurement-factory.com (Alex Rousskov)
Date: Tue, 30 Dec 2003 17:28:36 -0700 (MST)
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031231005501.31892d3b.henrik@levkowetz.com>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A5C5.6090800@gmx.de> <20031230105949.GH76219@finch-staff-1.thus.net> <20031231005501.31892d3b.henrik@levkowetz.com>
Message-ID: <Pine.BSF.4.53.0312301721390.20917@measurement-factory.com>

On Wed, 31 Dec 2003, Henrik Levkowetz wrote:

> However, the special character option is starting to look like a
> rathole to me. What if somebody actually needs to use this character
> withouth implying the additional semantics we've added to it?
>
> I go back to stating a preference for markup, something like <nul />,
> e.g. "i.e.<nul /> " to avoid triggering the rendering of "." WSP as
> "." SP SP

I agree that "overloading" special characters is a bad idea.

If "i.e.<nul />" looks too ugly, we can have an xml2rfc-specific
entity to encode non-sentence-ending-period or non-sentence-ending
space. For example,

	i.e.&nse;
or
	i.e&nsep;
instead of
	i.e.<nul />

The first option probably looks better than others (IMO), but they are
all ugly.

Thanks,

Alex.


From: henrik@levkowetz.com (Henrik Levkowetz)
Date: Wed, 31 Dec 2003 00:55:01 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031230105949.GH76219@finch-staff-1.thus.net>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A5C5.6090800@gmx.de> <20031230105949.GH76219@finch-staff-1.thus.net>
Message-ID: <20031231005501.31892d3b.henrik@levkowetz.com>

Tuesday 30 December 2003, Clive D.W. Feather wrote:
> > b) use "&#160;" (Unicode Non-Breaking-Space) instead of a regular space 
> > character.
> 
> Wrong character for the purpose.
> 
> Yes, we should also have a character or directive meaning "this isn't the
> end of a sentence", but U+00A0 isn't the right one for the job - it's
> semantics are "don't break to a new line at this space".
> 
> Unicode has several special characters, so it's a question of picking the
> right one.
> 
> I would argue that things like "i.e." need to be written as *either*
>     i.e.&nbsp;no line break is permitted, but justification space is.
>     i.e.&zwj; the zero width joiner shows a closer association.
> 
> &nbsp; is &#x00A0; or &#160;   NO-BREAK SPACE
> &zwj;  is &#x200D; or &#8205;  ZERO WIDTH JOINER

Good exposition.  

However, the special character option is starting to look like a rathole
to me. What if somebody actually needs to use this character withouth
implying the additional semantics we've added to it?

I go back to stating a preference for markup, something like <nul />,
e.g. "i.e.<nul /> " to avoid triggering the rendering of "." WSP as
"." SP SP

	Henrik


From: clive@demon.net (Clive D.W. Feather)
Date: Tue, 30 Dec 2003 10:59:49 +0000
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FF0A5C5.6090800@gmx.de>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A5C5.6090800@gmx.de>
Message-ID: <20031230105949.GH76219@finch-staff-1.thus.net>

Julian Reschke said:
> In general we could say that any "." followed by whitespace and an 
> uppercase character is a sentence ending, unless
> a) the character sequence in front of the dot appears in an exception 
> list or is
> b) otherwise marked as not being a sentence ending.
> Proposals:
> a) add a PI specificying an exception list, such as:
> <?rfc nosentenceending='e.g. I.D.'?>

This is a good idea.

Personally, I'd like there to be a default list built-in, but I could live
without that.

> b) use "&#160;" (Unicode Non-Breaking-Space) instead of a regular space 
> character.

Wrong character for the purpose.

Yes, we should also have a character or directive meaning "this isn't the
end of a sentence", but U+00A0 isn't the right one for the job - it's
semantics are "don't break to a new line at this space".

Unicode has several special characters, so it's a question of picking the
right one.

I would argue that things like "i.e." need to be written as *either*
    i.e.&nbsp;no line break is permitted, but justification space is.
    i.e.&zwj; the zero width joiner shows a closer association.

&nbsp; is &#x00A0; or &#160;   NO-BREAK SPACE
&zwj;  is &#x200D; or &#8205;  ZERO WIDTH JOINER

-- 
Clive D.W. Feather  | Work:  <clive@demon.net>   | Tel:    +44 20 8495 6138
Internet Expert     | Home:  <clive@davros.org>  | *** NOTE CHANGE ***
Demon Internet      | WWW: http://www.davros.org | Fax:    +44 870 051 9937
Thus plc            |                            | Mobile: +44 7973 377646


From: julian.reschke@gmx.de (Julian Reschke)
Date: Tue, 30 Dec 2003 11:09:38 +0100
Subject: [xml2rfc] Appendix subsections
In-Reply-To: <p0610070ebc162f837404@[10.0.2.4]>
References: <p0610070ebc162f837404@[10.0.2.4]>
Message-ID: <3FF14EE2.1060000@gmx.de>

Pete Resnick wrote:

> Right now, 2629bis says that appendix is identical to section, but that 
> it can contain any number of appendix elements. Doing so (using the 
> current web tool) produces subsections named "Appendix A.1", "Appendix 
> A.2.3", etc.
 > ...

As far as I understand the appendix element is work-in-progress (waiting 
for the RFC Editor to clarify the desired numbering of appendices and 
special sections such as references, authors...). For stable results, 
just use section elements inside the back section.

Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: julian.reschke@gmx.de (Julian Reschke)
Date: Tue, 30 Dec 2003 11:03:14 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <200312300121.hBU1LA2c019557@bulk.resource.org>
References: <200312300121.hBU1LA2c019557@bulk.resource.org>
Message-ID: <3FF14D62.50902@gmx.de>

Carl Malamud wrote:
> How about &#160;  ?  That's the utf-8 space character.

That's what I suggested. BTW: the UTF-8 space character is U+32, just 
line in ASCII. U+160 is "non-breaking space", a space character where no 
line break is allowed. So we would be bending the Unicode semantics a bit.

Julian


-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: clive@demon.net (Clive D.W. Feather)
Date: Tue, 30 Dec 2003 08:44:46 +0000
Subject: [xml2rfc] Appendix subsections
In-Reply-To: <p0610070ebc162f837404@[10.0.2.4]>
References: <p0610070ebc162f837404@[10.0.2.4]>
Message-ID: <20031230084446.GA76219@finch-staff-1.thus.net>

Pete Resnick said:
> 2. In reality, I don't think of sub-sections of an appendix as 
> themselves appendices, but instead sections. I think it would be best 
> to change appendix to contain any number of section elements instead 
> of appendix elements.

That would be my view.

> (The current web tool deals with this, but it 
> makes the numbering of those sections as if they were top-level 
> sections instead of being embedded.)

I would say that Appendix A contains sections A.1, A.2, A.3, etc.
A.1 contains A.1.1, A.1.2, etc.

-- 
Clive D.W. Feather  | Work:  <clive@demon.net>   | Tel:    +44 20 8495 6138
Internet Expert     | Home:  <clive@davros.org>  | *** NOTE CHANGE ***
Demon Internet      | WWW: http://www.davros.org | Fax:    +44 870 051 9937
Thus plc            |                            | Mobile: +44 7973 377646


From: carl@media.org (Carl Malamud)
Date: Mon, 29 Dec 2003 17:21:10 -0800 (PST)
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229235612.6b582b7f.henrik@levkowetz.com>
Message-ID: <200312300121.hBU1LA2c019557@bulk.resource.org>

How about &#160;  ?  That's the utf-8 space character.

That way, xml2rfc does not see ". WSP CR".

This comes from Tim Bray via Google, who apparently is an xml2rfc user:

http://tbray.org/tag/utf-8+names.html


> Monday 29 December 2003, Julian Reschke wrote:
> > BTW: an additional Unicode character can be declared as entity in the 
> > DTD and thus would be as writeable as anything else (such as "&nop;" :-).
> 
> Right. &nop; or &skip; or &nul; or whatever of that form would work fine
> for me.
> 
> 	Henrik


From: henrik@levkowetz.com (Henrik Levkowetz)
Date: Mon, 29 Dec 2003 23:56:12 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FF0AF70.2030403@gmx.de>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <20031229221247.1945fa64.henrik@levkowetz.com> <20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A671.1070805@gmx.de> <20031229233954.764a00aa.henrik@levkowetz.com> <3FF0AF70.2030403@gmx.de>
Message-ID: <20031229235612.6b582b7f.henrik@levkowetz.com>

--Signature=_Mon__29_Dec_2003_23_56_12_+0100_ygD.uhbV5dBH1Dld
Content-Type: text/plain; charset=US-ASCII
Content-Disposition: inline
Content-Transfer-Encoding: 7bit

Monday 29 December 2003, Julian Reschke wrote:
> BTW: an additional Unicode character can be declared as entity in the 
> DTD and thus would be as writeable as anything else (such as "&nop;" :-).

Right. &nop; or &skip; or &nul; or whatever of that form would work fine
for me.

	Henrik

--Signature=_Mon__29_Dec_2003_23_56_12_+0100_ygD.uhbV5dBH1Dld
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/8LEMeVhrtTJkXCMRAqYeAKCdCAFcwn+cMLRpMIVAxs2liTkrEwCfUWjJ
i7zzSZh3qXlHUiV2Hk/dSLw=
=U76x
-----END PGP SIGNATURE-----

--Signature=_Mon__29_Dec_2003_23_56_12_+0100_ygD.uhbV5dBH1Dld--


From: julian.reschke@gmx.de (Julian Reschke)
Date: Mon, 29 Dec 2003 23:49:20 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229233954.764a00aa.henrik@levkowetz.com>
References: <3FEEDA07.3070900@gmx.de>	<20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<3FEFFFD7.6020001@gmx.de>	<20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<20031229221247.1945fa64.henrik@levkowetz.com>	<20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<3FF0A671.1070805@gmx.de> <20031229233954.764a00aa.henrik@levkowetz.com>
Message-ID: <3FF0AF70.2030403@gmx.de>

Henrik Levkowetz wrote:

> Monday 29 December 2003, Julian Reschke wrote:
> 
>>Marshall Rose wrote:
>>
>>>>An alternative would be to *always* render "." WSP as "." SP SP, and add
>>>>a tag such as <nop />, which' only effect would be to make ".<nop /> "
>>>>not be recognized as "." WSP.  With such a tag, one could write
>>>>"i.e.<nop /> blah blah" and have it come out right, thereby avoiding
>>>>exception rules.
>>>
>>>
>>>interesting idea. comments?
>>
>>Yep. As I just wrote, we can add the condition of uppercase-ness of the 
>>following character (otherwise it'll never be a sentence start anyway).
> 
> 
> No, I used an editor macro with a regexp to do this some time ago, and
> it broke pretty swiftly on something like "e.g. EAP".  So this is not
> reliable.

That's right. However, the suggestion was meant as additional condition, 
that is: if the following word starts with a lowercase character, this 
is not the end of a sequence.

>>I'd probably use a PI instead of <nop/> (as this clearly *is* a 
>>processing instruction), but that's just a matter of taste. As I wrote, 
>>we can also hijack just another whitespace Unicode character.
> 
> 
> Don't know that I see clearly that this is a PI.  To me, it's markup.
> Call it <empty />, <zilch />, <nul />, <skip /> or something else if
> <nop /> makes one think of processing.  Another whitespace Unicode
> character would work, but I think it might be both less readable, less
> writeable, and less obvious.

If we make it explicit markup (a new element), it would make a lot of 
sense to assign a meaning to that element, so "nop" doesn't seem to be a 
good choice.

As far as I understand we want to express the condition of "this may 
look like a sequence end, but it isn't", and this really seems to be a 
processing instruction to the TXT formatter, nothing else (as the 
RFC2629 DTD currently doesn't go below the level of paragraphs).

BTW: an additional Unicode character can be declared as entity in the 
DTD and thus would be as writeable as anything else (such as "&nop;" :-).


Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: henrik@levkowetz.com (Henrik Levkowetz)
Date: Mon, 29 Dec 2003 23:39:54 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FF0A671.1070805@gmx.de>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <20031229221247.1945fa64.henrik@levkowetz.com> <20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FF0A671.1070805@gmx.de>
Message-ID: <20031229233954.764a00aa.henrik@levkowetz.com>

--Signature=_Mon__29_Dec_2003_23_39_54_+0100_igeIpDUMwBWxWaSj
Content-Type: text/plain; charset=US-ASCII
Content-Disposition: inline
Content-Transfer-Encoding: 7bit


Monday 29 December 2003, Julian Reschke wrote:
> Marshall Rose wrote:
> >>An alternative would be to *always* render "." WSP as "." SP SP, and add
> >>a tag such as <nop />, which' only effect would be to make ".<nop /> "
> >>not be recognized as "." WSP.  With such a tag, one could write
> >>"i.e.<nop /> blah blah" and have it come out right, thereby avoiding
> >>exception rules.
> > 
> > 
> > interesting idea. comments?
> 
> Yep. As I just wrote, we can add the condition of uppercase-ness of the 
> following character (otherwise it'll never be a sentence start anyway).

No, I used an editor macro with a regexp to do this some time ago, and
it broke pretty swiftly on something like "e.g. EAP".  So this is not
reliable.

> I'd probably use a PI instead of <nop/> (as this clearly *is* a 
> processing instruction), but that's just a matter of taste. As I wrote, 
> we can also hijack just another whitespace Unicode character.

Don't know that I see clearly that this is a PI.  To me, it's markup.
Call it <empty />, <zilch />, <nul />, <skip /> or something else if
<nop /> makes one think of processing.  Another whitespace Unicode
character would work, but I think it might be both less readable, less
writeable, and less obvious.

	Henrik

--Signature=_Mon__29_Dec_2003_23_39_54_+0100_igeIpDUMwBWxWaSj
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/8K1DeVhrtTJkXCMRAq7CAJ9nLlZo0tSkFgCR5ToPUpehmRqhogCg3xQb
MVDBEXm1WPnmSg0CNAFjevU=
=NT8n
-----END PGP SIGNATURE-----

--Signature=_Mon__29_Dec_2003_23_39_54_+0100_igeIpDUMwBWxWaSj--


From: rousskov@measurement-factory.com (Alex Rousskov)
Date: Mon, 29 Dec 2003 15:13:55 -0700 (MST)
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <20031229221247.1945fa64.henrik@levkowetz.com> <20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <Pine.BSF.4.53.0312291510030.64996@measurement-factory.com>

On Mon, 29 Dec 2003, Marshall Rose wrote:

> > An alternative would be to *always* render "." WSP as "." SP SP, and add
> > a tag such as <nop />, which' only effect would be to make ".<nop /> "
> > not be recognized as "." WSP.  With such a tag, one could write
> > "i.e.<nop /> blah blah" and have it come out right, thereby avoiding
> > exception rules.
>
> interesting idea. comments?

IMHO, this approach is the Right Thing to do because it is general
(covers all cases) rather than exceptions-based (covers known common
cases). FWIW, that's how LaTeX solves essentially the same problem.

Just be a little bit more careful not to render "." WSP "</t>" as
"." SP SP </p>. (I.e. there has to be something after WSP to justify
SP SP).

Alex.


From: julian.reschke@gmx.de (Julian Reschke)
Date: Mon, 29 Dec 2003 23:10:57 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de>	<20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<3FEFFFD7.6020001@gmx.de>	<20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<20031229221247.1945fa64.henrik@levkowetz.com> <20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <3FF0A671.1070805@gmx.de>

Marshall Rose wrote:
>>An alternative would be to *always* render "." WSP as "." SP SP, and add
>>a tag such as <nop />, which' only effect would be to make ".<nop /> "
>>not be recognized as "." WSP.  With such a tag, one could write
>>"i.e.<nop /> blah blah" and have it come out right, thereby avoiding
>>exception rules.
> 
> 
> interesting idea. comments?

Yep. As I just wrote, we can add the condition of uppercase-ness of the 
following character (otherwise it'll never be a sentence start anyway).

I'd probably use a PI instead of <nop/> (as this clearly *is* a 
processing instruction), but that's just a matter of taste. As I wrote, 
we can also hijack just another whitespace Unicode character.

Regards, Julian


-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: julian.reschke@gmx.de (Julian Reschke)
Date: Mon, 29 Dec 2003 23:08:05 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de>	<20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>	<3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <3FF0A5C5.6090800@gmx.de>

Marshall Rose wrote:

>>Why don't I get multiple spaces between "the" and "second" then? The two 
>>words are separated by one EOL and two spaces...?
> 
> 
> as i noted earlier: xml2rfc treats
> 
> 	*WSP EOL
> 
> as
> 
> 	EOL
> 
> and then treats
> 
> 	EOL
> 
> as exactly one
> 
> 	SP

Hm. I guess I'm a bit slow during the holidays...

In my example, I have -- replacing blanks by "_" -- :


<t>
___This_is_the_first_sentence_of_the_paragraph.__This_is_the
___second_sentence_of_the_paragraph_(with_two_leading_blanks).
___Here's_another_sentence_that_was_started_on_a_separate_line_in
___the_input_file.
</t>

So between "the" and "second" there's one EOL and three SPs. Why do I 
only get one space in the output? Maybe because the last character in 
front of the whitespace isn't a dot?

Anyway, I think if the TXT output needs these additional space 
characters, it would be wise to let xml2rfc figure that out on it's own 
instead of having the author worry with that issue. It would also make 
the content model for <t> simpler, and here simpler seems to be better.

In general we could say that any "." followed by whitespace and an 
uppercase character is a sentence ending, unless

a) the character sequence in front of the dot appears in an exception 
list or is

b) otherwise marked as not being a sentence ending.

Proposals:

a) add a PI specificying an exception list, such as:

<?rfc nosentenceending='e.g. I.D.'?>

b) use "&#160;" (Unicode Non-Breaking-Space) instead of a regular space 
character.


Regards, Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: mrose+internet.xml2rfc@dbc.mtview.ca.us (Marshall Rose)
Date: Mon, 29 Dec 2003 13:57:03 -0800
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229221247.1945fa64.henrik@levkowetz.com>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <20031229221247.1945fa64.henrik@levkowetz.com>
Message-ID: <20031229135703.0552016e.mrose+internet.xml2rfc@dbc.mtview.ca.us>

> An alternative would be to *always* render "." WSP as "." SP SP, and add
> a tag such as <nop />, which' only effect would be to make ".<nop /> "
> not be recognized as "." WSP.  With such a tag, one could write
> "i.e.<nop /> blah blah" and have it come out right, thereby avoiding
> exception rules.

interesting idea. comments?

/mtr


From: henrik@levkowetz.com (Henrik Levkowetz)
Date: Mon, 29 Dec 2003 22:12:47 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <20031229221247.1945fa64.henrik@levkowetz.com>

Monday 29 December 2003, Marshall Rose wrote:

> it sounds as if everyone would be happy if there was one other rule:
> 
> 	"." EOL
> 
> is treated as
> 
> 	"." SP SP

This would go a long way; I'd not have to always go over the text and
make sure that I never had "." EOL  - which I currently do.

However, I'd be even happier if "." WSP  was rendered as "." SP SP,
with an exception list for the non-whitespace preceding "." (avoiding
double SP after "e.g.", "i.e.", etc. )

An alternative would be to *always* render "." WSP as "." SP SP, and add
a tag such as <nop />, which' only effect would be to make ".<nop /> "
not be recognized as "." WSP.  With such a tag, one could write
"i.e.<nop /> blah blah" and have it come out right, thereby avoiding
exception rules.

	Henrik




From: presnick@qualcomm.com (Pete Resnick)
Date: Mon, 29 Dec 2003 13:29:54 -0600
Subject: [xml2rfc] Appendix subsections
Message-ID: <p0610070ebc162f837404@[10.0.2.4]>

Right now, 2629bis says that appendix is identical to section, but 
that it can contain any number of appendix elements. Doing so (using 
the current web tool) produces subsections named "Appendix A.1", 
"Appendix A.2.3", etc.

I don't know if I want a formatting change or a syntax change:

1. It would suffice if the formatting of sub-appendices left out the 
word "Appendix" and instead contained only the counter.

2. In reality, I don't think of sub-sections of an appendix as 
themselves appendices, but instead sections. I think it would be best 
to change appendix to contain any number of section elements instead 
of appendix elements. (The current web tool deals with this, but it 
makes the numbering of those sections as if they were top-level 
sections instead of being embedded.)

Your call folks.
-- 
Pete Resnick <http://www.qualcomm.com/~presnick/>
QUALCOMM Incorporated - Direct phone: (858)651-4478, Fax: (858)651-1102


From: falk@ISI.EDU (Aaron Falk)
Date: Mon, 29 Dec 2003 10:41:52 -0800
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <7D5D48D2CAA3D84C813F5B154F43B155033D2E55@nl0006exch001u.nl.lucent.com>
References: <7D5D48D2CAA3D84C813F5B154F43B155033D2E55@nl0006exch001u.nl.lucent.com>
Message-ID: <AB9E99F0-3A2E-11D8-90FB-000A95DBDB84@isi.edu>

On Dec 29, 2003, at 3:15 AM, Wijnen, Bert (Bert) wrote:

> All I was saying is that it seems wasting cycles if (RFC-)editor
> (or any person for that matter) goes through a document in a very
> detailed way and just add extra space in between sentences.

Bert-

Checking for single spaces is an automated process.  Changing single 
spaces to double spaces is a manual one.

--aaron



From: rousskov@measurement-factory.com (Alex Rousskov)
Date: Mon, 29 Dec 2003 11:38:26 -0700 (MST)
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229095450.56370c3f.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <20031229152758.GE2448@sbrim-w2k01> <20031229095450.56370c3f.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <Pine.BSF.4.53.0312291134530.64996@measurement-factory.com>

On Mon, 29 Dec 2003, Marshall Rose wrote:

> > I promise not to use "i.e.", "e.g.", or "viz." in RFCs.
>
> or rather, just at EOL. and therein lies the problem of having a
> program figure out what's what...

When the editor wraps paragraphs by insering new lines, the "do not
put 'i.e.' at EOL" rule does not work, unfortunately. What was not at
the end of the line may get there with time. Not to mention that
patching and pre-processing may have similar effects.

Having several kinds of "." is ugly but unavoidable in a complete
solution.

Alex.


From: mrose+internet.xml2rfc@dbc.mtview.ca.us (Marshall Rose)
Date: Mon, 29 Dec 2003 09:54:50 -0800
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229152758.GE2448@sbrim-w2k01>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us> <20031229152758.GE2448@sbrim-w2k01>
Message-ID: <20031229095450.56370c3f.mrose+internet.xml2rfc@dbc.mtview.ca.us>

> Actually, "." *WSP EOL (assuming * means 0 or more)

true.


> I promise not to use "i.e.", "e.g.", or "viz." in RFCs.

or rather, just at EOL. and therein lies the problem of having a program
figure out what's what...

/mtr


From: rousskov@measurement-factory.com (Alex Rousskov)
Date: Mon, 29 Dec 2003 09:19:45 -0700 (MST)
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <7D5D48D2CAA3D84C813F5B154F43B155033D2E4A@nl0006exch001u.nl.lucent.com>
References: <7D5D48D2CAA3D84C813F5B154F43B155033D2E4A@nl0006exch001u.nl.lucent.com>
Message-ID: <Pine.BSF.4.53.0312290905270.64996@measurement-factory.com>

On Mon, 29 Dec 2003, Wijnen, Bert (Bert) wrote:

> but are we not wasting cycles if an editor is doing manual work to
> change one space to two spaces inbetween sentences?

IMHO, it is the Editor who is wasting cycles due to the lack of better
tools. In a volunteer organization, it should be submitter's job to
get the formatting right. The Editor (and others) should be able to
validate the format, which can be 99% automated.

Xml2rfc goal should be to produce acceptable "as is" output. This
would save a lot of resources and even prevent many bugs in the long
run. Any Marshall's step in that direction is not a waste of cycles,
especially in the absence of a prioritized issues list from the RFC
Editor.

Unfortunately, since not all dots end sentences, and whitespace is too
fragile to rely on, we would certainly need additional optional markup
for xml2rfc to do the right thing _in corner cases_. LaTeX and other
established formatters have examples on how this additional markup can
be implemented. Alternatively, we can simplify RFC Editor formatting
rules, but I am guessing that would be a lot harder and possibly
undesirable.

Alex.


From: swb@employees.org (Scott W Brim)
Date: Mon, 29 Dec 2003 10:27:58 -0500
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de> <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <20031229152758.GE2448@sbrim-w2k01>

On Mon, Dec 29, 2003 07:11:58AM -0800, Marshall Rose allegedly wrote:
> it sounds as if everyone would be happy if there was one other rule:
> 
> 	"." EOL
> 
> is treated as
> 
> 	"." SP SP
> 
> /mtr

Actually, "." *WSP EOL (assuming * means 0 or more)

I promise not to use "i.e.", "e.g.", or "viz." in RFCs.


From: mrose+internet.xml2rfc@dbc.mtview.ca.us (Marshall Rose)
Date: Mon, 29 Dec 2003 07:11:58 -0800
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FEFFFD7.6020001@gmx.de>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us> <3FEFFFD7.6020001@gmx.de>
Message-ID: <20031229071158.6985f569.mrose+internet.xml2rfc@dbc.mtview.ca.us>

> 
> Why don't I get multiple spaces between "the" and "second" then? The two 
> words are separated by one EOL and two spaces...?

as i noted earlier: xml2rfc treats

	*WSP EOL

as

	EOL

and then treats

	EOL

as exactly one

	SP

it sounds as if everyone would be happy if there was one other rule:

	"." EOL

is treated as

	"." SP SP

/mtr


From: julian.reschke@gmx.de (Julian Reschke)
Date: Mon, 29 Dec 2003 12:34:03 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <7D5D48D2CAA3D84C813F5B154F43B155033D2E55@nl0006exch001u.nl.lucent.com>
References: <7D5D48D2CAA3D84C813F5B154F43B155033D2E55@nl0006exch001u.nl.lucent.com>
Message-ID: <3FF0112B.5080303@gmx.de>

Wijnen, Bert (Bert) wrote:

>>Wijnen, Bert (Bert) wrote:
>>
>>
>>>Maybe this is just me... but are we not wasting cycles if an
>>>editor is doing manual work to change one space to two
>>>spaces inbetween sentences? If that were all the problems we 
>>>found in RFCs, I would say that we are in Really Good Shape!
>>
>>Yes and no. Of course there are more important problems.
>>
>>I was rasing this issue because I recently finished an RFC, and I'm now 
>>in the process of finishing a second one. IMHO it would be extremely 
>>nice if xml2rfc would be able to match *exactly* the output of the RFC 
>>Editor's changes, for the following simple reasons:
>>
> 
> Of COURSE I do agree with the desire/need to have xml2rfc do the RIGHT 
> and CORRECT thing. 
> 
> All I was saying is that it seems wasting cycles if (RFC-)editor 
> (or any person for that matter) goes through a document in a very 
> detailed way and just add extra space in between sentences.

Oh, I see.

Actually, that was part of my question (answered by Aaron): is this 
still desired? It seems so.

Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: bwijnen@lucent.com (Wijnen, Bert (Bert))
Date: Mon, 29 Dec 2003 12:15:49 +0100
Subject: [xml2rfc] end of sentence: two spaces?
Message-ID: <7D5D48D2CAA3D84C813F5B154F43B155033D2E55@nl0006exch001u.nl.lucent.com>

> Wijnen, Bert (Bert) wrote:
> 
> > Maybe this is just me... but are we not wasting cycles if an
> > editor is doing manual work to change one space to two
> > spaces inbetween sentences? If that were all the problems we 
> > found in RFCs, I would say that we are in Really Good Shape!
> 
> Yes and no. Of course there are more important problems.
> 
> I was rasing this issue because I recently finished an RFC, and I'm now 
> in the process of finishing a second one. IMHO it would be extremely 
> nice if xml2rfc would be able to match *exactly* the output of the RFC 
> Editor's changes, for the following simple reasons:
> 
Of COURSE I do agree with the desire/need to have xml2rfc do the RIGHT 
and CORRECT thing. 

All I was saying is that it seems wasting cycles if (RFC-)editor 
(or any person for that matter) goes through a document in a very 
detailed way and just add extra space in between sentences.

Bert


From: julian.reschke@gmx.de (Julian Reschke)
Date: Mon, 29 Dec 2003 11:50:02 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <7D5D48D2CAA3D84C813F5B154F43B155033D2E4A@nl0006exch001u.nl.lucent.com>
References: <7D5D48D2CAA3D84C813F5B154F43B155033D2E4A@nl0006exch001u.nl.lucent.com>
Message-ID: <3FF006DA.7020300@gmx.de>

Wijnen, Bert (Bert) wrote:

> Maybe this is just me... but are we not wasting cycles if an
> editor is doing manual work to change one space to two
> spaces inbetween sentences? If that were all the problems we 
> found in RFCs, I would say that we are in Really Good Shape!

Yes and no. Of course there are more important problems.

I was rasing this issue because I recently finished an RFC, and I'm now 
in the process of finishing a second one. IMHO it would be extremely 
nice if xml2rfc would be able to match *exactly* the output of the RFC 
Editor's changes, for the following simple reasons:

- the less boring work the RFC Editor needs to do, the more she/he can 
focuse on content and language

- reducing the amount of editorial "fixes" (optimally to zero) is good 
for avoiding accidental changes in the content that weren't intended

- having xml2rfc produce exactly the "right" amount of whitespace would 
ensure that there's no need for a manual fixup of page references (both 
in the TOC and in the index)

Optimally, xml2rfc's TXT output should be acceptable as-is. Of course 
that requires that the RFC Editor comes up with guidelines we can rely 
on (hint, hint). For instance I note that 
<http://www.ietf.org/internet-drafts/draft-rfc-editor-rfc2223bis-07.txt> 
uses the same section numbering style as xml2rfc, while in the RFCs that 
are currently getting published, a trailing dot is added. This makes 
diffing of the text files  harder than it needs to be. And everytime 
diffing gets harder, it's also getting more likely that the author 
misses an unintended change done by the RFC Editor when reviewing the 
changes.

Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: bwijnen@lucent.com (Wijnen, Bert (Bert))
Date: Mon, 29 Dec 2003 11:38:31 +0100
Subject: [xml2rfc] end of sentence: two spaces?
Message-ID: <7D5D48D2CAA3D84C813F5B154F43B155033D2E4A@nl0006exch001u.nl.lucent.com>

Maybe this is just me... but are we not wasting cycles if an
editor is doing manual work to change one space to two
spaces inbetween sentences? If that were all the problems we 
found in RFCs, I would say that we are in Really Good Shape!

Thanks,
Bert 

> -----Original Message-----
> From: Aaron Falk [mailto:falk@ISI.EDU]
> Sent: maandag 29 december 2003 0:07
> To: Julian Reschke
> Cc: xml2rfc; RFC Editor
> Subject: Re: [xml2rfc] end of sentence: two spaces?
> 
> 
> Julian, et al-
> 
> RFCs should be formatted with two spaces between sentences.  
> It would  
> be a Good Thing if the formatter preserved/inserted the correct  
> spacing.  (Just so you know, submissions won't be rejected if they  
> don't have the correct spacing but if it is lacking an editor 
> will have  
> to add it by hand.)
> 
> --aaron (for the RFC Editor)
> 
> On Dec 28, 2003, at 5:26 AM, Julian Reschke wrote:
> 
> > Hi all,
> >
> > I'm not sure that I understand the logic how xml2rfc 
> decides when to  
> > preserve whitespace inside <t> elements and when it doesn't.
> >
> > Given the input:
> >
> > <section title="Paragraph formatting">
> > <t>
> >   This is the first sentence of the paragraph.  This is the
> >   second sentence of the paragraph (with two leading blanks).
> >   Here's another sentence that was started on a separate line in
> >   the input file.
> > </t>
> > <t>
> >   This is the second paragraph.
> > </t>
> > </section>
> >
> > xml2rfc will produce:
> >
> > 6. Paragraph formatting
> >
> >    This is the first sentence of the paragraph.  This is the second
> >    sentence of the paragraph (with two leading blanks). 
> Here's another
> >    sentence that was started on a separate line in the input file.
> >
> >    This is the second paragraph.
> >
> > So in the first case, the whitespace was preserved, while in the  
> > second case (between sentence 2 and 3) it wasn't.
> >
> > We should clarify:
> >
> > a) Whether or not the TXT output is supposed to have two spaces  
> > between sentences. According to  
> > <http://www.ietf.org/internet-drafts/draft-rfc-editor-rfc2223bis 
> > -07.txt> it's supposed to, but will this still be the case 
> when the  
> > new format spec is finally published (feedback from the RFC Editor  
> > appreciated here).
> >
> > b) If the answer to a) is "yes", who is supposed to take 
> care of this?  
> > The document author or the formatter?
> >
> > c) If the answer to b) is "the author", then there's a 
> problem because  
> > this would require the value of xml:space for <t> to be "preserve"  
> > which we of course don't want it to be.
> >
> > d) OTOH, if the answer to b) is "the formatter", we should 
> clarify how  
> > it's supposed to detect an end-of-sentence. In general this may  
> > require additional markup (at least in some edge cases).
> >
> > Slightly related: I just notice that <spanx> is declared to have  
> > xml:space='preserve', however xml2rfc ignores whitespace 
> inside the  
> > element (and I think this is correct and intended). Thus the DTD  
> > should be fixed.
> >
> >
> > Regards, Julian
> >
> >
> >
> >
> >
> > _______________________________________________
> > xml2rfc mailing list
> > xml2rfc@lists.xml.resource.org
> > http://lists.xml.resource.org/mailman/listinfo/xml2rfc
> 
> _______________________________________________
> xml2rfc mailing list
> xml2rfc@lists.xml.resource.org
> http://lists.xml.resource.org/mailman/listinfo/xml2rfc
> 


From: julian.reschke@gmx.de (Julian Reschke)
Date: Mon, 29 Dec 2003 11:20:07 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <3FEFFFD7.6020001@gmx.de>

Marshall Rose wrote:
>>I'm not sure that I understand the logic how xml2rfc decides when to 
>>preserve whitespace inside <t> elements and when it doesn't.
> 
>     
> right now, it preserves whitespace and leave the choice of ". " v. ".  "
> to the author, although whitespace EOL is treated as plain EOL, and
> plain EOL is treated as one space.

Hm. Looking again at the example:


<section title="Paragraph formatting">
<t>
   This is the first sentence of the paragraph.  This is the
   second sentence of the paragraph (with two leading blanks).
   Here's another sentence that was started on a separate line in
   the input file.
</t>
<t>
   This is the second paragraph.
</t>
</section>

xml2rfc will produce:

6. Paragraph formatting

    This is the first sentence of the paragraph.  This is the second
    sentence of the paragraph (with two leading blanks). Here's another
    sentence that was started on a separate line in the input file.

    This is the second paragraph.

Why don't I get multiple spaces between "the" and "second" then? The two 
words are separated by one EOL and two spaces...?


-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


From: fw@deneb.enyo.de (Florian Weimer)
Date: Mon, 29 Dec 2003 09:46:03 +0100
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>
References: <3FEEDA07.3070900@gmx.de> <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <20031229084603.GA18975@deneb.enyo.de>

Marshall Rose wrote:

> right now, it preserves whitespace and leave the choice of ". " v. ".  "
> to the author, although whitespace EOL is treated as plain EOL, and
> plain EOL is treated as one space.

This doesn't agree with Emacs -- Emacs treats "." at the end of a line
as a sentence-ending ".".  Word-wrapping might change it into ".  ".
(Emacs never breaks a line at ". ".)


From: mrose+internet.xml2rfc@dbc.mtview.ca.us (Marshall Rose)
Date: Sun, 28 Dec 2003 17:04:47 -0800
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FEEDA07.3070900@gmx.de>
References: <3FEEDA07.3070900@gmx.de>
Message-ID: <20031228170447.5f6f7254.mrose+internet.xml2rfc@dbc.mtview.ca.us>

> I'm not sure that I understand the logic how xml2rfc decides when to 
> preserve whitespace inside <t> elements and when it doesn't.
    
right now, it preserves whitespace and leave the choice of ". " v. ".  "
to the author, although whitespace EOL is treated as plain EOL, and
plain EOL is treated as one space.
    
    
> ...
> d) OTOH, if the answer to b) is "the formatter", we should clarify how 
> it's supposed to detect an end-of-sentence. In general this may require 
> additional markup (at least in some edge cases).
    
i hope that markup wouldn't be needed...
    
    
> Slightly related: I just notice that <spanx> is declared to have 
> xml:space='preserve', however xml2rfc ignores whitespace inside the 
> element (and I think this is correct and intended). Thus the DTD should 
> be fixed.

ok.
    
/mtr
    
    


From: falk@ISI.EDU (Aaron Falk)
Date: Sun, 28 Dec 2003 15:06:59 -0800
Subject: [xml2rfc] end of sentence: two spaces?
In-Reply-To: <3FEEDA07.3070900@gmx.de>
References: <3FEEDA07.3070900@gmx.de>
Message-ID: <8A615D94-398A-11D8-90FB-000A95DBDB84@isi.edu>

Julian, et al-

RFCs should be formatted with two spaces between sentences.  It would  
be a Good Thing if the formatter preserved/inserted the correct  
spacing.  (Just so you know, submissions won't be rejected if they  
don't have the correct spacing but if it is lacking an editor will have  
to add it by hand.)

--aaron (for the RFC Editor)

On Dec 28, 2003, at 5:26 AM, Julian Reschke wrote:

> Hi all,
>
> I'm not sure that I understand the logic how xml2rfc decides when to  
> preserve whitespace inside <t> elements and when it doesn't.
>
> Given the input:
>
> <section title="Paragraph formatting">
> <t>
>   This is the first sentence of the paragraph.  This is the
>   second sentence of the paragraph (with two leading blanks).
>   Here's another sentence that was started on a separate line in
>   the input file.
> </t>
> <t>
>   This is the second paragraph.
> </t>
> </section>
>
> xml2rfc will produce:
>
> 6. Paragraph formatting
>
>    This is the first sentence of the paragraph.  This is the second
>    sentence of the paragraph (with two leading blanks). Here's another
>    sentence that was started on a separate line in the input file.
>
>    This is the second paragraph.
>
> So in the first case, the whitespace was preserved, while in the  
> second case (between sentence 2 and 3) it wasn't.
>
> We should clarify:
>
> a) Whether or not the TXT output is supposed to have two spaces  
> between sentences. According to  
> <http://www.ietf.org/internet-drafts/draft-rfc-editor-rfc2223bis 
> -07.txt> it's supposed to, but will this still be the case when the  
> new format spec is finally published (feedback from the RFC Editor  
> appreciated here).
>
> b) If the answer to a) is "yes", who is supposed to take care of this?  
> The document author or the formatter?
>
> c) If the answer to b) is "the author", then there's a problem because  
> this would require the value of xml:space for <t> to be "preserve"  
> which we of course don't want it to be.
>
> d) OTOH, if the answer to b) is "the formatter", we should clarify how  
> it's supposed to detect an end-of-sentence. In general this may  
> require additional markup (at least in some edge cases).
>
> Slightly related: I just notice that <spanx> is declared to have  
> xml:space='preserve', however xml2rfc ignores whitespace inside the  
> element (and I think this is correct and intended). Thus the DTD  
> should be fixed.
>
>
> Regards, Julian
>
>
>
>
>
> _______________________________________________
> xml2rfc mailing list
> xml2rfc@lists.xml.resource.org
> http://lists.xml.resource.org/mailman/listinfo/xml2rfc



From: julian.reschke@gmx.de (Julian Reschke)
Date: Sun, 28 Dec 2003 14:26:31 +0100
Subject: [xml2rfc] end of sentence: two spaces?
Message-ID: <3FEEDA07.3070900@gmx.de>

Hi all,

I'm not sure that I understand the logic how xml2rfc decides when to 
preserve whitespace inside <t> elements and when it doesn't.

Given the input:

<section title="Paragraph formatting">
<t>
   This is the first sentence of the paragraph.  This is the
   second sentence of the paragraph (with two leading blanks).
   Here's another sentence that was started on a separate line in
   the input file.
</t>
<t>
   This is the second paragraph.
</t>
</section>

xml2rfc will produce:

6. Paragraph formatting

    This is the first sentence of the paragraph.  This is the second
    sentence of the paragraph (with two leading blanks). Here's another
    sentence that was started on a separate line in the input file.

    This is the second paragraph.

So in the first case, the whitespace was preserved, while in the second 
case (between sentence 2 and 3) it wasn't.

We should clarify:

a) Whether or not the TXT output is supposed to have two spaces between 
sentences. According to 
<http://www.ietf.org/internet-drafts/draft-rfc-editor-rfc2223bis-07.txt> 
it's supposed to, but will this still be the case when the new format 
spec is finally published (feedback from the RFC Editor appreciated here).

b) If the answer to a) is "yes", who is supposed to take care of this? 
The document author or the formatter?

c) If the answer to b) is "the author", then there's a problem because 
this would require the value of xml:space for <t> to be "preserve" which 
we of course don't want it to be.

d) OTOH, if the answer to b) is "the formatter", we should clarify how 
it's supposed to detect an end-of-sentence. In general this may require 
additional markup (at least in some edge cases).

Slightly related: I just notice that <spanx> is declared to have 
xml:space='preserve', however xml2rfc ignores whitespace inside the 
element (and I think this is correct and intended). Thus the DTD should 
be fixed.


Regards, Julian







From: mcr@sandelman.ottawa.on.ca (Michael Richardson)
Date: Tue, 16 Dec 2003 15:37:42 -0500
Subject: [xml2rfc] rsync access to XML ID/RFCs
In-Reply-To: Your message of "Tue, 16 Dec 2003 09:25:24 PST." <20031216092524.5cb8cfac.mrose+internet.xml2rfc@dbc.mtview.ca.us>
Message-ID: <24860.1071607062@marajade.sandelman.ottawa.on.ca>

-----BEGIN PGP SIGNED MESSAGE-----


>>>>> "Marshall" == Marshall Rose <mrose+internet.xml2rfc@dbc.mtview.ca.us> writes:
    >> Marshall, if you'd like to list this somewhere, I'm happy with that.
    >> If you want me to switch to rsync, let me know.

    Marshall> thanks. i'm a little concerned about running rsync on that
    Marshall> server because there was a recent rsync exploit...

  Yeah.... I am running the latest.
  I already had rsync alive, so I had to upgrade anyway.

]       ON HUMILITY: to err is human. To moo, bovine.           |  firewalls  [
]   Michael Richardson,    Xelerance Corporation, Ottawa, ON    |net architect[
] mcr@xelerance.com      http://www.sandelman.ottawa.on.ca/mcr/ |device driver[
] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Finger me for keys

iQCVAwUBP99tFIqHRg3pndX9AQGBngQA3WMJu6KoTMHVtEboKOptyPfqp2ikAbHf
t1t9xT4Iy63aTuJAyM8i1aOyWQmt6bTbQtnzSh1KV2LlmOOoq/v+fTajdTZbkRUm
I1r3+5V161jKpnLYx30wkVv+s+VAh8ogeuad7lceP1lw3D+G8oOaWq/7MtQpzW/F
2nAQI6F07pw=
=93K5
-----END PGP SIGNATURE-----


From: mrose+internet.xml2rfc@dbc.mtview.ca.us (Marshall Rose)
Date: Tue, 16 Dec 2003 09:25:24 -0800
Subject: [xml2rfc] rsync access to XML ID/RFCs
In-Reply-To: <20152.1071518378@marajade.sandelman.ottawa.on.ca>
References: <20152.1071518378@marajade.sandelman.ottawa.on.ca>
Message-ID: <20031216092524.5cb8cfac.mrose+internet.xml2rfc@dbc.mtview.ca.us>

> Marshall, if you'd like to list this somewhere, I'm happy with that.
> If you want me to switch to rsync, let me know.

thanks. i'm a little concerned about running rsync on that server because there
was a recent rsync exploit...

/mtr


From: mcr@sandelman.ottawa.on.ca (Michael Richardson)
Date: Mon, 15 Dec 2003 14:59:38 -0500
Subject: [xml2rfc] rsync access to XML ID/RFCs
Message-ID: <20152.1071518378@marajade.sandelman.ottawa.on.ca>

-----BEGIN PGP SIGNED MESSAGE-----


Hi, I'm running

wget -r -l 1 -A .xml -nv -np -nd -nc http://xml.resource.org/public/rfc/bibxml/
wget -r -l 1 -A .xml -nv -np -nd -nc http://xml.resource.org/public/id/bibxml/

to a local directory, since xml.resource.org was offline last time I
wanted to update. It also seems kind of slow at times. So, if you have rsync,

lox-[/m/ietf/xml] mcr 764 %rsync -l rsync://lox.sandelman.ca/
rfcxml          RFC biblography in XML
idxml           ID biblography in XML

Marshall, if you'd like to list this somewhere, I'm happy with that.
If you want me to switch to rsync, let me know.

]       ON HUMILITY: to err is human. To moo, bovine.           |  firewalls  [
]   Michael Richardson,    Xelerance Corporation, Ottawa, ON    |net architect[
] mcr@xelerance.com      http://www.sandelman.ottawa.on.ca/mcr/ |device driver[
] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Finger me for keys

iQCVAwUBP94So4qHRg3pndX9AQH8dwP+JlZSSdx0vtOhCeWODGiQDqoQNFBj8XG7
01yf+Zc6Q4kaHJZj7/aSLNWuGCqWo1EIccSq98kmCHLXJdzzeFH5V9J0b/Bar7q4
Zwc2PDb7qxiTTwie8qY0CGLS+8nXC9ZfhJ0Kislw8Mw+6Llj+dcvQFo6VCRl2kU3
d8aYznwa2OM=
=6VqL
-----END PGP SIGNATURE-----


From: mrose+internet.xml2rfc@dbc.mtview.ca.us (Marshall Rose)
Date: Wed, 10 Dec 2003 16:24:44 -0800
Subject: [xml2rfc] maintaining citation libraries
In-Reply-To: <A3587490-2B56-11D8-B8B2-000A95DBDB84@isi.edu>
References: <200312102012.hBAKCDFh004677@bulk.resource.org> <A3587490-2B56-11D8-B8B2-000A95DBDB84@isi.edu>
Message-ID: <20031210162444.1ae536a5.mrose+internet.xml2rfc@dbc.mtview.ca.us>

> > - why does rfc 2246 reference have so many authors?  our index entry
> >   only lists 2
    
oops!

    
> > - extra author added in reference for RFC 2616 (Nielsen, H.)
    
i believe that the fullname is "Henrik Frystyk Nielsen", hence the
author reference is correct... (the official rfc-index lists him as
"H. Frystyk").
    
> > - missing author for reference 2779 (Mohr, G.)

oops!
    
    
the index should auto-generate in about an hour or so.

/mtr


From: falk@ISI.EDU (Aaron Falk)
Date: Wed, 10 Dec 2003 11:13:46 -0800
Subject: [xml2rfc] maintaining citation libraries
Message-ID: <FA713C91-2B44-11D8-B8B2-000A95DBDB84@isi.edu>

Hi-

Can someone tell me how the citation libraries are maintained?  The RFC 
Editor has noticed that some entries are incorrect and I'm wondering 
how, e.g., the ID citation library gets created and is maintained.

--aaron  (for the RFC Editor)	


