
From dburnett@voxeo.com  Mon Dec 28 02:46:19 2009
Return-Path: <dburnett@voxeo.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6F1F93A67A8 for <speechsc@core3.amsl.com>; Mon, 28 Dec 2009 02:46:19 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level: 
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[BAYES_50=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HI7xke+20BsM for <speechsc@core3.amsl.com>; Mon, 28 Dec 2009 02:46:18 -0800 (PST)
Received: from voxeo.com (mmail.voxeo.com [66.193.54.208]) by core3.amsl.com (Postfix) with ESMTP id 822833A67EE for <speechsc@ietf.org>; Mon, 28 Dec 2009 02:46:17 -0800 (PST)
Received: from [71.204.33.81] (account dburnett HELO [192.168.15.104]) by voxeo.com (CommuniGate Pro SMTP 5.2.3) with ESMTPSA id 55079877; Mon, 28 Dec 2009 10:45:51 +0000
Message-Id: <4F3214C8-3210-4C45-94FA-56D96D7951BE@voxeo.com>
From: Dan Burnett <dburnett@voxeo.com>
To: Robert Sparks <rjsparks@nostrum.com>
In-Reply-To: <862ADFEF-C942-4945-8252-48BE7A7D420F@nostrum.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v936)
Date: Mon, 28 Dec 2009 05:45:48 -0500
References: <862ADFEF-C942-4945-8252-48BE7A7D420F@nostrum.com>
X-Mailer: Apple Mail (2.936)
Cc: draft-ietf-speechsc-mrcpv2@tools.ietf.org, speechsc@ietf.org, speechsc-chairs@tools.ietf.org
Subject: Re: [Speechsc] AD review of draft-ietf-speechsc-mrcpv2-20
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 28 Dec 2009 10:46:19 -0000

Hi Robert,

I have replied below to a number of your comments with what I plan to  
do to address them.  If I do not reply to a specific comment below, it  
is because I do not yet have an answer (or a complete answer).

-- dan

On Sep 29, 2009, at 11:06 AM, Robert Sparks wrote:

> Hi Folks -
>
> I'm working on moving MRCPv2 along. I've found several things so far
> that I'd like to discuss and/or have the document address before we  
> take
> the document into IETF last call.
>
> This is a large and complex document. Apologies that my review has  
> taken so long.
>
> After talking with Eric and Dave, I'm sending these all in one  
> message instead
> of splitting them into several threads at the beginning. When you  
> reply to a particular
> point, it would be useful to me if you adjusted the subject line to  
> indicate which point
> you are replying to.
>
> These are not listed in any particular order. Nits are grouped at  
> the end.
>
> Thanks!
>
> RjS
> ----------------------------------------------------------------------------------------------
> (The following apply to revision -20)
>
> 1 The Introduction points to RFC4313 for a discussion on why MRCPv2
>  does not use RTSP and details on alternatives, but I don't find that
>  discussion in 4313. Was that discussion captured somewhere? If so,
>  please point to that. Otherwise, modify this text.

This wording is incorrect.  I will replace the third paragraph of  
section 1 with the following:

The proprietary version of MRCP ran over RTSP [RFC2326].  At the time  
work on MRCPv2 was begun, the consensus was that this use of RTSP  
would break the RTSP protocol or cause backward-compatibility  
problems, something forbidden by Section 3.2 of the above mentioned  
requirements document, RFC 4313.  This is the reason why MRCPv2 does  
not run over RTSP.

>
> 2 The SIP examples throughout the draft need to be adjusted to reflect
>  correct syntax and intended use. There are several aspects of the SIP
>  messages, in particular, that are currently in error. Consider
>  showing partial SIP headers focusing only on what's important to the
>  example as an alternative to showing full messages (that will have to
>  be carefully reviewed). Some examples of issues that need to be
>  corrected (this is not exhaustive)
>    2.1 Several responses are missing "received=" in their Via header
>      fields
>    2.2 The o= line in answers (as in offer/answer) must be different
>      from the o= line in the offer.
>    2.3 The branch parameter values need to be reviewed very  
> carefully -
>      the first example incorrectly reuses the branch from the INVITE
>      in an ACK to a 200 OK. Then the _next transaction_ also reuses
>      the branch.
>    2.4 There is a to-tag in the OPTIONS request on page 44
>
> 3 The MRCP examples need to be similarly reviewed
>    3.1 are all the content-lengths correct? (I think the 2nd message's
>      on page 59 isn't)

There is only one Content-Length on that page, and it has a value of  
"...", which was done on purpose to avoid minor discrepancies in the  
calculations of content-lengths for the examples when an example is  
updated.

>    3.2 It's ok, probably even a good idea, to elide values that are  
> not
>      important to understanding an example, but please be consistent -
>      the first message on page 65, for example, has an explicit
>      (probably incorrect) MRCP length, but elided the mime-body length

Thanks.  I will replace all MRCP message-length values with "...".

>    3.3 The example in section 9.14 shows two RECOGNITION-COMPLETE
>      messages to the same RECOGNIZE. Were these intended to show two
>      alternate possible responses? If so, the document should make
>      that more clear.

I will clarify this.

>
> 4 Returning a SIP 501 at the end of section 4.3 is not the right thing
>  to do. 501 means the responding element does not implement the
>  method. You are probably looking for 488 Not Acceptable Here.
>
> 5 This needs to be run through an ABNF checker. There are production
>  rules and terminals missing - they either need to be defined or
>  pointers to where they are defined need to be added.

Okay.  There will be references or definitions for all rules and  
terminals not currently defined in the document.

>
> 6 The document occasionally mentions an MRCP proxy (there is a 503
>  Proxy Timeout code even), but I can't find where such proxies are
>  defined? Page 32 also talks about intermediaries.
>
> 7 Some additional discussion around connection establishment and
>  sharing/reuse is probably needed
>    7.1 Where does an element look in a peer's certificate to determine
>      it's reached the peer it has intended to reach?
>    7.2 What happens if a connection gets closed?
>    7.3 Must events come over the connection the request was sent on?
>    7.4 There should be some guidance on only reusing connections when
>      the identity of the peer matches what was confirmed when the
>      connection was opened. (Specifically, if it's possible for an
>      MRCPv2 server to host services for more than one domain, you
>      don't want to blindly reuse the connection you made to talk to A
>      to talk to B just because DNS aimed you to the same address/port
>      to reach them.)
>
> 8 Section 6.1.2 should be explicit about what it means by "empty  
> header
>  field"

Will clarify that this is a header field without a value.

>
> 9 With respect to the URI indirection mechanisms defined in the draft:
>    9.1 Much of the text assumes these URIs will be HTTP/HTTPS. But  
> other
>      parts of the text, and the syntax goes out of the way to allow
>      arbitrary URI types. Please help look for places where the
>      recommendations and requirements stated only make sense when the
>      URI is HTTP or HTTPS.
>    9.2 There's currently no discussion about authenticating the
>      requester seeking access to the resource pointed to by one of
>      these URIs. Security considerations should call out that if the
>      URI leaks, the content leaks. There should probably also be more
>      explicit discussion of how long a server should be expected to
>      hold onto the state indicated by such a URI (how long can a
>      client expect it to be there, and when does a server decide a
>      client or set of clients is mounting a state exhaustion attack?),
>      whether it should allow multiple accesses from a single client,
>      whether it should allow accesses from multiple clients, and what
>      it means to a client if the attempt to access the resource fails.
>
> 10 Why is there both a "Fetch Hint" and a "Audio Fetch Hint". Why does
>  the syntax allow for extensibility in the values for those fields?

This is parallel to parameters in VoiceXML, the primary language MRCP  
is used to implement.  "Fetch Hint" is generic across all external  
resources to be fetched (grammars, audio, etc.), while "Audio Fetch  
Hint" is specific to audio.  You are right that the ABNF can be made  
more precise.  I will restrict the ABNF to the allowed values.

>
> 11 On page 111, the document talks about timing between audio flows  
> and
>  RECOGNIZE methods. It claims there are "a number of mechanisms" for
>  dealing with the race conditions. Would it be possible to list a few
>  of these as informative examples? You might also consider pointing
>  out that the delta between the start of an audio flow (or the point
>  in an ongoing stream that you intended to start RECOGNIZEing) and the
>  receipt of a RECOGNIZE command could be quite large if TCP is
>  reacting to congestion. The prohibition at the end of the paragraph
>  ("MUST NOT buffer anything it receives beforehand.") seems odd.
>  What's the rationale for it? Finally - did the group consider
>  indicating RTP timestamps in the RECOGNIZE request to indicate where
>  to start recognition as one of the mechanisms pointed to above?

I will reword this section to point out that mechanisms to resolve  
this race condition are outside the scope of this specification.   
Regarding the no-buffering prohibition, I will clarify that this is in  
order to preserve the semantics that application authors expect with  
respect to the input timers.

>
> 12 Why is the record semantic defined in 10.4.7 different from the one
>  in 9.4.8/9.4.22 (specifically, by providing a way to request a server
>  store something somewhere other than on that server)? Why does this
>  section allow an arbitrary URI scheme to be passed in here? What is
>  an implementation supposed to do if it doesn't know the scheme? What
>  does it do if attempts to use a URI with a scheme it recognizes
>  results in failure? The security considerations section should
>  discuss how this might be abused by providing a URI that points at a
>  victim.
>
> 13 What should an element do if it receives a status code that it
>  doesn't recognize? If that's not already specified in the document,
>  it should be added.
>
> 14 Consider additional clarification around "Note that "GET-PARAMS"
>  returns header values that apply to the whole session and not values
>  that have a request level scope."

Thank you.  I will add an example.

>
> 15 How are parameters like "Confidence Threshold" and "Sensitivity
>  Level" interoperable? Would you expect .5 to mean the same thing to
>  two different implementations? I'm guessing that the intent is that
>  the server gets to interpret these values in an
>  implementation-specific way, and the utility of these knobs is that
>  you tune them over time to a given server. If that's right, the text
>  should explicitly point that out.

For both, I will replace 'The default value for this header is  
implementation specific.' with
'The default value for this header is implementation specific, as is  
the interpretation of any specific value for this header.  It is  
expected that clients will tune this value over time for a given  
server.'

>
> 16 Something I'm still trying to think through and would like other
>  folks to comment on - apologies if I've missed where this is treated
>  already: Can a server ever issue a reINVITE affecting an MRCPv2
>  session (to change codecs for example)? If so, are there any places
>  in the text that need to call that out?
>
> 17 On page 15, there's a requirement that "There MUST be one SDP m- 
> line
>  for each MRCPv2 resource to be used in the session. " This looks like
>  it would prevent offering things like alternates, v4 and v6, etc. Is
>  this what's intended?
>
> 18 Nits
>    18.1 Section 3 paragraph 2 sentence 1: SIP is not the "session
>      management protocol"

Will fix.

>    18.2 The word "pipe" is used ("control pipe", "audio pipe" for
>      example) with no definition and there are well-defined terms that
>      could be used instead.
>    18.3 Paragraph spanning pages 15 and 16 - I suggest explicitly  
> noting
>      that the reINVITE receives an error response.

Will do.

>    18.4 There is an unnatural break in the flow of the prose on page  
> 16
>      when the text shifts from an overview of the protocol to giving
>      an example. Suggest breaking the example into a subsection to
>      make it clear what you're intending.

Will promote this example to a section of its own.

>    18.5 Please use the terms "header" and "header
>      field" consistently and align the use of those phrases with
>      the definitions in section 2.1 of RFC 5322.

Will do.

>    18.6 The conditional language in 6.1.1 is hard to follow. In
>      particular, the paragraph starting "If both error 404 and
>      another" is awkward. Please consider clarifying these clauses.

Thanks.  We have rewritten this section several times, and this  
wording has been the best we could come up with so far.

>    18.7 typo on page 43: "veriifcation"

Will fix.

>    18.8 The string "ECMAScript" is used once with no definition.

It was not necessary in the sentence, so I will remove it.

>    18.9 The term "kill-on-barge-in" is used without any definition.
>      Please add a reference or a definition.

Actually, this is defined in section 8.4.2.  I will add more  
references to this section.

>    18.10 Page 121 says: "The Personal-Grammar-URI,"..."is  
> created"... . I
>      think you meant to say the resource indicated by that URI is
>      created.

Thanks.  I will reword this.

>    18.11 Consider using "octet" for "byte". In places where you are
>      describing lengths, consider talking about whether leading 0s
>      have meaning (it would probably be good to explicitly call out
>      that you don't want such a string to be interpreted base-8).
>


From dburnett@voxeo.com  Tue Dec 29 03:29:23 2009
Return-Path: <dburnett@voxeo.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id E88473A677C for <speechsc@core3.amsl.com>; Tue, 29 Dec 2009 03:29:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.152
X-Spam-Level: 
X-Spam-Status: No, score=0.152 tagged_above=-999 required=5 tests=[AWL=0.150,  BAYES_50=0.001, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7uenUvUEJvze for <speechsc@core3.amsl.com>; Tue, 29 Dec 2009 03:29:22 -0800 (PST)
Received: from voxeo.com (mmail.voxeo.com [66.193.54.208]) by core3.amsl.com (Postfix) with ESMTP id 5A4F53A6767 for <speechsc@ietf.org>; Tue, 29 Dec 2009 03:29:22 -0800 (PST)
Received: from [71.204.33.81] (account dburnett HELO [192.168.15.111]) by voxeo.com (CommuniGate Pro SMTP 5.2.3) with ESMTPSA id 55101780; Tue, 29 Dec 2009 11:29:01 +0000
Message-Id: <E1BF1CBD-FBA6-4221-8DB0-C86BC9AB7E08@voxeo.com>
From: Dan Burnett <dburnett@voxeo.com>
To: Corby Anderson <corbya@microsoft.com>
In-Reply-To: <EF149B22CD1213419BF4DFE038422CAC6D6C23@TK5EX14MBXC116.redmond.corp.microsoft.com>
Content-Type: multipart/alternative; boundary=Apple-Mail-41-311098442
Mime-Version: 1.0 (Apple Message framework v936)
Date: Tue, 29 Dec 2009 06:28:59 -0500
References: <EF149B22CD1213419BF4DFE038422CAC6D6C23@TK5EX14MBXC116.redmond.corp.microsoft.com>
X-Mailer: Apple Mail (2.936)
Cc: "speechsc@ietf.org" <speechsc@ietf.org>
Subject: Re: [Speechsc] Confusuion with INTERPRET
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 29 Dec 2009 11:29:24 -0000

--Apple-Mail-41-311098442
Content-Type: text/plain;
	charset=WINDOWS-1252;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: quoted-printable

Corby,

Thanks for catching these typos.  You are correct, and I will update =20
as you suggest.  See below.

-- dan

On Aug 21, 2009, at 9:13 PM, Corby Anderson wrote:

> Does section 9.20 INTERPRET need some clarification? 9.20 states =20
> that INTERPRETATION should return an INTERPRETATION-COMPLETE event =20
> (as described in 9.21), but the example in section 9.20 shows the =20
> following response:
>
>    S->C:    MRCP/2.0 49 543267 200 COMPLETE
>            Channel-Identifier:32AECB23433801@speechrecog
>            Completion-Cause:000 success
>            Content-Type:application/nlsml+xml
>            Content-Length:...
>
> That S->C format is for responses (5.3), not events (5.5).  Contrast =20=

> this with the RECOGNITION-RESPONSE event to RECOGNIZE:
>
>    S->C:MRCP/2.0 486 RECOGNITION-COMPLETE 543260 COMPLETE
>    Channel-Identifier:32AECB23433801@speechrecog
>    Completion-Cause:000 success
>    Waveform-URI:<http://web.media.com/session123/audio.wav>;
>                 size=3D124535;duration=3D2340
>    Content-Type:applicationt/x-nlsml
>    Content-Length:...
>
> Shouldn=92t the first line of the INTERPRETATION-COMPLETE event be =20
> something like the following?
>    S->C:    MRCP/2.0 49 INTERPRETATION-COMPLETE 543267 COMPLETE

Yes.  I will correct this in 9.20 and 9.21 in the next draft.

>
> The only mention of INTERPRETATION-COMPLETE in the spec are
> * table of contents
> * 9.3 Recognizer events
> * 9.21 where it=92s described
> * 13.1.2 MRCPv2 methods and events
> * 15 Normative definition
>
> I found no usage examples for INTERPRETATION-COMPLETE; most notably =20=

> not in 9.20
>
>
>
> Also, section 9.9 states
>    For the recognizer resource, RECOGNIZE is the only request that
>    returns a request-state of IN-PROGRESS, meaning that recognition is
>    in progress.
>
> But the example in 9.20 for INTERPRET shows
>    S->C:    MRCP/2.0 49 543266 200 IN-PROGRESS
>            Channel-Identifier:32AECB23433801@speechrecog
>
> Is the recognizer resource the resource that performs =20
> interpretation?  If so, then the text in 9.9 should be changed to =20
> say the following:
>    For the recognizer resource, RECOGNIZE and INTERPRET are the only
>    requests that return a request-state of IN-PROGRESS, meaning that
>    recognition or interpretation is in progress.

I will also make this change in the next draft.

>
>
> Corby Anderson
>
> _______________________________________________
> Speechsc mailing list
> Speechsc@ietf.org
> https://www.ietf.org/mailman/listinfo/speechsc
> Supplemental web site:
> &lt;http://www.standardstrack.com/ietf/speechsc&gt;


--Apple-Mail-41-311098442
Content-Type: text/html;
	charset=WINDOWS-1252
Content-Transfer-Encoding: quoted-printable

<html><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
">Corby,<div><br></div><div>Thanks for catching these typos. &nbsp;You =
are correct, and I will update as you suggest. &nbsp;See =
below.</div><div><br></div><div>-- dan</div><div><br><div><div>On Aug =
21, 2009, at 9:13 PM, Corby Anderson wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-align: auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; "><div lang=3D"EN-US" link=3D"blue" =
vlink=3D"purple"><div class=3D"Section1"><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; ">Does section 9.20 INTERPRET =
need some clarification? 9.20 states that INTERPRETATION should return =
an INTERPRETATION-COMPLETE event (as described in 9.21), but the example =
in section 9.20 shows the following response:<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">&nbsp;&nbsp; =
S-&gt;C:&nbsp;&nbsp;&nbsp; MRCP/2.0 49 543267 200 =
COMPLETE<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
Channel-Identifier:32AECB23433801@speechrecog<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
Completion-Cause:000 success<o:p></o:p></div><div style=3D"margin-top: =
0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
Content-Type:application/nlsml+xml<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
Content-Length:...<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">That S-&gt;C format is for responses (5.3), not events (5.5).&nbsp; =
Contrast this with the RECOGNITION-RESPONSE event to =
RECOGNIZE:<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; S-&gt;C:MRCP/2.0 486 RECOGNITION-COMPLETE 543260 =
COMPLETE<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">&nbsp;&nbsp; =
Channel-Identifier:32AECB23433801@speechrecog<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; Completion-Cause:000 success<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; Waveform-URI:&lt;<a =
href=3D"http://web.media.com/session123/audio.wav" style=3D"color: blue; =
text-decoration: underline; =
">http://web.media.com/session123/audio.wav</a>&gt;;<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp; size=3D124535;duration=3D2340<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; Content-Type:applicationt/x-nlsml<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; Content-Length:...<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">Shouldn=92t the first line of the =
INTERPRETATION-COMPLETE event be something like the =
following?<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">&nbsp;&nbsp; =
S-&gt;C:&nbsp;&nbsp;&nbsp; MRCP/2.0 49 INTERPRETATION-COMPLETE 543267 =
COMPLETE</div></div></div></span></blockquote><div><br></div>Yes. =
&nbsp;I will correct this in 9.20 and 9.21 in the next =
draft.</div><div><br><blockquote type=3D"cite"><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-align: auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; "><div lang=3D"EN-US" link=3D"blue" =
vlink=3D"purple"><div class=3D"Section1"><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; "><o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">The only mention of =
INTERPRETATION-COMPLETE in the spec are<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; ">* =
table of contents<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; ">* 9.3 Recognizer =
events<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; font-family: =
Calibri, sans-serif; ">* 9.21 where it=92s =
described<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">* 13.1.2 MRCPv2 methods and =
events<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; font-family: =
Calibri, sans-serif; ">* 15 Normative definition<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">I found no usage examples for =
INTERPRETATION-COMPLETE; most notably not in 9.20<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">Also, section 9.9 =
states<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; font-family: =
Calibri, sans-serif; ">&nbsp;&nbsp; For the recognizer resource, =
RECOGNIZE is the only request that<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; returns a request-state of IN-PROGRESS, meaning that =
recognition is<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; ">&nbsp;&nbsp; in =
progress.<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">But the example in 9.20 for INTERPRET shows<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; S-&gt;C:&nbsp;&nbsp;&nbsp; MRCP/2.0 49 543266 200 =
IN-PROGRESS<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
Channel-Identifier:32AECB23433801@speechrecog<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">Is the recognizer resource the =
resource that performs interpretation?&nbsp; If so, then the text in 9.9 =
should be changed to say the following:<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp; For the recognizer resource, RECOGNIZE and INTERPRET are =
the only<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">&nbsp; &nbsp;requests that return a =
request-state of IN-PROGRESS, meaning that<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp; &nbsp;recognition or interpretation is in =
progress.</div></div></div></span></blockquote><div><br></div>I will =
also make this change in the next draft.</div><div><br><blockquote =
type=3D"cite"><span class=3D"Apple-style-span" style=3D"border-collapse: =
separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: =
medium; font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
auto; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; "><div lang=3D"EN-US" link=3D"blue" =
vlink=3D"purple"><div class=3D"Section1"><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; "><o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">Corby Anderson<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div></div>___________________________________________=
____<br>Speechsc mailing list<br><a href=3D"mailto:Speechsc@ietf.org" =
style=3D"color: blue; text-decoration: underline; =
">Speechsc@ietf.org</a><br><a =
href=3D"https://www.ietf.org/mailman/listinfo/speechsc" style=3D"color: =
blue; text-decoration: underline; =
">https://www.ietf.org/mailman/listinfo/speechsc</a><br>Supplemental web =
site:<br>&amp;lt;<a =
href=3D"http://www.standardstrack.com/ietf/speechsc&amp;gt" =
style=3D"color: blue; text-decoration: underline; =
">http://www.standardstrack.com/ietf/speechsc&amp;gt</a>;</div></span></bl=
ockquote></div><br></div></body></html>=

--Apple-Mail-41-311098442--

From dburnett@voxeo.com  Tue Dec 29 03:01:25 2009
Return-Path: <dburnett@voxeo.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C68B23A680B; Tue, 29 Dec 2009 03:01:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.301
X-Spam-Level: 
X-Spam-Status: No, score=0.301 tagged_above=-999 required=5 tests=[AWL=-0.300,  BAYES_50=0.001, HTML_MESSAGE=0.001, J_CHICKENPOX_16=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zo95H47v-TAY; Tue, 29 Dec 2009 03:01:20 -0800 (PST)
Received: from voxeo.com (mmail.voxeo.com [66.193.54.208]) by core3.amsl.com (Postfix) with ESMTP id D3D5A3A6844; Tue, 29 Dec 2009 03:01:19 -0800 (PST)
Received: from [71.204.33.81] (account dburnett HELO [192.168.15.111]) by voxeo.com (CommuniGate Pro SMTP 5.2.3) with ESMTPSA id 55101526; Tue, 29 Dec 2009 11:00:52 +0000
Message-Id: <C46B7F31-9989-442C-B2F1-CA77E79F04F8@voxeo.com>
From: Dan Burnett <dburnett@voxeo.com>
To: Roni Even <Even.roni@huawei.com>
In-Reply-To: <027801ca1b1c$c2e8ee80$48bacb80$%roni@huawei.com>
Content-Type: multipart/alternative; boundary=Apple-Mail-40-309409471
Mime-Version: 1.0 (Apple Message framework v936)
Date: Tue, 29 Dec 2009 06:00:50 -0500
References: <033101c9ff3a$cbe33160$63a99420$%roni@huawei.com> <E2C626B8-8CA1-4A1D-A2CE-B6AB4B269DEE@voxeo.com> <027801ca1b1c$c2e8ee80$48bacb80$%roni@huawei.com>
X-Mailer: Apple Mail (2.936)
X-Mailman-Approved-At: Tue, 29 Dec 2009 03:37:58 -0800
Cc: speechsc@ietf.org, sarvi@cisco.com, oran@cisco.com, rai@ietf.org
Subject: Re: [Speechsc] RAI review of draft-ietf-speechsc-mrcpv2-19
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 29 Dec 2009 11:01:25 -0000

--Apple-Mail-40-309409471
Content-Type: text/plain;
	charset=WINDOWS-1252;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: quoted-printable

Hi Roni,

Just to finish up on your last comments . . .

-- dan

On Aug 12, 2009, at 3:15 AM, Roni Even wrote:

> Hi Dan,
> I understand your explanation about all these "vendor specific" =20
> parameter. I think that since this a standard track document there =20
> should be some text explaining the usage of these parameters as well =20=

> as making a note that since these are vendor specific information =20
> you cannot compare the values coming from different vendors

Thank you.  I will note this in the next draft and suggest how these =20
parameters may be used in light of their vendor dependence.

>
>
> As for my comment number 5 on payload type 96. My comment was that =20
> if the m-line has a payload type number of 96 you must have a =20
> a=3Drtpmap line mapping 96 to a specific subtype name while for pcmu =20=

> it is not mandatory to have a=3Drtpmap like you have in your examples =20=

> since payload type number 0 is a static payload type number assigned =20=

> to pcmu
>

I'm sorry, I did not explain this very well.  I understood your =20
comment.  My reply was that of the three examples, example 2 did =20
actually provide the a=3Drtpmap line for 96.  Since the payload type of =20=

96 should not even have been included in the first and third examples, =20=

once I removed it from those two examples all three contained the =20
proper a=3Drtpmap lines.
Although not necessary to have an a=3Drtpmap line for payload type 0, =20=

others in the past had requested it so I left it in.

>
> Roni Even
>
> From: Dan Burnett [mailto:dburnett@voxeo.com]
> Sent: Tuesday, August 11, 2009 9:22 PM
> To: Roni Even
> Cc: sarvi@cisco.com; oran@cisco.com; 'Eric Burger'; =20
> speechsc@ietf.org; rai@ietf.org
> Subject: Re: RAI review of draft-ietf-speechsc-mrcpv2-19
>
>
> On Jul 7, 2009, at 3:40 PM, Roni Even wrote:
>
>
> Hi,
>
> I was assigned to do a RAI review of the draft.  The draft looks =20
> ready for publication to me. I have some comments mostly editorial.
>
> The only issue I see that is not pure editorial is the issue of the =20=

> different parameters like confidence threshold, sensitivity level =20
> (see comments 11, 13, 15, 16 and 17). I think that some =20
> clarification on the semantics and the scale (for example are the =20
> values linearly spaced) as well as when they are useful will be =20
> helpful to implementers.
>
> 1.       In figure 1 Expand the abbreviations TTS, ASR, SV , SI and =20=

> how they are related to the media resource types in 3.1
>
>
> Done.  Added some text explaining Figure 1 and enhanced Figure 1 =20
> slightly for clarification.
>
> 2.       In figure 1 there is a SIP dialog between the MRCPv2 client =20=

> and the media source/sink, what is this dialog, I only saw in =20
> section 4 a dialog between the client and server.
>
> Clarified in the first example of section 4.2 that the SIP dialog =20
> with the media source/sink is not shown.
> 3.       In section 3.2 you have =93For example: =20
> sip:mrcpv2@example.net=94 twice one after the other.
>
> Fixed.
>
>
> 4.       In the example in section 4.2 you =93a=3Dcmid:1=94, cmid is =20=

> specified later in the document so maybe you can add some reference =20=

> to where it is specified
>
> Done.
>
>
>
> 5.       In the example is section 4.2 and in following examples you =20=

> have =93m=3Daudio 49170 RTP/AVP 0 96=94 but do not have an rtpmap =20
> parameter for mapping 96 (dynamic payload type number) to a media =20
> encoding name.
>
> It is not in the first or third examples (Synthesizer only), but it =20=

> is in the second example (Recognizer).  I have removed 96 as an =20
> option for the Synthesizer-only examples but let it remain as an =20
> addition for the Recognizer example.
>
>
>
> 6.       In section 4.3 =93Also note that more that one media session =20=

> can be associated with a single resource if need be, but this =20
> scenario is not useful for the current set of resources=94. There is a =
=20
> typo the second =93that=94 should be =93than=94. I am also not sure if =
the =20
> current syntax in this document can support the mode.
>
> Fixed the typo.
>
>
>
> 7.       In section 4.3 =93The formatting of the"cmid" attribute in =20=

> SDP RFC3388 [RFC4566]=94. I think you meant SDP grouping and need the =20=

> reference to RFC 3388.
>
> I removed the reference altogether because it already exists =20
> (correctly) earlier in the paragraph.
>
>
>
> 8.       In section 5.1 =93The message-length field specifies the =20
> length of the message, including the start-line=94 is the length in =20=

> Bytes, there is no unit specified.
>
> Changed "length of the message" to "length of the message in bytes".
>
>
>
> 9.       In section 6.3.1, typo you have =93Verfication =93 instead of =
=20
> verification. It appears twice in the section.
>
> Fixed.
>
>
>
> 10.   In the example in section 7 you have =93m=3Daudio 0 RTP/AVP 0 1 =
3=94 =20
> payload type 1 was deleted from the IANA registry, maybe have =20
> another payload type number.
>
> I just removed that payload type.  It is not germane to the example.
>
>
>
> 11.   In section 9.4.1, 9.4.2 and 9.4.3 you specify confidence =20
> threshold, sensitivity level and speed vs accuracy. What is the =20
> scale here; is it linear between 0 and 1. What is the absolute value =20=

> of the number, if you receive the same confidence level from two =20
> recognizers are they the same (e.g. when using context block to =20
> switch servers).  For the speed vs accuracy, how does the client =20
> know what is the relation between the value and the number of =20
> available sessions, since this seems to be the reason for using this =20=

> parameter.
>
> The interpretation of all of these parameters is implementation-=20
> specific because the underlying technologies used to implement them =20=

> vary and can even be proprietary.  In practice the speech =20
> recognition and synthesis and speaker authentication communities =20
> have lived with this state of affairs for many years, and users of =20
> other APIs for this technology are well aware of and have built =20
> applications that accommodate this variability in interpretation.  =20
> It is outside the scope of this specification to attempt to =20
> standardize interpretations of these values.
>
>
> 12.   In 9.4.9 and in 10.4.8, 11.4.11 what are the values for media-=20=

> type-value, you also mention audio and video but it looks to me that =20=

> this document only discusses voice.
>
> Yes.  Although the original intent was to record speech, application =20=

> authors today are beginning to look at ways to incorporate other =20
> audio or video.  The intent of the sentences in these sections is to =20=

> clarify that the specification itself imposes no restriction on the =20=

> types of media that are allowed.
>
>
>
> 13.   In 9.4.35 and 9.4.36 what is the scale for the consistency =20
> here. How does one know what close means. What is the consistency =20
> between different recognizers.
>
> The answer to question 11, above, applies here as well.
>
>
>
> 14.   In section 9.6.3.3 in the example (figure 2) confidence should =20=

> be 0.75 and not 75
>
> Fixed.
>
>
>
> 15.   In section 10.4.1 it is not clear how you measure the =20
> sensitivity in order to specify, is it based on some SNR translated =20=

> to 0 to 1 scale?
>
> The answer to question 11, above, applies here as well.
>
>
>
> 16.   In 11.4.6 the same issue with the scale, how does the client =20
> know how to set a value when working with different speaker =20
> verification servers.
>
> Ditto.  I should point out that in all of these cases the parameters =20=

> are typically passed directly to the engine, and their =20
> interpretations are defined (and described) in the vendors' =20
> documentation.  The most common MRCPv2 server implementations are by =20=

> the technology vendors themselves (the providers of the synthesis, =20
> recognition, and verification engines).  This is commonly understood =20=

> in this technology industry (meaning those who use this technology =20
> regularly).
>
>
>
> 17.   In 11.5.2.9 you state that the verification-score is not a =20
> probability, so what is it. How can the client decide if, for =20
> example, 0 is a good score for specifying the threshold.  I also =20
> noticed that the values in the example in section 11.5.2.10 are very =20=

> precise like 0.98514 is this the expected precision. The examples =20
> here and in section 11.11 do not show the threshold, if the =20
> threshold is required for this flow why not show it in the example?
>
> This parameter, as others mentioned above, has only a vendor-=20
> specific interpretation.  In practice authors interpret these values =20=

> based both on guidance from the technology vendors and via =20
> experimentation on large sets of recorded data.
>
> The Min-Verification-Score threshold is not required to be set.  In =20=

> many cases the technology vendor has a fairly good understanding of =20=

> what the default threshold should be.  The verification-score is =20
> returned, however, in case the application author determines =20
> (through experimentation, as described above) that the default =20
> threshold is not producing optimal results for the application.  In =20=

> that case the author can set the threshold to a different value or =20
> can set it to -1 and make the determination within the application =20
> itself based on the verification-score values.
>
>
>
> 18.   In section 12.3 the suggestion is to use SRTP as the mandatory =20=

> interoperability mode. If the reason for mandating SRTP is for a =20
> common mode you should also decide on a key exchange mechanism. I =20
> suggest you look at =
http://tools.ietf.org/html/draft-ietf-avt-srtp-not-mandatory-02=20
>  for discussion on media security.
>
> Based on the discussion between you and Dan York on the list, I will =20=

> change this:
>
> 12.3. Media session protection
> Sensitive data is also carried on media sessions terminating on =20
> MRCPv2 servers (the other end of a media channel may or may not be =20
> on the MRCPv2 client). This data includes the user's spoken =20
> utterances and the output of text-to-speech operations. MRCPv2 =20
> servers MUST support SRTP for protection of audio media sessions. =20
> MRCPv2 clients that originate or consume audio similarly MUST =20
> support SRTP. Alternative media channel protection MAY be used if =20
> desired (e.g. IPSEC).
>
> to this:
>
> 12.3. Media session protection
> Sensitive data is also carried on media sessions terminating on =20
> MRCPv2 servers (the other end of a media channel may or may not be =20
> on the MRCPv2 client). This data includes the user's spoken =20
> utterances and the output of text-to-speech operations. MRCPv2 =20
> servers MUST support a security mechanism for protection of audio =20
> media sessions. MRCPv2 clients that originate or consume audio =20
> similarly MUST support a security mechanism for protection of the =20
> audio. If appropriate, usage of the Secure Real-time Transport =20
> Protocol (SRTP) [RFC3711] is recommended.
>
> 19.   In section13.7.2 you specify the attribute resource as session =20=

> level yet in the example in section 4.2 it is a media level =20
> attribute. The same goes for the channel attribute
>
> I have corrected both in section 13.7.2 to be media-level.
>
>
>
> Thanks
>
> Roni Even
>
>
>


--Apple-Mail-40-309409471
Content-Type: text/html;
	charset=WINDOWS-1252
Content-Transfer-Encoding: quoted-printable

<html><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; ">Hi =
Roni,<div><br></div><div>Just to finish up on your last comments . . =
.</div><div><br></div><div>-- dan</div><div><br><div><div><div>On Aug =
12, 2009, at 3:15 AM, Roni Even wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-align: auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; "><div lang=3D"EN-US" link=3D"blue" =
vlink=3D"purple" style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><div =
class=3D"Section1"><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: =
'Times New Roman', serif; "><span style=3D"font-size: 11pt; font-family: =
Calibri, sans-serif; color: rgb(31, 73, 125); ">Hi =
Dan,<o:p></o:p></span></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; =
font-family: 'Times New Roman', serif; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: rgb(31, 73, 125); ">I =
understand your explanation about all these "vendor specific" parameter. =
I think that since this a standard track document there should be some =
text explaining the usage of these parameters as well as making a note =
that since these are vendor specific information you cannot compare the =
values coming from different =
vendors</span></div></div></div></span></blockquote><div><br></div>Thank =
you. &nbsp;I will note this in the next draft and suggest how these =
parameters may be used in light of their vendor =
dependence.</div><div><br><blockquote type=3D"cite"><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-align: auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; "><div lang=3D"EN-US" link=3D"blue" =
vlink=3D"purple" style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><div =
class=3D"Section1"><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: =
'Times New Roman', serif; "><span style=3D"font-size: 11pt; font-family: =
Calibri, sans-serif; color: rgb(31, 73, 125); =
"><o:p></o:p></span></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; =
font-family: 'Times New Roman', serif; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: rgb(31, 73, 125); =
"><o:p>&nbsp;</o:p></span></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125); =
"><o:p>&nbsp;</o:p></span></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125); ">As =
for my comment number 5 on payload type 96. My comment was that if the =
m-line has a payload type number of 96 you must have a a=3Drtpmap line =
mapping 96 to a specific subtype name while for pcmu it is not mandatory =
to have a=3Drtpmap like you have in your examples since payload type =
number 0 is a static payload type number assigned to =
pcmu<o:p></o:p></span></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; =
font-family: 'Times New Roman', serif; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: rgb(31, 73, 125); =
"><o:p>&nbsp;</o:p></span></div></div></div></span></blockquote><div><br><=
/div>I'm sorry, I did not explain this very well. &nbsp;I understood =
your comment. &nbsp;My reply was that of the three examples, example 2 =
did actually provide the a=3Drtpmap line for 96. &nbsp;Since the payload =
type of 96 should not even have been included in the first and third =
examples, once I removed it from those two examples all three contained =
the proper a=3Drtpmap lines.</div><div>Although not necessary to have an =
a=3Drtpmap line for payload type 0, others in the past had requested it =
so I left it in.</div><div><br><blockquote type=3D"cite"><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-align: auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; "><div lang=3D"EN-US" link=3D"blue" =
vlink=3D"purple" style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><div =
class=3D"Section1"><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: =
'Times New Roman', serif; "><span style=3D"font-size: 11pt; font-family: =
Calibri, sans-serif; color: rgb(31, 73, 125); =
"><o:p>&nbsp;</o:p></span></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125); ">Roni =
Even<o:p></o:p></span></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; =
font-family: 'Times New Roman', serif; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: rgb(31, 73, 125); =
"><o:p>&nbsp;</o:p></span></div><div style=3D"border-top-style: none; =
border-right-style: none; border-bottom-style: none; border-width: =
initial; border-color: initial; border-left-style: solid; =
border-left-color: blue; border-left-width: 1.5pt; padding-top: 0in; =
padding-right: 0in; padding-bottom: 0in; padding-left: 4pt; "><div><div =
style=3D"border-right-style: none; border-bottom-style: none; =
border-left-style: none; border-width: initial; border-color: initial; =
border-top-style: solid; border-top-color: rgb(181, 196, 223); =
border-top-width: 1pt; padding-top: 3pt; padding-right: 0in; =
padding-bottom: 0in; padding-left: 0in; position: static; z-index: auto; =
"><div style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: =
0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New =
Roman', serif; "><b><span style=3D"font-size: 10pt; font-family: Tahoma, =
sans-serif; ">From:</span></b><span style=3D"font-size: 10pt; =
font-family: Tahoma, sans-serif; "><span =
class=3D"Apple-converted-space">&nbsp;</span>Dan Burnett [<a =
href=3D"mailto:dburnett@voxeo.com" style=3D"color: blue; =
text-decoration: underline; ">mailto:dburnett@voxeo.com</a>]<span =
class=3D"Apple-converted-space">&nbsp;</span><br><b>Sent:</b><span =
class=3D"Apple-converted-space">&nbsp;</span>Tuesday, August 11, 2009 =
9:22 PM<br><b>To:</b><span =
class=3D"Apple-converted-space">&nbsp;</span>Roni =
Even<br><b>Cc:</b><span class=3D"Apple-converted-space">&nbsp;</span><a =
href=3D"mailto:sarvi@cisco.com" style=3D"color: blue; text-decoration: =
underline; ">sarvi@cisco.com</a>;<span =
class=3D"Apple-converted-space">&nbsp;</span><a =
href=3D"mailto:oran@cisco.com" style=3D"color: blue; text-decoration: =
underline; ">oran@cisco.com</a>; 'Eric Burger';<span =
class=3D"Apple-converted-space">&nbsp;</span><a =
href=3D"mailto:speechsc@ietf.org" style=3D"color: blue; text-decoration: =
underline; ">speechsc@ietf.org</a>;<span =
class=3D"Apple-converted-space">&nbsp;</span><a =
href=3D"mailto:rai@ietf.org" style=3D"color: blue; text-decoration: =
underline; ">rai@ietf.org</a><br><b>Subject:</b><span =
class=3D"Apple-converted-space">&nbsp;</span>Re: RAI review of =
draft-ietf-speechsc-mrcpv2-19<o:p></o:p></span></div></div></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">On Jul 7, 2009, at 3:40 =
PM, Roni Even wrote:<o:p></o:p></div></div><div style=3D"margin-top: =
0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><p class=3D"MsoCommentText" =
style=3D"margin-right: 0in; margin-left: 0in; font-size: 12pt; =
font-family: 'Times New Roman', serif; margin-bottom: 10pt; line-height: =
18px; "><span style=3D"font-size: 11pt; line-height: 17px; font-family: =
Calibri, sans-serif; color: black; ">Hi,</span><span style=3D"font-size: =
10pt; line-height: 14px; font-family: Calibri, sans-serif; color: black; =
"><o:p></o:p></span></p><p class=3D"MsoCommentText" style=3D"margin-right:=
 0in; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; margin-bottom: 10pt; line-height: 18px; "><span style=3D"font-size:=
 11pt; line-height: 17px; font-family: Calibri, sans-serif; color: =
black; ">I was assigned to do a RAI review of the draft. &nbsp;The draft =
looks ready for publication to me. I have some comments mostly =
editorial.</span><span style=3D"font-size: 10pt; line-height: 14px; =
font-family: Calibri, sans-serif; color: black; =
"><o:p></o:p></span></p><p class=3D"MsoCommentText" style=3D"margin-right:=
 0in; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; margin-bottom: 10pt; line-height: 18px; "><span style=3D"font-size:=
 11pt; line-height: 17px; font-family: Calibri, sans-serif; color: =
black; ">The only issue I see that is not pure editorial is the issue of =
the different parameters like confidence threshold, sensitivity level =
(see comments 11, 13, 15, 16 and 17). I think that some clarification on =
the semantics and the scale (for example are the values linearly spaced) =
as well as when they are useful will be helpful to =
implementers.</span><span style=3D"font-size: 10pt; line-height: 14px; =
font-family: Calibri, sans-serif; color: black; =
"><o:p></o:p></span></p><p class=3D"MsoCommentText" style=3D"margin-right:=
 0in; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; margin-bottom: 10pt; text-indent: -0.25in; line-height: 18px; =
"><span style=3D"font-size: 11pt; line-height: 17px; font-family: =
Calibri, sans-serif; color: black; ">1.</span><span style=3D"font-size: =
7pt; line-height: 10px; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; line-height: 17px; font-family: Calibri, =
sans-serif; color: black; ">In figure 1 Expand the abbreviations TTS, =
ASR, SV , SI and how they are related to the media resource types in =
3.1</span><span style=3D"font-size: 10pt; line-height: 14px; =
font-family: Calibri, sans-serif; color: black; =
"><o:p></o:p></span></p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">Done. &nbsp;Added some =
text explaining Figure 1 and enhanced Figure 1 slightly for =
clarification.<br><br><o:p></o:p></div><div><div><p =
class=3D"MsoCommentText" style=3D"margin-right: 0in; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; margin-bottom: =
10pt; text-indent: -0.25in; line-height: 18px; "><span style=3D"font-size:=
 11pt; line-height: 17px; font-family: Calibri, sans-serif; color: =
black; ">2.</span><span style=3D"font-size: 7pt; line-height: 10px; =
color: black; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; line-height: 17px; font-family: Calibri, =
sans-serif; color: black; ">In figure 1 there is a SIP dialog between =
the MRCPv2 client and the media source/sink, what is this dialog, I only =
saw in section 4 a dialog between the client and server.</span><span =
style=3D"font-size: 10pt; line-height: 14px; font-family: Calibri, =
sans-serif; color: black; "><o:p></o:p></span></p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; ">Clarified in&nbsp;the first example of section 4.2 that the SIP =
dialog with the media source/sink is not =
shown.<o:p></o:p></div></div><blockquote style=3D"margin-top: 5pt; =
margin-bottom: 5pt; "><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; text-indent: -0.25in; =
"><span style=3D"font-size: 11pt; font-family: Calibri, sans-serif; =
color: black; ">3.</span><span style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 3.2 you have =93For example:<span =
class=3D"apple-converted-space">&nbsp;</span><a =
href=3D"sip:mrcpv2@example.net" style=3D"color: blue; text-decoration: =
underline; "><span style=3D"color: windowtext; text-decoration: none; =
">sip:mrcpv2@example.net</span></a>=94 twice one after the =
other.</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><span style=3D"font-size: 11pt; font-family: Calibri, =
sans-serif; color: black; ">&nbsp;</span><span style=3D"font-size: =
10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div></blockquote><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; ">Fixed.<o:p></o:p></div></div><div><div style=3D"margin-top: =
0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; text-indent: -0.25in; =
"><span style=3D"font-size: 11pt; font-family: Calibri, sans-serif; =
color: black; ">4.</span><span style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In the example in section 4.2 you =93a=3Dcmid:1=94, cmid is =
specified later in the document so maybe you can add some reference to =
where it is specified</span><span style=3D"font-size: 10.5pt; =
font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
">Done.<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">5.</span><span =
style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In the example is section 4.2 and in following examples you =
have =93m=3Daudio 49170 RTP/AVP 0 96=94 but do not have an rtpmap =
parameter for mapping 96 (dynamic payload type number) to a media =
encoding name.</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">It is not in the first or =
third examples (Synthesizer only), but it is in the second example =
(Recognizer). &nbsp;I have removed 96 as an option for the =
Synthesizer-only examples but let it remain as an addition for the =
Recognizer example.<o:p></o:p></div></div><div><div style=3D"margin-top: =
0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">6.</span><span =
style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 4.3 =93Also note that more that one media session =
can be associated with a single resource if need be, but this scenario =
is not useful for the current set of resources=94. There is a typo the =
second =93that=94 should be =93than=94. I am also not sure if the =
current syntax in this document can support the mode.</span><span =
style=3D"font-size: 10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div></div></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; ">Fixed the typo.<o:p></o:p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; "><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">7.</span><span =
style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 4.3 =93The formatting of the"cmid" attribute in SDP =
RFC3388 [RFC4566]=94. I think you meant SDP grouping and need the =
reference to RFC 3388.</span><span style=3D"font-size: 10.5pt; =
font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div></div></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; ">I removed the reference altogether because it already exists =
(correctly) earlier in the paragraph.<o:p></o:p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; "><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">8.</span><span =
style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 5.1 =93The message-length field specifies the length =
of the message, including the start-line=94 is the length in Bytes, =
there is no unit specified.</span><span style=3D"font-size: 10.5pt; =
font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">Changed "length of the =
message" to "length of the message in =
bytes".<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">9.</span><span =
style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 6.3.1, typo you have =93Verfication =93 instead of =
verification. It appears twice in the section.</span><span =
style=3D"font-size: 10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
">Fixed.<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">10.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In the example in section 7 you have =93m=3Daudio 0 RTP/AVP 0 1 =
3=94 payload type 1 was deleted from the IANA registry, maybe have =
another payload type number.</span><span style=3D"font-size: 10.5pt; =
font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">I just removed that =
payload type. &nbsp;It is not germane to the =
example.<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">11.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 9.4.1, 9.4.2 and 9.4.3 you specify confidence =
threshold, sensitivity level and speed vs accuracy. What is the scale =
here; is it linear between 0 and 1. What is the absolute value of the =
number, if you receive the same confidence level from two recognizers =
are they the same (e.g. when using context block to switch =
servers).&nbsp; For the speed vs accuracy, how does the client know what =
is the relation between the value and the number of available sessions, =
since this seems to be the reason for using this parameter.</span><span =
style=3D"font-size: 10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div></div></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; ">The interpretation of all of these parameters is =
implementation-specific because the underlying technologies used to =
implement them vary and can even be proprietary. &nbsp;In practice the =
speech recognition and synthesis and speaker authentication communities =
have lived with this state of affairs for many years, and users of other =
APIs for this technology are well aware of and have built applications =
that accommodate this variability in interpretation. &nbsp;It is outside =
the scope of this specification to attempt to standardize =
interpretations of these values.<o:p></o:p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; text-indent: =
-0.25in; "><span style=3D"font-size: 11pt; font-family: Calibri, =
sans-serif; color: black; ">12.</span><span style=3D"font-size: 7pt; =
color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In 9.4.9 and in 10.4.8, 11.4.11 what are the values for =
media-type-value, you also mention audio and video but it looks to me =
that this document only discusses voice.</span><span style=3D"font-size: =
10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">Yes. &nbsp;Although the =
original intent was to record speech, application authors today are =
beginning to look at ways to incorporate other audio or video. &nbsp;The =
intent of the sentences in these sections is to clarify that the =
specification itself imposes no restriction on the types of media that =
are allowed.<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">13.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In 9.4.35 and 9.4.36 what is the scale for the consistency =
here. How does one know what close means. What is the consistency =
between different recognizers.</span><span style=3D"font-size: 10.5pt; =
font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">The answer to question =
11, above, applies here as well.<o:p></o:p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; "><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">14.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 9.6.3.3 in the example (figure 2) confidence should =
be 0.75 and not 75</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
">Fixed.<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">15.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 10.4.1 it is not clear how you measure the =
sensitivity in order to specify, is it based on some SNR translated to 0 =
to 1 scale?</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">The answer to question =
11, above, applies here as well.<o:p></o:p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; "><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">16.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In 11.4.6 the same issue with the scale, how does the client =
know how to set a value when working with different speaker verification =
servers.</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">Ditto. &nbsp;I should =
point out that in all of these cases the parameters are typically passed =
directly to the engine, and their interpretations are defined (and =
described) in the vendors' documentation. &nbsp;The most common MRCPv2 =
server implementations are by the technology vendors themselves (the =
providers of the synthesis, recognition, and verification engines). =
&nbsp;This is commonly understood in this technology industry (meaning =
those who use this technology =
regularly).<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">17.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In 11.5.2.9 you state that the verification-score is not a =
probability, so what is it. How can the client decide if, for example, 0 =
is a good score for specifying the threshold.&nbsp; I also noticed that =
the values in the example in section 11.5.2.10 are very precise like =
0.98514 is this the expected precision. The examples here and in section =
11.11 do not show the threshold, if the threshold is required for this =
flow why not show it in the example?</span><span style=3D"font-size: =
10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">This parameter, as others =
mentioned above, has only a vendor-specific interpretation. &nbsp;In =
practice authors interpret these values based both on guidance from the =
technology vendors and via experimentation on large sets of recorded =
data.<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">The =
Min-Verification-Score threshold is not required to be set. &nbsp;In =
many cases the technology vendor has a fairly good understanding of what =
the default threshold should be. &nbsp;The verification-score is =
returned, however, in case the application author determines (through =
experimentation, as described above) that the default threshold is not =
producing optimal results for the application. &nbsp;In that case the =
author can set the threshold to a different value or can set it to -1 =
and make the determination within the application itself based on the =
verification-score values.<o:p></o:p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><br><br><o:p></o:p></div><div><div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; "><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">&nbsp;</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; text-indent: -0.25in; "><span style=3D"font-size: 11pt; =
font-family: Calibri, sans-serif; color: black; ">18.</span><span =
style=3D"font-size: 7pt; color: black; ">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section 12.3 the suggestion is to use SRTP as the mandatory =
interoperability mode. If the reason for mandating SRTP is for a common =
mode you should also decide on a key exchange mechanism. I suggest you =
look at<span class=3D"apple-converted-space">&nbsp;</span><a =
href=3D"http://tools.ietf.org/html/draft-ietf-avt-srtp-not-mandatory-02" =
style=3D"color: blue; text-decoration: underline; =
">http://tools.ietf.org/html/draft-ietf-avt-srtp-not-mandatory-02</a><span=
 class=3D"apple-converted-space">&nbsp;</span>for discussion on media =
security.</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">Based on the discussion =
between you and Dan York on the list, I will change =
this:<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div><pre style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
10pt; font-family: 'Courier New'; "><span class=3D"apple-style-span"><span=
 style=3D"font-size: 12pt; font-family: Helvetica, sans-serif; ">12.3. =
Media session protection&nbsp;</span></span><o:p></o:p></pre><pre =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 10pt; font-family: 'Courier New'; "><span =
class=3D"apple-style-span"><span style=3D"font-size: 9pt; font-family: =
Helvetica, sans-serif; ">Sensitive data is also carried on media =
sessions terminating on MRCPv2 servers (the other end of a media channel =
may or may not be on the MRCPv2 client). This data includes the user's =
spoken utterances and the output of text-to-speech operations. MRCPv2 =
servers MUST support SRTP for protection of audio media sessions. MRCPv2 =
clients that originate or consume audio similarly MUST support SRTP. =
Alternative media channel protection MAY be used if desired (e.g. =
IPSEC).</span></span><o:p></o:p></pre></div><div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">to =
this:<o:p></o:p></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; =
"><o:p>&nbsp;</o:p></div></div><div><pre style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
10pt; font-family: 'Courier New'; "><span class=3D"apple-style-span"><span=
 style=3D"font-size: 9pt; font-family: Helvetica, sans-serif; ">12.3. =
Media session protection&nbsp;</span></span><o:p></o:p></pre><pre =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 10pt; font-family: 'Courier New'; "><span =
class=3D"apple-style-span"><span style=3D"font-size: 9pt; font-family: =
Helvetica, sans-serif; ">Sensitive data is also carried on media =
sessions terminating on MRCPv2 servers (the other end of a media channel =
may or may not be on the MRCPv2 client). This data includes the user's =
spoken utterances and the output of text-to-speech operations. MRCPv2 =
servers MUST support a security mechanism for protection of audio media =
sessions. MRCPv2 clients that originate or consume audio similarly MUST =
support a security mechanism for protection of the audio. If =
appropriate,&nbsp;usage of the Secure Real-time Transport Protocol =
(SRTP)&nbsp;[RFC3711] is =
recommended.</span></span><o:p></o:p></pre></div><div><blockquote =
style=3D"margin-top: 5pt; margin-bottom: 5pt; "><div><div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><span style=3D"font-size: 11pt; font-family: Calibri, =
sans-serif; color: black; ">&nbsp;</span><span style=3D"font-size: =
10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; text-indent: -0.25in; =
"><span style=3D"font-size: 11pt; font-family: Calibri, sans-serif; =
color: black; ">19.</span><span style=3D"font-size: 7pt; color: black; =
">&nbsp;&nbsp;<span =
class=3D"apple-converted-space">&nbsp;</span></span><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">In section13.7.2 you specify the attribute resource as session =
level yet in the example in section 4.2 it is a media level attribute. =
The same goes for the channel attribute</span><span style=3D"font-size: =
10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div></div></div></blockquote><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><o:p>&nbsp;</o:p></div></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; ">I have corrected both in =
section 13.7.2 to be media-level.<o:p></o:p></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><br><br><o:p></o:p></div><div><div><div style=3D"margin-left: =
0.5in; "><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: =
'Times New Roman', serif; "><span style=3D"font-size: 11pt; font-family: =
Calibri, sans-serif; color: black; =
">&nbsp;<o:p></o:p></span></div></div><div><div style=3D"margin-top: =
0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 12pt; font-family: 'Times New Roman', serif; "><span =
style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: =
black; ">Thanks</span><span style=3D"font-size: 10.5pt; font-family: =
Consolas; color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><span style=3D"font-size: 11pt; font-family: Calibri, =
sans-serif; color: black; ">&nbsp;</span><span style=3D"font-size: =
10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; ">Roni =
Even</span><span style=3D"font-size: 10.5pt; font-family: Consolas; =
color: black; "><o:p></o:p></span></div></div><div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; "><span style=3D"font-size: 11pt; font-family: Calibri, =
sans-serif; color: black; ">&nbsp;</span><span style=3D"font-size: =
10.5pt; font-family: Consolas; color: black; =
"><o:p></o:p></span></div></div><div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
12pt; font-family: 'Times New Roman', serif; "><span style=3D"font-size: =
11pt; font-family: Calibri, sans-serif; color: black; =
">&nbsp;<o:p></o:p></span></div></div></div></div></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', =
serif; =
"><o:p>&nbsp;</o:p></div></div></div></div></span></blockquote></div><br><=
/div></div></body></html>=

--Apple-Mail-40-309409471--

From ron.even.tlv@gmail.com  Tue Dec 29 07:37:00 2009
Return-Path: <ron.even.tlv@gmail.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0FDEE3A6891; Tue, 29 Dec 2009 07:37:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level: 
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_16=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2nFGziLDF48Y; Tue, 29 Dec 2009 07:36:46 -0800 (PST)
Received: from mail-fx0-f215.google.com (mail-fx0-f215.google.com [209.85.220.215]) by core3.amsl.com (Postfix) with ESMTP id 2A5CD3A6820; Tue, 29 Dec 2009 07:36:45 -0800 (PST)
Received: by fxm7 with SMTP id 7so10397626fxm.29 for <multiple recipients>; Tue, 29 Dec 2009 07:36:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:references :in-reply-to:subject:date:message-id:mime-version:content-type :x-mailer:thread-index:content-language; bh=oeVVv7+aw9S6v3L/MO5rnDH65uEnAeHXJL0d2He8/gg=; b=pFEO8QlCnV0IqQPt5S07tE7oUt+2epfErq30ElhwVRJtD44O6Z+J8pGG0MYADyCi/4 0l0VfZKCkgm8AlyynErxm6Z84baKHzi5IZHEZ7ka9VPeAdYGnMiyop+JwNDD33BI7Y9Z zMXeUNW2qwT7A0vjIRMo/CDwiump8xIRJPcpI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:references:in-reply-to:subject:date:message-id :mime-version:content-type:x-mailer:thread-index:content-language; b=aF+OsL+OjsNSDbdmd/LY6u8lzv7x5DrQrFqT1yCRq7Q+XqhfhTuNRaHj81nBWwLay0 pmok/9rmyOua+6+jJfjMF8ncHJAS9TgYSDtqOkYwBYD7WtZZ6586FfkA/qTwnDAUDFzh Ftyn7OAMaaCgq3AmmB2hr8LI71X1CrdPLHM6I=
Received: by 10.223.19.200 with SMTP id c8mr10024907fab.55.1262100982726; Tue, 29 Dec 2009 07:36:22 -0800 (PST)
Received: from windows8d787f9 (bzq-79-183-125-218.red.bezeqint.net [79.183.125.218]) by mx.google.com with ESMTPS id 35sm18549275fkt.40.2009.12.29.07.36.16 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 29 Dec 2009 07:36:20 -0800 (PST)
From: "Roni Even" <ron.even.tlv@gmail.com>
To: "'Dan Burnett'" <dburnett@voxeo.com>, "'Roni Even'" <Even.roni@huawei.com>
References: <033101c9ff3a$cbe33160$63a99420$%roni@huawei.com>	<E2C626B8-8CA1-4A1D-A2CE-B6AB4B269DEE@voxeo.com>	<027801ca1b1c$c2e8ee80$48bacb80$%roni@huawei.com> <C46B7F31-9989-442C-B2F1-CA77E79F04F8@voxeo.com>
In-Reply-To: <C46B7F31-9989-442C-B2F1-CA77E79F04F8@voxeo.com>
Date: Tue, 29 Dec 2009 17:35:29 +0200
Message-ID: <4b3a21f4.23145e0a.5a8e.ffff8c91@mx.google.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0801_01CA88AD.53323700"
X-Mailer: Microsoft Office Outlook 12.0
Thread-index: AcqIdkGBq48YoaG7TeC/YMOV6kSpFgAJg4eQ
Content-language: en-us
Cc: speechsc@ietf.org, sarvi@cisco.com, oran@cisco.com, rai@ietf.org
Subject: Re: [Speechsc] [RAI] RAI review of draft-ietf-speechsc-mrcpv2-19
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 29 Dec 2009 15:37:00 -0000

This is a multi-part message in MIME format.

------=_NextPart_000_0801_01CA88AD.53323700
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

Looks OK

Roni

 

From: rai-bounces@ietf.org [mailto:rai-bounces@ietf.org] On Behalf Of Dan
Burnett
Sent: Tuesday, December 29, 2009 1:01 PM
To: Roni Even
Cc: speechsc@ietf.org; sarvi@cisco.com; oran@cisco.com; rai@ietf.org
Subject: Re: [RAI] RAI review of draft-ietf-speechsc-mrcpv2-19

 

Hi Roni,

 

Just to finish up on your last comments . . .

 

-- dan

 

On Aug 12, 2009, at 3:15 AM, Roni Even wrote:





Hi Dan,

I understand your explanation about all these "vendor specific" parameter. I
think that since this a standard track document there should be some text
explaining the usage of these parameters as well as making a note that since
these are vendor specific information you cannot compare the values coming
from different vendors

 

Thank you.  I will note this in the next draft and suggest how these
parameters may be used in light of their vendor dependence.





 

 

As for my comment number 5 on payload type 96. My comment was that if the
m-line has a payload type number of 96 you must have a a=rtpmap line mapping
96 to a specific subtype name while for pcmu it is not mandatory to have
a=rtpmap like you have in your examples since payload type number 0 is a
static payload type number assigned to pcmu

 

 

I'm sorry, I did not explain this very well.  I understood your comment.  My
reply was that of the three examples, example 2 did actually provide the
a=rtpmap line for 96.  Since the payload type of 96 should not even have
been included in the first and third examples, once I removed it from those
two examples all three contained the proper a=rtpmap lines.

Although not necessary to have an a=rtpmap line for payload type 0, others
in the past had requested it so I left it in.





 

Roni Even

 

From: Dan Burnett [mailto:dburnett@voxeo.com] 
Sent: Tuesday, August 11, 2009 9:22 PM
To: Roni Even
Cc: sarvi@cisco.com; oran@cisco.com; 'Eric Burger'; speechsc@ietf.org;
rai@ietf.org
Subject: Re: RAI review of draft-ietf-speechsc-mrcpv2-19

 

 

On Jul 7, 2009, at 3:40 PM, Roni Even wrote:






Hi,

I was assigned to do a RAI review of the draft.  The draft looks ready for
publication to me. I have some comments mostly editorial.

The only issue I see that is not pure editorial is the issue of the
different parameters like confidence threshold, sensitivity level (see
comments 11, 13, 15, 16 and 17). I think that some clarification on the
semantics and the scale (for example are the values linearly spaced) as well
as when they are useful will be helpful to implementers.

1.       In figure 1 Expand the abbreviations TTS, ASR, SV , SI and how they
are related to the media resource types in 3.1

 

Done.  Added some text explaining Figure 1 and enhanced Figure 1 slightly
for clarification.




2.       In figure 1 there is a SIP dialog between the MRCPv2 client and the
media source/sink, what is this dialog, I only saw in section 4 a dialog
between the client and server.

Clarified in the first example of section 4.2 that the SIP dialog with the
media source/sink is not shown.

3.       In section 3.2 you have "For example:  <sip:mrcpv2@example.net>
sip:mrcpv2@example.net" twice one after the other.

 

Fixed.






4.       In the example in section 4.2 you "a=cmid:1", cmid is specified
later in the document so maybe you can add some reference to where it is
specified

 

Done.






 

5.       In the example is section 4.2 and in following examples you have
"m=audio 49170 RTP/AVP 0 96" but do not have an rtpmap parameter for mapping
96 (dynamic payload type number) to a media encoding name.

 

It is not in the first or third examples (Synthesizer only), but it is in
the second example (Recognizer).  I have removed 96 as an option for the
Synthesizer-only examples but let it remain as an addition for the
Recognizer example.






 

6.       In section 4.3 "Also note that more that one media session can be
associated with a single resource if need be, but this scenario is not
useful for the current set of resources". There is a typo the second "that"
should be "than". I am also not sure if the current syntax in this document
can support the mode.

 

Fixed the typo.






 

7.       In section 4.3 "The formatting of the"cmid" attribute in SDP
RFC3388 [RFC4566]". I think you meant SDP grouping and need the reference to
RFC 3388.

 

I removed the reference altogether because it already exists (correctly)
earlier in the paragraph.






 

8.       In section 5.1 "The message-length field specifies the length of
the message, including the start-line" is the length in Bytes, there is no
unit specified.

 

Changed "length of the message" to "length of the message in bytes".






 

9.       In section 6.3.1, typo you have "Verfication " instead of
verification. It appears twice in the section.

 

Fixed.






 

10.   In the example in section 7 you have "m=audio 0 RTP/AVP 0 1 3" payload
type 1 was deleted from the IANA registry, maybe have another payload type
number.

 

I just removed that payload type.  It is not germane to the example.






 

11.   In section 9.4.1, 9.4.2 and 9.4.3 you specify confidence threshold,
sensitivity level and speed vs accuracy. What is the scale here; is it
linear between 0 and 1. What is the absolute value of the number, if you
receive the same confidence level from two recognizers are they the same
(e.g. when using context block to switch servers).  For the speed vs
accuracy, how does the client know what is the relation between the value
and the number of available sessions, since this seems to be the reason for
using this parameter.

 

The interpretation of all of these parameters is implementation-specific
because the underlying technologies used to implement them vary and can even
be proprietary.  In practice the speech recognition and synthesis and
speaker authentication communities have lived with this state of affairs for
many years, and users of other APIs for this technology are well aware of
and have built applications that accommodate this variability in
interpretation.  It is outside the scope of this specification to attempt to
standardize interpretations of these values.






12.   In 9.4.9 and in 10.4.8, 11.4.11 what are the values for
media-type-value, you also mention audio and video but it looks to me that
this document only discusses voice.

 

Yes.  Although the original intent was to record speech, application authors
today are beginning to look at ways to incorporate other audio or video.
The intent of the sentences in these sections is to clarify that the
specification itself imposes no restriction on the types of media that are
allowed.






 

13.   In 9.4.35 and 9.4.36 what is the scale for the consistency here. How
does one know what close means. What is the consistency between different
recognizers.

 

The answer to question 11, above, applies here as well.






 

14.   In section 9.6.3.3 in the example (figure 2) confidence should be 0.75
and not 75

 

Fixed.






 

15.   In section 10.4.1 it is not clear how you measure the sensitivity in
order to specify, is it based on some SNR translated to 0 to 1 scale?

 

The answer to question 11, above, applies here as well.






 

16.   In 11.4.6 the same issue with the scale, how does the client know how
to set a value when working with different speaker verification servers.

 

Ditto.  I should point out that in all of these cases the parameters are
typically passed directly to the engine, and their interpretations are
defined (and described) in the vendors' documentation.  The most common
MRCPv2 server implementations are by the technology vendors themselves (the
providers of the synthesis, recognition, and verification engines).  This is
commonly understood in this technology industry (meaning those who use this
technology regularly).






 

17.   In 11.5.2.9 you state that the verification-score is not a
probability, so what is it. How can the client decide if, for example, 0 is
a good score for specifying the threshold.  I also noticed that the values
in the example in section 11.5.2.10 are very precise like 0.98514 is this
the expected precision. The examples here and in section 11.11 do not show
the threshold, if the threshold is required for this flow why not show it in
the example?

 

This parameter, as others mentioned above, has only a vendor-specific
interpretation.  In practice authors interpret these values based both on
guidance from the technology vendors and via experimentation on large sets
of recorded data.

 

The Min-Verification-Score threshold is not required to be set.  In many
cases the technology vendor has a fairly good understanding of what the
default threshold should be.  The verification-score is returned, however,
in case the application author determines (through experimentation, as
described above) that the default threshold is not producing optimal results
for the application.  In that case the author can set the threshold to a
different value or can set it to -1 and make the determination within the
application itself based on the verification-score values.






 

18.   In section 12.3 the suggestion is to use SRTP as the mandatory
interoperability mode. If the reason for mandating SRTP is for a common mode
you should also decide on a key exchange mechanism. I suggest you look at
http://tools.ietf.org/html/draft-ietf-avt-srtp-not-mandatory-02 for
discussion on media security.

 

Based on the discussion between you and Dan York on the list, I will change
this:

 

12.3. Media session protection 
Sensitive data is also carried on media sessions terminating on MRCPv2
servers (the other end of a media channel may or may not be on the MRCPv2
client). This data includes the user's spoken utterances and the output of
text-to-speech operations. MRCPv2 servers MUST support SRTP for protection
of audio media sessions. MRCPv2 clients that originate or consume audio
similarly MUST support SRTP. Alternative media channel protection MAY be
used if desired (e.g. IPSEC).

 

to this:

 

12.3. Media session protection 
Sensitive data is also carried on media sessions terminating on MRCPv2
servers (the other end of a media channel may or may not be on the MRCPv2
client). This data includes the user's spoken utterances and the output of
text-to-speech operations. MRCPv2 servers MUST support a security mechanism
for protection of audio media sessions. MRCPv2 clients that originate or
consume audio similarly MUST support a security mechanism for protection of
the audio. If appropriate, usage of the Secure Real-time Transport Protocol
(SRTP) [RFC3711] is recommended.

 

19.   In section13.7.2 you specify the attribute resource as session level
yet in the example in section 4.2 it is a media level attribute. The same
goes for the channel attribute

 

I have corrected both in section 13.7.2 to be media-level.






 

Thanks

 

Roni Even

 

 

 

 


------=_NextPart_000_0801_01CA88AD.53323700
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:x=3D"urn:schemas-microsoft-com:office:excel" =
xmlns:p=3D"urn:schemas-microsoft-com:office:powerpoint" =
xmlns:a=3D"urn:schemas-microsoft-com:office:access" =
xmlns:dt=3D"uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" =
xmlns:s=3D"uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" =
xmlns:rs=3D"urn:schemas-microsoft-com:rowset" xmlns:z=3D"#RowsetSchema" =
xmlns:b=3D"urn:schemas-microsoft-com:office:publisher" =
xmlns:ss=3D"urn:schemas-microsoft-com:office:spreadsheet" =
xmlns:c=3D"urn:schemas-microsoft-com:office:component:spreadsheet" =
xmlns:odc=3D"urn:schemas-microsoft-com:office:odc" =
xmlns:oa=3D"urn:schemas-microsoft-com:office:activation" =
xmlns:html=3D"http://www.w3.org/TR/REC-html40" =
xmlns:q=3D"http://schemas.xmlsoap.org/soap/envelope/" =
xmlns:rtc=3D"http://microsoft.com/officenet/conferencing" =
xmlns:D=3D"DAV:" xmlns:Repl=3D"http://schemas.microsoft.com/repl/" =
xmlns:mt=3D"http://schemas.microsoft.com/sharepoint/soap/meetings/" =
xmlns:x2=3D"http://schemas.microsoft.com/office/excel/2003/xml" =
xmlns:ppda=3D"http://www.passport.com/NameSpace.xsd" =
xmlns:ois=3D"http://schemas.microsoft.com/sharepoint/soap/ois/" =
xmlns:dir=3D"http://schemas.microsoft.com/sharepoint/soap/directory/" =
xmlns:ds=3D"http://www.w3.org/2000/09/xmldsig#" =
xmlns:dsp=3D"http://schemas.microsoft.com/sharepoint/dsp" =
xmlns:udc=3D"http://schemas.microsoft.com/data/udc" =
xmlns:xsd=3D"http://www.w3.org/2001/XMLSchema" =
xmlns:sub=3D"http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/"=
 xmlns:ec=3D"http://www.w3.org/2001/04/xmlenc#" =
xmlns:sp=3D"http://schemas.microsoft.com/sharepoint/" =
xmlns:sps=3D"http://schemas.microsoft.com/sharepoint/soap/" =
xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance" =
xmlns:udcs=3D"http://schemas.microsoft.com/data/udc/soap" =
xmlns:udcxf=3D"http://schemas.microsoft.com/data/udc/xmlfile" =
xmlns:udcp2p=3D"http://schemas.microsoft.com/data/udc/parttopart" =
xmlns:wf=3D"http://schemas.microsoft.com/sharepoint/soap/workflow/" =
xmlns:dsss=3D"http://schemas.microsoft.com/office/2006/digsig-setup" =
xmlns:dssi=3D"http://schemas.microsoft.com/office/2006/digsig" =
xmlns:mdssi=3D"http://schemas.openxmlformats.org/package/2006/digital-sig=
nature" =
xmlns:mver=3D"http://schemas.openxmlformats.org/markup-compatibility/2006=
" xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns:mrels=3D"http://schemas.openxmlformats.org/package/2006/relationshi=
ps" xmlns:spwp=3D"http://microsoft.com/sharepoint/webpartpages" =
xmlns:ex12t=3D"http://schemas.microsoft.com/exchange/services/2006/types"=
 =
xmlns:ex12m=3D"http://schemas.microsoft.com/exchange/services/2006/messag=
es" =
xmlns:pptsl=3D"http://schemas.microsoft.com/sharepoint/soap/SlideLibrary/=
" =
xmlns:spsl=3D"http://microsoft.com/webservices/SharePointPortalServer/Pub=
lishedLinksService" xmlns:Z=3D"urn:schemas-microsoft-com:" =
xmlns:st=3D"&#1;" xmlns=3D"http://www.w3.org/TR/REC-html40">

<head>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 12 (filtered medium)">
<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:Helvetica;
	panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
	{font-family:Consolas;
	panose-1:2 11 6 9 2 2 4 3 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
p.MsoCommentText, li.MsoCommentText, div.MsoCommentText
	{mso-style-priority:99;
	mso-style-link:"Comment Text Char";
	mso-margin-top-alt:auto;
	margin-right:0in;
	mso-margin-bottom-alt:auto;
	margin-left:0in;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
pre
	{mso-style-priority:99;
	mso-style-link:"HTML Preformatted Char";
	margin:0in;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";}
span.apple-style-span
	{mso-style-name:apple-style-span;}
span.apple-converted-space
	{mso-style-name:apple-converted-space;}
span.CommentTextChar
	{mso-style-name:"Comment Text Char";
	mso-style-priority:99;
	mso-style-link:"Comment Text";
	font-family:"Calibri","sans-serif";}
span.HTMLPreformattedChar
	{mso-style-name:"HTML Preformatted Char";
	mso-style-priority:99;
	mso-style-link:"HTML Preformatted";
	font-family:Consolas;}
span.EmailStyle23
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
	{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext=3D"edit">
  <o:idmap v:ext=3D"edit" data=3D"1" />
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple style=3D'word-wrap: =
break-word;
-webkit-nbsp-mode: space;-webkit-line-break: after-white-space'>

<div class=3DSection1>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Looks OK<o:p></o:p></span></p>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Roni<o:p></o:p></span></p>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p>&nbsp;</o:p></span></p>

<div style=3D'border:none;border-left:solid blue 1.5pt;padding:0in 0in =
0in 4.0pt'>

<div>

<div style=3D'border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt =
0in 0in 0in'>

<p class=3DMsoNormal><b><span =
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span>=
</b><span
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'>
rai-bounces@ietf.org [mailto:rai-bounces@ietf.org] <b>On Behalf Of =
</b>Dan Burnett<br>
<b>Sent:</b> Tuesday, December 29, 2009 1:01 PM<br>
<b>To:</b> Roni Even<br>
<b>Cc:</b> speechsc@ietf.org; sarvi@cisco.com; oran@cisco.com; =
rai@ietf.org<br>
<b>Subject:</b> Re: [RAI] RAI review of =
draft-ietf-speechsc-mrcpv2-19<o:p></o:p></span></p>

</div>

</div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<p class=3DMsoNormal>Hi Roni,<o:p></o:p></p>

<div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

</div>

<div>

<p class=3DMsoNormal>Just to finish up on your last comments . . =
.<o:p></o:p></p>

</div>

<div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

</div>

<div>

<p class=3DMsoNormal>-- dan<o:p></o:p></p>

</div>

<div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

<div>

<div>

<div>

<p class=3DMsoNormal>On Aug 12, 2009, at 3:15 AM, Roni Even =
wrote:<o:p></o:p></p>

</div>

<p class=3DMsoNormal><br>
<br>
<o:p></o:p></p>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Hi Dan,</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>I understand your explanation about all these =
&quot;vendor
specific&quot; parameter. I think that since this a standard track =
document
there should be some text explaining the usage of these parameters as =
well as
making a note that since these are vendor specific information you =
cannot
compare the values coming from different vendors</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

<div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

</div>

<p class=3DMsoNormal>Thank you. &nbsp;I will note this in the next draft =
and
suggest how these parameters may be used in light of their vendor =
dependence.<o:p></o:p></p>

</div>

<div>

<p class=3DMsoNormal><br>
<br>
<o:p></o:p></p>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>As for my comment number 5 on payload type 96. My comment =
was
that if the m-line has a payload type number of 96 you must have a =
a=3Drtpmap
line mapping 96 to a specific subtype name while for pcmu it is not =
mandatory
to have a=3Drtpmap like you have in your examples since payload type =
number 0 is
a static payload type number assigned to pcmu</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

<div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

</div>

<p class=3DMsoNormal>I'm sorry, I did not explain this very well. =
&nbsp;I
understood your comment. &nbsp;My reply was that of the three examples, =
example
2 did actually provide the a=3Drtpmap line for 96. &nbsp;Since the =
payload type
of 96 should not even have been included in the first and third =
examples, once
I removed it from those two examples all three contained the proper =
a=3Drtpmap
lines.<o:p></o:p></p>

</div>

<div>

<p class=3DMsoNormal>Although not necessary to have an a=3Drtpmap line =
for payload
type 0, others in the past had requested it so I left it =
in.<o:p></o:p></p>

</div>

<div>

<p class=3DMsoNormal><br>
<br>
<o:p></o:p></p>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Roni Even</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

<div style=3D'border:none;border-left:solid blue 1.5pt;padding:0in 0in =
0in 4.0pt;
border-width:initial;border-color:initial'>

<div>

<div style=3D'border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt =
0in 0in 0in;
border-width:initial;border-color:initial;z-index:auto'>

<div>

<p class=3DMsoNormal><b><span =
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif";
color:black'>From:</span></b><span class=3Dapple-converted-space><span
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif";color:black'>=
&nbsp;</span></span><span
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif";color:black'>=
Dan
Burnett [<a =
href=3D"mailto:dburnett@voxeo.com">mailto:dburnett@voxeo.com</a>]<span
class=3Dapple-converted-space>&nbsp;</span><br>
<b>Sent:</b><span class=3Dapple-converted-space>&nbsp;</span>Tuesday, =
August 11,
2009 9:22 PM<br>
<b>To:</b><span class=3Dapple-converted-space>&nbsp;</span>Roni Even<br>
<b>Cc:</b><span class=3Dapple-converted-space>&nbsp;</span><a
href=3D"mailto:sarvi@cisco.com">sarvi@cisco.com</a>;<span
class=3Dapple-converted-space>&nbsp;</span><a =
href=3D"mailto:oran@cisco.com">oran@cisco.com</a>;
'Eric Burger';<span class=3Dapple-converted-space>&nbsp;</span><a
href=3D"mailto:speechsc@ietf.org">speechsc@ietf.org</a>;<span
class=3Dapple-converted-space>&nbsp;</span><a =
href=3D"mailto:rai@ietf.org">rai@ietf.org</a><br>
<b>Subject:</b><span class=3Dapple-converted-space>&nbsp;</span>Re: RAI =
review of
draft-ietf-speechsc-mrcpv2-19</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>On Jul 7, 2009, at 3:40 =
PM, Roni
Even wrote:<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<p class=3DMsoCommentText =
style=3D'margin-bottom:10.0pt;line-height:13.5pt'><span
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:black'=
>Hi,</span><span
style=3D'color:black'><o:p></o:p></span></p>

<p class=3DMsoCommentText =
style=3D'margin-bottom:10.0pt;line-height:13.5pt'><span
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:black'=
>I was
assigned to do a RAI review of the draft. &nbsp;The draft looks ready =
for
publication to me. I have some comments mostly editorial.</span><span
style=3D'color:black'><o:p></o:p></span></p>

<p class=3DMsoCommentText =
style=3D'margin-bottom:10.0pt;line-height:13.5pt'><span
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:black'=
>The
only issue I see that is not pure editorial is the issue of the =
different
parameters like confidence threshold, sensitivity level (see comments =
11, 13,
15, 16 and 17). I think that some clarification on the semantics and the =
scale
(for example are the values linearly spaced) as well as when they are =
useful
will be helpful to implementers.</span><span =
style=3D'color:black'><o:p></o:p></span></p>

<p class=3DMsoCommentText =
style=3D'margin-bottom:10.0pt;text-indent:-.25in;
line-height:13.5pt'><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>1.</span><span =
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In figure 1 Expand the
abbreviations TTS, ASR, SV , SI and how they are related to the media =
resource
types in 3.1</span><span style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>Done. &nbsp;Added some =
text
explaining Figure 1 and enhanced Figure 1 slightly for =
clarification.<br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<p class=3DMsoCommentText =
style=3D'margin-bottom:10.0pt;text-indent:-.25in;
line-height:13.5pt'><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>2.</span><span =
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In figure 1 there is a =
SIP
dialog between the MRCPv2 client and the media source/sink, what is this
dialog, I only saw in section 4 a dialog between the client and =
server.</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>Clarified in&nbsp;the =
first
example of section 4.2 that the SIP dialog with the media source/sink is =
not
shown.<o:p></o:p></span></p>

</div>

</div>

<blockquote style=3D'margin-top:5.0pt;margin-bottom:5.0pt'>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>3.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 3.2 you have =
&#8220;For
example:<span class=3Dapple-converted-space>&nbsp;</span><a
href=3D"sip:mrcpv2@example.net"><span =
style=3D'color:windowtext;text-decoration:
none'>sip:mrcpv2@example.net</span></a>&#8221; twice one after the =
other.</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

</blockquote>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>Fixed.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>4.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In the example in =
section 4.2
you &#8220;a=3Dcmid:1&#8221;, cmid is specified later in the document so =
maybe you can add
some reference to where it is specified</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>Done.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>5.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In the example is =
section 4.2
and in following examples you have &#8220;m=3Daudio 49170 RTP/AVP 0 =
96&#8221; but do not have
an rtpmap parameter for mapping 96 (dynamic payload type number) to a =
media
encoding name.</span><span style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>It is not in the first =
or third
examples (Synthesizer only), but it is in the second example =
(Recognizer).
&nbsp;I have removed 96 as an option for the Synthesizer-only examples =
but let
it remain as an addition for the Recognizer =
example.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>6.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 4.3 =
&#8220;Also note that
more that one media session can be associated with a single resource if =
need
be, but this scenario is not useful for the current set of =
resources&#8221;. There is
a typo the second &#8220;that&#8221; should be &#8220;than&#8221;. I am =
also not sure if the current
syntax in this document can support the mode.</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>Fixed the =
typo.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>7.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 4.3 =
&#8220;The formatting
of the&quot;cmid&quot; attribute in SDP RFC3388 [RFC4566]&#8221;. I =
think you meant
SDP grouping and need the reference to RFC 3388.</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>I removed the reference =
altogether
because it already exists (correctly) earlier in the =
paragraph.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>8.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 5.1 =
&#8220;The
message-length field specifies the length of the message, including the
start-line&#8221; is the length in Bytes, there is no unit =
specified.</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>Changed &quot;length of =
the
message&quot; to &quot;length of the message in =
bytes&quot;.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>9.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 6.3.1, typo =
you have
&#8220;Verfication &#8220; instead of verification. It appears twice in =
the section.</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>Fixed.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>10.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In the example in =
section 7 you
have &#8220;m=3Daudio 0 RTP/AVP 0 1 3&#8221; payload type 1 was deleted =
from the IANA
registry, maybe have another payload type number.</span><span =
style=3D'color:
black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>I just removed that =
payload type.
&nbsp;It is not germane to the example.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>11.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 9.4.1, 9.4.2 =
and
9.4.3 you specify confidence threshold, sensitivity level and speed vs
accuracy. What is the scale here; is it linear between 0 and 1. What is =
the
absolute value of the number, if you receive the same confidence level =
from two
recognizers are they the same (e.g. when using context block to switch
servers).&nbsp; For the speed vs accuracy, how does the client know what =
is the
relation between the value and the number of available sessions, since =
this
seems to be the reason for using this parameter.</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>The interpretation of =
all of these
parameters is implementation-specific because the underlying =
technologies used
to implement them vary and can even be proprietary. &nbsp;In practice =
the
speech recognition and synthesis and speaker authentication communities =
have lived
with this state of affairs for many years, and users of other APIs for =
this
technology are well aware of and have built applications that =
accommodate this
variability in interpretation. &nbsp;It is outside the scope of this
specification to attempt to standardize interpretations of these =
values.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>12.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In 9.4.9 and in 10.4.8, =
11.4.11
what are the values for media-type-value, you also mention audio and =
video but
it looks to me that this document only discusses voice.</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>Yes. &nbsp;Although the =
original
intent was to record speech, application authors today are beginning to =
look at
ways to incorporate other audio or video. &nbsp;The intent of the =
sentences in
these sections is to clarify that the specification itself imposes no
restriction on the types of media that are =
allowed.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>13.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In 9.4.35 and 9.4.36 =
what is
the scale for the consistency here. How does one know what close means. =
What is
the consistency between different recognizers.</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>The answer to question =
11, above,
applies here as well.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>14.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 9.6.3.3 in =
the
example (figure 2) confidence should be 0.75 and not 75</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>Fixed.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>15.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 10.4.1 it is =
not
clear how you measure the sensitivity in order to specify, is it based =
on some
SNR translated to 0 to 1 scale?</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>The answer to question =
11, above,
applies here as well.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>16.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In 11.4.6 the same issue =
with
the scale, how does the client know how to set a value when working with
different speaker verification servers.</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>Ditto. &nbsp;I should =
point out
that in all of these cases the parameters are typically passed directly =
to the
engine, and their interpretations are defined (and described) in the =
vendors'
documentation. &nbsp;The most common MRCPv2 server implementations are =
by the technology
vendors themselves (the providers of the synthesis, recognition, and
verification engines). &nbsp;This is commonly understood in this =
technology
industry (meaning those who use this technology =
regularly).<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>17.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In 11.5.2.9 you state =
that the
verification-score is not a probability, so what is it. How can the =
client
decide if, for example, 0 is a good score for specifying the =
threshold.&nbsp; I
also noticed that the values in the example in section 11.5.2.10 are =
very
precise like 0.98514 is this the expected precision. The examples here =
and in
section 11.11 do not show the threshold, if the threshold is required =
for this
flow why not show it in the example?</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>This parameter, as =
others
mentioned above, has only a vendor-specific interpretation. &nbsp;In =
practice
authors interpret these values based both on guidance from the =
technology
vendors and via experimentation on large sets of recorded =
data.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>The =
Min-Verification-Score
threshold is not required to be set. &nbsp;In many cases the technology =
vendor
has a fairly good understanding of what the default threshold should be.
&nbsp;The verification-score is returned, however, in case the =
application
author determines (through experimentation, as described above) that the
default threshold is not producing optimal results for the application.
&nbsp;In that case the author can set the threshold to a different value =
or can
set it to -1 and make the determination within the application itself =
based on
the verification-score values.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>18.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section 12.3 the =
suggestion
is to use SRTP as the mandatory interoperability mode. If the reason for
mandating SRTP is for a common mode you should also decide on a key =
exchange
mechanism. I suggest you look at<span =
class=3Dapple-converted-space>&nbsp;</span><a
href=3D"http://tools.ietf.org/html/draft-ietf-avt-srtp-not-mandatory-02">=
http://tools.ietf.org/html/draft-ietf-avt-srtp-not-mandatory-02</a><span
class=3Dapple-converted-space>&nbsp;</span>for discussion on media =
security.</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>Based on the discussion =
between
you and Dan York on the list, I will change this:<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div><pre><span class=3Dapple-style-span><span =
style=3D'font-size:12.0pt;
font-family:"Helvetica","sans-serif";color:black'>12.3. Media session =
protection&nbsp;</span></span><span
style=3D'color:black'><o:p></o:p></span></pre><pre><span =
class=3Dapple-style-span><span
style=3D'font-size:9.0pt;font-family:"Helvetica","sans-serif";color:black=
'>Sensitive data is also carried on media sessions terminating on MRCPv2 =
servers (the other end of a media channel may or may not be on the =
MRCPv2 client). This data includes the user's spoken utterances and the =
output of text-to-speech operations. MRCPv2 servers MUST support SRTP =
for protection of audio media sessions. MRCPv2 clients that originate or =
consume audio similarly MUST support SRTP. Alternative media channel =
protection MAY be used if desired (e.g. IPSEC).</span></span><span
style=3D'color:black'><o:p></o:p></span></pre></div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>to =
this:<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div><pre><span class=3Dapple-style-span><span =
style=3D'font-size:9.0pt;font-family:
"Helvetica","sans-serif";color:black'>12.3. Media session =
protection&nbsp;</span></span><span
style=3D'color:black'><o:p></o:p></span></pre><pre><span =
class=3Dapple-style-span><span
style=3D'font-size:9.0pt;font-family:"Helvetica","sans-serif";color:black=
'>Sensitive data is also carried on media sessions terminating on MRCPv2 =
servers (the other end of a media channel may or may not be on the =
MRCPv2 client). This data includes the user's spoken utterances and the =
output of text-to-speech operations. MRCPv2 servers MUST support a =
security mechanism for protection of audio media sessions. MRCPv2 =
clients that originate or consume audio similarly MUST support a =
security mechanism for protection of the audio. If =
appropriate,&nbsp;usage of the Secure Real-time Transport Protocol =
(SRTP)&nbsp;[RFC3711] is recommended.</span></span><span
style=3D'color:black'><o:p></o:p></span></pre></div>

<div>

<blockquote style=3D'margin-top:5.0pt;margin-bottom:5.0pt'>

<div>

<div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal style=3D'text-indent:-.25in'><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>19.</span><span
style=3D'font-size:7.0pt;color:black'>&nbsp;&nbsp;<span
class=3Dapple-converted-space>&nbsp;</span></span><span =
style=3D'font-size:11.0pt;
font-family:"Calibri","sans-serif";color:black'>In section13.7.2 you =
specify
the attribute resource as session level yet in the example in section =
4.2 it is
a media level attribute. The same goes for the channel =
attribute</span><span
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

</blockquote>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'>I have corrected both =
in section
13.7.2 to be media-level.<o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span style=3D'color:black'><br>
<br>
<br>
<o:p></o:p></span></p>

</div>

<div>

<div>

<div style=3D'margin-left:.5in'>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>Thanks</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>Roni Even</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

<div>

<div>

<p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
color:black'>&nbsp;</span><span =
style=3D'color:black'><o:p></o:p></span></p>

</div>

</div>

</div>

</div>

</div>

<div>

<p class=3DMsoNormal><span =
style=3D'color:black'>&nbsp;<o:p></o:p></span></p>

</div>

</div>

</div>

</div>

</div>

<p class=3DMsoNormal><o:p>&nbsp;</o:p></p>

</div>

</div>

</div>

</div>

</body>

</html>

------=_NextPart_000_0801_01CA88AD.53323700--


From slawomir.testowy@gmail.com  Wed Dec 30 07:01:04 2009
Return-Path: <slawomir.testowy@gmail.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id ACE103A67A6 for <speechsc@core3.amsl.com>; Wed, 30 Dec 2009 07:01:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level: 
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[BAYES_50=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gXC45MQ2XDnT for <speechsc@core3.amsl.com>; Wed, 30 Dec 2009 07:01:03 -0800 (PST)
Received: from mail-bw0-f223.google.com (mail-bw0-f223.google.com [209.85.218.223]) by core3.amsl.com (Postfix) with ESMTP id 9C9933A67A2 for <speechsc@ietf.org>; Wed, 30 Dec 2009 07:01:03 -0800 (PST)
Received: by bwz23 with SMTP id 23so7776508bwz.29 for <speechsc@ietf.org>; Wed, 30 Dec 2009 07:00:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=dcKu4G2/0JzCL5lvH4SwBJcTHuL3cG/YRtnvmGIreVE=; b=c5jHQZ2/H3d2hYw6yNgHmv+Oejsq4jZQ/Rsat9JOb0k0TaVKyY3eQ84zMOJ6tN13hc /2k2ePTEUm7AjjGVkeRHajN5+O0PIFZR5K15bNkgmsN6rWeRskmEJokmgeogEZuujQEP vm40U8cDS32f+vHZGqN9RWnbFsshCVCAt/xPw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=xt9VWeMuhOx/hhwS3IoX4YUYz01IrPjDdKXIeCBgyXLtz2xiYktWTWLQwvyIIwXIpi UMaEaXl2HGt0K+3uCtZY766wIaf0Zggyu3SXxJfo25rfrnTXJDH6o1ZIaN+0LbcxzYI0 L4qpmpvU/fp960CQqUVeLdfjEZznJji07TDBs=
MIME-Version: 1.0
Received: by 10.204.2.211 with SMTP id 19mr6666756bkk.6.1262185240197; Wed, 30  Dec 2009 07:00:40 -0800 (PST)
Date: Wed, 30 Dec 2009 16:00:40 +0100
Message-ID: <8681d1580912300700p4ae3bb77hea0cebe875da4fb7@mail.gmail.com>
From: Slawomir Testowy <slawomir.testowy@gmail.com>
To: speechsc@ietf.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: [Speechsc]  Question about DEFINE-LEXICON
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Dec 2009 15:01:04 -0000

Hi.

Description of DEFINE-LEXICON seems to be not precise.

> 8.14. DEFINE-LEXICON
>
>
>   The DEFINE-LEXICON method, from the client to the server, provides a
>   lexicon and tells the server to load, unload, activate or deactivate
>   the lexicon.

How does it tell it? OK, I found it in the description of Load-Lexicon header in
"8.4.16. Load-Lexicon",. Maybe this should be moved here?

Moreover, while loading and unloading is defined in activations and deactivation
is not specified anywhere.

>
>    If the server resource is in the speaking or paused state, the server
>    MUST respond 402 (Method not valid in this state) failure status.
>
>    If the resource is in the idle state and is able to successfully
>    load/unload/activate/deactivate the lexicon the status MUST return a
>    success code and the request-state MUST be COMPLETE.
>
>    If the synthesizer could not define the lexicon for some reason, for
>    example because the download failed or the lexicon was in an
>    unsupported form, the server MUST respond with a failure status code
>    of 407, and a Completion-Cause header describing the failure reason.

There is no definition of "lexicon". There is only this:

> 8.5.2. Lexicon Data
>
>    Synthesizer lexicon data from the client to the server can be
>    provided inline or by reference.  Either way they are carried as
>    typed media in the message body of the MRCPv2 request message.
>    ...

Is "lexicon" the same as http://www.w3.org/TR/pronunciation-lexicon/
defined in SSML1.1? If yes, there should be a reference to PLS. If no,
there should be some clarification what media type the lexicon is, for
example reference to http://www.w3.org/TR/speech-synthesis/#S3.1.4
describing lexicons in SSML.

Maybe there should be a note that type of the lexicon is defined in
Content-Type header of the DEFINE-LEXICON message.

From slawomir.testowy@gmail.com  Wed Dec 30 07:07:38 2009
Return-Path: <slawomir.testowy@gmail.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id E6A403A6844 for <speechsc@core3.amsl.com>; Wed, 30 Dec 2009 07:07:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.299
X-Spam-Level: 
X-Spam-Status: No, score=-1.299 tagged_above=-999 required=5 tests=[AWL=1.300,  BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CIYED50Cgsmc for <speechsc@core3.amsl.com>; Wed, 30 Dec 2009 07:07:37 -0800 (PST)
Received: from mail-bw0-f223.google.com (mail-bw0-f223.google.com [209.85.218.223]) by core3.amsl.com (Postfix) with ESMTP id 820CF3A67EF for <speechsc@ietf.org>; Wed, 30 Dec 2009 07:07:37 -0800 (PST)
Received: by bwz23 with SMTP id 23so7780263bwz.29 for <speechsc@ietf.org>; Wed, 30 Dec 2009 07:07:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=dQlVS2SNZ7RXXB+MQMDyPI/yQnp205/2Xtf8LyWZ9h4=; b=uPSVZyP4AkyXxK8CALljVWH1/VSsV1L5qgvF3q1KrOpzTuy7DPuGWR/NhUR/yext+o pUkv9cVE5GDzDWCDwxa/OFFdHLdNGP373pY22EBiNivUvl9oitgVesHAEyprVCwLxVOz qbQrmU8Sbf47TT67jbe5i/gQ7rRk/bewOgipU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=kv1EK3AG0YGBb+ng+dzra9X3AszxAav77C5s2MTnAz8/hj1qGUXMzzlZp1F/aGMeS+ lD9eiwl9MQkA4m01IOjBziEjpsqdBqWY3u5XrJFH/6+/V1h2jcFBZf0YWot7cJv0NCnE Kz3P1s/rjLaWSIJSortd/PF4QIC5z0HHM9omM=
MIME-Version: 1.0
Received: by 10.204.155.86 with SMTP id r22mr3723097bkw.165.1262185633497;  Wed, 30 Dec 2009 07:07:13 -0800 (PST)
Date: Wed, 30 Dec 2009 16:07:13 +0100
Message-ID: <8681d1580912300707w31e1bb1et7174d0f01c7df83e@mail.gmail.com>
From: Slawomir Testowy <slawomir.testowy@gmail.com>
To: speechsc@ietf.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: [Speechsc]  Possible typo in Content-Encoding description
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Dec 2009 15:07:39 -0000

> 6.2.9. Content-Encoding
>
>
>    The content-encoding entity-header is used as a modifier to the
>    media-type.  When present, its value indicates what additional
>    content encoding has been applied to the entity-body, and thus what
>    decoding mechanisms must be applied in order to obtain the media-type
>    referenced by the content-type header.  Content-encoding is primarily
>    used to allow a document to be compressed without losing the identity
>    of its underlying media type.  Note that the SDP session can be used
>    to determine accepted encodings (see Section 7).  This header MAY
>    occur on all messages.

Section 7 describes usage of OPTIONS method of SIP and Accept-Encoding
header is returned by SIP response, not SDP answer, so I guess "Note that
the SDP session can be used" should be changed to "Note that the SIP
session can be used".

From slawomir.testowy@gmail.com  Wed Dec 30 07:49:15 2009
Return-Path: <slawomir.testowy@gmail.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C667E3A6A21 for <speechsc@core3.amsl.com>; Wed, 30 Dec 2009 07:49:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.949
X-Spam-Level: 
X-Spam-Status: No, score=-1.949 tagged_above=-999 required=5 tests=[AWL=0.650,  BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wtFXbybu4D4i for <speechsc@core3.amsl.com>; Wed, 30 Dec 2009 07:49:15 -0800 (PST)
Received: from mail-bw0-f223.google.com (mail-bw0-f223.google.com [209.85.218.223]) by core3.amsl.com (Postfix) with ESMTP id BD1783A6909 for <speechsc@ietf.org>; Wed, 30 Dec 2009 07:49:14 -0800 (PST)
Received: by bwz23 with SMTP id 23so7802630bwz.29 for <speechsc@ietf.org>; Wed, 30 Dec 2009 07:48:50 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=SUeS5RbDK+XPuC9AlADcLl8bm1v5Wvey/MBivlcNFrE=; b=lvoOjK1/qwb+5QYWktpR5bnCAipEZ28JpV/jDjokl3RyGMWzZCaRpfpUUGdMug5cIA 81DCFyzKq+HzsNk/F96o0DrqlS1AthW3MYQV2ikdMLwlEYQaZmd29i2hYiPfDNDkvwuJ mLTw3bd2NT7gmmCBRQyL4EpPq2SjyeuLoVtYI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=ERlO/LhevfjxJngCKso72wxKXydYOqjvOCZCA8Z2481fxbChMaX668iSinng6lwict n3BpsCdazsV4RklxH+HBE8wOXD2ew+gHK+NU5peSJlvsZ77M5WtZbQjyrCWxOxGXYVY4 GOhlkhtJ9j1hxUX/K4LXx7ILS2qObm3SKlxrk=
MIME-Version: 1.0
Received: by 10.204.175.81 with SMTP id w17mr366201bkz.125.1262188130103; Wed,  30 Dec 2009 07:48:50 -0800 (PST)
Date: Wed, 30 Dec 2009 16:48:50 +0100
Message-ID: <8681d1580912300748l7c37b7a2gad296df002bbe399@mail.gmail.com>
From: Slawomir Testowy <slawomir.testowy@gmail.com>
To: speechsc@ietf.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: [Speechsc]  Random comments on mrcpv2-draft-20
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Dec 2009 15:49:15 -0000

> The "SPEAK" Request provides the synthesizer

> The SPEAK method implementation MUST do

> If the current SPEAK fails, all SPEAK methods in the pending queue are
> cancelled and each generates a SPEAK-COMPLETE event with a Completion-Cause of
> "cancelled".

> For the synthesizer resource, "SPEAK" is the only method that can return a
> request-state of IN-PROGRESS or PENDING.  When the text has been synthesized
> and played into the media stream, the resource issues a "SPEAK-COMPLETE"
> event with the request-id of the "SPEAK" request and a request-state of
> COMPLETE.

Why sometimes methods and events are surrounded in quotes and sometimes not?

> 6.1.1. SET-PARAMS

Is SET-PARAMS atomic? It means, if SET-PARAMS fails, it MUST NOT modify anything
as if it was not received?

> 8.4.7. Prosody-Parameters

>  The prosody parameter headers in the "SET-PARAMS" or "SPEAK" request
>  only apply if the speech data is of type text/plain and does not use
>  a speech markup format.

Why is it so? Why it is not true for Voice-Parameters?
Is it true for CONTROL (i.e. current SPEAK must be text/plain)?

Specification does not say anything about it.

>
>  These prosody parameter headers MAY also be sent in a CONTROL method
>  to affect a "SPEAK" request in progress and change its behavior on
>  the fly.  If the synthesizer resource does not support this
>  operation, it MUST respond back to the client with a status of 403
>  "Unsupported Header".

If the last sentence works only for CONTROL method (which is not clear IMO),
maybe this should be moved to the description of CONTROL, because there is no
information about handling errors in 8.11.

The same applies for Voice-Parameters headers.

-- 
regards
Slawomir Testowy
