From exim@www1.ietf.org  Wed Nov  5 17:14:36 2003
Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged))
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA26883
	for <speechsc-archive@odin.ietf.org>; Wed, 5 Nov 2003 17:14:36 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHVut-0002Eb-SF
	for speechsc-archive@odin.ietf.org; Wed, 05 Nov 2003 17:14:19 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hA5MEJes008583
	for speechsc-archive@odin.ietf.org; Wed, 5 Nov 2003 17:14:19 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHVut-0002EM-Oo
	for speechsc-web-archive@optimus.ietf.org; Wed, 05 Nov 2003 17:14:19 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA26875
	for <speechsc-web-archive@ietf.org>; Wed, 5 Nov 2003 17:14:06 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHVur-0004YH-00
	for speechsc-web-archive@ietf.org; Wed, 05 Nov 2003 17:14:17 -0500
Received: from manatick.foretec.com ([4.17.168.5] helo=manatick)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHVuo-0004YD-00
	for speechsc-web-archive@ietf.org; Wed, 05 Nov 2003 17:14:16 -0500
Received: from [132.151.6.22] (helo=optimus.ietf.org)
	by manatick with esmtp (Exim 4.24)
	id 1AHVun-0005E3-QX
	for speechsc-web-archive@ietf.org; Wed, 05 Nov 2003 17:14:14 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHVua-0002Aj-R4; Wed, 05 Nov 2003 17:14:00 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHVth-00029Z-Kh
	for speechsc@optimus.ietf.org; Wed, 05 Nov 2003 17:13:05 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA26828
	for <speechsc@ietf.org>; Wed, 5 Nov 2003 17:12:52 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHVtf-0004Wv-00
	for speechsc@ietf.org; Wed, 05 Nov 2003 17:13:03 -0500
Received: from [205.150.90.87] (helo=voicegenie.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHVtb-0004Wn-00
	for speechsc@ietf.org; Wed, 05 Nov 2003 17:12:59 -0500
Received: from voicegenie.com (dilbert.voicegenie.com [205.150.90.110])
	by voicegenie.com (8.11.6+Sun/8.9.3) with ESMTP id hA5MCRY28708
	for <speechsc@ietf.org>; Wed, 5 Nov 2003 17:12:27 -0500 (EST)
Message-ID: <3FA97740.3CCD5D0D@voicegenie.com>
Date: Wed, 05 Nov 2003 17:18:40 -0500
From: Alex Lee <alee@voicegenie.com>
Reply-To: alee@voicegenie.com
Organization: VoiceGenie Inc.
X-Mailer: Mozilla 4.76 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: speechsc@ietf.org
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit
Subject: [Speechsc] Question on Proxy, and media redirect.
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

Hi,

    We have looked at the way the sessions are set up with MRCPv2, and
find that it isn't well-defined for some proxy applications.  The
situation we're thinking about is that there's a proxy fronting multiple
ASR servers at the back end, where each ASR server may have different
capabilities (e.g. Server 1 supports English, Server 2 supports
French).  Sometimes, it may be desirable to use different servers for
different recognition sessions on the same call.

    When it is desired that a different server is to be used for
different recognition sessions, there are two possibilities for how this
may be done.  First, if the client is very smart, then it can be aware
of many different servers it has access to, and setup independent
sessions to these servers based on what it needs.  The other possibility
is to build a client that isn't very smart, but it will access an MRCP
proxy that is capable of routing its recognition requests, based on the
parameters of each recognition session.  For efficient implementation of
the Proxy, the Proxy should not proxy the audio data, but rather it
should only proxy the SIP and MRCP control messages.

    In order to do that, however, the spec itself must allow for some
kind of media re-routing (similar to a SIP re-invite) initiated from the
server when a "RECOGNIZE" method is called from the MRCP client.  Right
now, MRCPv2 assumes that the same media channel will be used within one
session.

    One way to work around this is to allow for a SIP INVITE to be sent
from the Server to the Client in the middle of a "recognize" method, to
change some properties of the media channel.  In the use case with the
proxy switching back-end servers, the client would initiate a session
with the proxy.  Then, based on the parameters of the MRCP RECOGNIZE
command (e.g. which grammars to use for recognition, which languages to
use, etc.), the proxy would choose the back-end server to use for the
recognition.  After the real back-end server has been chosen, the proxy
would go back to the client to modify the media properties, so that the
client can send the audio directly to the real server.

    The following diagram illustrates the interaction of such a
situation.  The single arrows are messages sent via the SIP channels,
while the double arrows are messages sent via the control channels.

Client                           Proxy                           Server
   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |     Port# for both control    |                               |
   |            and media          |                               |
   |                               |                               |
   | ---------- SIP ACK ---------> |                               |
   |                               |                               |
   |                               |                               |
   | ======= TCP connect() ======> |                               |
   |                               |                               |
   | ==== MRCP DEFINE-GRAMMAR ===> |                               |
   |                               |                               |
   | <=== MRCP DEFINE-GRAMMAR ==== |                               |
   |                               |                               |
   | ====== MRCP SET-PARAM ======> |                               |
   |                               |                               |
   | <===== MRCP SET-PARAM ======= |                               |
   |                               |                               |
   | <=== More MRCP SET-PARAM ===> |                               |
   |       and DEFINE-GRAMMAR      |                               |
   |                               |                               |
   | ==== MRCP RECOGNIZE (1) ====> |                               |
   |                               |                               |
   |                               | -------- SIP INVITE --------> |
   |                               |                               |
   |                               | <------- SIP INVITE --------- |
   |                               |     Port# for both control    |
   |                               |            and media          |
   |                               |                               |
   |                               | ---------- SIP ACK ---------> |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |      New Port# of media       |                               |
   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <--------- SIP ACK ---------- |                               |
   |                               | ======= TCP connect() ======> |
   |                               |                               |
   |                               | <==== All MRCP SET-PARAM ===> |
   |                               |       and DEFINE-GRAMMAR      |
   |                               |        sent by client         |
   |                               |                               |
   |                               | ====== MRCP RECOGNIZE ======> |
   |                               |                               |
   |                               | <===== MRCP RECOGNIZE ======= |
   |                               |                               |
   | <==== MRCP RECOGNIZE (2) ==== |                               |
   |                               |                               |
   |                               | <=== MRCP START-OF-SPEECH === |
   |                               |                               |
   | <=== MRCP START-OF-SPEECH === |                               |
   |                               |                               |
   |                               | < MRCP RECOGNITION-COMPLETE = |
   |                               |                               |
   | < MRCP RECOGNITION-COMPLETE = |                               |
   |                               |                               |
   |                               |                               |

    Is there any possibility to explicitly include this mechanism, or an
alternative, in the MRCPv2 spec that would allow for the usage of a
proxy where it needs change the back-end server media properties in the
middle of the session?  I have seen that "RE-DIRECT" has been mentioned
in the MRCPv2 spec in section 8.5.3, but there hasn't been any example,
explicit description, or rules on how this is to be done.  Any
extensions or clarification on this topic is much appreciated.  Thanks.

Alex...


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Thu Nov  6 08:26:22 2003
Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged))
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA07897
	for <speechsc-archive@odin.ietf.org>; Thu, 6 Nov 2003 08:26:22 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHk9D-0006M3-F6
	for speechsc-archive@odin.ietf.org; Thu, 06 Nov 2003 08:26:03 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hA6DQ3QM024423
	for speechsc-archive@odin.ietf.org; Thu, 6 Nov 2003 08:26:03 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHk9D-0006Lq-9D
	for speechsc-web-archive@optimus.ietf.org; Thu, 06 Nov 2003 08:26:03 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA07857
	for <speechsc-web-archive@ietf.org>; Thu, 6 Nov 2003 08:25:51 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHk9C-0000wA-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 08:26:02 -0500
Received: from ietf.org ([132.151.1.19] helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHk9B-0000w7-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 08:26:01 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHk9B-0006Kz-Eu; Thu, 06 Nov 2003 08:26:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHk3P-0005vV-Ut
	for speechsc@optimus.ietf.org; Thu, 06 Nov 2003 08:20:03 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA07635
	for <speechsc@ietf.org>; Thu, 6 Nov 2003 08:19:52 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHk3O-0000sU-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 08:20:02 -0500
Received: from pb-relay1.scansoft.com ([198.71.64.23] helo=smtp-relay1.scansoft.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHk3O-0000s2-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 08:20:02 -0500
Received: from pb-exchcon.pb.scansoft.com ([10.1.4.73] unverified) by smtp-relay1.scansoft.com with Microsoft SMTPSVC(5.0.2195.6713);
	 Thu, 6 Nov 2003 08:19:35 -0500
Received: by pb-exchcon.pb.scansoft.com with Internet Mail Service (5.5.2653.19)
	id <WA7KL3LS>; Thu, 6 Nov 2003 08:19:31 -0500
Message-ID: <9E452E1028D9664E89CA1BBCF166805768EBE0@bn-exch1>
From: "Eberman, Brian" <Brian.Eberman@scansoft.com>
To: speechsc@ietf.org
Cc: "Eberman, Brian" <Brian.Eberman@scansoft.com>
Subject: RE: [Speechsc] Question on Proxy, and media redirect.
Date: Thu, 6 Nov 2003 08:19:29 -0500 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain
X-OriginalArrivalTime: 06 Nov 2003 13:19:35.0945 (UTC) FILETIME=[9FB43F90:01C3A468]
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>



I really think Alex has brought up an important point here.  We need to be
able to cleanly separate the RTP/ audio streaming from the control stream to
get a cleaner specification.  Proxy scaling should really be a key issues
for MRCPv2 and I thought it was in the base requirements. This seems like a
significant whole.

I've seen two other places in implementations of VoiceXML browsers that need
to be clarified for v2:

1) There isn't any way to cleanly implement a pure recording function under
MRCP to implement the record function of VoiceXML without having the
VoiceXML implementation box routing the audio. I think we need to resolve
this issue.  VoiceXML recording can be terminated by speech recognition so
it is actually more like a recognition then a recording.

2) The SPEECH-MARKER event for TTS doesn't indicate the time of the event.
So if a system is distributing the RTP stream and the TTS stream and
potential mixing audio using the MARKER it doesn't know the time in the RTP
stream when the audio mark was hit.

The stream control semantics seem to need a little work here.
-Brian

-----Original Message-----
From: Alex Lee [mailto:alee@voicegenie.com] 
Sent: Wednesday, November 05, 2003 5:19 PM
To: speechsc@ietf.org
Subject: [Speechsc] Question on Proxy, and media redirect.


Hi,

    We have looked at the way the sessions are set up with MRCPv2, and find
that it isn't well-defined for some proxy applications.  The situation we're
thinking about is that there's a proxy fronting multiple ASR servers at the
back end, where each ASR server may have different capabilities (e.g. Server
1 supports English, Server 2 supports French).  Sometimes, it may be
desirable to use different servers for different recognition sessions on the
same call.

    When it is desired that a different server is to be used for different
recognition sessions, there are two possibilities for how this may be done.
First, if the client is very smart, then it can be aware of many different
servers it has access to, and setup independent sessions to these servers
based on what it needs.  The other possibility is to build a client that
isn't very smart, but it will access an MRCP proxy that is capable of
routing its recognition requests, based on the parameters of each
recognition session.  For efficient implementation of the Proxy, the Proxy
should not proxy the audio data, but rather it should only proxy the SIP and
MRCP control messages.

    In order to do that, however, the spec itself must allow for some kind
of media re-routing (similar to a SIP re-invite) initiated from the server
when a "RECOGNIZE" method is called from the MRCP client.  Right now, MRCPv2
assumes that the same media channel will be used within one session.

    One way to work around this is to allow for a SIP INVITE to be sent from
the Server to the Client in the middle of a "recognize" method, to change
some properties of the media channel.  In the use case with the proxy
switching back-end servers, the client would initiate a session with the
proxy.  Then, based on the parameters of the MRCP RECOGNIZE command (e.g.
which grammars to use for recognition, which languages to use, etc.), the
proxy would choose the back-end server to use for the recognition.  After
the real back-end server has been chosen, the proxy would go back to the
client to modify the media properties, so that the client can send the audio
directly to the real server.

    The following diagram illustrates the interaction of such a situation.
The single arrows are messages sent via the SIP channels, while the double
arrows are messages sent via the control channels.

Client                           Proxy                           Server
   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |     Port# for both control    |                               |
   |            and media          |                               |
   |                               |                               |
   | ---------- SIP ACK ---------> |                               |
   |                               |                               |
   |                               |                               |
   | ======= TCP connect() ======> |                               |
   |                               |                               |
   | ==== MRCP DEFINE-GRAMMAR ===> |                               |
   |                               |                               |
   | <=== MRCP DEFINE-GRAMMAR ==== |                               |
   |                               |                               |
   | ====== MRCP SET-PARAM ======> |                               |
   |                               |                               |
   | <===== MRCP SET-PARAM ======= |                               |
   |                               |                               |
   | <=== More MRCP SET-PARAM ===> |                               |
   |       and DEFINE-GRAMMAR      |                               |
   |                               |                               |
   | ==== MRCP RECOGNIZE (1) ====> |                               |
   |                               |                               |
   |                               | -------- SIP INVITE --------> |
   |                               |                               |
   |                               | <------- SIP INVITE --------- |
   |                               |     Port# for both control    |
   |                               |            and media          |
   |                               |                               |
   |                               | ---------- SIP ACK ---------> |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |      New Port# of media       |                               |
   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <--------- SIP ACK ---------- |                               |
   |                               | ======= TCP connect() ======> |
   |                               |                               |
   |                               | <==== All MRCP SET-PARAM ===> |
   |                               |       and DEFINE-GRAMMAR      |
   |                               |        sent by client         |
   |                               |                               |
   |                               | ====== MRCP RECOGNIZE ======> |
   |                               |                               |
   |                               | <===== MRCP RECOGNIZE ======= |
   |                               |                               |
   | <==== MRCP RECOGNIZE (2) ==== |                               |
   |                               |                               |
   |                               | <=== MRCP START-OF-SPEECH === |
   |                               |                               |
   | <=== MRCP START-OF-SPEECH === |                               |
   |                               |                               |
   |                               | < MRCP RECOGNITION-COMPLETE = |
   |                               |                               |
   | < MRCP RECOGNITION-COMPLETE = |                               |
   |                               |                               |
   |                               |                               |

    Is there any possibility to explicitly include this mechanism, or an
alternative, in the MRCPv2 spec that would allow for the usage of a proxy
where it needs change the back-end server media properties in the middle of
the session?  I have seen that "RE-DIRECT" has been mentioned in the MRCPv2
spec in section 8.5.3, but there hasn't been any example, explicit
description, or rules on how this is to be done.  Any extensions or
clarification on this topic is much appreciated.  Thanks.

Alex...


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Thu Nov  6 15:07:22 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA26615
	for <speechsc-archive@odin.ietf.org>; Thu, 6 Nov 2003 15:07:21 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHqPH-00078F-L2
	for speechsc-archive@odin.ietf.org; Thu, 06 Nov 2003 15:07:04 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hA6K73QD027406
	for speechsc-archive@odin.ietf.org; Thu, 6 Nov 2003 15:07:03 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHqPH-00077x-57
	for speechsc-web-archive@optimus.ietf.org; Thu, 06 Nov 2003 15:07:03 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA26560
	for <speechsc-web-archive@ietf.org>; Thu, 6 Nov 2003 15:06:49 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHqPD-00072z-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 15:06:59 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHqPD-00072w-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 15:06:59 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHqPE-000768-Mf; Thu, 06 Nov 2003 15:07:00 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHqOw-00074T-LC
	for speechsc@optimus.ietf.org; Thu, 06 Nov 2003 15:06:42 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA26520
	for <speechsc@ietf.org>; Thu, 6 Nov 2003 15:06:29 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHqOt-00072X-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 15:06:39 -0500
Received: from [63.163.229.65] (helo=[208.236.204.65])
	by ietf-mx with smtp (Exim 4.12)
	id 1AHqOs-00072T-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 15:06:38 -0500
Received: from namasmtp02.nmss.com by [208.236.204.65]
          via smtpd (for ietf-mx.ietf.org [132.151.6.1]) with SMTP; Thu, 6 Nov 2003 15:07:24 -0500
To: speechsc@ietf.org
Cc: "Eberman, Brian" <Brian.Eberman@scansoft.com>
X-Mailer: Lotus Notes Release 5.0.12   February 13, 2003
Message-ID: <OF4DA4C910.D1BAD914-ON85256DD6.005F8E18-85256DD6.006E774C@nmss.com>
From: "John Potemri" <John_Potemri@nmss.com>
Date: Thu, 6 Nov 2003 15:06:11 -0500
X-MIMETrack: Serialize by Router on NAMASMTP02/NMS Communications(Release 5.0.12  |February
 13, 2003) at 11/06/2003 03:06:39 PM
MIME-Version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: quoted-printable
Content-Transfer-Encoding: quoted-printable
Subject: [Speechsc] Re: Speechsc digest, Vol 1 #153 - 2 msgs
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Transfer-Encoding: quoted-printable






Brian,

I did wonder when someone would raise "record" on the reflector. I know=

many people are talking about it.  I would caution to push "record" int=
o
the recognize package (a.k.a. resource type). Someone may want to offer=

record capabilities via MRCP and not have recognition.

I would prefer to see record defined as a separate resource. MRCP does
allow multiple resource types on the same control channel and share the=

same source stream, so you could record while recognizing, too. This ke=
eps
the two state machines and protocol interface clean.

I don't want to open a can of words, but I suppose similar arguments fo=
r
more granular resource types could be made around:
   - "speak" supporting both TTS and audio - what if I only supported
   audio? At least in this case you wouldn't have both resources "activ=
e".
   - "recognize" resources that only do DTMF grammars.

But then again, these are "types" within the "resource" and one could a=
rgue
that you support the functional interface specified by the resource
definition, but may not support certain types. This could have been
resolved out-of-band via some resource management framework. But record=
 in
recognition is different. I understand that this may not be a speech
vendors preferred way to look at things.

This does beg the question of MRCP as the protocol for media engine ("m=
edia
server") functions. You'll note that while not defined, the spec does
suggest fax as a type, too. Perhaps someone has already had this dialog=
. I
have excluded addessing conferencing at this point because I know who w=
ill
jump all over me.

-John

P.S. And while we comment against MRCPv2, is it out-of-scope to mention=
 the
effects relative to MRCP(v1) for which people might make parallel
suggestions?


John Potemri
NMS Communications
100 Crossing Blvd
Framingham, MA=A0 01702
+1-508-271-1369
john_potemri@nmss.com



|---------+---------------------------->
|         |           speechsc-request@|
|         |           ietf.org         |
|         |           Sent by:         |
|         |           speechsc-admin@ie|
|         |           tf.org           |
|         |                            |
|         |                            |
|         |           11/06/2003 12:00 |
|         |           PM               |
|         |           Please respond to|
|         |           speechsc         |
|         |                            |
|---------+---------------------------->
  >--------------------------------------------------------------------=
------------------------------------------|
  |                                                                    =
                                          |
  |       To:       speechsc@ietf.org                                  =
                                          |
  |       cc:                                                          =
                                          |
  |       Subject:  Speechsc digest, Vol 1 #153 - 2 msgs               =
                                          |
  >--------------------------------------------------------------------=
------------------------------------------|




Send Speechsc mailing list submissions to
             speechsc@ietf.org

To subscribe or unsubscribe via the World Wide Web, visit
             https://www1.ietf.org/mailman/listinfo/speechsc
or, via email, send a message with subject or body 'help' to
             speechsc-request@ietf.org

You can reach the person managing the list at
             speechsc-admin@ietf.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Speechsc digest..."

Today's Topics:

   1. Question on Proxy, and media redirect. (Alex Lee)
   2. RE: Question on Proxy, and media redirect. (Eberman, Brian)

----- Message from Alex Lee <alee@voicegenie.com> on Wed, 05 Nov 2003
17:18:40 -0500 -----
                                                 =20
      To: speechsc@ietf.org                      =20
                                                 =20
 Subject: [Speechsc] Question on Proxy, and media=20
          redirect.                              =20
                                                 =20

Hi,

    We have looked at the way the sessions are set up with MRCPv2, and
find that it isn't well-defined for some proxy applications.  The
situation we're thinking about is that there's a proxy fronting multipl=
e
ASR servers at the back end, where each ASR server may have different
capabilities (e.g. Server 1 supports English, Server 2 supports
French).  Sometimes, it may be desirable to use different servers for
different recognition sessions on the same call.

    When it is desired that a different server is to be used for
different recognition sessions, there are two possibilities for how thi=
s
may be done.  First, if the client is very smart, then it can be aware
of many different servers it has access to, and setup independent
sessions to these servers based on what it needs.  The other possibilit=
y
is to build a client that isn't very smart, but it will access an MRCP
proxy that is capable of routing its recognition requests, based on the=

parameters of each recognition session.  For efficient implementation o=
f
the Proxy, the Proxy should not proxy the audio data, but rather it
should only proxy the SIP and MRCP control messages.

    In order to do that, however, the spec itself must allow for some
kind of media re-routing (similar to a SIP re-invite) initiated from th=
e
server when a "RECOGNIZE" method is called from the MRCP client.  Right=

now, MRCPv2 assumes that the same media channel will be used within one=

session.

    One way to work around this is to allow for a SIP INVITE to be sent=

from the Server to the Client in the middle of a "recognize" method, to=

change some properties of the media channel.  In the use case with the
proxy switching back-end servers, the client would initiate a session
with the proxy.  Then, based on the parameters of the MRCP RECOGNIZE
command (e.g. which grammars to use for recognition, which languages to=

use, etc.), the proxy would choose the back-end server to use for the
recognition.  After the real back-end server has been chosen, the proxy=

would go back to the client to modify the media properties, so that the=

client can send the audio directly to the real server.

    The following diagram illustrates the interaction of such a
situation.  The single arrows are messages sent via the SIP channels,
while the double arrows are messages sent via the control channels.

Client                           Proxy                           Server=

   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |     Port# for both control    |                               |
   |            and media          |                               |
   |                               |                               |
   | ---------- SIP ACK ---------> |                               |
   |                               |                               |
   |                               |                               |
   | =3D=3D=3D=3D=3D=3D=3D TCP connect() =3D=3D=3D=3D=3D=3D> |         =
                      |
   |                               |                               |
   | =3D=3D=3D=3D MRCP DEFINE-GRAMMAR =3D=3D=3D> |                     =
          |
   |                               |                               |
   | <=3D=3D=3D MRCP DEFINE-GRAMMAR =3D=3D=3D=3D |                     =
          |
   |                               |                               |
   | =3D=3D=3D=3D=3D=3D MRCP SET-PARAM =3D=3D=3D=3D=3D=3D> |           =
                    |
   |                               |                               |
   | <=3D=3D=3D=3D=3D MRCP SET-PARAM =3D=3D=3D=3D=3D=3D=3D |           =
                    |
   |                               |                               |
   | <=3D=3D=3D More MRCP SET-PARAM =3D=3D=3D> |                       =
        |
   |       and DEFINE-GRAMMAR      |                               |
   |                               |                               |
   | =3D=3D=3D=3D MRCP RECOGNIZE (1) =3D=3D=3D=3D> |                   =
            |
   |                               |                               |
   |                               | -------- SIP INVITE --------> |
   |                               |                               |
   |                               | <------- SIP INVITE --------- |
   |                               |     Port# for both control    |
   |                               |            and media          |
   |                               |                               |
   |                               | ---------- SIP ACK ---------> |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |      New Port# of media       |                               |
   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <--------- SIP ACK ---------- |                               |
   |                               | =3D=3D=3D=3D=3D=3D=3D TCP connect(=
) =3D=3D=3D=3D=3D=3D> |
   |                               |                               |
   |                               | <=3D=3D=3D=3D All MRCP SET-PARAM =3D=
=3D=3D> |
   |                               |       and DEFINE-GRAMMAR      |
   |                               |        sent by client         |
   |                               |                               |
   |                               | =3D=3D=3D=3D=3D=3D MRCP RECOGNIZE =
=3D=3D=3D=3D=3D=3D> |
   |                               |                               |
   |                               | <=3D=3D=3D=3D=3D MRCP RECOGNIZE =3D=
=3D=3D=3D=3D=3D=3D |=00   |
|                               |
   | <=3D=3D=3D=3D MRCP RECOGNIZE (2) =3D=3D=3D=3D |                   =
            |
   |                               |                               |
   |                               | <=3D=3D=3D MRCP START-OF-SPEECH =3D=
=3D=3D |=00   |
|                               |
   | <=3D=3D=3D MRCP START-OF-SPEECH =3D=3D=3D |                       =
        |
   |                               |                               |
   |                               | < MRCP RECOGNITION-COMPLETE =3D |=00=
   |
|                               |
   | < MRCP RECOGNITION-COMPLETE =3D |                               |
   |                               |                               |
   |                               |                               |

    Is there any possibility to explicitly include this mechanism, or a=
n
alternative, in the MRCPv2 spec that would allow for the usage of a
proxy where it needs change the back-end server media properties in the=

middle of the session?  I have seen that "RE-DIRECT" has been mentioned=

in the MRCPv2 spec in section 8.5.3, but there hasn't been any example,=

explicit description, or rules on how this is to be done.  Any
extensions or clarification on this topic is much appreciated.  Thanks.=


Alex...



----- Message from "Eberman, Brian" <Brian.Eberman@scansoft.com> on Thu=
, 6
Nov 2003 08:19:29 -0500 -----
                                                    =20
      To: speechsc@ietf.org                         =20
                                                    =20
      cc: "Eberman, Brian"                          =20
          <Brian.Eberman@scansoft.com>              =20
                                                    =20
 Subject: RE: [Speechsc] Question on Proxy, and     =20
          media redirect.                           =20
                                                    =20



I really think Alex has brought up an important point here.  We need to=
 be
able to cleanly separate the RTP/ audio streaming from the control stre=
am
to
get a cleaner specification.  Proxy scaling should really be a key issu=
es
for MRCPv2 and I thought it was in the base requirements. This seems li=
ke a
significant whole.

I've seen two other places in implementations of VoiceXML browsers that=

need
to be clarified for v2:

1) There isn't any way to cleanly implement a pure recording function u=
nder
MRCP to implement the record function of VoiceXML without having the
VoiceXML implementation box routing the audio. I think we need to resol=
ve
this issue.  VoiceXML recording can be terminated by speech recognition=
 so
it is actually more like a recognition then a recording.

2) The SPEECH-MARKER event for TTS doesn't indicate the time of the eve=
nt.
So if a system is distributing the RTP stream and the TTS stream and
potential mixing audio using the MARKER it doesn't know the time in the=
 RTP
stream when the audio mark was hit.

The stream control semantics seem to need a little work here.
-Brian

-----Original Message-----
From: Alex Lee [mailto:alee@voicegenie.com]
Sent: Wednesday, November 05, 2003 5:19 PM
To: speechsc@ietf.org
Subject: [Speechsc] Question on Proxy, and media redirect.


Hi,

    We have looked at the way the sessions are set up with MRCPv2, and =
find
that it isn't well-defined for some proxy applications.  The situation
we're
thinking about is that there's a proxy fronting multiple ASR servers at=
 the
back end, where each ASR server may have different capabilities (e.g.
Server
1 supports English, Server 2 supports French).  Sometimes, it may be
desirable to use different servers for different recognition sessions o=
n
the
same call.

    When it is desired that a different server is to be used for differ=
ent
recognition sessions, there are two possibilities for how this may be d=
one.
First, if the client is very smart, then it can be aware of many differ=
ent
servers it has access to, and setup independent sessions to these serve=
rs
based on what it needs.  The other possibility is to build a client tha=
t
isn't very smart, but it will access an MRCP proxy that is capable of
routing its recognition requests, based on the parameters of each
recognition session.  For efficient implementation of the Proxy, the Pr=
oxy
should not proxy the audio data, but rather it should only proxy the SI=
P
and
MRCP control messages.

    In order to do that, however, the spec itself must allow for some k=
ind
of media re-routing (similar to a SIP re-invite) initiated from the ser=
ver
when a "RECOGNIZE" method is called from the MRCP client.  Right now,
MRCPv2
assumes that the same media channel will be used within one session.

    One way to work around this is to allow for a SIP INVITE to be sent=

from
the Server to the Client in the middle of a "recognize" method, to chan=
ge
some properties of the media channel.  In the use case with the proxy
switching back-end servers, the client would initiate a session with th=
e
proxy.  Then, based on the parameters of the MRCP RECOGNIZE command (e.=
g.
which grammars to use for recognition, which languages to use, etc.), t=
he
proxy would choose the back-end server to use for the recognition.  Aft=
er
the real back-end server has been chosen, the proxy would go back to th=
e
client to modify the media properties, so that the client can send the
audio
directly to the real server.

    The following diagram illustrates the interaction of such a situati=
on.
The single arrows are messages sent via the SIP channels, while the dou=
ble
arrows are messages sent via the control channels.

Client                           Proxy                           Server=

   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |     Port# for both control    |                               |
   |            and media          |                               |
   |                               |                               |
   | ---------- SIP ACK ---------> |                               |
   |                               |                               |
   |                               |                               |
   | =3D=3D=3D=3D=3D=3D=3D TCP connect() =3D=3D=3D=3D=3D=3D> |         =
                      |
   |                               |                               |
   | =3D=3D=3D=3D MRCP DEFINE-GRAMMAR =3D=3D=3D> |                     =
          |
   |                               |                               |
   | <=3D=3D=3D MRCP DEFINE-GRAMMAR =3D=3D=3D=3D |                     =
          |
   |                               |                               |
   | =3D=3D=3D=3D=3D=3D MRCP SET-PARAM =3D=3D=3D=3D=3D=3D> |           =
                    |
   |                               |                               |
   | <=3D=3D=3D=3D=3D MRCP SET-PARAM =3D=3D=3D=3D=3D=3D=3D |           =
                    |
   |                               |                               |
   | <=3D=3D=3D More MRCP SET-PARAM =3D=3D=3D> |                       =
        |
   |       and DEFINE-GRAMMAR      |                               |
   |                               |                               |
   | =3D=3D=3D=3D MRCP RECOGNIZE (1) =3D=3D=3D=3D> |                   =
            |
   |                               |                               |
   |                               | -------- SIP INVITE --------> |
   |                               |                               |
   |                               | <------- SIP INVITE --------- |
   |                               |     Port# for both control    |
   |                               |            and media          |
   |                               |                               |
   |                               | ---------- SIP ACK ---------> |
   |                               |                               |
   | <------- SIP INVITE --------- |                               |
   |      New Port# of media       |                               |
   |                               |                               |
   | -------- SIP INVITE --------> |                               |
   |                               |                               |
   | <--------- SIP ACK ---------- |                               |
   |                               | =3D=3D=3D=3D=3D=3D=3D TCP connect(=
) =3D=3D=3D=3D=3D=3D> |
   |                               |                               |
   |                               | <=3D=3D=3D=3D All MRCP SET-PARAM =3D=
=3D=3D> |
   |                               |       and DEFINE-GRAMMAR      |
   |                               |        sent by client         |
   |                               |                               |
   |                               | =3D=3D=3D=3D=3D=3D MRCP RECOGNIZE =
=3D=3D=3D=3D=3D=3D> |
   |                               |                               |
   |                               | <=3D=3D=3D=3D=3D MRCP RECOGNIZE =3D=
=3D=3D=3D=3D=3D=3D |=00   |
|                               |
   | <=3D=3D=3D=3D MRCP RECOGNIZE (2) =3D=3D=3D=3D |                   =
            |
   |                               |                               |
   |                               | <=3D=3D=3D MRCP START-OF-SPEECH =3D=
=3D=3D |=00   |
|                               |
   | <=3D=3D=3D MRCP START-OF-SPEECH =3D=3D=3D |                       =
        |
   |                               |                               |
   |                               | < MRCP RECOGNITION-COMPLETE =3D |=00=
   |
|                               |
   | < MRCP RECOGNITION-COMPLETE =3D |                               |
   |                               |                               |
   |                               |                               |

    Is there any possibility to explicitly include this mechanism, or a=
n
alternative, in the MRCPv2 spec that would allow for the usage of a pro=
xy
where it needs change the back-end server media properties in the middl=
e of
the session?  I have seen that "RE-DIRECT" has been mentioned in the MR=
CPv2
spec in section 8.5.3, but there hasn't been any example, explicit
description, or rules on how this is to be done.  Any extensions or
clarification on this topic is much appreciated.  Thanks.

Alex...


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc



_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



=



_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Thu Nov  6 16:09:22 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA00982
	for <speechsc-archive@odin.ietf.org>; Thu, 6 Nov 2003 16:09:22 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHrNI-00048h-BJ
	for speechsc-archive@odin.ietf.org; Thu, 06 Nov 2003 16:09:04 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hA6L9419015907
	for speechsc-archive@odin.ietf.org; Thu, 6 Nov 2003 16:09:04 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHrNI-00048U-6b
	for speechsc-web-archive@optimus.ietf.org; Thu, 06 Nov 2003 16:09:04 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA00970
	for <speechsc-web-archive@ietf.org>; Thu, 6 Nov 2003 16:08:51 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrNG-0000Wi-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 16:09:02 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrNG-0000Wf-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 16:09:02 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHrNF-00047k-7A; Thu, 06 Nov 2003 16:09:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHrMQ-00043n-7Y
	for speechsc@optimus.ietf.org; Thu, 06 Nov 2003 16:08:10 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA00952
	for <speechsc@ietf.org>; Thu, 6 Nov 2003 16:07:58 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrMO-0000WJ-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 16:08:08 -0500
Received: from vtg-um-e2k1.cisco.com ([171.70.93.55] helo=vtg-um-e2k1.sj21ad.cisco.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrMN-0000WG-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 16:08:07 -0500
Received: from cisco.com ([128.107.139.4]) by vtg-um-e2k1.sj21ad.cisco.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Thu, 6 Nov 2003 13:06:20 -0800
Message-ID: <3FAAB817.7090209@cisco.com>
Date: Thu, 06 Nov 2003 13:07:35 -0800
From: Sarvi Shanmugham <sarvi@cisco.com>
Organization: Cisco Systems Inc.
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: alee@voicegenie.com
CC: speechsc@ietf.org
Subject: Re: [Speechsc] Question on Proxy, and media redirect.
References: <3FA97740.3CCD5D0D@voicegenie.com>
In-Reply-To: <3FA97740.3CCD5D0D@voicegenie.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 06 Nov 2003 21:06:20.0984 (UTC) FILETIME=[D4021380:01C3A4A9]
Content-Transfer-Encoding: 7bit
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

Hi Alex,
    I agree with you very much that the redirect /proxy capabilities of 
MRCPv2 need to upgraded.
The current MRCPv2 specification is derived from the the current MRCPv1 
spec which is RTSP based, and hence was limited in its Proxy/Redirect 
capabilities.
   With MRCPv2, it fits in the SIP framework. The SIP framework is 
expected to setup  media-lines and control-lines. And I agree with your 
call flow too. I was hoping we could achieve re-directs in a similar 
fashion too.

     I agree we should add such call flows  in the spec to explain how 
this achieved. It hasn't been addressed in the current spec yet.

Sarvi


Alex Lee wrote:

>Hi,
>
>    We have looked at the way the sessions are set up with MRCPv2, and
>find that it isn't well defined for some proxy applications.  The
>situation we're thinking about is that there's a proxy fronting multiple
>ASR servers at the back end, where each ASR server may have different
>capabilities (e.g. Server 1 supports English, Server 2 supports
>French).  Sometimes, it may be desirable to use different servers for
>different recognition sessions on the same call.
>
>    When it is desired that a different server is to be used for
>different recognition sessions, there are two possibilities for how this
>may be done.  First, if the client is very smart, then it can be aware
>of many different servers it has access to, and setup independent
>sessions to these servers based on what it needs.  The other possibility
>is to build a client that isn't very smart, but it will access an MRCP
>proxy that is capable of routing its recognition requests, based on the
>parameters of each recognition session.  For efficient implementation of
>the Proxy, the Proxy should not proxy the audio data, but rather it
>should only proxy the SIP and MRCP control messages.
>
>    In order to do that, however, the spec itself must allow for some
>kind of media re-routing (similar to a SIP re-invite) initiated from the
>server when a "RECOGNIZE" method is called from the MRCP client.  Right
>now, MRCPv2 assumes that the same media channel will be used within one
>session.
>
>    One way to work around this is to allow for a SIP INVITE to be sent
>from the Server to the Client in the middle of a "recognize" method, to
>change some properties of the media channel.  In the use case with the
>proxy switching backed servers, the client would initiate a session
>with the proxy.  Then, based on the parameters of the MRCP RECOGNIZE
>command (e.g. which grammars to use for recognition, which languages to
>use, etc.), the proxy would choose the backed server to use for the
>recognition.  After the real backed server has been chosen, the proxy
>would go back to the client to modify the media properties, so that the
>client can send the audio directly to the real server.
>
>    The following diagram illustrates the interaction of such a
>situation.  The single arrows are messages sent via the SIP channels,
>while the double arrows are messages sent via the control channels.
>
>Client                           Proxy                           Server
>   |                               |                               |
>   | -------- SIP INVITE --------> |                               |
>   |                               |                               |
>   | <------- SIP INVITE --------- |                               |
>   |     Port# for both control    |                               |
>   |            and media          |                               |
>   |                               |                               |
>   | ---------- SIP ACK ---------> |                               |
>   |                               |                               |
>   |                               |                               |
>   | ======= TCP connect() ======> |                               |
>   |                               |                               |
>   | ==== MRCP DEFINE-GRAMMAR ===> |                               |
>   |                               |                               |
>   | <=== MRCP DEFINE-GRAMMAR ==== |                               |
>   |                               |                               |
>   | ====== MRCP SET-PARAM ======> |                               |
>   |                               |                               |
>   | <===== MRCP SET-PARAM ======= |                               |
>   |                               |                               |
>   | <=== More MRCP SET-PARAM ===> |                               |
>   |       and DEFINE-GRAMMAR      |                               |
>   |                               |                               |
>   | ==== MRCP RECOGNIZE (1) ====> |                               |
>   |                               |                               |
>   |                               | -------- SIP INVITE --------> |
>   |                               |                               |
>   |                               | <------- SIP INVITE --------- |
>   |                               |     Port# for both control    |
>   |                               |            and media          |
>   |                               |                               |
>   |                               | ---------- SIP ACK ---------> |
>   |                               |                               |
>   | <------- SIP INVITE --------- |                               |
>   |      New Port# of media       |                               |
>   |                               |                               |
>   | -------- SIP INVITE --------> |                               |
>   |                               |                               |
>   | <--------- SIP ACK ---------- |                               |
>   |                               | ======= TCP connect() ======> |
>   |                               |                               |
>   |                               | <==== All MRCP SET-PARAM ===> |
>   |                               |       and DEFINE-GRAMMAR      |
>   |                               |        sent by client         |
>   |                               |                               |
>   |                               | ====== MRCP RECOGNIZE ======> |
>   |                               |                               |
>   |                               | <===== MRCP RECOGNIZE ======= |
>   |                               |                               |
>   | <==== MRCP RECOGNIZE (2) ==== |                               |
>   |                               |                               |
>   |                               | <=== MRCP START-OF-SPEECH === |
>   |                               |                               |
>   | <=== MRCP START-OF-SPEECH === |                               |
>   |                               |                               |
>   |                               | < MRCP RECOGNITION-COMPLETE = |
>   |                               |                               |
>   | < MRCP RECOGNITION-COMPLETE = |                               |
>   |                               |                               |
>   |                               |                               |
>
>    Is there any possibility to explicitly include this mechanism, or an
>alternative, in the MRCPv2 spec that would allow for the usage of a
>proxy where it needs change the back-end server media properties in the
>middle of the session?  I have seen that "RE-DIRECT" has been mentioned
>in the MRCPv2 spec in section 8.5.3, but there hasn't been any example,
>explicit description, or rules on how this is to be done.  Any
>extensions or clarification on this topic is much appreciated.  Thanks.
>
>Alex...
>
>
>_______________________________________________
>Speechsc mailing list
>Speechsc@ietf.org
>https://www1.ietf.org/mailman/listinfo/speechsc
>
>  
>


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Thu Nov  6 16:37:22 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA02387
	for <speechsc-archive@odin.ietf.org>; Thu, 6 Nov 2003 16:37:22 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHroO-0005Gc-SD
	for speechsc-archive@odin.ietf.org; Thu, 06 Nov 2003 16:37:04 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hA6Lb4j7020232
	for speechsc-archive@odin.ietf.org; Thu, 6 Nov 2003 16:37:04 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHroO-0005GF-Cl
	for speechsc-web-archive@optimus.ietf.org; Thu, 06 Nov 2003 16:37:04 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA02368
	for <speechsc-web-archive@ietf.org>; Thu, 6 Nov 2003 16:36:51 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHroK-0000zk-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 16:37:00 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHroK-0000zg-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 16:37:00 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHroK-0005EW-Gz; Thu, 06 Nov 2003 16:37:00 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHrny-0005E2-Uv
	for speechsc@optimus.ietf.org; Thu, 06 Nov 2003 16:36:38 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA02310
	for <speechsc@ietf.org>; Thu, 6 Nov 2003 16:36:21 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrns-0000zB-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 16:36:32 -0500
Received: from vtg-um-e2k1.cisco.com ([171.70.93.55] helo=vtg-um-e2k1.sj21ad.cisco.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrnr-0000yP-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 16:36:31 -0500
Received: from cisco.com ([128.107.139.4]) by vtg-um-e2k1.sj21ad.cisco.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Thu, 6 Nov 2003 13:34:45 -0800
Message-ID: <3FAABEBF.4090302@cisco.com>
Date: Thu, 06 Nov 2003 13:35:59 -0800
From: Sarvi Shanmugham <sarvi@cisco.com>
Organization: Cisco Systems Inc.
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: John Potemri <John_Potemri@nmss.com>
CC: speechsc@ietf.org, "Eberman, Brian" <Brian.Eberman@scansoft.com>
Subject: Re: [Speechsc] Re: Speechsc digest, Vol 1 #153 - 2 msgs
References: <OF4DA4C910.D1BAD914-ON85256DD6.005F8E18-85256DD6.006E774C@nmss.com>
In-Reply-To: <OF4DA4C910.D1BAD914-ON85256DD6.005F8E18-85256DD6.006E774C@nmss.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 06 Nov 2003 21:34:45.0250 (UTC) FILETIME=[CBD46E20:01C3A4AD]
Content-Transfer-Encoding: 7bit
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

Hi John,
       I agree with all your comments. These are things that I was 
planning to propose too,  as additions to the specification, but haven't 
been able to get to it due to some other priority work items.
       Like you, I am also interested in seeing the Record as a separate 
resource for purposes very similar to the ones stated by you and Brian.
       But I still see a need for recording capabilities associated with 
the recognition resources to be able to record and manage utterances. I 
still have no problem looking at the Recording engine as a separately 
controlable standalone resource, as long as for simplicity sake, we 
allow the Recognize operation to trigger the start and stop of the 
recording engine if needed.
      Another place I suspect this Recording or the Record utterances 
operation is going to be usefull, is for Speaker verification. This 
requires the user to record/buffer user utterances spoken  separately or 
during a recognize operation and then apply the recorded/buffered audio 
to the Speaker Verification Engine.

    I also agree with john that there is need to be able to play only 
audio from a synthesizer.
The I was hoping we could achieve this, is by using the smae messages 
with different limitations based on the specific resource profiles.
       For Example:  with the Synthesizer we could have 3 different 
profiles. Audio-Player, poor-mans TTS(concatenating audio bits), 
advanced TTS.  The major difference  between the 3 is  the synthesizer 
data sent in the SPEAK method. 
        1.  Audio player may only support , the SSML <audio>, and the 
<mark> tags.
         2. Poor man TTS can support a few more SSML tags such as 
<say-as> , <phoneme>,  <voice> <xml:lang> etc.
         3. The Advanced TTS could support  everything else.

I agree this same approach applies to Recognizers that are capable of 
DTMF only or both DTMF and ASR.

Sarvi


John Potemri wrote:

>
>
>
>Brian,
>
>I did wonder when someone would raise "record" on the reflector. I know
>many people are talking about it.  I would caution to push "record" into
>the recognize package (a.k.a. resource type). Someone may want to offer
>record capabilities via MRCP and not have recognition.
>
>I would prefer to see record defined as a separate resource. MRCP does
>allow multiple resource types on the same control channel and share the
>same source stream, so you could record while recognizing, too. This keeps
>the two state machines and protocol interface clean.
>
>I don't want to open a can of words, but I suppose similar arguments for
>more granular resource types could be made around:
>   - "speak" supporting both TTS and audio - what if I only supported
>   audio? At least in this case you wouldn't have both resources "active".
>   - "recognize" resources that only do DTMF grammars.
>
>But then again, these are "types" within the "resource" and one could argue
>that you support the functional interface specified by the resource
>definition, but may not support certain types. This could have been
>resolved out-of-band via some resource management framework. But record in
>recognition is different. I understand that this may not be a speech
>vendors preferred way to look at things.
>
>This does beg the question of MRCP as the protocol for media engine ("media
>server") functions. You'll note that while not defined, the spec does
>suggest fax as a type, too. Perhaps someone has already had this dialog. I
>have excluded addessing conferencing at this point because I know who will
>jump all over me.
>
>-John
>
>P.S. And while we comment against MRCPv2, is it out-of-scope to mention the
>effects relative to MRCP(v1) for which people might make parallel
>suggestions?
>
>
>John Potemri
>NMS Communications
>100 Crossing Blvd
>Framingham, MA  01702
>+1-508-271-1369
>john_potemri@nmss.com
>
>
>
>|---------+---------------------------->
>|         |           speechsc-request@|
>|         |           ietf.org         |
>|         |           Sent by:         |
>|         |           speechsc-admin@ie|
>|         |           tf.org           |
>|         |                            |
>|         |                            |
>|         |           11/06/2003 12:00 |
>|         |           PM               |
>|         |           Please respond to|
>|         |           speechsc         |
>|         |                            |
>|---------+---------------------------->
>  >--------------------------------------------------------------------------------------------------------------|
>  |                                                                                                              |
>  |       To:       speechsc@ietf.org                                                                            |
>  |       cc:                                                                                                    |
>  |       Subject:  Speechsc digest, Vol 1 #153 - 2 msgs                                                         |
>  >--------------------------------------------------------------------------------------------------------------|
>
>
>
>
>Send Speechsc mailing list submissions to
>             speechsc@ietf.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
>             https://www1.ietf.org/mailman/listinfo/speechsc
>or, via email, send a message with subject or body 'help' to
>             speechsc-request@ietf.org
>
>You can reach the person managing the list at
>             speechsc-admin@ietf.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of Speechsc digest..."
>
>Today's Topics:
>
>   1. Question on Proxy, and media redirect. (Alex Lee)
>   2. RE: Question on Proxy, and media redirect. (Eberman, Brian)
>
>----- Message from Alex Lee <alee@voicegenie.com> on Wed, 05 Nov 2003
>17:18:40 -0500 -----
>                                                  
>      To: speechsc@ietf.org                       
>                                                  
> Subject: [Speechsc] Question on Proxy, and media 
>          redirect.                               
>                                                  
>
>Hi,
>
>    We have looked at the way the sessions are set up with MRCPv2, and
>find that it isn't well-defined for some proxy applications.  The
>situation we're thinking about is that there's a proxy fronting multiple
>ASR servers at the back end, where each ASR server may have different
>capabilities (e.g. Server 1 supports English, Server 2 supports
>French).  Sometimes, it may be desirable to use different servers for
>different recognition sessions on the same call.
>
>    When it is desired that a different server is to be used for
>different recognition sessions, there are two possibilities for how this
>may be done.  First, if the client is very smart, then it can be aware
>of many different servers it has access to, and setup independent
>sessions to these servers based on what it needs.  The other possibility
>is to build a client that isn't very smart, but it will access an MRCP
>proxy that is capable of routing its recognition requests, based on the
>parameters of each recognition session.  For efficient implementation of
>the Proxy, the Proxy should not proxy the audio data, but rather it
>should only proxy the SIP and MRCP control messages.
>
>    In order to do that, however, the spec itself must allow for some
>kind of media re-routing (similar to a SIP re-invite) initiated from the
>server when a "RECOGNIZE" method is called from the MRCP client.  Right
>now, MRCPv2 assumes that the same media channel will be used within one
>session.
>
>    One way to work around this is to allow for a SIP INVITE to be sent
>from the Server to the Client in the middle of a "recognize" method, to
>change some properties of the media channel.  In the use case with the
>proxy switching back-end servers, the client would initiate a session
>with the proxy.  Then, based on the parameters of the MRCP RECOGNIZE
>command (e.g. which grammars to use for recognition, which languages to
>use, etc.), the proxy would choose the back-end server to use for the
>recognition.  After the real back-end server has been chosen, the proxy
>would go back to the client to modify the media properties, so that the
>client can send the audio directly to the real server.
>
>    The following diagram illustrates the interaction of such a
>situation.  The single arrows are messages sent via the SIP channels,
>while the double arrows are messages sent via the control channels.
>
>Client                           Proxy                           Server
>   |                               |                               |
>   | -------- SIP INVITE --------> |                               |
>   |                               |                               |
>   | <------- SIP INVITE --------- |                               |
>   |     Port# for both control    |                               |
>   |            and media          |                               |
>   |                               |                               |
>   | ---------- SIP ACK ---------> |                               |
>   |                               |                               |
>   |                               |                               |
>   | ======= TCP connect() ======> |                               |
>   |                               |                               |
>   | ==== MRCP DEFINE-GRAMMAR ===> |                               |
>   |                               |                               |
>   | <=== MRCP DEFINE-GRAMMAR ==== |                               |
>   |                               |                               |
>   | ====== MRCP SET-PARAM ======> |                               |
>   |                               |                               |
>   | <===== MRCP SET-PARAM ======= |                               |
>   |                               |                               |
>   | <=== More MRCP SET-PARAM ===> |                               |
>   |       and DEFINE-GRAMMAR      |                               |
>   |                               |                               |
>   | ==== MRCP RECOGNIZE (1) ====> |                               |
>   |                               |                               |
>   |                               | -------- SIP INVITE --------> |
>   |                               |                               |
>   |                               | <------- SIP INVITE --------- |
>   |                               |     Port# for both control    |
>   |                               |            and media          |
>   |                               |                               |
>   |                               | ---------- SIP ACK ---------> |
>   |                               |                               |
>   | <------- SIP INVITE --------- |                               |
>   |      New Port# of media       |                               |
>   |                               |                               |
>   | -------- SIP INVITE --------> |                               |
>   |                               |                               |
>   | <--------- SIP ACK ---------- |                               |
>   |                               | ======= TCP connect() ======> |
>   |                               |                               |
>   |                               | <==== All MRCP SET-PARAM ===> |
>   |                               |       and DEFINE-GRAMMAR      |
>   |                               |        sent by client         |
>   |                               |                               |
>   |                               | ====== MRCP RECOGNIZE ======> |
>   |                               |                               |
>   |                               | <===== MRCP RECOGNIZE ======= ||                               |
>   | <==== MRCP RECOGNIZE (2) ==== |                               |
>   |                               |                               |
>   |                               | <=== MRCP START-OF-SPEECH === ||                               |
>   | <=== MRCP START-OF-SPEECH === |                               |
>   |                               |                               |
>   |                               | < MRCP RECOGNITION-COMPLETE = ||                               |
>   | < MRCP RECOGNITION-COMPLETE = |                               |
>   |                               |                               |
>   |                               |                               |
>
>    Is there any possibility to explicitly include this mechanism, or an
>alternative, in the MRCPv2 spec that would allow for the usage of a
>proxy where it needs change the back-end server media properties in the
>middle of the session?  I have seen that "RE-DIRECT" has been mentioned
>in the MRCPv2 spec in section 8.5.3, but there hasn't been any example,
>explicit description, or rules on how this is to be done.  Any
>extensions or clarification on this topic is much appreciated.  Thanks.
>
>Alex...
>
>
>
>----- Message from "Eberman, Brian" <Brian.Eberman@scansoft.com> on Thu, 6
>Nov 2003 08:19:29 -0500 -----
>                                                     
>      To: speechsc@ietf.org                          
>                                                     
>      cc: "Eberman, Brian"                           
>          <Brian.Eberman@scansoft.com>               
>                                                     
> Subject: RE: [Speechsc] Question on Proxy, and      
>          media redirect.                            
>                                                     
>
>
>
>I really think Alex has brought up an important point here.  We need to be
>able to cleanly separate the RTP/ audio streaming from the control stream
>to
>get a cleaner specification.  Proxy scaling should really be a key issues
>for MRCPv2 and I thought it was in the base requirements. This seems like a
>significant whole.
>
>I've seen two other places in implementations of VoiceXML browsers that
>need
>to be clarified for v2:
>
>1) There isn't any way to cleanly implement a pure recording function under
>MRCP to implement the record function of VoiceXML without having the
>VoiceXML implementation box routing the audio. I think we need to resolve
>this issue.  VoiceXML recording can be terminated by speech recognition so
>it is actually more like a recognition then a recording.
>
>2) The SPEECH-MARKER event for TTS doesn't indicate the time of the event.
>So if a system is distributing the RTP stream and the TTS stream and
>potential mixing audio using the MARKER it doesn't know the time in the RTP
>stream when the audio mark was hit.
>
>The stream control semantics seem to need a little work here.
>-Brian
>
>-----Original Message-----
>From: Alex Lee [mailto:alee@voicegenie.com]
>Sent: Wednesday, November 05, 2003 5:19 PM
>To: speechsc@ietf.org
>Subject: [Speechsc] Question on Proxy, and media redirect.
>
>
>Hi,
>
>    We have looked at the way the sessions are set up with MRCPv2, and find
>that it isn't well-defined for some proxy applications.  The situation
>we're
>thinking about is that there's a proxy fronting multiple ASR servers at the
>back end, where each ASR server may have different capabilities (e.g.
>Server
>1 supports English, Server 2 supports French).  Sometimes, it may be
>desirable to use different servers for different recognition sessions on
>the
>same call.
>
>    When it is desired that a different server is to be used for different
>recognition sessions, there are two possibilities for how this may be done.
>First, if the client is very smart, then it can be aware of many different
>servers it has access to, and setup independent sessions to these servers
>based on what it needs.  The other possibility is to build a client that
>isn't very smart, but it will access an MRCP proxy that is capable of
>routing its recognition requests, based on the parameters of each
>recognition session.  For efficient implementation of the Proxy, the Proxy
>should not proxy the audio data, but rather it should only proxy the SIP
>and
>MRCP control messages.
>
>    In order to do that, however, the spec itself must allow for some kind
>of media re-routing (similar to a SIP re-invite) initiated from the server
>when a "RECOGNIZE" method is called from the MRCP client.  Right now,
>MRCPv2
>assumes that the same media channel will be used within one session.
>
>    One way to work around this is to allow for a SIP INVITE to be sent
>from
>the Server to the Client in the middle of a "recognize" method, to change
>some properties of the media channel.  In the use case with the proxy
>switching back-end servers, the client would initiate a session with the
>proxy.  Then, based on the parameters of the MRCP RECOGNIZE command (e.g.
>which grammars to use for recognition, which languages to use, etc.), the
>proxy would choose the back-end server to use for the recognition.  After
>the real back-end server has been chosen, the proxy would go back to the
>client to modify the media properties, so that the client can send the
>audio
>directly to the real server.
>
>    The following diagram illustrates the interaction of such a situation.
>The single arrows are messages sent via the SIP channels, while the double
>arrows are messages sent via the control channels.
>
>Client                           Proxy                           Server
>   |                               |                               |
>   | -------- SIP INVITE --------> |                               |
>   |                               |                               |
>   | <------- SIP INVITE --------- |                               |
>   |     Port# for both control    |                               |
>   |            and media          |                               |
>   |                               |                               |
>   | ---------- SIP ACK ---------> |                               |
>   |                               |                               |
>   |                               |                               |
>   | ======= TCP connect() ======> |                               |
>   |                               |                               |
>   | ==== MRCP DEFINE-GRAMMAR ===> |                               |
>   |                               |                               |
>   | <=== MRCP DEFINE-GRAMMAR ==== |                               |
>   |                               |                               |
>   | ====== MRCP SET-PARAM ======> |                               |
>   |                               |                               |
>   | <===== MRCP SET-PARAM ======= |                               |
>   |                               |                               |
>   | <=== More MRCP SET-PARAM ===> |                               |
>   |       and DEFINE-GRAMMAR      |                               |
>   |                               |                               |
>   | ==== MRCP RECOGNIZE (1) ====> |                               |
>   |                               |                               |
>   |                               | -------- SIP INVITE --------> |
>   |                               |                               |
>   |                               | <------- SIP INVITE --------- |
>   |                               |     Port# for both control    |
>   |                               |            and media          |
>   |                               |                               |
>   |                               | ---------- SIP ACK ---------> |
>   |                               |                               |
>   | <------- SIP INVITE --------- |                               |
>   |      New Port# of media       |                               |
>   |                               |                               |
>   | -------- SIP INVITE --------> |                               |
>   |                               |                               |
>   | <--------- SIP ACK ---------- |                               |
>   |                               | ======= TCP connect() ======> |
>   |                               |                               |
>   |                               | <==== All MRCP SET-PARAM ===> |
>   |                               |       and DEFINE-GRAMMAR      |
>   |                               |        sent by client         |
>   |                               |                               |
>   |                               | ====== MRCP RECOGNIZE ======> |
>   |                               |                               |
>   |                               | <===== MRCP RECOGNIZE ======= ||                               |
>   | <==== MRCP RECOGNIZE (2) ==== |                               |
>   |                               |                               |
>   |                               | <=== MRCP START-OF-SPEECH === ||                               |
>   | <=== MRCP START-OF-SPEECH === |                               |
>   |                               |                               |
>   |                               | < MRCP RECOGNITION-COMPLETE = ||                               |
>   | < MRCP RECOGNITION-COMPLETE = |                               |
>   |                               |                               |
>   |                               |                               |
>
>    Is there any possibility to explicitly include this mechanism, or an
>alternative, in the MRCPv2 spec that would allow for the usage of a proxy
>where it needs change the back-end server media properties in the middle of
>the session?  I have seen that "RE-DIRECT" has been mentioned in the MRCPv2
>spec in section 8.5.3, but there hasn't been any example, explicit
>description, or rules on how this is to be done.  Any extensions or
>clarification on this topic is much appreciated.  Thanks.
>
>Alex...
>
>
>_______________________________________________
>Speechsc mailing list
>Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc
>
>
>
>_______________________________________________
>Speechsc mailing list
>Speechsc@ietf.org
>https://www1.ietf.org/mailman/listinfo/speechsc
>
>
>
>
>
>
>_______________________________________________
>Speechsc mailing list
>Speechsc@ietf.org
>https://www1.ietf.org/mailman/listinfo/speechsc
>
>  
>


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Thu Nov  6 16:47:27 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA02804
	for <speechsc-archive@odin.ietf.org>; Thu, 6 Nov 2003 16:47:27 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHry9-0006LS-Ok
	for speechsc-archive@odin.ietf.org; Thu, 06 Nov 2003 16:47:09 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hA6Ll9bP024384
	for speechsc-archive@odin.ietf.org; Thu, 6 Nov 2003 16:47:09 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHry9-0006LD-K1
	for speechsc-web-archive@optimus.ietf.org; Thu, 06 Nov 2003 16:47:09 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA02777
	for <speechsc-web-archive@ietf.org>; Thu, 6 Nov 2003 16:46:56 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHry7-00018v-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 16:47:07 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHry6-00018k-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 16:47:06 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHry0-0006IO-Go; Thu, 06 Nov 2003 16:47:00 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHrxJ-0006Gi-Uy
	for speechsc@optimus.ietf.org; Thu, 06 Nov 2003 16:46:17 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA02738
	for <speechsc@ietf.org>; Thu, 6 Nov 2003 16:46:05 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrxH-00017m-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 16:46:15 -0500
Received: from [205.150.90.87] (helo=voicegenie.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHrxG-000170-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 16:46:15 -0500
Received: from voicegenie.com (dilbert.voicegenie.com [205.150.90.110])
	by voicegenie.com (8.11.6+Sun/8.9.3) with ESMTP id hA6LjSG29303;
	Thu, 6 Nov 2003 16:45:28 -0500 (EST)
Message-ID: <3FAAC26E.48D90C25@voicegenie.com>
Date: Thu, 06 Nov 2003 16:51:42 -0500
From: Alex Lee <alee@voicegenie.com>
Reply-To: alee@voicegenie.com
Organization: VoiceGenie Inc.
X-Mailer: Mozilla 4.76 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
CC: Sarvi Shanmugham <sarvi@cisco.com>, speechsc@ietf.org
Subject: Re: [Speechsc] Question on Proxy, and media redirect.
References: <3FA97740.3CCD5D0D@voicegenie.com> <3FAAB817.7090209@cisco.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

Thanks for your response Sarvi.  Do you think the client should be *required*
to support this kind of "redirect" instruction from the server, or do you
think this may be optional?  Thanks.

Alex...

Sarvi Shanmugham wrote:

> Hi Alex,
>     I agree with you very much that the redirect /proxy capabilities of
> MRCPv2 need to upgraded.
> The current MRCPv2 specification is derived from the the current MRCPv1
> spec which is RTSP based, and hence was limited in its Proxy/Redirect
> capabilities.
>    With MRCPv2, it fits in the SIP framework. The SIP framework is
> expected to setup  media-lines and control-lines. And I agree with your
> call flow too. I was hoping we could achieve re-directs in a similar
> fashion too.
>
>      I agree we should add such call flows  in the spec to explain how
> this achieved. It hasn't been addressed in the current spec yet.
>
> Sarvi


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Thu Nov  6 17:16:26 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA04398
	for <speechsc-archive@odin.ietf.org>; Thu, 6 Nov 2003 17:16:25 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHsQB-0000TB-LZ
	for speechsc-archive@odin.ietf.org; Thu, 06 Nov 2003 17:16:07 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hA6MG7L6001799
	for speechsc-archive@odin.ietf.org; Thu, 6 Nov 2003 17:16:07 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHsQB-0000Sw-HQ
	for speechsc-web-archive@optimus.ietf.org; Thu, 06 Nov 2003 17:16:07 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA04362
	for <speechsc-web-archive@ietf.org>; Thu, 6 Nov 2003 17:15:53 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHsQ7-0001kP-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 17:16:03 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHsQ7-0001kL-00
	for speechsc-web-archive@ietf.org; Thu, 06 Nov 2003 17:16:03 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHsQ6-0000S4-9e; Thu, 06 Nov 2003 17:16:02 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AHsP9-0000Pe-Ti
	for speechsc@optimus.ietf.org; Thu, 06 Nov 2003 17:15:03 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA04296
	for <speechsc@ietf.org>; Thu, 6 Nov 2003 17:14:50 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHsP7-0001h7-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 17:15:01 -0500
Received: from vtg-um-e2k1.cisco.com ([171.70.93.55] helo=vtg-um-e2k1.sj21ad.cisco.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AHsP7-0001gH-00
	for speechsc@ietf.org; Thu, 06 Nov 2003 17:15:01 -0500
Received: from cisco.com ([128.107.139.4]) by vtg-um-e2k1.sj21ad.cisco.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Thu, 6 Nov 2003 14:13:14 -0800
Message-ID: <3FAAC7C5.8010706@cisco.com>
Date: Thu, 06 Nov 2003 14:14:29 -0800
From: Sarvi Shanmugham <sarvi@cisco.com>
Organization: Cisco Systems Inc.
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: alee@voicegenie.com
CC: speechsc@ietf.org
Subject: Re: [Speechsc] Question on Proxy, and media redirect.
References: <3FA97740.3CCD5D0D@voicegenie.com> <3FAAB817.7090209@cisco.com> <3FAAC26E.48D90C25@voicegenie.com>
In-Reply-To: <3FAAC26E.48D90C25@voicegenie.com>
Content-Type: multipart/alternative;
 boundary="------------040806070300050205070704"
X-OriginalArrivalTime: 06 Nov 2003 22:13:14.0859 (UTC) FILETIME=[2C76CBB0:01C3A4B3]
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>

This is a multi-part message in MIME format.
--------------040806070300050205070704
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

I would think the client SHOULD be required to respond to SIP 
re-INVITEs.  My understanding was that as SIP UA the client would 
already suipport re-INVITES. If not, we make it SHOULD level requirement 
for the SIP UA acting as the MRCPv2 client to respond appropriately to 
re-INVITES.

As of now, the spec does not say anything about when audio transmition 
should start  or stop on a media pipe, for a Recognition resource. I 
suppose the best time to have the client start media transmition to the 
client would be after it gets MRCPv2 200 OK for the recognize command. 
The media transmition from the client can stop after the 
RECOGNITION-COMPLETE event is received by the client.

This would allow the proxy to delay sending back the MRCPv2 200 OK until 
the media pipe has been redirected and the RECOGNIZE command has been 
received from the back end server.

Sarvi

Alex Lee wrote:

>Thanks for your response Sarvi.  Do you think the client should be *required*
>to support this kind of "redirect" instruction from the server, or do you
>think this may be optional?  Thanks.
>
>Alex...
>
>Sarvi Shanmugham wrote:
>
>  
>
>>Hi Alex,
>>    I agree with you very much that the redirect /proxy capabilities of
>>MRCPv2 need to upgraded.
>>The current MRCPv2 specification is derived from the the current MRCPv1
>>spec which is RTSP based, and hence was limited in its Proxy/Redirect
>>capabilities.
>>   With MRCPv2, it fits in the SIP framework. The SIP framework is
>>expected to setup  media-lines and control-lines. And I agree with your
>>call flow too. I was hoping we could achieve re-directs in a similar
>>fashion too.
>>
>>     I agree we should add such call flows  in the spec to explain how
>>this achieved. It hasn't been addressed in the current spec yet.
>>
>>Sarvi
>>    
>>
>
>
>  
>

--------------040806070300050205070704
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
  <title></title>
</head>
<body text="#000000" bgcolor="#ffffff">
I would think the client SHOULD be required to respond to SIP
re-INVITEs.&nbsp; My understanding was that as SIP UA the client would
already suipport re-INVITES. If not, we make it SHOULD level
requirement for the SIP UA acting as the MRCPv2 client to respond
appropriately to re-INVITES. <br>
<br>
As of now, the spec does not say anything about when audio transmition
should start&nbsp; or stop on a media pipe, for a Recognition resource. I
suppose the best time to have the client start media transmition to the
client would be after it gets MRCPv2 200 OK for the recognize command.
The media transmition from the client can stop after the
RECOGNITION-COMPLETE event is received by the client.<br>
<br>
This would allow the proxy to delay sending back the MRCPv2 200 OK
until the media pipe has been redirected and the RECOGNIZE command has
been received from the back end server.<br>
<br>
Sarvi<br>
<br>
Alex Lee wrote:<br>
<blockquote type="cite" cite="mid3FAAC26E.48D90C25@voicegenie.com">
  <pre wrap="">Thanks for your response Sarvi.  Do you think the client should be *required*
to support this kind of "redirect" instruction from the server, or do you
think this may be optional?  Thanks.

Alex...

Sarvi Shanmugham wrote:

  </pre>
  <blockquote type="cite">
    <pre wrap="">Hi Alex,
    I agree with you very much that the redirect /proxy capabilities of
MRCPv2 need to upgraded.
The current MRCPv2 specification is derived from the the current MRCPv1
spec which is RTSP based, and hence was limited in its Proxy/Redirect
capabilities.
   With MRCPv2, it fits in the SIP framework. The SIP framework is
expected to setup  media-lines and control-lines. And I agree with your
call flow too. I was hoping we could achieve re-directs in a similar
fashion too.

     I agree we should add such call flows  in the spec to explain how
this achieved. It hasn't been addressed in the current spec yet.

Sarvi
    </pre>
  </blockquote>
  <pre wrap=""><!---->

  </pre>
</blockquote>
</body>
</html>

--------------040806070300050205070704--


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Mon Nov 10 10:46:23 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA25483
	for <speechsc-archive@odin.ietf.org>; Mon, 10 Nov 2003 10:46:23 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJEEx-0003a3-KX
	for speechsc-archive@odin.ietf.org; Mon, 10 Nov 2003 10:46:07 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hAAFk7Jm013758
	for speechsc-archive@odin.ietf.org; Mon, 10 Nov 2003 10:46:07 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJEEx-0003Yf-Fx
	for speechsc-web-archive@optimus.ietf.org; Mon, 10 Nov 2003 10:46:07 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA25466
	for <speechsc-web-archive@ietf.org>; Mon, 10 Nov 2003 10:45:51 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJEEt-0006Nx-00
	for speechsc-web-archive@ietf.org; Mon, 10 Nov 2003 10:46:03 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJEEt-0006Nt-00
	for speechsc-web-archive@ietf.org; Mon, 10 Nov 2003 10:46:03 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJEEq-0003Y0-SW; Mon, 10 Nov 2003 10:46:00 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJEEQ-0003XU-NZ
	for speechsc@optimus.ietf.org; Mon, 10 Nov 2003 10:45:34 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA25430;
	Mon, 10 Nov 2003 10:45:19 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJEEJ-0006KH-00; Mon, 10 Nov 2003 10:45:27 -0500
Received: from sj-iport-3-in.cisco.com ([171.71.176.72] helo=sj-iport-3.cisco.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJEEJ-0006Jr-00; Mon, 10 Nov 2003 10:45:27 -0500
Received: from cisco.com (171.71.177.254)
  by sj-iport-3.cisco.com with ESMTP; 10 Nov 2003 07:51:06 -0800
Received: from mira-sjc5-e.cisco.com (IDENT:mirapoint@mira-sjc5-e.cisco.com [171.71.163.15])
	by sj-core-2.cisco.com (8.12.9/8.12.6) with ESMTP id hAAFiqw5014379;
	Mon, 10 Nov 2003 07:44:52 -0800 (PST)
Received: from ORANLT.ietf58.ietf.org (sjc-vpn4-151.cisco.com [10.21.80.151])
	by mira-sjc5-e.cisco.com (Mirapoint Messaging Server MOS 3.3.6-GR)
	with ESMTP id AJU66545;
	Mon, 10 Nov 2003 07:44:51 -0800 (PST)
Date: Mon, 10 Nov 2003 10:44:46 -0500
From: "David R. Oran" <oran@cisco.com>
To: agenda@ietf.org
cc: speechsc@ietf.org
Message-ID: <1660948.1068461086@ORANLT.ietf58.ietf.org>
X-Mailer: Mulberry/3.1.0b9 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
Subject: [Speechsc] Updated Speechsc WGT Agenda
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

5 mins 		Agenda Bashing

5 mins 		Status of speechsc Requirements document
			draft-ietf-speechsc-reqts-06.txt

45 mins		MRCPv2 document
			draft-ietf-speechsc-mrcpv2-00.txt

30 mins 	SI/SV/Enrollment/Hotword
			draft-burnett-mrcpext-00.txt

30 mins 	Open discussion on media routing


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Mon Nov 10 22:54:19 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA03414
	for <speechsc-archive@odin.ietf.org>; Mon, 10 Nov 2003 22:54:19 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJPbP-00059e-SJ
	for speechsc-archive@odin.ietf.org; Mon, 10 Nov 2003 22:54:04 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hAB3s31v019810
	for speechsc-archive@odin.ietf.org; Mon, 10 Nov 2003 22:54:03 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJPbP-00059R-Ok
	for speechsc-web-archive@optimus.ietf.org; Mon, 10 Nov 2003 22:54:03 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA03395
	for <speechsc-web-archive@ietf.org>; Mon, 10 Nov 2003 22:53:48 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJPbM-00040K-00
	for speechsc-web-archive@ietf.org; Mon, 10 Nov 2003 22:54:00 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJPbL-00040H-00
	for speechsc-web-archive@ietf.org; Mon, 10 Nov 2003 22:53:59 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJPbN-000592-05; Mon, 10 Nov 2003 22:54:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJPb3-00058P-2n
	for speechsc@optimus.ietf.org; Mon, 10 Nov 2003 22:53:41 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA03344
	for <speechsc@ietf.org>; Mon, 10 Nov 2003 22:53:26 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJPaz-0003zZ-00
	for speechsc@ietf.org; Mon, 10 Nov 2003 22:53:37 -0500
Received: from vtg-um-e2k1.cisco.com ([171.70.93.55] helo=vtg-um-e2k1.sj21ad.cisco.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJPay-0003yk-00
	for speechsc@ietf.org; Mon, 10 Nov 2003 22:53:36 -0500
Received: from cisco.com ([128.107.139.4]) by vtg-um-e2k1.sj21ad.cisco.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Mon, 10 Nov 2003 19:51:50 -0800
Message-ID: <3FB05D22.9050809@cisco.com>
Date: Mon, 10 Nov 2003 19:53:06 -0800
From: Sarvi Shanmugham <sarvi@cisco.com>
Organization: Cisco Systems Inc.
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Sailesh.Sathish@nokia.com
CC: speechsc@ietf.org, ramalingam.hariharan@nokia.com
Subject: Re: [Speechsc] MRCPv2  comments
References: <D338C3A6DFB6BE4EA06F1A7494CEBD4601ED64B7@trebe004.europe.nokia.com>
In-Reply-To: <D338C3A6DFB6BE4EA06F1A7494CEBD4601ED64B7@trebe004.europe.nokia.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 11 Nov 2003 03:51:51.0031 (UTC) FILETIME=[237FA070:01C3A807]
Content-Transfer-Encoding: 7bit
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

Some of the features that you are requesting are not available in the 
current spec because the current specification is leveraging the 
existing MRCPv1 specification. That said,
I do plan to add some of the functionality you have requested. Look for 
individual reposnses before.

Sailesh.Sathish@nokia.com wrote:

>Hi,
>
>These are some things that I noticed on a quick inspection of MRCP v2.0  spec.
>
>Fetching, Storing and deleting TTS content
>-------------------------------------------------------------
>
>1. When the TTS receives an inline content, it has a content-id associated with it. If it has a cache-control field, then that would determine how long this content is kept on the server.  However in case of a scenario where there MIGHT be a need to keep TTS content for an entire page browse (so no cache control is activated) and then if the user navigates to a new page, the client would decide to delete that particular TTS content, it cannot do so. Basically, there is no DELETE_CONTENT directive that can take the "content-id" as a parameter. If there are other ways to do this, these are not so clear. 
>
With TTS there is no explicit content create or delete.  There is inline 
content, that is stored on the server for the life of the SPEAK request. 
And there is URI referenced content(either specified as  a URI, or 
specified as a URI from inside an inline content), which may be held in 
a local cache on the media server. The client uses the Cache-Directive 
fields to tell the media server, what caching directive values to use in 
the media servers document cache for thsi session or Request..

None of these cache directives applies for the inline content itself, 
but only applies for content that is URI referenced and hence those that 
the media server would have to go and fetch and possibly cache.

>
>
>2. For TTS content, the spec doesnt specify if content-id is needed for both inline content and URI's. It specifies for ASR that content-id is needed only for inline recognition grammars and not for external ones.
>
currently the Content-ID is used in recognizers because you could 
load/complile grammars with a DEFINE-GRAMMARS command  and then use it 
later in a RECOGNIZE command. Here we need the content-id to relate the 2.

There doesn't seem to be a similar need in TTS. Or is this question 
related to your requirement above??

>
>3. The spec does not specify which all TTS methods can have the content as a payload. I would assume from the spec that only the SPEAK request would have the content as a payload. However, the spec defines message headers like "audio fetch hint" and "fetch hint" that is used with SET-PARAMS, GET-PARAMS and SPEAK. They have values of "prefetch" indicating that the content may be prefetched and "safe" that says the content need to be downloaded only when needed. If (according to my understanding) there is prefetch enabled with a SET-PARAM request, and the content can be specified ONLY through a SPEAK request, I dont see a prefetch capability here. There should be some method like LOAD-CONTENT that can take either an inline content or a URI to fetch (and the content addressed through a content-id). 
>
The fetch hints are to be used when the server has to go fetch dat from 
outside. This may be thec ase if the SSML contains either <audio> tags 
pointing ot external audio files or may be other SSML content. The fetch 
hints are to be used in these cases to see if the SPEAK request should 
speaking when all the contents are have been loaded or if it can start 
speaking as more content is being fetched in the background(streaming 
mode). Audio-fetch-hint is specifically for the <audio> tag and hence 
supports a streaming as one of its options.  The regular fetch hint 
applies for XML documents that the Synthesizor or the Recognizer may 
have to fetch from outside before speaking or recognizing. The 
difference betweens eding thme in a SET-PARAM method as opposed to a 
RECOGNIZE or SPEAK request is that the values apply for session or for 
that particular request.

>
>Queueing
>-------------
>
>MRCPv2 does not specify using a QUEUE method. If there are preloaded content (for TTS) on the server that could be referenced using a content-id, the client should be able to choose multiple content through the id's and create a queue to be played back. It can be argued that multiple SPEAK requests can be sent and they naturally form a queue to be played back. However, consider the scenario where you have sent multiple SPEAK and the TTS is playing them back. Then the client decides to PAUSE this and play another TTS content. With the current spec, if you pause this, the next SPEAK will be appended to this queue rather than spoken immediately. If a STOP gets sent, the whole queue gets cleared. 
>
>If a QUEUE is implemented, then this queue can be paused (through a PAUSE_QUEUE or something) and a new TTS can be played. The TTS would send back a PLAY_COMPLETE after which the client can resume playback of the queue.  
>
I'd like to gauge how important the need is to be able to pause a SPEAK 
request, play something else and then  resume a previously paused play 
request.
If there is s strong need for it, My suggestion to address this is using 
the existing mechanisms instead of adding a separate QUEUE 
implementation and operation.

I suspect we could address your problem, by fixing the current 
SPEAK/PAUSE/RESUME mechanism and use the Request-Identifier and the 
Request state (ACTIVE/PENDING/COMPLETE) to provide a fix. I will get 
back to you with a clearer proposal on this.

>
>Audio recorder
>---------------------
>
>MRCPv2 does not specify the use of an audio recorder with RECORD, and FETCH capability.
>There is provision in the ASR methods for recording only those waveforms that were used for recognition with fetch capability.
>
This has been addressed on a recent thread. The summary is that that I 
agree there is a need for a separate RECORD resource, I will add a 
propose mechanism in the next revision of the specification.

>
>ASR
>-------
>
>1. MRCPv2 does not have provision to delete pre-loaded grammars if the client decides they are no longer needed.
>
The current specification doesn't support server level or global 
preloaded grammars at all. This is one of the requirements per the MRCP 
requirements document and I will be adding support for it as well.

>
>2. No INTERMEDIATE_RECOGNITION_RESULT send event from the recognizer and provision for setting this with the message fields (eg SET-PARAMS)
>
Can you elaborate on the need for this. Are you refering to recognition 
in stages(recgonizer, NL etc.) or are you refering to parital 
recognition. If so could you provide an example of its usage.

Thx,
Sarvi

>
>best regards,
>
>Sailesh.
>
>Sailesh Kumar Sathish      GSM +358 50 4835679
>Nokia Research Center      Desk +358 3 272 5679
>P.O.Box 100, Tampere      sailesh.sathish@nokia.com
>Finland -33721
>
>
>
>_______________________________________________
>Speechsc mailing list
>Speechsc@ietf.org
>https://www1.ietf.org/mailman/listinfo/speechsc
>
>  
>


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Tue Nov 11 11:00:24 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA07192
	for <speechsc-archive@odin.ietf.org>; Tue, 11 Nov 2003 11:00:24 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJaw3-0001m6-1U
	for speechsc-archive@odin.ietf.org; Tue, 11 Nov 2003 11:00:07 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hABG06bm006814
	for speechsc-archive@odin.ietf.org; Tue, 11 Nov 2003 11:00:06 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJaw2-0001lj-Ey
	for speechsc-web-archive@optimus.ietf.org; Tue, 11 Nov 2003 11:00:06 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA07141
	for <speechsc-web-archive@ietf.org>; Tue, 11 Nov 2003 10:59:52 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJavz-00054s-00
	for speechsc-web-archive@ietf.org; Tue, 11 Nov 2003 11:00:03 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJavz-00054o-00
	for speechsc-web-archive@ietf.org; Tue, 11 Nov 2003 11:00:03 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJavy-0001hn-NE; Tue, 11 Nov 2003 11:00:02 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AJavu-0001gJ-Ln
	for speechsc@optimus.ietf.org; Tue, 11 Nov 2003 10:59:58 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA07135
	for <speechsc@ietf.org>; Tue, 11 Nov 2003 10:59:45 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJavs-00054i-00
	for speechsc@ietf.org; Tue, 11 Nov 2003 10:59:56 -0500
Received: from sj-iport-1-in.cisco.com ([171.71.176.70] helo=sj-iport-1.cisco.com)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AJavr-00054V-00
	for speechsc@ietf.org; Tue, 11 Nov 2003 10:59:55 -0500
Received: from mira-sjc5-e.cisco.com (IDENT:mirapoint@mira-sjc5-e.cisco.com [171.71.163.15])
	by sj-core-2.cisco.com (8.12.9/8.12.6) with ESMTP id hABFxNw5017786
	for <speechsc@ietf.org>; Tue, 11 Nov 2003 07:59:23 -0800 (PST)
Received: from ORANLT.ietf58.ietf.org (sjc-vpn2-684.cisco.com [10.21.114.172])
	by mira-sjc5-e.cisco.com (Mirapoint Messaging Server MOS 3.3.6-GR)
	with ESMTP id AJV65672;
	Tue, 11 Nov 2003 07:59:20 -0800 (PST)
Date: Tue, 11 Nov 2003 10:59:13 -0500
From: "David R. Oran" <oran@cisco.com>
To: speechsc@ietf.org
Message-ID: <88928402.1068548353@ORANLT.ietf58.ietf.org>
X-Mailer: Mulberry/3.1.0 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
Subject: [Speechsc] Speaker slides for Speechsc WG meeting
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

If you are presenting at the speechsc meeting Thursday, please send your 
slides to me and Eric as soon as you have them ready. We'll get them put up 
somewhere and send a pointer out before the meeting.

Thanks, Dave.

------------------------
David R. Oran
Cisco Systems
7 Ladyslipper Lane
Acton, MA 01720
Office: +1 978 264 2048
VoIP: +1 408 571 4576
Email: oran@cisco.com

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Wed Nov 12 17:46:15 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA14043
	for <speechsc-archive@odin.ietf.org>; Wed, 12 Nov 2003 17:46:14 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AK3kL-0006At-Vs
	for speechsc-archive@odin.ietf.org; Wed, 12 Nov 2003 17:45:58 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hACMjvh2023729
	for speechsc-archive@odin.ietf.org; Wed, 12 Nov 2003 17:45:57 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AK3kL-0006Ae-RT
	for speechsc-web-archive@optimus.ietf.org; Wed, 12 Nov 2003 17:45:57 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA13978
	for <speechsc-web-archive@ietf.org>; Wed, 12 Nov 2003 17:45:44 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AK3kJ-0000Fw-00
	for speechsc-web-archive@ietf.org; Wed, 12 Nov 2003 17:45:55 -0500
Received: from manatick.foretec.com ([4.17.168.5] helo=manatick)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AK3ir-00009c-00
	for speechsc-web-archive@ietf.org; Wed, 12 Nov 2003 17:44:25 -0500
Received: from [132.151.6.22] (helo=optimus.ietf.org)
	by manatick with esmtp (Exim 4.24)
	id 1AK3Ww-0006iq-Pc
	for speechsc-web-archive@ietf.org; Wed, 12 Nov 2003 17:32:06 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AK3Wr-0004vl-11; Wed, 12 Nov 2003 17:32:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AK3W2-0004vH-7H
	for speechsc@optimus.ietf.org; Wed, 12 Nov 2003 17:31:10 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA13512
	for <speechsc@ietf.org>; Wed, 12 Nov 2003 17:30:56 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AK3Vz-00004g-00
	for speechsc@ietf.org; Wed, 12 Nov 2003 17:31:07 -0500
Received: from goalie.snowshore.com ([216.57.133.4] helo=webshield.office.snowshore.com)
	by ietf-mx with smtp (Exim 4.12)
	id 1AK3Vz-00001X-00
	for speechsc@ietf.org; Wed, 12 Nov 2003 17:31:07 -0500
Received: from zoe.office.snowshore.com(192.168.1.172) by webshield.office.snowshore.com via csmap 
	 id 30174; Wed, 12 Nov 2003 17:37:25 -0500 (EST)
X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 12 Nov 2003 17:30:36 -0500
Message-ID: <4A3384433CE2AB46A63468CB207E209D4F69D9@zoe.office.snowshore.com>
Thread-Topic: Call for Scribes
Thread-Index: AcOpbF6LFWF2mpXYTG6DIcg6sNsV/g==
From: "Eric Burger" <eburger@snowshore.com>
To: "IETF SPEECHSC (E-mail)" <speechsc@ietf.org>
Content-Transfer-Encoding: quoted-printable
Subject: [Speechsc] Call for Scribes
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Transfer-Encoding: quoted-printable

Please send me or Dave if you will be in Minneapolis and are willing to =
take notes (easy - just need to take down decisions) or jabber =
(relatively easy - "he said, she said").

The Jabber information is at http://www.xmpp.org/ietf-chat.html .
Our server is ietf.xmpp.org; the room is speechsc.
We will be in the "room" starting at 1pm CT.


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Thu Nov 13 15:45:25 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA19476
	for <speechsc-archive@odin.ietf.org>; Thu, 13 Nov 2003 15:45:25 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AKOKw-0004qs-1Y
	for speechsc-archive@odin.ietf.org; Thu, 13 Nov 2003 15:45:06 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hADKj62s018644
	for speechsc-archive@odin.ietf.org; Thu, 13 Nov 2003 15:45:06 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AKOKv-0004qI-Nv
	for speechsc-web-archive@optimus.ietf.org; Thu, 13 Nov 2003 15:45:05 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA19437
	for <speechsc-web-archive@ietf.org>; Thu, 13 Nov 2003 15:44:54 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AKOKu-0004qm-00
	for speechsc-web-archive@ietf.org; Thu, 13 Nov 2003 15:45:04 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AKOKt-0004qj-00
	for speechsc-web-archive@ietf.org; Thu, 13 Nov 2003 15:45:03 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AKOKt-0004oA-Dn; Thu, 13 Nov 2003 15:45:03 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AKOKY-0004n1-N3
	for speechsc@optimus.ietf.org; Thu, 13 Nov 2003 15:44:42 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA19391
	for <speechsc@ietf.org>; Thu, 13 Nov 2003 15:44:30 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AKOKW-0004pq-00
	for speechsc@ietf.org; Thu, 13 Nov 2003 15:44:40 -0500
Received: from albatross-ext.wise.edt.ericsson.se ([193.180.251.49])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AKOKU-0004pM-00
	for speechsc@ietf.org; Thu, 13 Nov 2003 15:44:39 -0500
Received: from esealnt610.al.sw.ericsson.se ([153.88.254.120])
	by albatross-ext.wise.edt.ericsson.se (8.12.10/8.12.10/WIREfire-1.8) with ESMTP id hADKhvI2022054
	for <speechsc@ietf.org>; Thu, 13 Nov 2003 21:43:57 +0100 (MET)
Received: from ericsson.com (permit153.er.ki.sw.ericsson.se [147.214.97.153]) by esealnt610.al.sw.ericsson.se with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72)
	id WHXDYWRT; Thu, 13 Nov 2003 21:44:32 +0100
Message-ID: <3FB3ECE8.8070203@ericsson.com>
Date: Thu, 13 Nov 2003 21:43:20 +0100
X-Sybari-Space: 00000000 00000000 00000000 00000000
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: speechsc@ietf.org
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit
Subject: [Speechsc] ABNF in draft-ietf-speechsc-mrcpv2-00.txt
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Transfer-Encoding: 7bit

Hi,

Regarding the usage of ABNF in the spec. I would strongly encourage that 
all message definition ABNF is collected in a syntax chapter. I think 
both SIP and RTSP has come to the conclusion that this is best. 
Collecting things together improves the readability of the formal 
syntax, ensuring consistency, easing implementation. However it is 
recommended that one reference the ABNF in the appropriate places when 
one talks about rules regarding a header or other elements.

Cheers

Magnus Westerlund


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Mon Nov 17 17:51:22 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA11087
	for <speechsc-archive@odin.ietf.org>; Mon, 17 Nov 2003 17:51:22 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsD3-0004wW-R0
	for speechsc-archive@odin.ietf.org; Mon, 17 Nov 2003 17:51:05 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hAHMp5O6018996
	for speechsc-archive@odin.ietf.org; Mon, 17 Nov 2003 17:51:05 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsD3-0004wJ-O1
	for speechsc-web-archive@optimus.ietf.org; Mon, 17 Nov 2003 17:51:05 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA11072
	for <speechsc-web-archive@ietf.org>; Mon, 17 Nov 2003 17:50:52 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1ALsD1-0000gg-00
	for speechsc-web-archive@ietf.org; Mon, 17 Nov 2003 17:51:03 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1ALsD0-0000gc-00
	for speechsc-web-archive@ietf.org; Mon, 17 Nov 2003 17:51:02 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsCz-0004vx-CW; Mon, 17 Nov 2003 17:51:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsCv-0004vU-1c
	for speechsc@optimus.ietf.org; Mon, 17 Nov 2003 17:50:57 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA11064
	for <speechsc@ietf.org>; Mon, 17 Nov 2003 17:50:43 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1ALsCs-0000gT-00
	for speechsc@ietf.org; Mon, 17 Nov 2003 17:50:54 -0500
Received: from goalie.snowshore.com ([216.57.133.4] helo=webshield.office.snowshore.com)
	by ietf-mx with smtp (Exim 4.12)
	id 1ALsCr-0000gH-00
	for speechsc@ietf.org; Mon, 17 Nov 2003 17:50:53 -0500
Received: from zoe.office.snowshore.com(192.168.1.172) by webshield.office.snowshore.com via csmap 
	 id 20359; Mon, 17 Nov 2003 17:56:53 -0500 (EST)
X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 17 Nov 2003 17:50:24 -0500
Message-ID: <4A3384433CE2AB46A63468CB207E209D59E8D4@zoe.office.snowshore.com>
Thread-Topic: Draft Minutes
Thread-Index: AcOtXS/p9IApJVunSTihkZr3MWuFxA==
From: "Eric Burger" <eburger@snowshore.com>
To: "IETF SPEECHSC (E-mail)" <speechsc@ietf.org>
Content-Transfer-Encoding: quoted-printable
Subject: [Speechsc] Draft Minutes
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Transfer-Encoding: quoted-printable

Thanks to Jeff, our trusty scribe!

Please post comments by 12/1.






Dave: The requirements document passed review, it's in editorial review =
now

Sarvi: Dan Burnett's speaker identification/verification draft is out =
now.  It's geared toward MRCP v1, and will be evolved into MRCP v2.

Sarvi: Open Issues..

 * Proxy support: Call flows are needed.  Currently, we're looking at =
using a relay to front-end requests.

 * When to start/stop media: The recognizer should expect the media to =
start flowing when it receives the recognize request, and shouldn't =
buffer anything it receives beforehand.

 * Recording audio:
    Two types (definitions from Dan Burnett):
      resource-related: everything the recognizer hears and/or =
everything it thinks is speech
      time-based: record some period of the conversation, independent of =
the recognition.

    It was agreed that it is outside the scope of MRCP to record the =
conversation. It is, however, desirable to have a "record" resource =
which takes audio input from the client, "puts a handle on it" and makes =
it available to the client, possibly applying some "speechish" =
operations (end pointing, etc).

 * Resource types: there is potentially a need to identify/classify =
resource for allocation (e.g. this "recognizer" can only recognize DTMF =
input, not speech, or this "TTS engine" can only play audio, it doesn't =
do synthesis).  SIP Callee capabilities will be investigated/discussed =
on the mailing list to determine whether they are sufficient.

 * NLSML versus EMMA: As we won't know the status of the EMMA =
specification at the time we publish until the time we publish, we'll =
leave a placeholder in our document until it's time to publish and make =
a decision them.

 * Multiple media streams: There's a need for only one media line

 * Multiple speak requests: Is it desirable to be able to pause an =
active speak request, execute a new speak request, and then resume the =
original request?  Yes, it's potentially useful, but it can be =
accomplished on the client side by allocating two separate TTS =
resources, pausing one, starting the other, and then resuming the first =
one when the second one finishes.


Dan Burnett on Speaker Identification and Verification:

 - joint proposal from Nuance and Intervoice submitted recently.  In =
addition to SI/SV, document covers:

    - speaker-enrolled grammars: use recorded audio to make a grammar; =
well-suited for voice dialing applications
    - hotword recognition: recognizer listens for hotword(s) in a =
conversation, doing nothing until it actually recognizes something (as =
opposed to timing out, throwing a "nomatch", etc)

  SI/SV discussion:

    Two questions so far:

      1. Why buffering?  Can the audio from a captured recognizer =
session be used (when recognition is done with save-waveform=3Dtrue) be =
used for verification, by passing the verification engine handle(s) to =
the recorded audio?  We should be able to eliminate the pause/resume =
methods

      2. Is there a need for some sort of registry for returned info - =
some verifier/identifier might return gender information, or language =
information; common categories would be beneficial


Milestones:

  Slightly behind schedule currently.  A draft will be submitted =
sometime after the next IETF meeting (March 2004?)


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Mon Nov 17 17:54:47 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA11200
	for <speechsc-archive@odin.ietf.org>; Mon, 17 Nov 2003 17:54:47 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsGM-00056d-Rs
	for speechsc-archive@odin.ietf.org; Mon, 17 Nov 2003 17:54:30 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hAHMsUwU019621
	for speechsc-archive@odin.ietf.org; Mon, 17 Nov 2003 17:54:30 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsGM-00056O-Nh
	for speechsc-web-archive@optimus.ietf.org; Mon, 17 Nov 2003 17:54:30 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA11151
	for <speechsc-web-archive@ietf.org>; Mon, 17 Nov 2003 17:54:16 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1ALsGJ-0000iT-00
	for speechsc-web-archive@ietf.org; Mon, 17 Nov 2003 17:54:27 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1ALsGJ-0000ht-00
	for speechsc-web-archive@ietf.org; Mon, 17 Nov 2003 17:54:27 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsEv-00053j-1B; Mon, 17 Nov 2003 17:53:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1ALsEX-00053G-Ug
	for speechsc@optimus.ietf.org; Mon, 17 Nov 2003 17:52:37 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA11135
	for <speechsc@ietf.org>; Mon, 17 Nov 2003 17:52:24 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1ALsEV-0000hj-00
	for speechsc@ietf.org; Mon, 17 Nov 2003 17:52:35 -0500
Received: from goalie.snowshore.com ([216.57.133.4] helo=webshield.office.snowshore.com)
	by ietf-mx with smtp (Exim 4.12)
	id 1ALsEU-0000hM-00
	for speechsc@ietf.org; Mon, 17 Nov 2003 17:52:34 -0500
Received: from zoe.office.snowshore.com(192.168.1.172) by webshield.office.snowshore.com via csmap 
	 id 20375; Mon, 17 Nov 2003 17:58:34 -0500 (EST)
X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 17 Nov 2003 17:52:05 -0500
Message-ID: <4A3384433CE2AB46A63468CB207E209D59E8D5@zoe.office.snowshore.com>
Thread-Topic: Presentations and other Useful Stuff
Thread-Index: AcOtXWwqvgNLo/zTScGw3752kVYGnA==
From: "Eric Burger" <eburger@snowshore.com>
To: "IETF SPEECHSC (E-mail)" <speechsc@ietf.org>
Content-Transfer-Encoding: quoted-printable
Subject: [Speechsc] Presentations and other Useful Stuff
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Transfer-Encoding: quoted-printable

The Supplemental SPEECHSC web site is up-to-date.

http://flyingfox.snowshore.com/i-d/speechsc/


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Mon Nov 24 11:24:25 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA14814
	for <speechsc-archive@odin.ietf.org>; Mon, 24 Nov 2003 11:24:25 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOJVS-0005yC-5i
	for speechsc-archive@odin.ietf.org; Mon, 24 Nov 2003 11:24:10 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hAOGOAmE022942
	for speechsc-archive@odin.ietf.org; Mon, 24 Nov 2003 11:24:10 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOJVS-0005xx-12
	for speechsc-web-archive@optimus.ietf.org; Mon, 24 Nov 2003 11:24:10 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA14801
	for <speechsc-web-archive@ietf.org>; Mon, 24 Nov 2003 11:23:55 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AOJVR-0007fV-00
	for speechsc-web-archive@ietf.org; Mon, 24 Nov 2003 11:24:09 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AOJVQ-0007fP-00
	for speechsc-web-archive@ietf.org; Mon, 24 Nov 2003 11:24:08 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOJVJ-0005wl-63; Mon, 24 Nov 2003 11:24:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOJUU-0005vj-9D
	for speechsc@optimus.ietf.org; Mon, 24 Nov 2003 11:23:10 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA14773
	for <speechsc@ietf.org>; Mon, 24 Nov 2003 11:22:55 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AOJUT-0007fC-00
	for speechsc@ietf.org; Mon, 24 Nov 2003 11:23:09 -0500
Received: from goalie.snowshore.com ([216.57.133.4] helo=webshield.office.snowshore.com)
	by ietf-mx with smtp (Exim 4.12)
	id 1AOJUS-0007ey-00
	for speechsc@ietf.org; Mon, 24 Nov 2003 11:23:09 -0500
Received: from zoe.office.snowshore.com(192.168.1.172) by webshield.office.snowshore.com via csmap 
	 id 31132; Mon, 24 Nov 2003 11:28:43 -0500 (EST)
X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 24 Nov 2003 11:22:49 -0500
Message-ID: <4A3384433CE2AB46A63468CB207E209D59E90B@zoe.office.snowshore.com>
Thread-Topic: Any changes to minutes?
Thread-Index: AcOypzOL5Ma+LktMSx6JyOcY1pq7gw==
From: "Eric Burger" <eburger@snowshore.com>
To: "IETF SPEECHSC (E-mail)" <speechsc@ietf.org>
Content-Transfer-Encoding: quoted-printable
Subject: [Speechsc] Any changes to minutes?
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Transfer-Encoding: quoted-printable

If so, speak up now!


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



From exim@www1.ietf.org  Tue Nov 25 08:28:20 2003
Received: from optimus.ietf.org ([132.151.1.19])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA16821
	for <speechsc-archive@odin.ietf.org>; Tue, 25 Nov 2003 08:28:20 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOdEZ-00088T-RV
	for speechsc-archive@odin.ietf.org; Tue, 25 Nov 2003 08:28:03 -0500
Received: (from exim@localhost)
	by www1.ietf.org (8.12.8/8.12.8/Submit) id hAPDS3Ki031269
	for speechsc-archive@odin.ietf.org; Tue, 25 Nov 2003 08:28:03 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOdEZ-00088G-Nz
	for speechsc-web-archive@optimus.ietf.org; Tue, 25 Nov 2003 08:28:03 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA16818
	for <speechsc-web-archive@ietf.org>; Tue, 25 Nov 2003 08:27:50 -0500 (EST)
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AOdEY-0004rg-00
	for speechsc-web-archive@ietf.org; Tue, 25 Nov 2003 08:28:02 -0500
Received: from [132.151.1.19] (helo=optimus.ietf.org)
	by ietf-mx with esmtp (Exim 4.12)
	id 1AOdEY-0004rd-00
	for speechsc-web-archive@ietf.org; Tue, 25 Nov 2003 08:28:02 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOdEX-00087u-Bf; Tue, 25 Nov 2003 08:28:01 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by optimus.ietf.org with esmtp (Exim 4.20)
	id 1AOdBu-000845-Lk
	for speechsc@optimus.ietf.org; Tue, 25 Nov 2003 08:25:18 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA16735
	for <speechsc@ietf.org>; Tue, 25 Nov 2003 08:25:05 -0500 (EST)
From: Sailesh.Sathish@nokia.com
Received: from ietf-mx ([132.151.6.1])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AOdBt-0004oF-00
	for speechsc@ietf.org; Tue, 25 Nov 2003 08:25:17 -0500
Received: from mgw-x1.nokia.com ([131.228.20.21])
	by ietf-mx with esmtp (Exim 4.12)
	id 1AOdBs-0004oB-00
	for speechsc@ietf.org; Tue, 25 Nov 2003 08:25:16 -0500
Received: from esvir05nok.ntc.nokia.com (esvir05nokt.ntc.nokia.com [172.21.143.37])
	by mgw-x1.nokia.com (Switch-2.2.8/Switch-2.2.8) with ESMTP id hAPDPFA07276
	for <speechsc@ietf.org>; Tue, 25 Nov 2003 15:25:15 +0200 (EET)
Received: from esebh001.NOE.Nokia.com (unverified) by esvir05nok.ntc.nokia.com
 (Content Technologies SMTPRS 4.2.5) with ESMTP id <T661fe74297ac158f25423@esvir05nok.ntc.nokia.com> for <speechsc@ietf.org>;
 Tue, 25 Nov 2003 15:25:13 +0200
Received: from esebe003.NOE.Nokia.com ([172.21.138.39]) by esebh001.NOE.Nokia.com with Microsoft SMTPSVC(5.0.2195.6747);
	 Tue, 25 Nov 2003 15:25:13 +0200
Received: from trebe004.NOE.Nokia.com ([172.22.232.177]) by esebe003.NOE.Nokia.com with Microsoft SMTPSVC(5.0.2195.6747);
	 Tue, 25 Nov 2003 15:25:13 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Speechsc] MRCPv2 Comments
Date: Tue, 25 Nov 2003 15:25:12 +0200
Message-ID: <D338C3A6DFB6BE4EA06F1A7494CEBD4601ED64D5@trebe004.europe.nokia.com>
Thread-Topic: Re: [Speechsc] MRCPv2 Comments
Thread-Index: AcOzV45hCa11CZIMQhaELkEPvfGPLQ==
To: <speechsc@ietf.org>
Cc: <ramalingam.hariharan@nokia.com>
X-OriginalArrivalTime: 25 Nov 2003 13:25:13.0419 (UTC) FILETIME=[8EB3F1B0:01C3B357]
Content-Transfer-Encoding: quoted-printable
Sender: speechsc-admin@ietf.org
Errors-To: speechsc-admin@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Transfer-Encoding: quoted-printable

Hi Mr Sarvi,

Thank you for the reply. Please find my comments below.

>With TTS there is no explicit content create or delete.  There is =
inline=20
content, that is stored on the server for the life of the SPEAK request. =

And there is URI referenced content(either specified as  a URI, or=20
specified as a URI from inside an inline content), which may be held in=20
a local cache on the media server. The client uses the Cache-Directive=20
fields to tell the media server, what caching directive values to use in =

the media servers document cache for thsi session or Request..

None of these cache directives applies for the inline content itself,=20
but only applies for content that is URI referenced and hence those that =

>the media server would have to go and fetch and possibly cache.


Exactly. Wouldnt there be cases when the TTS inline content would have =
to be stored beyond the SPEAK request timeline? As a crude example, say =
for instance the ASR couldnt recognize the speech data and the TTS would =
send back a text saying "I didnt understand, please say again". This =
text might have to be repeated multiple times within a page (or =
throughout the session). The text could be inline and can have a =
content-id associated with it where subsequent SPEAK request can invoke =
through the content-id. Also there could be other application specific =
TTS text that the author could write that will give feedback to the user =
based on ASR response metadata. The author can set multiple TTS text =
(each text with own id) at the server and the client can later invoke a =
particular TTS text based on an interaction condition (or based on =
metadata from the recognition result).=20

>2. No INTERMEDIATE_RECOGNITION_RESULT send event from the recognizer =
and provision for setting this with the message fields (eg SET-PARAMS)
>
Can you elaborate on the need for this. Are you refering to recognition=20
in stages(recgonizer, NL etc.) or are you refering to parital=20
>recognition. If so could you provide an example of its usage.

I am referring to partial results when they become available. For =
example, here is an interaction for a car sales application

user: get me the best quotes for minivans in Manhattan area
system: what models are you interested in?=20
user: Dodge, Ford, Mitsubishi, Audi, Volkswagen, BMW
system: finding quotes for minivan
.. Dodge
.. Ford
..Mitsubishi
..Audi
..Vokswagen
..BMW

Please wait while the system searches for best quotes

In the interaction above, the ASR grammar would be written in such a way =
as to allow the user to make multiple selections (even multiple =
selections in mixed initiative). It would be desirable then to give =
feedback to the user when each model gets recognized. This result would =
be sent back to the client when a match in an active grammar is found. =
This could be coupled with a Speech_Timeout_Value when the recognizer =
would decide that the user has done all the selections.

Thank you.

Regards,

Sailesh.
=20
Sailesh Kumar Sathish      GSM +358 50 4835679
Nokia Research Center      Desk +358 3 272 5679
P.O.Box 100, Tampere      sailesh.sathish@nokia.com
Finland -33721


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



