From speechsc-bounces@ietf.org Tue May 01 16:51:24 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HizJi-0002h2-M8; Tue, 01 May 2007 16:51:22 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1HizJi-0002gx-05 for speechsc-confirm+ok@megatron.ietf.org;
	Tue, 01 May 2007 16:51:22 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1HizJh-0002gp-MX
	for speechsc@ietf.org; Tue, 01 May 2007 16:51:21 -0400
Received: from smtp02.lnh.mail.rcn.net ([207.172.157.102])
	by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HizJh-0000Dg-5a
	for speechsc@ietf.org; Tue, 01 May 2007 16:51:21 -0400
Received: from mr08.lnh.mail.rcn.net ([207.172.157.28])
	by smtp02.lnh.mail.rcn.net with ESMTP; 01 May 2007 16:51:21 -0400
Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11])
	by mr08.lnh.mail.rcn.net (MOS 3.8.3-GA) with ESMTP id IPM02929;
	Tue, 1 May 2007 16:51:20 -0400 (EDT)
Received: from 24-148-43-202.grn-bsr1.chi-grn.il.cable.rcn.com (HELO
	JMarkowitz) ([24.148.43.202])
	by smtp01.lnh.mail.rcn.net with ESMTP; 01 May 2007 16:51:15 -0400
Message-Id: <5u1lfe$c484a0@smtp01.lnh.mail.rcn.net>
From: "Judith Markowitz" <Judith@Jmarkowitz.com>
To: <speechsc@ietf.org>
Date: Tue, 1 May 2007 15:51:05 -0500
Organization: J. Markowitz, Consultants
MIME-Version: 1.0
X-Mailer: Microsoft Office Outlook, Build 11.0.6353
Thread-Index: AceMMm/pqAB190NcQSyGEhoI2X7j5w==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
X-Junkmail-Status: score=10/50, host=mr08.lnh.mail.rcn.net
X-Junkmail-SD-Raw: score=unknown,
	refid=str=0001.0A090209.4637A848.005E,ss=1,fgs=0,
	ip=207.172.4.11, so=2006-12-09 10:45:40,
	dmn=5.1.5/2006-01-31
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 9e1884bf469cda400682762c3e8d96c9
Subject: [Speechsc] proposed change to MRCP V2 draft 12 
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Judith@JMarkowitz.com
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1619187282=="
Errors-To: speechsc-bounces@ietf.org

This is a multi-part message in MIME format.

--===============1619187282==
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_0014_01C78C08.879DEF10"

This is a multi-part message in MIME format.

------=_NextPart_000_0014_01C78C08.879DEF10
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

The introduction of Section 11 says

 

Speaker identification identifies the speaker among a set of users by
   matching against a set of voiceprints.  (This function is also called
   Multi-Verification).  Speaker identification can be performed on a
   small set of users or for a large population.  This capability is
   useful for applications where multiple users share the same access
   privileges to some data or application, but where the individual
   speaker must be uniquely identified from the group.  It is also
   useful for real time or post processing of recorded content to
   ascertain who was speaking when.  Speaker identification is also done
   in two phases, a designation phase and an execution phase.

 

This is incorrect. 

 

The definition of speaker identification should be 

 

Speaker identification is a biometric modality that uses an individual's
speech for identification without a claimed identity (see identity claim).
Identification is a task where the biometric system searches a database for
a reference model matching a submitted biometric sample (processed voice
sample) and if found, returns a corresponding identity.  A processed voice
sample is collected and compared to all the reference models in a database.
See Open Set and Closed Set definitions for more specifics.  Identification
is also referred to as one-to-many matching.  

 

Speaker identification is NOT a kind of multi-verification. It never
includes a claim of identity. Only speaker verification and variants of
speaker verifications, such as group authentication involve a claim of
identity. This is a significant difference that is fundamental to the
operation of all biometrics..  

 

The definition of speaker verification should be

 

Speaker verification is a biometric modality that uses an individual's
speech for verification of a claimed identity.  Verification is a task where
the biometric system uses an identity claim for reference matching against a
submitted biometric sample (processed voice sample) and if matched returns a
positive verification.  A processed voice sample is collected and compared
to one reference model or a small group of reference models in a database.
See identity claim for examples.  Verification is also referred to as
one-to-one matching.

 

The definition of group authentication/multi-verification should be 

 

A special case of verification where an identity claim is made which
pertains to a group. A biometric system uses a group of records in a
database for reference matching against a submitted biometric sample
(processed voice sample) and if matched returns a positive verification.
Also referred to as multi-verification.  The term small set or small group
identification has been used improperly as a synonym for group
authentication in the past.  See closed set identification for a definition
of small set identification

 

These definitions come from the soon-to-be-published SIV Glossary that the
VoiceXML Forum's Speaker Biometrics Committee and correspond to definitions
in biometric sources. 

 

I don't think these changes will affect the functionality of MRCP - mostly
because the specification focuses primarily on verification. That is
reasonable since most applications are verification ones as well. 

 

Judith Markowitz, Co-chair, VoiceXML Speaker Biometrics Group

 

J. Markowitz, Consultants

5801 N. Sheridan Rd, #19A

Chicago, IL 60660

773-769-9243

judith@jmarkowitz.com

 


------=_NextPart_000_0014_01C78C08.879DEF10
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns=3D"http://www.w3.org/TR/REC-html40">

<head>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
pre
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Courier New";}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:Arial;
	color:windowtext;
	font-weight:normal;
	font-style:normal;
	text-decoration:none none;}
p.glossarydefinition, li.glossarydefinition, div.glossarydefinition
	{margin-top:0in;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:.5in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:Arial;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
	{page:Section1;}
-->
</style>

</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple>

<div class=3DSection1>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>The introduction of Section 11 =
says<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<pre><font size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>Speaker identification identifies the speaker =
among a set of users by<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; matching against a set of =
voiceprints.&nbsp; (This function is also =
called<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; Multi-Verification).&nbsp; =
Speaker identification can be performed on =
a<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; small set of users or for a =
large population.&nbsp; This capability =
is<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; useful for applications where =
multiple users share the same =
access<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; privileges to some data or =
application, but where the =
individual<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; speaker must be uniquely =
identified from the group.&nbsp; It is =
also<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; useful for real time or post =
processing of recorded content =
to<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; ascertain who was speaking =
when.&nbsp; Speaker identification is also =
done<o:p></o:p></span></font></pre><pre><font
size=3D3 face=3D"Courier New"><span lang=3DEN =
style=3D'font-size:12.0pt'>&nbsp;&nbsp; in two phases, a designation =
phase and an execution phase.<o:p></o:p></span></font></pre>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
lang=3DEN
style=3D'font-size:11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p>=
</span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>This is incorrect. =
<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>The definition of speaker =
identification
should be <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3Dglossarydefinition><font size=3D2 face=3DArial><span =
style=3D'font-size:
10.0pt'>Speaker identification is a biometric modality that uses an
individual&#8217;s speech for identification <b><span =
style=3D'font-weight:bold'>without</span></b>
a <i><span style=3D'font-style:italic'>claimed identity (see identity =
claim)</span></i>.&nbsp;
Identification is a task where the biometric system searches a database =
for a
reference model <i><span style=3D'font-style:italic'>matching</span></i> =
a
submitted biometric sample (processed voice sample) and if found, =
returns a
corresponding identity. &nbsp;A processed voice sample is collected and
compared to all the reference models in a database.&nbsp; See Open Set =
and
Closed Set definitions for more specifics.&nbsp; Identification is also
referred to as one-to-many matching.&nbsp; <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>Speaker identification is NOT a =
kind of
multi-verification. It never includes a claim of identity. Only speaker
verification and variants of speaker verifications, such as group
authentication involve a claim of identity. This is a significant =
difference
that is fundamental to the operation of all biometrics.. =
&nbsp;<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>The definition of speaker =
verification
should be<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3Dglossarydefinition><font size=3D2 face=3DArial><span =
style=3D'font-size:
10.0pt'>Speaker verification is a biometric modality that uses an
individual&#8217;s speech for verification of a claimed identity<font
color=3Dblue><span style=3D'color:blue'>.</span></font>&nbsp; =
Verification is a
task where the biometric system uses an<i><span =
style=3D'font-style:italic'>
identity claim</span></i> for reference <i><span =
style=3D'font-style:italic'>matching</span></i>
against a submitted biometric sample (processed voice sample) and if =
matched
returns a positive verification.&nbsp; A processed voice sample is =
collected
and compared to one reference model or a small group of reference models =
in a
database. See <i><span style=3D'font-style:italic'>identity =
claim</span></i> for
examples.&nbsp; Verification is also referred to as one-to-one =
matching.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>The definition of group
authentication/multi-verification should be =
<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>A special case of verification where an identity =
claim is
made which pertains to a group. A biometric system uses a group of =
records in a
database for reference matching against a submitted biometric sample =
(processed
voice sample) and if matched returns a positive verification.&nbsp; Also
referred to as multi-verification. &nbsp;The term small set or small =
group
identification has been used improperly as a synonym for group =
authentication
in the past.&nbsp; See closed set identification for a definition of =
small set
identification<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>These definitions come from the
soon-to-be-published SIV Glossary that the VoiceXML Forum&#8217;s =
Speaker
Biometrics Committee and correspond to definitions in biometric sources. =
<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>I don&#8217;t think these changes =
will
affect the functionality of MRCP &#8211; mostly because the =
specification
focuses primarily on verification. That is reasonable since most =
applications
are verification ones as well. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dblue face=3DArial><span =
style=3D'font-size:
11.0pt;font-family:Arial;color:blue'>Judith Markowitz, Co-chair, =
VoiceXML
Speaker Biometrics Group<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>J. Markowitz, =
Consultants</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>5801 N. Sheridan Rd, =
#19A</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>Chicago, IL 60660</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>773-769-9243</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>judith@jmarkowitz.com</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span =
style=3D'font-size:
12.0pt'><o:p>&nbsp;</o:p></span></font></p>

</div>

</body>

</html>

------=_NextPart_000_0014_01C78C08.879DEF10--



--===============1619187282==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;
--===============1619187282==--





From speechsc-bounces@ietf.org Tue May 01 18:14:29 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1Hj0c9-00018q-9D; Tue, 01 May 2007 18:14:29 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1Hj0c8-00013B-7J for speechsc-confirm+ok@megatron.ietf.org;
	Tue, 01 May 2007 18:14:28 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1Hj0c7-00010O-RM
	for speechsc@ietf.org; Tue, 01 May 2007 18:14:27 -0400
Received: from hoemail1.lucent.com ([192.11.226.161])
	by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Hj0c6-0005yh-13
	for speechsc@ietf.org; Tue, 01 May 2007 18:14:27 -0400
Received: from homail.ho.lucent.com (h135-17-192-10.lucent.com [135.17.192.10])
	by hoemail1.lucent.com (8.13.8/IER-o) with ESMTP id l41MEIXJ008459;
	Tue, 1 May 2007 17:14:21 -0500 (CDT)
Received: from [135.104.37.57] (qzhou-c1.research.bell-labs.com
	[135.104.37.57])
	by homail.ho.lucent.com (8.11.7p1+Sun/8.12.11) with ESMTP id
	l41MEIn08800; Tue, 1 May 2007 18:14:18 -0400 (EDT)
Message-ID: <4637BBB6.4090705@alcatel-lucent.com>
Date: Tue, 01 May 2007 18:14:14 -0400
From: Qiru Zhou <qzhou@alcatel-lucent.com>
Organization: Bell Labs, Alcatel-Lucent
User-Agent: Thunderbird 1.5.0.10 (Windows/20070221)
MIME-Version: 1.0
To: Judith@Jmarkowitz.com
Subject: Re: [Speechsc] proposed change to MRCP V2 draft 12
References: <5u1lfe$c484a0@smtp01.lnh.mail.rcn.net>
In-Reply-To: <5u1lfe$c484a0@smtp01.lnh.mail.rcn.net>
Content-Type: multipart/mixed; boundary="------------000509070607020103060104"
X-Scanned-By: MIMEDefang 2.57 on 192.11.226.41
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 789c141a303c09204b537a4078e2a63f
Cc: speechsc@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: qzhou@research.bell-labs.com
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Errors-To: speechsc-bounces@ietf.org

This is a multi-part message in MIME format.
--------------000509070607020103060104
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by hoemail1.lucent.com id
	l41MEIXJ008459

Scientifically, the definitions of SI and SV are very clear and simple:

SV is the process of verifying whether an unknown speaker is the person
as claimed; i.e., a yes-no hypothesis testing problem.

SI is the process of associating an unknown speaker with a member in
a population; i.e., a multiple-choice classification problem.

The above definitions come from our publication and previous work cited
in the references of the article:
"Recent advancements in automatic speaker authentication"
Qi Li Biing-Hwang Juang Chin-Hui Lee Qiru Zhou Soong, F.K.
Lucent Technol., AT&T Bell Labs., Murray Hill, NJ, USA; Mar. 1999

SV technologies may be used for SI such as use multiple speaker verificat=
ion
processes to identify if a speaker is in scope. I guess this is the phras=
e
multi-verification came from (I am not familiar with it). But this is=20
implementation
issue.

Since MRCP is for communication engineering than science. I think the=20
document need
some tuning on definition to address different audiences than researchers.

-- Qiru Zhou

Judith Markowitz wrote:
>
> The introduction of Section 11 says
>
> Speaker identification identifies the speaker among a set of users by
>    matching against a set of voiceprints.  (This function is also calle=
d
>    Multi-Verification).  Speaker identification can be performed on a
>    small set of users or for a large population.  This capability is
>    useful for applications where multiple users share the same access
>    privileges to some data or application, but where the individual
>    speaker must be uniquely identified from the group.  It is also
>    useful for real time or post processing of recorded content to
>    ascertain who was speaking when.  Speaker identification is also don=
e
>    in two phases, a designation phase and an execution phase.
>
> This is incorrect.
>
> The definition of speaker identification should be
>
> Speaker identification is a biometric modality that uses an=20
> individual=92s speech for identification *without* a /claimed identity=20
> (see identity claim)/. Identification is a task where the biometric=20
> system searches a database for a reference model /matching/ a=20
> submitted biometric sample (processed voice sample) and if found,=20
> returns a corresponding identity. A processed voice sample is=20
> collected and compared to all the reference models in a database. See=20
> Open Set and Closed Set definitions for more specifics. Identification=20
> is also referred to as one-to-many matching.
>
> Speaker identification is NOT a kind of multi-verification. It never=20
> includes a claim of identity. Only speaker verification and variants=20
> of speaker verifications, such as group authentication involve a claim=20
> of identity. This is a significant difference that is fundamental to=20
> the operation of all biometrics..
>
> The definition of speaker verification should be
>
> Speaker verification is a biometric modality that uses an individual=92=
s=20
> speech for verification of a claimed identity. Verification is a task=20
> where the biometric system uses an/ identity claim/ for reference=20
> /matching/ against a submitted biometric sample (processed voice=20
> sample) and if matched returns a positive verification. A processed=20
> voice sample is collected and compared to one reference model or a=20
> small group of reference models in a database. See /identity claim/=20
> for examples. Verification is also referred to as one-to-one matching.
>
> The definition of group authentication/multi-verification should be
>
> A special case of verification where an identity claim is made which=20
> pertains to a group. A biometric system uses a group of records in a=20
> database for reference matching against a submitted biometric sample=20
> (processed voice sample) and if matched returns a positive=20
> verification. Also referred to as multi-verification. The term small=20
> set or small group identification has been used improperly as a=20
> synonym for group authentication in the past. See closed set=20
> identification for a definition of small set identification
>
> These definitions come from the soon-to-be-published SIV Glossary that=20
> the VoiceXML Forum=92s Speaker Biometrics Committee and correspond to=20
> definitions in biometric sources.
>
> I don=92t think these changes will affect the functionality of MRCP =96=
=20
> mostly because the specification focuses primarily on verification.=20
> That is reasonable since most applications are verification ones as wel=
l.
>
> Judith Markowitz, Co-chair, VoiceXML Speaker Biometrics Group
>
> J. Markowitz, Consultants
>
> 5801 N. Sheridan Rd, #19A
>
> Chicago, IL 60660
>
> 773-769-9243
>
> judith@jmarkowitz.com
>
> -----------------------------------------------------------------------=
-
>
> _______________________________________________
> Speechsc mailing list
> Speechsc@ietf.org
> https://www1.ietf.org/mailman/listinfo/speechsc
> Supplemental web site:
> &lt;http://www.standardstrack.com/ietf/speechsc&gt;

--------------000509070607020103060104
Content-Type: text/x-vcard; charset=utf-8;
 name="qzhou.vcf"
Content-Disposition: attachment;
 filename="qzhou.vcf"
Content-Transfer-Encoding: 7bit

begin:vcard
fn:Qiru
n:Zhou;Qiru
org:Bell Labs, Alcatel-Lucent;Multimedia Processing Technology Research Dpt.
adr:2D518;;600 Mountain Avenue;Murray Hill;NJ;07974;USA
email;internet:qzhou@research.bell-labs.com
tel;work:+1 908 582 4562
tel;fax:+1 908 582 7038
x-mozilla-html:FALSE
url:http://www.bell-labs.com/org/1133/Research/
version:2.1
end:vcard


--------------000509070607020103060104
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;
--------------000509070607020103060104--





From speechsc-bounces@ietf.org Wed May 02 17:15:56 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HjMB2-0006LF-13; Wed, 02 May 2007 17:15:56 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1HjMB1-0006HE-0z for speechsc-confirm+ok@megatron.ietf.org;
	Wed, 02 May 2007 17:15:55 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1HjMB0-0006H6-Nb
	for speechsc@ietf.org; Wed, 02 May 2007 17:15:54 -0400
Received: from smtp02.lnh.mail.rcn.net ([207.172.157.102])
	by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HjMAz-0007eM-86
	for speechsc@ietf.org; Wed, 02 May 2007 17:15:54 -0400
Received: from mr08.lnh.mail.rcn.net ([207.172.157.28])
	by smtp02.lnh.mail.rcn.net with ESMTP; 02 May 2007 17:15:53 -0400
Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11])
	by mr08.lnh.mail.rcn.net (MOS 3.8.3-GA) with ESMTP id IPP30937;
	Wed, 2 May 2007 17:15:48 -0400 (EDT)
Received: from 24-148-43-202.grn-bsr1.chi-grn.il.cable.rcn.com (HELO
	JMarkowitz) ([24.148.43.202])
	by smtp01.lnh.mail.rcn.net with ESMTP; 02 May 2007 17:15:42 -0400
Message-Id: <5u1lfe$c4pjbh@smtp01.lnh.mail.rcn.net>
From: "Judith Markowitz" <Judith@Jmarkowitz.com>
To: <speechsc@ietf.org>
Date: Wed, 2 May 2007 16:15:28 -0500
Organization: J. Markowitz, Consultants
MIME-Version: 1.0
X-Mailer: Microsoft Office Outlook, Build 11.0.6353
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
Thread-Index: AceM/wJVRWpyV8KvT+6fFHYLBJGB/A==
X-Junkmail-Status: score=10/50, host=mr08.lnh.mail.rcn.net
X-Junkmail-SD-Raw: score=unknown,
	refid=str=0001.0A090208.4638FF88.00B7,ss=1,fgs=0,
	ip=207.172.4.11, so=2006-12-09 10:45:40,
	dmn=5.1.5/2006-01-31
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 515708a075ffdf0a79d1c83b601e2afd
Subject: [Speechsc] proposed change to definition of "speaker identification"
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Judith@JMarkowitz.com
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0645299269=="
Errors-To: speechsc-bounces@ietf.org

This is a multi-part message in MIME format.

--===============0645299269==
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_0027_01C78CD5.1A25DC40"

This is a multi-part message in MIME format.

------=_NextPart_000_0027_01C78CD5.1A25DC40
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: 7bit

The definition of "speaker identification" is incorrect because it portrays
speaker as a kind of verification. In fact, it is entirely different. It is
important that the speech-processing industry be in step with both the
research community and the biometrics industry. The recommended change is an
editorial one and does not affect the specification in any way other than to
provide a proper definition. 

 

Here is the proposed change:

 

"Speaker identification is the process of associating an unknown speaker
with a member in a population. It does not employ a claim of identity.
Speaker verification is the process of verifying whether an unknown speaker
is the person as claimed. It is a yes/no process performed in response to a
claim of identity."

 

If the committee wants to include the concept of making a claim of
membership (which is now erroneously included as a kind of speaker
identification) you can add

 

"When an individual claims to belong to a group (e.g., one of the owners of
a joint bank account) a group authentication is performed. This is generally
implemented as a kind of verification involving comparison with more than
one voice model. It is sometimes called 'multi-verification.'"

 

Thank you for your kind attention to this request. 

 

Judith Markowitz

Co-chair Speaker Biometrics Committee, VoiceXML Forum 

Invited Expert in Speaker Recognition to the W3C Voice Browser Working Group

VoiceXML Forum liaison to ANSI/INCITS/Mi(biometrics) 

Editor, ANSI/INCITS data exchange format project for speaker biometrics

 

 

 

 


------=_NextPart_000_0027_01C78CD5.1A25DC40
Content-Type: text/html;
	charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable

<html xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns=3D"http://www.w3.org/TR/REC-html40">

<head>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:Arial;
	color:windowtext;
	font-weight:normal;
	font-style:normal;
	text-decoration:none none;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
	{page:Section1;}
-->
</style>

</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple>

<div class=3DSection1>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>The definition of =
&#8220;speaker
identification&#8221; is incorrect because it portrays speaker as a kind =
of
verification. In fact, it is entirely different. It is important that =
the
speech-processing industry be in step with both the research community =
and the
biometrics industry. The recommended change is an editorial one and does =
not
affect the specification in any way other than to provide a proper =
definition. <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Here is the =
proposed change:<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>&quot;Speaker =
identification
is the process of associating an unknown speaker with a member in a =
population.
It does not employ a claim of identity. Speaker verification is the =
process of
verifying whether an unknown speaker is the person as claimed. It is a =
yes/no
process performed in response to a claim of =
identity.&quot;<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>If the committee =
wants to
include the concept of making a claim of membership (which is now =
erroneously
included as a kind of speaker identification) you can =
add<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>&quot;When an =
individual
claims to belong to a group (e.g., one of the owners of a joint bank =
account) a
group authentication is performed. This is generally implemented as a =
kind of
verification involving comparison with more than one voice model. It is
sometimes called =
'multi-verification.'&quot;<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Thank you for your =
kind
attention to this request. <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Judith =
Markowitz<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Co-chair Speaker =
Biometrics
Committee, VoiceXML Forum <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Invited Expert in =
Speaker
Recognition to the W3C Voice Browser Working =
Group<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>VoiceXML Forum =
liaison to
ANSI/INCITS/Mi(biometrics) <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Editor, ANSI/INCITS =
data
exchange format project for speaker =
biometrics<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

</div>

</body>

</html>

------=_NextPart_000_0027_01C78CD5.1A25DC40--



--===============0645299269==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;
--===============0645299269==--





From speechsc-bounces@ietf.org Tue May 08 08:42:36 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HlP1T-0004HF-Eb; Tue, 08 May 2007 08:42:31 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1HlP1S-0004H2-0I for speechsc-confirm+ok@megatron.ietf.org;
	Tue, 08 May 2007 08:42:30 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1HlP1R-0004Gu-Mw
	for speechsc@ietf.org; Tue, 08 May 2007 08:42:29 -0400
Received: from szxga01-in.huawei.com ([61.144.161.53])
	by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HlP1O-0005va-Rc
	for speechsc@ietf.org; Tue, 08 May 2007 08:42:29 -0400
Received: from huawei.com (szxga01-in [172.24.2.3])
	by szxga01-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 1.25
	(built Mar
	3 2004)) with ESMTP id <0JHQ00IMR39CHI@szxga01-in.huawei.com> for
	speechsc@ietf.org; Tue, 08 May 2007 20:41:37 +0800 (CST)
Received: from huawei.com ([172.24.1.24])
	by szxga01-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 1.25
	(built Mar
	3 2004)) with ESMTP id <0JHQ00HQH39CIV@szxga01-in.huawei.com> for
	speechsc@ietf.org; Tue, 08 May 2007 20:41:36 +0800 (CST)
Received: from prasad1528 ([10.18.4.236])
	by szxml04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 1.25
	(built Mar
	3 2004)) with ESMTPA id <0JHQ001GU398MB@szxml04-in.huawei.com> for
	speechsc@ietf.org; Tue, 08 May 2007 20:41:36 +0800 (CST)
Date: Tue, 08 May 2007 18:11:32 +0530
From: prasadkumbala <prasadkumbala@huawei.com>
To: speechsc@ietf.org
Message-id: <001401c7916e$3534a0a0$ec04120a@china.huawei.com>
Organization: htipl
MIME-version: 1.0
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
X-Mailer: Microsoft Office Outlook 11
Thread-index: AceRbjSsEtP2Kv3RTIm0mynHGPArAw==
X-Spam-Score: 0.2 (/)
X-Scan-Signature: 65215b440f7ab00ca9514de4a7a89926
Cc: ranjit@huawei.com
Subject: [Speechsc] cache-directive ABNF - Inconsistency
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: prasadkumbala@huawei.com
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0265231248=="
Errors-To: speechsc-bounces@ietf.org

This is a multi-part message in MIME format.

--===============0265231248==
Content-type: multipart/alternative;
	boundary="Boundary_(ID_dG+LSfV13gHmObuhHdWFiA)"

This is a multi-part message in MIME format.

--Boundary_(ID_dG+LSfV13gHmObuhHdWFiA)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7BIT

Hi,

There is inconsistency in the ABNF of cache-directive header for
"max-stale".

In the Section 6.2.13.  ABNF states that '=' is optional after "max-stale"

cache-directive     = "max-age" "=" delta-seconds

                       / "max-stale" [ "=" delta-seconds ]

                       / "min-fresh" "=" delta-seconds

Whereas Section  15.  ABNF Normative Definition is as given below, according
to which '=' is mandatory after "max-stale"

cache-directive  =    "max-age" "=" delta-seconds

                 /    "max-stale" "=" [ delta-seconds ]

                 /    "min-fresh" "=" delta-seconds

Please address this issue.

 

Regards,

Prasad K


--Boundary_(ID_dG+LSfV13gHmObuhHdWFiA)
Content-type: text/html; charset=us-ascii
Content-transfer-encoding: 7BIT

<html>

<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered)">
<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:SimHei;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:KaiTi_GB2312;}
@font-face
	{font-family:"\@SimSun";
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:"\@KaiTi_GB2312";}
@font-face
	{font-family:"\@SimHei";
	panose-1:0 0 0 0 0 0 0 0 0 0;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin-top:0in;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:10.0pt;
	margin-bottom:.0001pt;
	line-height:150%;
	text-autospace:none;
	font-size:10.5pt;
	font-family:"Times New Roman";}
h1
	{margin-top:12.0pt;
	margin-right:0in;
	margin-bottom:12.0pt;
	margin-left:21.55pt;
	text-align:justify;
	text-indent:-21.55pt;
	page-break-after:avoid;
	font-size:16.0pt;
	font-family:Arial;}
h2
	{margin-top:12.0pt;
	margin-right:0in;
	margin-bottom:12.0pt;
	margin-left:.4in;
	text-align:justify;
	text-indent:-.4in;
	page-break-after:avoid;
	font-size:12.0pt;
	font-family:Arial;
	font-weight:normal;}
h3
	{margin-top:13.0pt;
	margin-right:0in;
	margin-bottom:13.0pt;
	margin-left:.5in;
	text-align:justify;
	text-indent:-.5in;
	line-height:173%;
	page-break-after:avoid;
	font-size:12.0pt;
	font-family:Arial;
	font-weight:normal;}
p.MsoHeader, li.MsoHeader, div.MsoHeader
	{margin:0in;
	margin-bottom:.0001pt;
	text-align:justify;
	layout-grid-mode:char;
	font-size:9.0pt;
	font-family:Arial;}
p.MsoFooter, li.MsoFooter, div.MsoFooter
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:9.0pt;
	font-family:Arial;}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
p.Table, li.Table, div.Table
	{margin-top:5.0pt;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:0in;
	margin-bottom:.0001pt;
	text-align:center;
	text-indent:0in;
	font-size:9.0pt;
	font-family:Arial;}
p.TableText, li.TableText, div.TableText
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:10.5pt;
	font-family:Arial;}
p.TableHeader, li.TableHeader, div.TableHeader
	{margin:0in;
	margin-bottom:.0001pt;
	text-align:center;
	font-size:10.5pt;
	font-family:Arial;
	font-weight:bold;}
p.FigureStyle, li.FigureStyle, div.FigureStyle
	{margin-top:4.0pt;
	margin-right:0in;
	margin-bottom:4.0pt;
	margin-left:0in;
	text-align:center;
	line-height:150%;
	page-break-after:avoid;
	text-autospace:none;
	font-size:10.5pt;
	font-family:"Times New Roman";}
p.DocumentTitle, li.DocumentTitle, div.DocumentTitle
	{margin-top:15.0pt;
	margin-right:0in;
	margin-bottom:15.0pt;
	margin-left:0in;
	text-align:center;
	line-height:150%;
	text-autospace:none;
	font-size:18.0pt;
	font-family:Arial;}
p.NotesHeader, li.NotesHeader, div.NotesHeader
	{margin-top:0in;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:10.0pt;
	margin-bottom:.0001pt;
	text-align:justify;
	line-height:150%;
	text-autospace:none;
	border:none;
	padding:0in;
	font-size:9.0pt;
	font-family:Arial;}
p.NotesText, li.NotesText, div.NotesText
	{margin-top:0in;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:10.0pt;
	margin-bottom:.0001pt;
	text-align:justify;
	text-indent:.25in;
	line-height:150%;
	text-autospace:none;
	border:none;
	padding:0in;
	font-size:9.0pt;
	font-family:Arial;}
p.CompilingAdvice, li.CompilingAdvice, div.CompilingAdvice
	{margin-top:0in;
	margin-right:0in;
	margin-bottom:0in;
	margin-left:10.0pt;
	margin-bottom:.0001pt;
	line-height:150%;
	text-autospace:none;
	font-size:10.5pt;
	font-family:Arial;
	color:blue;
	font-style:italic;}
span.EmailStyle28
	{font-family:Arial;
	color:windowtext;}
p.Figure, li.Figure, div.Figure
	{margin:0in;
	margin-bottom:.0001pt;
	text-align:center;
	text-indent:0in;
	line-height:150%;
	text-autospace:none;
	font-size:10.5pt;
	font-family:"Times New Roman";}
 /* Page Definitions */
 @page Section1
	{size:595.3pt 841.9pt;
	margin:1.0in 1.25in 1.0in 1.25in;
	layout-grid:15.6pt;}
div.Section1
	{page:Section1;}
 /* List Definitions */
 ol
	{margin-bottom:0in;}
ul
	{margin-bottom:0in;}
-->
</style>

</head>

<body lang=EN-US link=blue vlink=purple style='text-justify-trim:punctuation'>

<div class=Section1 style='layout-grid:15.6pt'>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>Hi,</span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>There is
inconsistency in the ABNF of </span></font><font size=2 color=black
face="Courier New"><span style='font-size:10.0pt;line-height:150%;font-family:
"Courier New";color:black'>cache-directive header for &quot;max-stale&quot;.</span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>In the Section </span></font><b><font
size=2 color="#000032" face="Courier New"><span style='font-size:10.0pt;
line-height:150%;font-family:"Courier New";color:#000032;font-weight:bold'>6.2.13.</span></font></b><font
size=2 face=Arial><span style='font-size:10.0pt;line-height:150%;font-family:
Arial'>&nbsp; ABNF states that &#8216;=&#8217; is optional after </span></font><font
size=2 color=black face="Courier New"><span style='font-size:10.0pt;line-height:
150%;font-family:"Courier New";color:black'>&quot;max-stale&quot;</span></font></p>

<p class=MsoNormal style='margin-left:0in;line-height:normal'><font size=2
color=black face="Courier New"><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>cache-directive&nbsp;&nbsp;&nbsp;&nbsp; = &quot;max-age&quot;
&quot;=&quot; delta-seconds</span></font></p>

<p class=MsoNormal style='margin-left:0in;line-height:normal'><font size=2
color=black face="Courier New"><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
/ &quot;<b><span style='font-weight:bold'>max-stale&quot; [ &quot;=&quot;
delta-seconds ]</span></b></span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 color=black
face="Courier New"><span style='font-size:10.0pt;line-height:150%;font-family:
"Courier New";color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
/ &quot;min-fresh&quot; &quot;=&quot; delta-seconds</span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>Whereas Section &nbsp;</span></font><b><font
size=2 color="#000032" face="Courier New"><span style='font-size:10.0pt;
line-height:150%;font-family:"Courier New";color:#000032;font-weight:bold'>15.&nbsp;
ABNF Normative Definition</span></font></b><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'> is as given below,
according to which &#8216;=&#8217; is mandatory after </span></font><font
size=2 color=black face="Courier New"><span style='font-size:10.0pt;line-height:
150%;font-family:"Courier New";color:black'>&quot;max-stale&quot;</span></font></p>

<p class=MsoNormal style='margin-left:0in;line-height:normal'><font size=2
color=black face="Courier New"><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>cache-directive&nbsp; =&nbsp;&nbsp;&nbsp; &quot;max-age&quot;
&quot;=&quot; delta-seconds</span></font></p>

<p class=MsoNormal style='margin-left:0in;line-height:normal'><font size=2
color=black face="Courier New"><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
/&nbsp;&nbsp;&nbsp; <b><span style='font-weight:bold'>&quot;max-stale&quot;
&quot;=&quot; [ delta-seconds ]</span></b></span></font></p>

<p class=MsoNormal style='margin-left:0in;line-height:normal'><font size=2
color=black face="Courier New"><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
/&nbsp;&nbsp;&nbsp; &quot;min-fresh&quot; &quot;=&quot; delta-seconds</span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>Please address this
issue.</span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>&nbsp;</span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>Regards,</span></font></p>

<p class=MsoNormal style='margin-left:0in'><font size=2 face=Arial><span
style='font-size:10.0pt;line-height:150%;font-family:Arial'>Prasad K</span></font></p>

</div>

</body>

</html>

--Boundary_(ID_dG+LSfV13gHmObuhHdWFiA)--



--===============0265231248==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;
--===============0265231248==--





From speechsc-bounces@ietf.org Wed May 09 15:29:43 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1Hlrqz-0006o9-OD; Wed, 09 May 2007 15:29:37 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1Hlrqy-0006nz-AK for speechsc-confirm+ok@megatron.ietf.org;
	Wed, 09 May 2007 15:29:36 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1Hlrqy-0006nr-0Q
	for speechsc@ietf.org; Wed, 09 May 2007 15:29:36 -0400
Received: from mx2.nuance.com ([198.71.73.25])
	by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Hlrqx-0003Mh-4Q
	for speechsc@ietf.org; Wed, 09 May 2007 15:29:35 -0400
X-IronPort-AV: E=Sophos;i="4.14,510,1170651600"; d="scan'208,217";a="62011853"
Received: from unknown (HELO bn-exchbh1.nuance.com) ([10.1.4.190])
	by mx2.nuance.com with ESMTP; 09 May 2007 15:29:34 -0400
Received: from BN-EXCH01.nuance.com ([10.1.4.214]) by bn-exchbh1.nuance.com
	with Microsoft SMTPSVC(6.0.3790.1830); 
	Wed, 9 May 2007 15:29:39 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Subject: RE: [Speechsc] proposed change to definition of "speaker
	identification"
Date: Wed, 9 May 2007 15:29:37 -0400
Message-ID: <2AB5541EB33172459EE430FFB66B1EE9055E7B8F@BN-EXCH01.nuance.com>
In-Reply-To: <5u1lfe$c4pjbh@smtp01.lnh.mail.rcn.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: [Speechsc] proposed change to definition of "speaker
	identification"
Thread-Index: AceM/wJVRWpyV8KvT+6fFHYLBJGB/AFYkXXg
From: "Daniel C. Burnett" <Daniel.Burnett@nuance.com>
To: <Judith@JMarkowitz.com>,
	<speechsc@ietf.org>
X-OriginalArrivalTime: 09 May 2007 19:29:39.0514 (UTC)
	FILETIME=[62D001A0:01C79270]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: dfec89f65e469387666e36fc1e4e3b22
Cc: 
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1456258238=="
Errors-To: speechsc-bounces@ietf.org

This is a multi-part message in MIME format.

--===============1456258238==
Content-class: urn:content-classes:message
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C79270.62AA4113"

This is a multi-part message in MIME format.

------_=_NextPart_001_01C79270.62AA4113
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Judith,

=20

Thank you for your suggestion.  You are right that the current
definition in MRCPv2 draft 12 of speaker identification is incorrect.
I'd like to give a bit of history before I propose a modification of
your suggestion.

=20

When we first introduced speaker authentication (verification and
identification) features into MRCP, the definition of speaker
identification was close, if not identical, to the one you provide
below.  Identification was clearly described as a no-claim process.

During work on the specification, it became clear that some technology
vendors provided, instead of identification per se, an ability to
perform verification, individually, against multiple voiceprints, where
a verification result was returned for each voiceprint.  This is, of
course, merely an convenience loop placed around single-voiceprint
verification.  Further, they allowed this to be done by giving a single
"voiceprint group identifier" to represent this group of voiceprints.
Notice that in both of these cases results may be returned for multiple
voiceprints.

=20

Those of us participating in the development of MRCPv2 decided that it
was possible to combine the three notions (verification against a single
voiceprint, identification, and verification individually against
multiple voiceprints) into a single API.  That API is the one we have
today.

=20

I think we should incorporate your explanations that provide proper,
industry-accepted terminology.

Since it is also important to provide an explanation that assists in
understanding the API, in addition I think we should state more clearly
that we have taken these separate processes and combined them within our
API.

=20

Here is my counter-proposal.  Replace the following:

=20

=3D=3D=3D=3D=3D=3D=3D=3D=3D

Speaker identification identifies the speaker among a set of users by

   matching against a set of voiceprints.  (This function is also called

   Multi-Verification).  Speaker identification can be performed on a

   small set of users or for a large population.  This capability is

   useful for applications where multiple users share the same access

   privileges to some data or application, but where the individual

   speaker must be uniquely identified from the group.  It is also

   useful for real time or post processing of recorded content to

   ascertain who was speaking when.  Speaker identification is also done

   in two phases, a designation phase and an execution phase.

=3D=3D=3D=3D=3D=3D=3D=3D=3D

=20

with

=20

=3D=3D=3D=3D=3D=3D=3D=3D=3D

Speaker identification is the process of associating an unknown speaker
with a member in a population. It does not employ a claim of identity.

When an individual claims to belong to a group (e.g., one of the owners
of a joint bank account) a group authentication is performed. This is
generally implemented as a kind of verification involving comparison
with more than one voice model. It is sometimes called
'multi-verification.'  If the individual speaker can be identified from
the group, this may be useful for applications where multiple users
share the same access privileges to some data or application.

Speaker identification and group authentication are also done in two
phases, a designation phase and an execution phase.

=20

Note that from a functionality standpoint identification can be thought
of as a special case of group authentication (if the individual is
identified) where the group is the entire population, although the
implementation of speaker identification may be different from the way
group authentication is performed.

=20

To accommodate single-voiceprint verification, verification against
multiple voiceprints, group authentication, and identification, this
specification provides a single set of methods that can take a list of
identifiers, called "voiceprint identifiers", and return a list of
identifiers, with a score for each representing how well the input
speech matched each identifier.  The input and output lists of
identifiers do not have to match, allowing a vendor-specific group
identifier to be used as input to indicate that identification is to be
performed.

=20

In this specification, the terms "Identification" and
"Multi-verification" are used to indicate that the input represents a
group (potentially the entire population)  and that results for multiple
voiceprints may be returned.

=3D=3D=3D=3D=3D=3D=3D=3D=3D

=20

=20

What do you think?

=20

-- dan

________________________________

From: Judith Markowitz [mailto:Judith@Jmarkowitz.com]=20
Sent: Wednesday, May 02, 2007 5:15 PM
To: speechsc@ietf.org
Subject: [Speechsc] proposed change to definition of "speaker
identification"

=20

The definition of "speaker identification" is incorrect because it
portrays speaker as a kind of verification. In fact, it is entirely
different. It is important that the speech-processing industry be in
step with both the research community and the biometrics industry. The
recommended change is an editorial one and does not affect the
specification in any way other than to provide a proper definition.=20

=20

Here is the proposed change:

=20

"Speaker identification is the process of associating an unknown speaker
with a member in a population. It does not employ a claim of identity.
Speaker verification is the process of verifying whether an unknown
speaker is the person as claimed. It is a yes/no process performed in
response to a claim of identity."

=20

If the committee wants to include the concept of making a claim of
membership (which is now erroneously included as a kind of speaker
identification) you can add

=20

"When an individual claims to belong to a group (e.g., one of the owners
of a joint bank account) a group authentication is performed. This is
generally implemented as a kind of verification involving comparison
with more than one voice model. It is sometimes called
'multi-verification.'"

=20

Thank you for your kind attention to this request.=20

=20

Judith Markowitz

Co-chair Speaker Biometrics Committee, VoiceXML Forum=20

Invited Expert in Speaker Recognition to the W3C Voice Browser Working
Group

VoiceXML Forum liaison to ANSI/INCITS/Mi(biometrics)=20

Editor, ANSI/INCITS data exchange format project for speaker biometrics

=20

=20

=20

=20


------_=_NextPart_001_01C79270.62AA4113
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns=3D"http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{mso-style-type:personal;
	font-family:Arial;
	color:windowtext;
	font-weight:normal;
	font-style:normal;
	text-decoration:none none;}
span.EmailStyle18
	{mso-style-type:personal-reply;
	font-family:Arial;
	color:navy;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
	{page:Section1;}
-->
</style>

</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple>

<div class=3DSection1>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Judith,<o:p></o:p></span></font></p>=


<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Thank you for your =
suggestion.&nbsp; You
are right that the current definition in MRCPv2 draft 12 of speaker
identification is incorrect.&nbsp; I&#8217;d like to give a bit of =
history
before I propose a modification of your =
suggestion.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>When we first introduced speaker
authentication (verification and identification) features into MRCP, the
definition of speaker identification was close, if not identical, to the =
one
you provide below.&nbsp; Identification was clearly described as a =
no-claim
process.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>During work on the specification, =
it
became clear that some technology vendors provided, instead of =
identification
per se, an ability to perform verification, individually, against =
multiple
voiceprints, where a verification result was returned for each =
voiceprint.&nbsp;
This is, of course, merely an convenience loop placed around =
single-voiceprint
verification.&nbsp; Further, they allowed this to be done by giving a =
single &#8220;voiceprint
group identifier&#8221; to represent this group of voiceprints.&nbsp; =
Notice
that in both of these cases results may be returned for multiple =
voiceprints.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Those of us participating in the
development of MRCPv2 decided that it was possible to combine the three =
notions
(verification against a single voiceprint, identification, and =
verification
individually against multiple voiceprints) into a single API.&nbsp; That =
API is
the one we have today.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>I think we should incorporate your =
explanations
that provide proper, industry-accepted =
terminology.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Since it is also important to =
provide an
explanation that assists in understanding the API, in addition I think =
we should
state more clearly that we have taken these separate processes and =
combined
them within our API.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Here is my counter-proposal.&nbsp; =
Replace
the following:<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Speaker identification identifies =
the
speaker among a set of users by<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; matching against a set =
of
voiceprints.&nbsp; (This function is also =
called<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; =
Multi-Verification).&nbsp;
Speaker identification can be performed on =
a<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; small set of users or =
for a
large population.&nbsp; This capability is<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; useful for =
applications where
multiple users share the same access<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; privileges to some =
data or
application, but where the individual<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; speaker must be =
uniquely
identified from the group.&nbsp; It is also<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; useful for real time =
or post
processing of recorded content to<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; ascertain who was =
speaking
when.&nbsp; Speaker identification is also =
done<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; in two phases, a =
designation
phase and an execution phase.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>with<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Speaker identification is the =
process of
associating an unknown speaker with a member in a population. It does =
not
employ a claim of identity.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>When an individual claims to belong =
to a
group (e.g., one of the owners of a joint bank account) a group =
authentication
is performed. This is generally implemented as a kind of verification =
involving
comparison with more than one voice model. It is sometimes called
'multi-verification.'&nbsp; If the individual speaker can be identified =
from
the group, this may be useful for applications where multiple users =
share the
same access privileges to some data or =
application.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Speaker identification and group =
authentication
are also done in two phases, a designation phase and an execution =
phase.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Note that from a functionality =
standpoint identification
can be thought of as a special case of group authentication (if the =
individual
is identified) where the group is the entire population, although the =
implementation
of speaker identification may be different from the way group =
authentication is
performed.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>To accommodate single-voiceprint =
verification,
verification against multiple voiceprints, group authentication, and =
identification,
this specification provides a single set of methods that can take a list =
of
identifiers, called &#8220;voiceprint identifiers&#8221;, and return a =
list of
identifiers, with a score for each representing how well the input =
speech
matched each identifier.&nbsp; The input and output lists of identifiers =
do not
have to match, allowing a vendor-specific group identifier to be used as =
input
to indicate that identification is to be =
performed.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>In this specification, the terms =
&#8220;Identification&#8221;
and &#8220;Multi-verification&#8221; are used to indicate that the input
represents a group (potentially the entire population)&nbsp; and that =
results
for multiple voiceprints may be returned.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>What do you =
think?<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>-- dan<o:p></o:p></span></font></p>

<div>

<div class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><font =
size=3D3
face=3D"Times New Roman"><span style=3D'font-size:12.0pt'>

<hr size=3D2 width=3D"100%" align=3Dcenter tabindex=3D-1>

</span></font></div>

<p class=3DMsoNormal><b><font size=3D2 face=3DTahoma><span =
style=3D'font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font =
size=3D2
face=3DTahoma><span style=3D'font-size:10.0pt;font-family:Tahoma'> =
Judith Markowitz
[mailto:Judith@Jmarkowitz.com] <br>
<b><span style=3D'font-weight:bold'>Sent:</span></b> Wednesday, May 02, =
2007 5:15
PM<br>
<b><span style=3D'font-weight:bold'>To:</span></b> speechsc@ietf.org<br>
<b><span style=3D'font-weight:bold'>Subject:</span></b> [Speechsc] =
proposed
change to definition of &quot;speaker =
identification&quot;</span></font><o:p></o:p></p>

</div>

<p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span =
style=3D'font-size:
12.0pt'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>The definition of
&#8220;speaker identification&#8221; is incorrect because it portrays =
speaker
as a kind of verification. In fact, it is entirely different. It is =
important
that the speech-processing industry be in step with both the research =
community
and the biometrics industry. The recommended change is an editorial one =
and
does not affect the specification in any way other than to provide a =
proper
definition. <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Here is the =
proposed change:<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>&quot;Speaker =
identification
is the process of associating an unknown speaker with a member in a =
population.
It does not employ a claim of identity. Speaker verification is the =
process of
verifying whether an unknown speaker is the person as claimed. It is a =
yes/no
process performed in response to a claim of =
identity.&quot;<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>If the committee =
wants to
include the concept of making a claim of membership (which is now =
erroneously
included as a kind of speaker identification) you can =
add<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>&quot;When an =
individual
claims to belong to a group (e.g., one of the owners of a joint bank =
account) a
group authentication is performed. This is generally implemented as a =
kind of
verification involving comparison with more than one voice model. It is
sometimes called =
'multi-verification.'&quot;<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Thank you for your =
kind
attention to this request. <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Judith =
Markowitz<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Co-chair Speaker =
Biometrics
Committee, VoiceXML Forum <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Invited Expert in =
Speaker Recognition
to the W3C Voice Browser Working Group<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>VoiceXML Forum =
liaison to
ANSI/INCITS/Mi(biometrics) <o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier New"'>Editor, ANSI/INCITS =
data
exchange format project for speaker =
biometrics<o:p></o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D3 =
face=3D"Courier New"><span
style=3D'font-size:12.0pt;font-family:"Courier =
New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

</div>

</body>

</html>

------_=_NextPart_001_01C79270.62AA4113--



--===============1456258238==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;
--===============1456258238==--





From speechsc-bounces@ietf.org Tue May 22 09:58:47 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HqUsw-0004D2-JT; Tue, 22 May 2007 09:58:46 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1HqUsv-0004Cq-Ae for speechsc-confirm+ok@megatron.ietf.org;
	Tue, 22 May 2007 09:58:45 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HqUsv-0004Ci-0y; Tue, 22 May 2007 09:58:45 -0400
Received: from repmmg02.bea.com ([66.248.192.39])
	by ietf-mx.ietf.org with esmtp (Exim 4.43)
	id 1HqUst-0004Zh-JR; Tue, 22 May 2007 09:58:44 -0400
Received: from repmmr02.bea.com (repmmr02.bea.com [10.160.30.72])
	by repmmg02.bea.com (Switch-3.2.7/Switch-3.2.7) with ESMTP id
	l4MDwfIg007181; Tue, 22 May 2007 06:58:41 -0700
Received: from rcpbex01.amer.bea.com (rcpbex01.bea.com [10.168.26.17])
	by repmmr02.bea.com (Switch-3.2.7/Switch-3.2.5) with ESMTP id
	l4MDwdeO023458; Tue, 22 May 2007 06:58:40 -0700
Received: from 10.43.242.244 ([10.43.242.244]) by rcpbex01.amer.bea.com
	([10.168.26.17]) with Microsoft Exchange Server HTTP-DAV ; 
	Tue, 22 May 2007 13:58:39 +0000
User-Agent: Microsoft-Entourage/11.3.3.061214
Date: Tue, 22 May 2007 19:48:42 +0800
Subject: FW: [Speechsc] WGLC MRCPv2-12
From: Eric Burger <eburger@bea.com>
To: <mmusic@ietf.org>
Message-ID: <C278F99A.398F%eburger@bea.com>
Thread-Topic: [Speechsc] WGLC MRCPv2-12
Thread-Index: Acdy5kyPCIt7w37GTz+qURNeG5V+RgpgNh0F
In-Reply-To: <460D3708.90109@ericsson.com>
Mime-version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7bit
x-BEA-PMX-Instructions: AV
x-BEA-MM: Internal-To-External
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 39bd8f8cbb76cae18b7e23f7cf6b2b9f
Cc: speechsc@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Errors-To: speechsc-bounces@ietf.org

OK, MMUSIC folks, can you help us SPEECHSC folks out?

Magnus noted the following may be an issue:
> ------ Forwarded Message
> From: Magnus Westerlund <magnus.westerlund@ericsson.com>
> Date: Fri, 30 Mar 2007 09:12:56 -0700
[snip]
> In section 4.2 you define behavior for using the comedia a=connection
> attribute with the TCP/MRCPv2 protocol. Based on the text and example
> (add recognizer example starting on page 16) it seems that you try to
> allow that multiple m= lines reuses the same TCP connection. However it
> is far from obvious that comedia (RFC 4145) support this behavior. It
> doesn't talk about any relation between multiple m= lines and their TCP
> connections. I would recommend discussing this with MMUSIC.
------ End of Forwarded Message

The "section 4.2" referenced is to MRCPv2, draft-ietf-speechsc-mrcpv2:

When using TCP the m-lines MUST conform to comedia [10], which describes the
usage of SDP for connection-oriented transport.
[snip]
The a=setup attribute, as described in comedia [10], MUST be "active" for
the offer from the client and MUST be "passive" for the answer from the
MRCPv2 server. The a=connection attribute MUST have a value of "new" on the
very first control m-line offer from the client to an MRCPv2 server.
Subsequent control m-line offers from the client to the MRCP server MAY
contain "new" or "existing", depending on whether the client wants to set up
a new connection or share an existing connection, respectively.



Do you think we are OK?  If not, why not, and how can we fix it?

Thanks.


Notice:  This email message, together with any attachments, may contain information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated entities,  that may be confidential,  proprietary,  copyrighted  and/or legally privileged, and is intended solely for the use of the individual or entity named in this message. If you are not the intended recipient, and have received this message in error, please immediately return this by email and then delete it.


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;



From speechsc-bounces@ietf.org Tue May 22 13:11:00 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HqXsx-0001QM-T3; Tue, 22 May 2007 13:10:59 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1HqXsw-0001QC-Pf for speechsc-confirm+ok@megatron.ietf.org;
	Tue, 22 May 2007 13:10:58 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1HqXsw-0001Q3-Fb
	for speechsc@ietf.org; Tue, 22 May 2007 13:10:58 -0400
Received: from smtp02.lnh.mail.rcn.net ([207.172.157.102])
	by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HqXsv-0002Nw-S6
	for speechsc@ietf.org; Tue, 22 May 2007 13:10:58 -0400
Received: from mr08.lnh.mail.rcn.net ([207.172.157.28])
	by smtp02.lnh.mail.rcn.net with ESMTP; 22 May 2007 13:10:58 -0400
Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11])
	by mr08.lnh.mail.rcn.net (MOS 3.8.3-GA) with ESMTP id IRM77593;
	Tue, 22 May 2007 13:10:57 -0400 (EDT)
Received: from 216-80-91-219.grn-bsr1.chi-grn.il.cable.rcn.com (HELO
	JMarkowitz) ([216.80.91.219])
	by smtp01.lnh.mail.rcn.net with ESMTP; 22 May 2007 13:10:49 -0400
Message-Id: <5u1lfe$cdg80f@smtp01.lnh.mail.rcn.net>
From: "Judith Markowitz" <Judith@Jmarkowitz.com>
To: "'Daniel C. Burnett'" <Daniel.Burnett@nuance.com>, <speechsc@ietf.org>
Date: Tue, 22 May 2007 12:10:32 -0500
Organization: J. Markowitz, Consultants
MIME-Version: 1.0
X-Mailer: Microsoft Office Outlook, Build 11.0.6353
Thread-Index: AceclBr0mWaMFnKQR1e4Lm+o5Opbyg==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
X-Junkmail-Status: score=10/50, host=mr08.lnh.mail.rcn.net
X-Junkmail-SD-Raw: score=unknown,
	refid=str=0001.0A090204.46532420.0117,ss=1,fgs=0,
	ip=207.172.4.11, so=2006-12-09 10:45:40,
	dmn=5.1.5/2006-01-31
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 819069d28e3cfe534e22b502261ce83f
Cc: 
Subject: [Speechsc] definition of "speaker identification"
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Judith@JMarkowitz.com
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1756248044=="
Errors-To: speechsc-bounces@ietf.org

This is a multi-part message in MIME format.

--===============1756248044==
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_000B_01C79C6A.32DFC260"

This is a multi-part message in MIME format.

------=_NextPart_000_000B_01C79C6A.32DFC260
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

Dan, 

I am concerned that your revised definition - like the original definition -
does not highlight the elements that clearly distinguish verification from
identification. The difference between identification and verification is a
source of confusion in the speech community as a whole. For that reason,  I
believe that it is critical that the definition that appears in MRCP specify
the following 

 

Verification 

1. requires a claim of identity. That claim can be verbal (ex: stating name,
saying employee ID, saying account number), non-verbal (e.g., DTMF) or even
implied (e.g., ANI, caller ID, using a specific cell phone to call,
answering a specific telephone number called by the SV system)

2. is basically a one-to-one matching. The only "exception" to this is
multi-verification. In multi-verification the person claims to be a member
of a group. For example, one of the people holding a joint bank account or
one of the members of a project team. In multi-verification the system
compares the new spoken input to stored voice models for each of the members
of the group. 

 

Identification 

1. does not have a claim of identity. The goal is to assign one or more
probably identities to a speech segment produced by an unknown speaker. 

2. is basically one-to-many matching. The system goes through a database of
voice models looking for the best "match." 

 

That is why I propose that Qiru's definitions or my original definitions be
used. 

 

Qiru's definitions (from his publications)

Speaker identification is the process of associating an unknown speaker with
a member in a population. It does not employ a claim of identity. 

 

Speaker verification is the process of verifying whether an unknown speaker
is the person as claimed. It is a yes/no process performed in response to a
claim of identity.

 

[my addition] When an individual claims to belong to a group (e.g., one of
the owners of a joint bank account) a "multi-verification" is performed.
This is a kind of verification involving comparison with each member of the
group.

 

 

Or the following definitions that are adapted from the VoiceXML Forum's "SIV
Glossary"

 

Speaker identification is a basic biometric function that uses an
individual's speech to search a database for a reference model that
"matches" a sample submitted by an unknown speaker. If found, it returns a
corresponding identity. This process is performed without a claim of
identity and it involves one-to-many matching operations. 

 

Speaker verification is a biometric function that uses an individual's
speech to validate or invalidate a claim of identity made by that individual

Verification retrieves the voice model for the claimed identity and compares
it with the sample submitted by the claimant. This process is performed with
a claim of identity and involves a one-to-one matching operation. The only
exception to the one-to-one matching is when the individual claims to be a
member of a group (e.g., owner of a joint bank account). Then, the
claimant's voice model is compared with models from each of the members of
the group. 

 

 

Thank you for your consideration of this issue. I hope this message
clarifies any confusion surrounding the issue so that it can be resolved
quickly. 

 

Judith  

 

J. Markowitz, Consultants

5801 N. Sheridan Rd, #19A

Chicago, IL 60660

773-769-9243

judith@jmarkowitz.com

 


------=_NextPart_000_000B_01C79C6A.32DFC260
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns=3D"http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:Arial;
	color:windowtext;
	font-weight:normal;
	font-style:normal;
	text-decoration:none none;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
	{page:Section1;}
-->
</style>

</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple>

<div class=3DSection1>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Dan, <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>I am concerned that your revised definition &#8211; =
like the
original definition &#8211; does not highlight the elements that clearly
distinguish verification from identification. The difference between
identification and verification is a source of confusion in the speech
community as a whole. For that reason, &nbsp;I believe that it is =
critical that
the definition that appears in MRCP specify the following =
<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Verification <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>1. requires a claim of identity. That claim can be =
verbal
(ex: stating name, saying employee ID, saying account number), =
non-verbal
(e.g., DTMF) or even implied (e.g., ANI, caller ID, using a specific =
cell phone
to call, answering a specific telephone number called by the SV =
system)<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>2. is basically a one-to-one matching. The only =
&#8220;exception&#8221;
to this is multi-verification. In multi-verification the person claims =
to be a
member of a group. For example, one of the people holding a joint bank =
account
or one of the members of a project team. In multi-verification the =
system
compares the new spoken input to stored voice models for each of the =
members of
the group. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Identification <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>1. does not have a claim of identity. The goal is to =
assign
one or more probably identities to a speech segment produced by an =
unknown
speaker. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>2. is basically one-to-many matching. The system goes
through a database of voice models looking for the best =
&#8220;match.&#8221; <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>That is why I propose that Qiru&#8217;s definitions =
or my
original definitions be used. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Qiru&#8217;s definitions (from his =
publications)<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Speaker identification is the process of associating =
an
unknown speaker with a member in a population. It does not employ a =
claim of
identity. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Speaker verification is the process of verifying =
whether an
unknown speaker is the person as claimed. It is a yes/no process =
performed in
response to a claim of identity.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>[my addition] When an individual claims to belong to =
a group
(e.g., one of the owners of a joint bank account) a =
&#8220;multi-verification&#8221;
is performed. This is a kind of verification involving comparison with =
each
member of the group.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Or the following definitions that are adapted from =
the VoiceXML
Forum&#8217;s &#8220;SIV Glossary&#8221;<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Speaker identification is a basic biometric function =
that
uses an individual's speech to search a database for a reference model =
that &#8220;matches&#8221;
a sample submitted by an unknown speaker. If found, it returns a =
corresponding
identity. This process is performed without a claim of identity and it =
involves
one-to-many matching operations. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Speaker verification is a biometric function that =
uses an individual's
speech to validate or invalidate a claim of identity made by that =
individual<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Verification retrieves the voice model for the =
claimed
identity and compares it with the sample submitted by the claimant. This
process is performed with a claim of identity and involves a one-to-one
matching operation. The only exception to the one-to-one matching is =
when the
individual claims to be a member of a group (e.g., owner of a joint bank
account). Then, the claimant&#8217;s voice model is compared with models =
from
each of the members of the group. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Thank you for your consideration of this issue. I =
hope this message
clarifies any confusion surrounding the issue so that it can be resolved =
quickly.
<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'>Judith&nbsp; <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:11.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>J. Markowitz, =
Consultants</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>5801 N. Sheridan Rd, =
#19A</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>Chicago, IL 60660</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>773-769-9243</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>judith@jmarkowitz.com</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span =
style=3D'font-size:
12.0pt'><o:p>&nbsp;</o:p></span></font></p>

</div>

</body>

</html>

------=_NextPart_000_000B_01C79C6A.32DFC260--



--===============1756248044==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;
--===============1756248044==--





From speechsc-bounces@ietf.org Tue May 22 14:21:09 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HqYyq-00030I-Ov; Tue, 22 May 2007 14:21:08 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1HqYyn-0002yI-H9 for speechsc-confirm+ok@megatron.ietf.org;
	Tue, 22 May 2007 14:21:05 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1HqYym-0002vb-WA
	for speechsc@ietf.org; Tue, 22 May 2007 14:21:05 -0400
Received: from smtp02.lnh.mail.rcn.net ([207.172.157.102])
	by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HqYyk-00059b-M0
	for speechsc@ietf.org; Tue, 22 May 2007 14:21:03 -0400
Received: from mr08.lnh.mail.rcn.net ([207.172.157.28])
	by smtp02.lnh.mail.rcn.net with ESMTP; 22 May 2007 14:21:02 -0400
Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11])
	by mr08.lnh.mail.rcn.net (MOS 3.8.3-GA) with ESMTP id IRM99206;
	Tue, 22 May 2007 14:21:02 -0400 (EDT)
Received: from 216-80-91-219.grn-bsr1.chi-grn.il.cable.rcn.com (HELO
	JMarkowitz) ([216.80.91.219])
	by smtp01.lnh.mail.rcn.net with ESMTP; 22 May 2007 14:20:58 -0400
Message-Id: <5u1lfe$cdhe2k@smtp01.lnh.mail.rcn.net>
From: "Judith Markowitz" <Judith@Jmarkowitz.com>
To: "'Daniel C. Burnett'" <Daniel.Burnett@nuance.com>, <speechsc@ietf.org>
Date: Tue, 22 May 2007 13:20:42 -0500
Organization: J. Markowitz, Consultants
MIME-Version: 1.0
X-Mailer: Microsoft Office Outlook, Build 11.0.6353
Thread-Index: Acecnef3g75czwFFRMqC+QE9iHk3tg==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
X-Junkmail-Status: score=10/50, host=mr08.lnh.mail.rcn.net
X-Junkmail-SD-Raw: score=unknown,
	refid=str=0001.0A090207.4653348D.0124,ss=1,fgs=0,
	ip=207.172.4.11, so=2006-12-09 10:45:40,
	dmn=5.1.5/2006-01-31
X-Spam-Score: 0.0 (/)
X-Scan-Signature: ad122f56a92d6ccd133117ee8a4b1ff3
Cc: 
Subject: [Speechsc] definition of speaker identification 
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Judith@JMarkowitz.com
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1701273105=="
Errors-To: speechsc-bounces@ietf.org

This is a multi-part message in MIME format.

--===============1701273105==
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_002B_01C79C73.FFE75030"

This is a multi-part message in MIME format.

------=_NextPart_000_002B_01C79C73.FFE75030
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

Dan

I apologize for not referencing your email of May 9 (below) which extends
the core definitions of speaker identification and verification to the MRCP
implementation. Your proposed changes do contain the modifications I was
seeking. 

 

Thank you

 

Judith 

 

From: Daniel C. Burnett [Daniel.Burnett@nuance.com]
Sent: Wednesday, May 09, 2007 2:30 PM
To: Judith@jmarkowitz.com; speechsc@ietf.org
Subject: RE: [Speechsc] proposed change to definition of "speaker
identification"

Attachments: ATT00194.txt

Judith,

 

Thank you for your suggestion.  You are right that the current definition in
MRCPv2 draft 12 of speaker identification is incorrect.  I'd like to give a
bit of history before I propose a modification of your suggestion.

 

When we first introduced speaker authentication (verification and
identification) features into MRCP, the definition of speaker identification
was close, if not identical, to the one you provide below.  Identification
was clearly described as a no-claim process.

During work on the specification, it became clear that some technology
vendors provided, instead of identification per se, an ability to perform
verification, individually, against multiple voiceprints, where a
verification result was returned for each voiceprint.  This is, of course,
merely an convenience loop placed around single-voiceprint verification.
Further, they allowed this to be done by giving a single "voiceprint group
identifier" to represent this group of voiceprints.  Notice that in both of
these cases results may be returned for multiple voiceprints.

 

Those of us participating in the development of MRCPv2 decided that it was
possible to combine the three notions (verification against a single
voiceprint, identification, and verification individually against multiple
voiceprints) into a single API.  That API is the one we have today.

 

I think we should incorporate your explanations that provide proper,
industry-accepted terminology.

Since it is also important to provide an explanation that assists in
understanding the API, in addition I think we should state more clearly that
we have taken these separate processes and combined them within our API.

 

Here is my counter-proposal.  Replace the following:

 

=========

Speaker identification identifies the speaker among a set of users by

   matching against a set of voiceprints.  (This function is also called

   Multi-Verification).  Speaker identification can be performed on a

   small set of users or for a large population.  This capability is

   useful for applications where multiple users share the same access

   privileges to some data or application, but where the individual

   speaker must be uniquely identified from the group.  It is also

   useful for real time or post processing of recorded content to

   ascertain who was speaking when.  Speaker identification is also done

   in two phases, a designation phase and an execution phase.

=========

 

with

 

=========

Speaker identification is the process of associating an unknown speaker with
a member in a population. It does not employ a claim of identity.

When an individual claims to belong to a group (e.g., one of the owners of a
joint bank account) a group authentication is performed. This is generally
implemented as a kind of verification involving comparison with more than
one voice model. It is sometimes called 'multi-verification.'  If the
individual speaker can be identified from the group, this may be useful for
applications where multiple users share the same access privileges to some
data or application.

Speaker identification and group authentication are also done in two phases,
a designation phase and an execution phase.

 

Note that from a functionality standpoint identification can be thought of
as a special case of group authentication (if the individual is identified)
where the group is the entire population, although the implementation of
speaker identification may be different from the way group authentication is
performed.

 

To accommodate single-voiceprint verification, verification against multiple
voiceprints, group authentication, and identification, this specification
provides a single set of methods that can take a list of identifiers, called
"voiceprint identifiers", and return a list of identifiers, with a score for
each representing how well the input speech matched each identifier.  The
input and output lists of identifiers do not have to match, allowing a
vendor-specific group identifier to be used as input to indicate that
identification is to be performed.

 

In this specification, the terms "Identification" and "Multi-verification"
are used to indicate that the input represents a group (potentially the
entire population)  and that results for multiple voiceprints may be
returned.

=========

 

 

What do you think?

 

-- dan

  _____  

From: Judith Markowitz [mailto:Judith@Jmarkowitz.com] 
Sent: Wednesday, May 02, 2007 5:15 PM
To: speechsc@ietf.org
Subject: [Speechsc] proposed change to definition of "speaker
identification"

 

The definition of "speaker identification" is incorrect because it portrays
speaker as a kind of verification. In fact, it is entirely different. It is
important that the speech-processing industry be in step with both the
research community and the biometrics industry. The recommended change is an
editorial one and does not affect the specification in any way other than to
provide a proper definition. 

 

Here is the proposed change:

 

"Speaker identification is the process of associating an unknown speaker
with a member in a population. It does not employ a claim of identity.
Speaker verification is the process of verifying whether an unknown speaker
is the person as claimed. It is a yes/no process performed in response to a
claim of identity."

 

If the committee wants to include the concept of making a claim of
membership (which is now erroneously included as a kind of speaker
identification) you can add

 

"When an individual claims to belong to a group (e.g., one of the owners of
a joint bank account) a group authentication is performed. This is generally
implemented as a kind of verification involving comparison with more than
one voice model. It is sometimes called 'multi-verification.'"

 

Thank you for your kind attention to this request. 

 

Judith Markowitz

Co-chair Speaker Biometrics Committee, VoiceXML Forum 

Invited Expert in Speaker Recognition to the W3C Voice Browser Working Group

VoiceXML Forum liaison to ANSI/INCITS/Mi(biometrics) 

Editor, ANSI/INCITS data exchange format project for speaker biometrics

 

 


------=_NextPart_000_002B_01C79C73.FFE75030
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns=3D"http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:Arial;
	color:windowtext;
	font-weight:normal;
	font-style:normal;
	text-decoration:none none;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
	{page:Section1;}
-->
</style>

</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple>

<div class=3DSection1>

<p class=3DMsoNormal><font size=3D3 color=3Dblue face=3D"Times New =
Roman"><span
style=3D'font-size:12.0pt;color:blue'>Dan<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 color=3Dblue face=3D"Times New =
Roman"><span
style=3D'font-size:12.0pt;color:blue'>I apologize for not referencing =
your email
of May 9 (below) which extends the core definitions of speaker =
identification and
verification to the MRCP implementation. Your proposed changes do =
contain the modifications
I was seeking. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 color=3Dblue face=3D"Times New =
Roman"><span
style=3D'font-size:12.0pt;color:blue'><o:p>&nbsp;</o:p></span></font></p>=


<p class=3DMsoNormal><font size=3D3 color=3Dblue face=3D"Times New =
Roman"><span
style=3D'font-size:12.0pt;color:blue'>Thank =
you<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 color=3Dblue face=3D"Times New =
Roman"><span
style=3D'font-size:12.0pt;color:blue'><o:p>&nbsp;</o:p></span></font></p>=


<p class=3DMsoNormal><font size=3D3 color=3Dblue face=3D"Times New =
Roman"><span
style=3D'font-size:12.0pt;color:blue'>Judith =
<o:p></o:p></span></font></p>

<p class=3DMsoNormal><b><font size=3D3 face=3D"Times New Roman"><span
style=3D'font-size:12.0pt;font-weight:bold'><o:p>&nbsp;</o:p></span></fon=
t></b></p>

<p class=3DMsoNormal><b><font size=3D3 face=3D"Times New Roman"><span
style=3D'font-size:12.0pt;font-weight:bold'>From:</span></font></b> =
Daniel C.
Burnett [Daniel.Burnett@nuance.com]<br>
<b><span style=3D'font-weight:bold'>Sent:</span></b> Wednesday, May 09, =
2007 2:30
PM<br>
<b><span style=3D'font-weight:bold'>To:</span></b> =
Judith@jmarkowitz.com;
speechsc@ietf.org<br>
<b><span style=3D'font-weight:bold'>Subject:</span></b> RE: [Speechsc] =
proposed
change to definition of &quot;speaker identification&quot;<br>
<br>
<b><span style=3D'font-weight:bold'>Attachments:</span></b> =
ATT00194.txt<o:p></o:p></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Judith,<o:p></o:p></span></font></p>=


<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Thank you for your =
suggestion.&nbsp; You
are right that the current definition in MRCPv2 draft 12 of speaker
identification is incorrect.&nbsp; I&#8217;d like to give a bit of =
history
before I propose a modification of your =
suggestion.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>When we first introduced speaker
authentication (verification and identification) features into MRCP, the
definition of speaker identification was close, if not identical, to the =
one
you provide below.&nbsp; Identification was clearly described as a =
no-claim process.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>During work on the specification, =
it
became clear that some technology vendors provided, instead of =
identification
per se, an ability to perform verification, individually, against =
multiple
voiceprints, where a verification result was returned for each
voiceprint.&nbsp; This is, of course, merely an convenience loop placed =
around
single-voiceprint verification.&nbsp; Further, they allowed this to be =
done by
giving a single &#8220;voiceprint group identifier&#8221; to represent =
this
group of voiceprints.&nbsp; Notice that in both of these cases results =
may be
returned for multiple voiceprints.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Those of us participating in the
development of MRCPv2 decided that it was possible to combine the three =
notions
(verification against a single voiceprint, identification, and =
verification
individually against multiple voiceprints) into a single API.&nbsp; That =
API is
the one we have today.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>I think we should incorporate your
explanations that provide proper, industry-accepted =
terminology.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Since it is also important to =
provide an
explanation that assists in understanding the API, in addition I think =
we
should state more clearly that we have taken these separate processes =
and
combined them within our API.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Here is my counter-proposal.&nbsp; =
Replace
the following:<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Speaker identification identifies =
the
speaker among a set of users by<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; matching against a set =
of
voiceprints.&nbsp; (This function is also =
called<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; =
Multi-Verification).&nbsp;
Speaker identification can be performed on =
a<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; small set of users or =
for a
large population.&nbsp; This capability is<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; useful for =
applications where
multiple users share the same access<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; privileges to some =
data or
application, but where the individual<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; speaker must be =
uniquely
identified from the group.&nbsp; It is also<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; useful for real time =
or post
processing of recorded content to<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; ascertain who was =
speaking
when.&nbsp; Speaker identification is also =
done<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>&nbsp;&nbsp; in two phases, a =
designation
phase and an execution phase.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>with<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Speaker identification is the =
process of
associating an unknown speaker with a member in a population. It does =
not
employ a claim of identity.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>When an individual claims to belong =
to a
group (e.g., one of the owners of a joint bank account) a group =
authentication
is performed. This is generally implemented as a kind of verification =
involving
comparison with more than one voice model. It is sometimes called
'multi-verification.'&nbsp; If the individual speaker can be identified =
from
the group, this may be useful for applications where multiple users =
share the
same access privileges to some data or =
application.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Speaker identification and group
authentication are also done in two phases, a designation phase and an
execution phase.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>Note that from a functionality =
standpoint
identification can be thought of as a special case of group =
authentication (if
the individual is identified) where the group is the entire population,
although the implementation of speaker identification may be different =
from the
way group authentication is performed.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>To accommodate single-voiceprint
verification, verification against multiple voiceprints, group =
authentication,
and identification, this specification provides a single set of methods =
that
can take a list of identifiers, called &#8220;voiceprint =
identifiers&#8221;,
and return a list of identifiers, with a score for each representing how =
well
the input speech matched each identifier.&nbsp; The input and output =
lists of
identifiers do not have to match, allowing a vendor-specific group =
identifier
to be used as input to indicate that identification is to be =
performed.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>In this specification, the terms
&#8220;Identification&#8221; and &#8220;Multi-verification&#8221; are =
used to
indicate that the input represents a group (potentially the entire
population)&nbsp; and that results for multiple voiceprints may be =
returned.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>=3D=3D=3D=3D=3D=3D=3D=3D=3D<o:p></o:=
p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>What do you =
think?<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
style=3D'font-size:
10.0pt;font-family:Arial;color:navy'>-- dan<o:p></o:p></span></font></p>

<div class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><font =
size=3D3
face=3D"Times New Roman"><span style=3D'font-size:12.0pt'>

<hr size=3D2 width=3D"100%" align=3Dcenter tabIndex=3D-1>

</span></font></div>

<p class=3DMsoNormal><b><font size=3D2 face=3DTahoma><span =
style=3D'font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font =
size=3D2
face=3DTahoma><span style=3D'font-size:10.0pt;font-family:Tahoma'> =
Judith Markowitz
[mailto:Judith@Jmarkowitz.com] <br>
<b><span style=3D'font-weight:bold'>Sent:</span></b> Wednesday, May 02, =
2007 5:15
PM<br>
<b><span style=3D'font-weight:bold'>To:</span></b> speechsc@ietf.org<br>
<b><span style=3D'font-weight:bold'>Subject:</span></b> [Speechsc] =
proposed
change to definition of &quot;speaker =
identification&quot;</span></font><o:p></o:p></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span =
style=3D'font-size:
12.0pt'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>The definition of &#8220;speaker
identification&#8221; is incorrect because it portrays speaker as a kind =
of
verification. In fact, it is entirely different. It is important that =
the
speech-processing industry be in step with both the research community =
and the
biometrics industry. The recommended change is an editorial one and does =
not
affect the specification in any way other than to provide a proper =
definition. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>Here is the proposed =
change:<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>&quot;Speaker identification is the process =
of associating
an unknown speaker with a member in a population. It does not employ a =
claim of
identity. Speaker verification is the process of verifying whether an =
unknown
speaker is the person as claimed. It is a yes/no process performed in =
response
to a claim of identity.&quot;<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>If the committee wants to include the concept =
of
making a claim of membership (which is now erroneously included as a =
kind of
speaker identification) you can add<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>&quot;When an individual claims to belong to =
a group
(e.g., one of the owners of a joint bank account) a group authentication =
is
performed. This is generally implemented as a kind of verification =
involving
comparison with more than one voice model. It is sometimes called
'multi-verification.'&quot;<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>Thank you for your kind attention to this =
request. <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>Judith Markowitz<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>Co-chair Speaker Biometrics Committee, =
VoiceXML
Forum <o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>Invited Expert in Speaker Recognition to the =
W3C
Voice Browser Working Group<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>VoiceXML Forum liaison to =
ANSI/INCITS/Mi(biometrics)
<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'>Editor, ANSI/INCITS data exchange format =
project for
speaker biometrics<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Courier New"><span =
style=3D'font-size:12.0pt;
font-family:"Courier New"'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span =
style=3D'font-size:
12.0pt'><o:p>&nbsp;</o:p></span></font></p>

</div>

</body>

</html>

------=_NextPart_000_002B_01C79C73.FFE75030--



--===============1701273105==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;
--===============1701273105==--





From speechsc-bounces@ietf.org Wed May 23 09:38:42 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1Hqr33-0005tO-ST; Wed, 23 May 2007 09:38:41 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1Hqp02-0002N3-OG for speechsc-confirm+ok@megatron.ietf.org;
	Wed, 23 May 2007 07:27:26 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1Hqp02-0002Ly-8V; Wed, 23 May 2007 07:27:26 -0400
Received: from mailgw3.ericsson.se ([193.180.251.60])
	by ietf-mx.ietf.org with esmtp (Exim 4.43)
	id 1Hqp01-00055p-Eo; Wed, 23 May 2007 07:27:26 -0400
Received: from mailgw3.ericsson.se (unknown [127.0.0.1])
	by mailgw3.ericsson.se (Symantec Mail Security) with ESMTP id
	ACBE520A8B; Wed, 23 May 2007 13:27:24 +0200 (CEST)
X-AuditID: c1b4fb3c-ac51ebb0000073d5-8c-4654251c774c 
Received: from esealmw128.eemea.ericsson.se (unknown [153.88.254.121])
	by mailgw3.ericsson.se (Symantec Mail Security) with ESMTP id
	92F1E20947; Wed, 23 May 2007 13:27:24 +0200 (CEST)
Received: from esealmw128.eemea.ericsson.se ([153.88.254.176]) by
	esealmw128.eemea.ericsson.se with Microsoft SMTPSVC(6.0.3790.1830); 
	Wed, 23 May 2007 13:27:24 +0200
Received: from mail.lmf.ericsson.se ([131.160.11.50]) by
	esealmw128.eemea.ericsson.se with Microsoft SMTPSVC(6.0.3790.1830); 
	Wed, 23 May 2007 13:27:23 +0200
Received: from [131.160.36.58] (E000FB0F665DD.lmf.ericsson.se [131.160.36.58])
	by mail.lmf.ericsson.se (Postfix) with ESMTP id CF7AA2467;
	Wed, 23 May 2007 14:27:23 +0300 (EEST)
Message-ID: <4654251B.7000401@ericsson.com>
Date: Wed, 23 May 2007 14:27:23 +0300
From: Gonzalo Camarillo <Gonzalo.Camarillo@ericsson.com>
User-Agent: Thunderbird 1.5.0.10 (Windows/20070221)
MIME-Version: 1.0
To: Eric Burger <eburger@bea.com>
Subject: Re: [MMUSIC] FW: [Speechsc] WGLC MRCPv2-12
References: <C278F99A.398F%eburger@bea.com>
In-Reply-To: <C278F99A.398F%eburger@bea.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 23 May 2007 11:27:24.0069 (UTC)
	FILETIME=[55B9FD50:01C79D2D]
X-Brightmail-Tracker: AAAAAA==
X-Spam-Score: 0.0 (/)
X-Scan-Signature: d8ae4fd88fcaf47c1a71c804d04f413d
X-Mailman-Approved-At: Wed, 23 May 2007 09:38:40 -0400
Cc: speechsc@ietf.org, mmusic@ietf.org,
	"Magnus Westerlund \(KI/EAB\)" <magnus.westerlund@ericsson.com>
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Errors-To: speechsc-bounces@ietf.org

Hi Eric,

comedia was not designed to handle different m lines sharing a single 
TCP connection. Some time ago, the best practice was not to use the same 
port number in two m lines (assuming the same IP address). Nowadays that 
rule does not seem to apply any longer and some specs use the same port 
in different m lines.

For example, Section 8.1 of the draft below explains how to use a single 
TCP connection for different MSRP sessions. The trick there is that they 
fill the SDP parameters with dummy values and then define their own 
stuff in an MSRP-specific way:

http://www.ietf.org/internet-drafts/draft-ietf-simple-message-sessions-19.txt

If sharing a single TCP connection between different m lines is 
something we want to specify (and it should not be hard to specify), I 
would suggest that it is done in a way that can be reused by any 
upper-layer protocol so that we do not need to re-define the same stuff 
over and over again.

Something else to think about is TCP-ICE and its interactions with 
shared transport connections.

Cheers,

Gonzalo



Eric Burger wrote:
> OK, MMUSIC folks, can you help us SPEECHSC folks out?
> 
> Magnus noted the following may be an issue:
>> ------ Forwarded Message
>> From: Magnus Westerlund <magnus.westerlund@ericsson.com>
>> Date: Fri, 30 Mar 2007 09:12:56 -0700
> [snip]
>> In section 4.2 you define behavior for using the comedia a=connection
>> attribute with the TCP/MRCPv2 protocol. Based on the text and example
>> (add recognizer example starting on page 16) it seems that you try to
>> allow that multiple m= lines reuses the same TCP connection. However it
>> is far from obvious that comedia (RFC 4145) support this behavior. It
>> doesn't talk about any relation between multiple m= lines and their TCP
>> connections. I would recommend discussing this with MMUSIC.
> ------ End of Forwarded Message
> 
> The "section 4.2" referenced is to MRCPv2, draft-ietf-speechsc-mrcpv2:
> 
> When using TCP the m-lines MUST conform to comedia [10], which describes the
> usage of SDP for connection-oriented transport.
> [snip]
> The a=setup attribute, as described in comedia [10], MUST be "active" for
> the offer from the client and MUST be "passive" for the answer from the
> MRCPv2 server. The a=connection attribute MUST have a value of "new" on the
> very first control m-line offer from the client to an MRCPv2 server.
> Subsequent control m-line offers from the client to the MRCP server MAY
> contain "new" or "existing", depending on whether the client wants to set up
> a new connection or share an existing connection, respectively.
> 
> 
> 
> Do you think we are OK?  If not, why not, and how can we fix it?
> 
> Thanks.
> 
> 
> Notice:  This email message, together with any attachments, may contain information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated entities,  that may be confidential,  proprietary,  copyrighted  and/or legally privileged, and is intended solely for the use of the individual or entity named in this message. If you are not the intended recipient, and have received this message in error, please immediately return this by email and then delete it.
> 
> _______________________________________________
> mmusic mailing list
> mmusic@ietf.org
> https://www1.ietf.org/mailman/listinfo/mmusic
> 



_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;



From speechsc-bounces@ietf.org Thu May 24 16:31:06 2007
Return-path: <speechsc-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
	by megatron.ietf.org with esmtp (Exim 4.43)
	id 1HrJxh-0006MH-EU; Thu, 24 May 2007 16:31:05 -0400
Received: from speechsc by megatron.ietf.org with local (Exim 4.43)
	id 1HrJxg-0006MC-AS for speechsc-confirm+ok@megatron.ietf.org;
	Thu, 24 May 2007 16:31:04 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.43) id 1HrJxg-0006M4-0V
	for speechsc@ietf.org; Thu, 24 May 2007 16:31:04 -0400
Received: from sj-iport-2-in.cisco.com ([171.71.176.71]
	helo=sj-iport-2.cisco.com) by ietf-mx.ietf.org with esmtp (Exim 4.43)
	id 1HrJxd-0004kz-DC
	for speechsc@ietf.org; Thu, 24 May 2007 16:31:03 -0400
Received: from sjc12-sbr-sw3-3f5.cisco.com (HELO imail.cisco.com)
	([172.19.96.182])
	by sj-iport-2.cisco.com with ESMTP; 24 May 2007 13:30:39 -0700
X-IronPort-AV: i="4.14,575,1170662400"; 
	d="scan'208"; a="376014738:sNHT62263519350"
Received: from [161.44.183.194] (dhcp-161-44-183-194.cisco.com
	[161.44.183.194])
	by imail.cisco.com (8.12.11/8.12.10) with ESMTP id l4OK59o2030837;
	Thu, 24 May 2007 13:05:10 -0700
In-Reply-To: <5u1lfe$cdg80f@smtp01.lnh.mail.rcn.net>
References: <5u1lfe$cdg80f@smtp01.lnh.mail.rcn.net>
Mime-Version: 1.0 (Apple Message framework v752.2)
Content-Type: text/plain; charset=WINDOWS-1252; delsp=yes; format=flowed
Message-Id: <E3FBDF95-1DA0-42E7-83EB-17D813BA7AD4@cisco.com>
Content-Transfer-Encoding: quoted-printable
From: David R Oran <oran@cisco.com>
Subject: Re: [Speechsc] definition of "speaker identification"
Date: Thu, 24 May 2007 16:30:36 -0400
To: Judith@Jmarkowitz.com
X-Mailer: Apple Mail (2.752.2)
DKIM-Signature: v=0.5; a=rsa-sha256; q=dns/txt; l=5093; t=1180037110;
	x=1180901110; c=relaxed/simple; s=oregon;
	h=To:Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; 
	d=cisco.com; i=oran@cisco.com;
	z=From:=20David=20R=20Oran=20<oran@cisco.com>
	|Subject:=20Re=3A=20[Speechsc]=20definition=20of=20=22speaker=20identific
	ation=22 |Sender:=20 |To:=20Judith@Jmarkowitz.com;
	bh=+aqf8kak3ObtxIf6nwsNFNsCdLARxWw87OBeELxYA0o=;
	b=izmTw28E96Dq9QMN9XCH7ldjTBK+fcvSxtQ1/iqMq2+PMgFTeFoqkdE5gmMbkX+hMB/mJPrR
	zXKCnKsfEfjxSBuUYa9/iOyZttzPBtVsB1PjDywy8P9f77VcuA4Pl02T;
Authentication-Results: imail.cisco.com; header.From=oran@cisco.com;
	dkim=pass ( sig from cisco.com/oregon verified; ); 
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 6ba8aaf827dcb437101951262f69b3de
Cc: speechsc@ietf.org
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>,
	<mailto:speechsc-request@ietf.org?subject=subscribe>
Errors-To: speechsc-bounces@ietf.org


On May 22, 2007, at 1:10 PM, Judith Markowitz wrote:

> Dan,
>
> I am concerned that your revised definition =96 like the original =20
> definition =96 does not highlight the elements that clearly =20
> distinguish verification from identification. The difference =20
> between identification and verification is a source of confusion in =20=

> the speech community as a whole. For that reason,  I believe that =20
> it is critical that the definition that appears in MRCP specify the =20=

> following
>
>
>
> Verification
>
> 1. requires a claim of identity. That claim can be verbal (ex: =20
> stating name, saying employee ID, saying account number), non-=20
> verbal (e.g., DTMF) or even implied (e.g., ANI, caller ID, using a =20
> specific cell phone to call, answering a specific telephone number =20
> called by the SV system)
>
> 2. is basically a one-to-one matching. The only =93exception=94 to =
this =20
> is multi-verification. In multi-verification the person claims to =20
> be a member of a group. For example, one of the people holding a =20
> joint bank account or one of the members of a project team. In =20
> multi-verification the system compares the new spoken input to =20
> stored voice models for each of the members of the group.
>
>
> Identification
>
> 1. does not have a claim of identity. The goal is to assign one or =20
> more probably identities to a speech segment produced by an unknown =20=

> speaker.
>
> 2. is basically one-to-many matching. The system goes through a =20
> database of voice models looking for the best =93match.=94
>
>

I don't find these two to be nearly as disjoint as Judith seems to. =20
Furthermore, the distinctions that are made fall mostly in the domain =20=

of the application using the capabilities of the MRCP protocol. =20
Therefore, in the document describing the protocol it seems less =20
important (and possibly counterproductive) to use elements of the =20
definitions which do not affect the protocol semantics.
> That is why I propose that Qiru=92s definitions or my original =20
> definitions be used.
>
>
If we go this way, I still lean toward the core of Dan's suggestion, =20
which is to make clear what elements of these definitions matter to =20
the MRCPv2 protocol and which do not.

Also, keep in mind that the prime audience for the document is =20
protocol implementers and not speech application developers. If the =20
protocol blurs the distinctions (which it does, IMO quite properly) =20
then that should be reflected somehow.

[chair hat off/ individual hat on]

> Qiru=92s definitions (from his publications)
>
> Speaker identification is the process of associating an unknown =20
> speaker with a member in a population. It does not employ a claim =20
> of identity.
>
>
>
> Speaker verification is the process of verifying whether an unknown =20=

> speaker is the person as claimed. It is a yes/no process performed =20
> in response to a claim of identity.
>
>
>
> [my addition] When an individual claims to belong to a group (e.g., =20=

> one of the owners of a joint bank account) a =93multi-verification=94 =20=

> is performed. This is a kind of verification involving comparison =20
> with each member of the group.
>
>
>
>
>
> Or the following definitions that are adapted from the VoiceXML =20
> Forum=92s =93SIV Glossary=94
>
>
>
> Speaker identification is a basic biometric function that uses an =20
> individual's speech to search a database for a reference model that =20=

> =93matches=94 a sample submitted by an unknown speaker. If found, it =20=

> returns a corresponding identity. This process is performed without =20=

> a claim of identity and it involves one-to-many matching operations.
>
>
>
> Speaker verification is a biometric function that uses an =20
> individual's speech to validate or invalidate a claim of identity =20
> made by that individual
>
> Verification retrieves the voice model for the claimed identity and =20=

> compares it with the sample submitted by the claimant. This process =20=

> is performed with a claim of identity and involves a one-to-one =20
> matching operation. The only exception to the one-to-one matching =20
> is when the individual claims to be a member of a group (e.g., =20
> owner of a joint bank account). Then, the claimant=92s voice model is =20=

> compared with models from each of the members of the group.
>
>
>
>
>
> Thank you for your consideration of this issue. I hope this message =20=

> clarifies any confusion surrounding the issue so that it can be =20
> resolved quickly.
>
>
>
> Judith
>
>
>
> J. Markowitz, Consultants
>
> 5801 N. Sheridan Rd, #19A
>
> Chicago, IL 60660
>
> 773-769-9243
>
> judith@jmarkowitz.com
>
>
>
> _______________________________________________
> Speechsc mailing list
> Speechsc@ietf.org
> https://www1.ietf.org/mailman/listinfo/speechsc
> Supplemental web site:
> &lt;http://www.standardstrack.com/ietf/speechsc&gt;


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;



