<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">

<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="std"
     ipr="trust200902" submissionType="IETF" number="8853" consensus="true"
     obsoletes="" updates="" xml:lang="en" tocInclude="true" symRefs="true"
     sortRefs="true" version="3" docName="draft-ietf-mmusic-sdp-simulcast-14">

  <!-- xml2rfc v2v3 conversion 2.31.0 -->
  <front>
    <title abbrev="Simulcast">Using Simulcast in Session Description Protocol (SDP) and RTP Sessions</title>

    <seriesInfo name="RFC" value="8853"/>
    <author fullname="Bo Burman" initials="B." surname="Burman">
      <organization>Ericsson</organization>
      <address>
        <postal>
          <street>Gronlandsgatan 31</street>
          <city>SE-164 60 Stockholm</city>
          <region/>
          <code/>
          <country>Sweden</country>
        </postal>
        <phone/>
        <email>bo.burman@ericsson.com</email>
        <uri/>
      </address>
    </author>
    <author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
      <organization>Ericsson</organization>
      <address>
        <postal>
          <street>Torshamnsgatan 23</street>
          <city>SE-164 83 Stockholm</city>
          <country>Sweden</country>
        </postal>
        <email>magnus.westerlund@ericsson.com</email>
      </address>
    </author>
    <author fullname="Suhas Nandakumar" initials="S." surname="Nandakumar">
      <organization>Cisco</organization>
      <address>
        <postal>
          <street>170 West Tasman Drive</street>
          <city>San Jose</city>
          <region>CA</region>
          <code>95134</code>
          <country>United States of America</country>
        </postal>
        <phone/>
        <email>snandaku@cisco.com</email>
        <uri/>
      </address>
    </author>
    <author fullname="Mo Zanaty" initials="M." surname="Zanaty">
      <organization>Cisco</organization>
      <address>
        <postal>
          <street>170 West Tasman Drive</street>
          <city>San Jose</city>
          <region>CA</region>
          <code>95134</code>
          <country>United States of America</country>
        </postal>
        <phone/>
        <email>mzanaty@cisco.com</email>
        <uri/>
      </address>
    </author>
    <date month="January" year="2021"/>

<keyword>Conference</keyword>
<keyword>multi-party</keyword>
<keyword>middlebox</keyword>
<keyword>MCU</keyword>
<keyword>SFU</keyword>
<keyword>media</keyword>
<keyword>video</keyword>
<keyword>restrictions</keyword>
<keyword>RTCP</keyword>
<keyword>RID</keyword>
<keyword>RtpStreamId</keyword>

    <abstract>

      <t>In some application scenarios, it may be desirable to send multiple
      differently encoded versions of the same media source in different RTP
      streams. This is called simulcast. This document describes how to
      accomplish simulcast in RTP and how to signal it in the Session
      Description Protocol (SDP).  The described solution uses an RTP/RTCP
      identification method to identify RTP streams
      belonging to the same media source and makes an extension to SDP to
      indicate that those RTP streams are different simulcast formats of that
      media source. The SDP extension consists of a new media-level SDP
      attribute that expresses capability to send and/or receive simulcast RTP
      streams.</t>
    </abstract>
  </front>
  <middle>
    <section anchor="sec-intro" numbered="true" toc="default">
      <name>Introduction</name>
      <t>Most of today's multiparty video-conference solutions make use of
      centralized servers to reduce the bandwidth and CPU consumption in the
      endpoints. Those servers receive RTP streams from each participant and
      send some suitable set of possibly modified RTP streams to the rest of
      the participants, which usually have heterogeneous capabilities (screen
      size, CPU, bandwidth, codec, etc.). One of the biggest issues is how to
      perform RTP stream adaptation to different participants' constraints
      with the minimum possible impact on both video quality and server
      performance.</t>
      <t>Simulcast is defined in this memo as the act of simultaneously
      sending multiple different encoded streams of the same media source --
      e.g., the same video source encoded with different video-encoder types or
      image resolutions. This can be done in several ways and for different
      purposes. This document focuses on the case where it is desirable to
      provide a media source as multiple encoded streams over <xref target="RFC3550" format="default">RTP</xref> towards an intermediary so that the
      intermediary can provide the wanted functionality by selecting which RTP
      stream(s) to forward to other participants in the session, and more
      specifically how the identification and grouping of the involved RTP
      streams are done.</t>
      <t>The intended scope of the defined mechanism is to support negotiation
      and usage of simulcast when using SDP offer/answer and media transport
      over RTP. The media transport topologies considered are point-to-point
      RTP sessions, as well as centralized multiparty RTP sessions, where a
      media sender will provide the simulcasted streams to an RTP middlebox or
      endpoint, and middleboxes may further distribute the simulcast streams
      to other middleboxes or endpoints. Simulcast could be used point to point between
      middleboxes as part of a distributed multiparty scenario. Usage of
      multicast or broadcast transport is out of scope
      and left for future extensions.</t>
      <t>This document describes a few scenarios that motivate the use of
      simulcast and also defines the needed RTP/RTCP and SDP signaling for
      it.</t>
    </section>
    <section anchor="sec-definitions" numbered="true" toc="default">
      <name>Definitions</name>
      <section numbered="true" toc="default">
        <name>Terminology</name>
        <t>This document makes use of the terminology defined in <xref
	target="RFC7656" format="default">"A Taxonomy of Semantics and
	Mechanisms for Real-Time
	Transport Protocol (RTP) Sources"</xref> and <xref target="RFC7667"
	format="default">"RTP Topologies"</xref>. The following terms are
	especially noted or here defined:</t>
        <dl newline="false" spacing="normal">
          <dt>RTP mixer:</dt>
          <dd>An RTP middlebox, in the wide sense of the term, encompassing
          Sections <xref target="RFC7667" section="3.6" sectionFormat="bare"/>
          to <xref target="RFC7667" section="3.9" sectionFormat="bare"/> of
          <xref target="RFC7667" format="default"/>.</dd>
          <dt>RTP session:</dt>
          <dd>An association among a group of
            participants communicating with RTP, as defined in <xref target="RFC3550" format="default"/> and amended by <xref target="RFC7656" format="default"/>.</dd>
          <dt>RTP stream:</dt>
          <dd>A stream of RTP packets containing media
            data, as defined in <xref target="RFC7656" format="default"/>.</dd>
          <dt>RTP switch:</dt>
          <dd>A common short term for the terms
            "switching RTP mixer", "source projecting middlebox", and "video
            switching Multipoint Control Unit (MCU)", as discussed in <xref target="RFC7667" format="default"/>.</dd>
          <dt>Simulcast stream:</dt>
          <dd>One encoded stream or dependent
            stream from a set of concurrently transmitted encoded streams and
            optional dependent streams, all sharing a common media source, as
            defined in <xref target="RFC7656" format="default"/>. For example, HD and thumbnail
            video simulcast versions of a single media source sent
            concurrently as separate RTP streams.</dd>
          <dt>Simulcast format:</dt>
          <dd>Different formats of a simulcast
            stream serve the same purpose as alternative RTP payload types in
            nonsimulcast SDP: to allow multiple alternative media formats for
            a given RTP stream. As for multiple RTP payload types on the
            "m=" line in <xref target="RFC3264" format="default">offer/answer</xref>, any one of
            the negotiated alternative formats can be used in a single RTP
            stream at a given point in time, but not more than one (based on
            RTP timestamp). What format is used can change dynamically from
            one RTP packet to another.</dd>
        </dl>
      </section>
      <section numbered="true" toc="default">
        <name>Requirements Language</name>
        <t>
    The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>",
    "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
    NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>",
    "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
    "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are
    to be interpreted as
    described in BCP&nbsp;14 <xref target="RFC2119" format="default"/> <xref
    target="RFC8174" format="default"/> 
    when, and only when, they appear in all capitals, as shown here.
        </t>
      </section>
    </section>
    <section anchor="sec-use-cases" numbered="true" toc="default">
      <name>Use Cases</name>
      <t>The use cases of simulcast described in this document relate to a
      multiparty communication session where one or more central nodes are
      used to adapt the view of the communication session towards individual
      participants and facilitate the media transport between participants.
      Thus, these cases target the RTP mixer type of topology.</t>
      <t>There are two principal approaches for an RTP mixer to provide this
      adapted view of the communication session to each receiving
      participant:</t>
      <ul spacing="normal">
        <li>Transcoding (decoding and re-encoding) received RTP streams with
          characteristics adapted to each receiving participant. This often
          includes mixing or composition of media sources from multiple
          participants into a mixed media source originated by the RTP mixer.
          The main advantage of this approach is that it achieves
	  close-to-optimal adaptation to individual receiving
	  participants. The main
          disadvantages are that it can be very computationally expensive to
          the RTP mixer, typically degrades media Quality of Experience (QoE)
          such as creating end-to-end delay for the receiving participants, and
          requires the RTP mixer to have access to media content.</li>
        <li>Switching a subset of all received RTP streams or substreams to
          each receiving participant, where the used subset is typically
          specific to each receiving participant. The main advantages of this
          approach are that it is computationally cheap to the RTP mixer, has
          very limited impact on media QoE, and does not require the RTP mixer
	  to have (full) access to media content. The main disadvantage is
	  that it can be difficult to combine a subset of received RTP streams into a
          perfect fit for the resource situation of a receiving participant. It
          is also a disadvantage that sending multiple RTP streams consumes
          more network resources from the sending participant to the RTP
          mixer.</li>
      </ul>
      <t>The use of simulcast relates to the latter approach, where it is more
      important to reduce the load on the RTP mixer and/or minimize QoE impact
      than to achieve an optimal adaptation of resource usage.</t>
      <section anchor="sec-diverse-receivers" numbered="true" toc="default">
        <name>Reaching a Diverse Set of Receivers</name>
        <t>The media sources provided by a sending participant potentially
        need to reach several receiving participants that differ in terms of
        available resources. The receiver resources that typically differ
        include, but are not limited to:</t>
        <dl newline="false" spacing="normal">
          <dt>Codec:</dt>
          <dd>This includes codec type (such as RTP payload
            format MIME type) and can include codec configuration. A couple of
            codec resources that differ only in codec configuration will be
            "different" if they are somehow not "compatible", such as if they
            differ in video codec profile or the transport packetization
            configuration.</dd>
          <dt>Sampling:</dt>
          <dd>This relates to how the media source is
            sampled, in spatial as well as temporal domain. For video
            streams, spatial sampling affects image resolution, and temporal
            sampling affects video frame rate. For audio, spatial sampling
            relates to the number of audio channels, and temporal sampling
            affects audio bandwidth. This may be used to suit different
            rendering capabilities or needs at the receiving endpoints.</dd>
          <dt>Bitrate:</dt>
          <dd>This relates to the number of bits sent per
            second to transmit the media source as an RTP stream, which
            typically also affects the QoE for the receiving user.</dd>
        </dl>
        <t>Letting the sending participant create a simulcast of a few
        differently configured RTP streams per media source can be a good
        trade-off when using an RTP switch as middlebox, instead of sending a
        single RTP stream and using an RTP mixer to create individual
        transcodings to each receiving participant.</t>
        <t>This requires that the receiving participants can be categorized in
        terms of available resources and that the sending participant can
        choose a matching configuration for a single RTP stream per category
        and media source. For example, a set of receiving participants differ
        only in screen resolution; some are able to display video with at most
        360p resolution, and some support 720p resolution. A sending
        participant can then reach all receivers with best possible resolution
        by creating a simulcast of RTP streams with 360p and 720p resolution
        for each sent video media source.</t>
        <t>The maximum number of simulcasted RTP streams that can be sent is
        mainly limited by the amount of processing and uplink network
        resources available to the sending participant.</t>
      </section>
      <section anchor="sec-application-specific" numbered="true" toc="default">
        <name>Application-Specific Media Source Handling</name>
        <t>The application logic that controls the communication session may
        include special handling of some media sources. It is, for example,
        commonly the case that the media from a sending participant is not
        sent back to itself.</t>
        <t>It is also common that a currently active speaker participant is
        shown in larger size or higher quality than other participants (the
        sampling or bitrate aspects of <xref target="sec-diverse-receivers"
	format="default"/>)
        in a receiving client. Many conferencing systems do not send the
        active speaker's media back to the sender itself, which means there is
        some other participant's media that instead is forwarded to the active
        speaker -- typically the previous active speaker. This way, the
        previously active speaker is needed both in larger size (to current
        active speaker) and in small size (to the rest of the participants),
        which can be solved with a simulcast from the previously active
        speaker to the RTP switch.</t>
      </section>
      <section anchor="sec-receiver-preferences" numbered="true" toc="default">
        <name>Receiver Media-Source Preferences</name>
        <t>The application logic that controls the communication session may
        allow receiving participants to state preferences on the
        characteristics of the RTP stream they like to receive, for example in
        terms of the aspects listed in <xref target="sec-diverse-receivers" format="default"/>.
        Sending a simulcast of RTP streams is one way of accommodating
        receivers with conflicting or otherwise incompatible preferences.</t>
      </section>
    </section>
    <section anchor="sec-overview" numbered="true" toc="default">
      <name>Overview</name>
      <t>This memo defines <xref target="RFC4566" format="default">SDP</xref> signaling that
      covers the above described simulcast use cases and functionalities. A
      number of requirements for such signaling are elaborated in <xref target="sec-requirements" format="default"/>.</t>

      <t>The Restriction Identifier (RID) mechanism, as defined in <xref target="RFC8851" format="default"/>, enables an SDP offerer or answerer to
      specify a number of different RTP stream restrictions for a rid-id by
      using the "a=rid" line. Examples of such restrictions are maximum
      bitrate, maximum spatial video resolution (width and height), maximum
      video frame rate, etc. Each rid-id may also be restricted to use only a
      subset of the RTP payload types in the associated SDP media description.
      Those RTP payload types can have their own configurations and parameters
      affecting what can be sent or received, using the "a=fmtp" line as well
      as other SDP attributes.</t>
      <t>A new SDP media-level attribute, "a=simulcast", is defined. The
      attribute describes, independently for "send" and "receive" directions, the
      number of simulcast RTP streams as well as potential alternative formats
      for each simulcast RTP stream. Each simulcast RTP stream, including
      alternatives, is identified using the RID identifier (rid-id), defined
      in <xref target="RFC8851" format="default"/>.</t>

<sourcecode type="sdp">
a=simulcast:send 1;2,3 recv 4
</sourcecode>

      <t>If this line is included in an SDP offer, the "send" part
      indicates the offerer's capability and proposal to send two simulcast
      RTP streams. Each simulcast stream is described by one or more RTP
      stream identifiers (rid-ids), and each group of rid-ids for a simulcast
      stream is separated by a semicolon (";"). When a simulcast stream has
      multiple rid-ids that are separated by a comma (","), they describe
      alternative representations for that particular simulcast RTP stream.
      Thus, the "send" part shown above is interpreted as an intention to send two
      simulcast RTP streams. The first simulcast RTP stream is identified and
      restricted according to rid-id 1. The second simulcast RTP stream can be
      sent as two alternatives, identified and restricted according to rid-ids
      2 and 3. The "recv" part of the line shown here indicates that the offerer
      desires to receive a single RTP stream (no simulcast) according to
      rid-id 4.</t>
      <t>A more complete example SDP-offer media description is provided
      in <xref target="fig-md-offer" format="default"/>.</t>

      <figure anchor="fig-md-offer">
        <name>Example Simulcast Media Description in Offer</name>
<sourcecode type="sdp">
m=video 49300 RTP/AVP 97 98 99
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=rtpmap:99 VP8/90000
a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000
a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600
a=fmtp:99 max-fs=240; max-fr=30
a=rid:1 send pt=97;max-width=1280;max-height=720
a=rid:2 send pt=98;max-width=320;max-height=180
a=rid:3 send pt=99;max-width=320;max-height=180
a=rid:4 recv pt=97
a=simulcast:send 1;2,3 recv 4
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
</sourcecode>
      </figure>

      <t>The SDP media description in <xref target="fig-md-offer" format="default"/> can be
      interpreted at a high level to
      say that the offerer is capable of sending two simulcast RTP streams:
      one H.264 encoded stream in up to 720p resolution, and one additional
      stream encoded as either H.264 or VP8 with a maximum resolution of
      320x180 pixels. The offerer can receive one H.264 stream with maximum
      720p resolution.</t>
      <t>The receiver of this SDP offer can generate an SDP answer that
      indicates what it accepts. It uses the "a=simulcast" attribute to
      indicate simulcast capability and specify what simulcast RTP streams and
      alternatives to receive and/or send. An example of such an answering
      "a=simulcast" attribute, corresponding to the above offer, is:</t>

<sourcecode type="sdp">
a=simulcast:recv 1;2 send 4
</sourcecode>

      <t>With this SDP answer, the answerer indicates in the "recv" part that
      it wants to receive the two simulcast RTP streams. It has removed an
      alternative that it doesn't support (rid-id 3). The "send" part confirms
      to the offerer that it will receive one stream for this media source
      according to rid-id 4. The corresponding, more complete example SDP
      answer media description could look like <xref target="fig-md-answer" format="default"/>.</t>

      <figure anchor="fig-md-answer">
        <name>Example Simulcast Media Description in Answer</name>
<sourcecode type="sdp">
m=video 49674 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000
a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600
a=rid:1 recv pt=97;max-width=1280;max-height=720
a=rid:2 recv pt=98;max-width=320;max-height=180
a=rid:4 send pt=97
a=simulcast:recv 1;2 send 4
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
</sourcecode>
      </figure>

      <t>It is assumed that a single SDP media description is used to describe
      a single media source. This is aligned with the concepts defined in
      <xref target="RFC7656" format="default"/> and will work in a WebRTC context, both with
      and without BUNDLE grouping of media descriptions <xref target="RFC8843" format="default"/>.</t>
      <t>To summarize, the "a=simulcast" line describes "send"- and
      "receive"-direction simulcast streams separately. Each direction can in
      turn describe one or more simulcast streams, separated by semicolons. The
      identifiers describing simulcast streams on the "a=simulcast" line are
      rid-ids, as defined by "a=rid" lines in <xref target="RFC8851" format="default"/>. Each simulcast stream can be offered as
      a list of alternative rid-ids, with each alternative separated by a comma
      as shown in the example offer in <xref target="fig-md-offer"/>. A detailed specification can be found in
      <xref target="sec-details" format="default"/>, and more detailed examples are outlined in
      <xref target="sec-ex" format="default"/>.</t>
    </section>
    <section anchor="sec-details" numbered="true" toc="default">
      <name>Detailed Description</name>
      <t>This section provides further details to the overview in <xref target="sec-overview" format="default"/>. First, formal syntax is <xref target="sec-attr" format="default">provided</xref>, followed by the rest of the SDP
      attribute definition in <xref target="sec-cap" format="default"/>. <xref target="sec-relating" format="default">"Relating Simulcast Streams"</xref> provides the
      definition of the RTP/RTCP mechanisms used. The section concludes
      with a number of examples.</t>
      <section anchor="sec-attr" numbered="true" toc="default">
        <name>Simulcast Attribute</name>
        <t>This document defines a new SDP media-level "a=simulcast"
        attribute, with value according to the syntax in <xref target="fig-abnf" format="default"/>, which uses <xref target="RFC5234" format="default">ABNF</xref> and its update, <xref target="RFC7405" format="default">"Case-Sensitive String Support in ABNF"</xref>:</t>

        <figure anchor="fig-abnf">
          <name>ABNF for Simulcast Value</name>
<sourcecode type="abnf">
sc-value     = ( sc-send [SP sc-recv] ) / ( sc-recv [SP sc-send] )
sc-send      = %s"send" SP sc-str-list
sc-recv      = %s"recv" SP sc-str-list
sc-str-list  = sc-alt-list *( ";" sc-alt-list )
sc-alt-list  = sc-id *( "," sc-id )
sc-id-paused = "~"
sc-id        = [sc-id-paused] rid-id
; SP defined in [RFC5234]
; rid-id defined in [RFC8851]
</sourcecode>
        </figure>

        <t>The "a=simulcast" attribute has a parameter in the form of one or
        two simulcast stream descriptions, each consisting of a direction
        ("send" or "recv"), followed by a list of one or more simulcast
        streams. Each simulcast stream consists of one or more alternative
        simulcast formats. Each simulcast format is identified by a simulcast
        stream identifier (rid-id). The rid-id <bcp14>MUST</bcp14> have the form of an RTP
        stream identifier, as described by <xref target="RFC8851" format="default">"RTP Payload Format Restrictions"</xref>.</t>
        <t>In the list of simulcast streams, each simulcast stream is
        separated by a semicolon (";"). Each simulcast stream can, in turn, be
        offered in one or more alternative formats, represented by rid-ids,
        separated by commas (","). Each rid-id can also be specified as
        initially <xref target="RFC7728" format="default">paused</xref>, indicated by
        prepending a "~" to the rid-id. The reason to allow separate initial
        pause states for each rid-id is that pause capability can be specified
        individually for each RTP payload type referenced by a rid-id. Since
        pause capability specified via the "a=rtcp-fb" attribute applies only
        to specified payload types, and a rid-id specified by "a=rid" can refer
        to multiple different payload types, it is unfeasible to pause streams
        with rid-id where any of the related RTP payload type(s) do not have
        pause capability.</t>
      </section>
      <section anchor="sec-cap" numbered="true" toc="default">
        <name>Simulcast Capability</name>
        <t>Simulcast capability is expressed through a new media-level <xref target="sec-attr" format="default">SDP attribute, "a=simulcast"</xref>. The use of this
        attribute at the session level is undefined. Implementations of this
        specification <bcp14>MUST NOT</bcp14> use it at the session level and <bcp14>MUST</bcp14> ignore it
        if received at the session level. Extensions to this specification may
        define such session-level usage. Each SDP media description <bcp14>MUST</bcp14>
        contain at most one "a=simulcast" line.</t>
        <t>There are separate and independent sets of simulcast streams in the
        "send" and "receive" directions. When listing multiple directions, each
        direction <bcp14>MUST NOT</bcp14> occur more than once on the same line.</t>
        <t>Simulcast streams using undefined rid-ids <bcp14>MUST NOT</bcp14> be used as valid
        simulcast streams by an RTP stream receiver. The direction for a
        rid-id <bcp14>MUST</bcp14> be aligned with the direction specified for the
        corresponding RTP stream identifier on the "a=rid" line.</t>
        <t>The listed number of simulcast streams for a direction sets a limit
        to the number of supported simulcast streams in that direction. The
        order of the listed simulcast streams in the "send" direction suggests
        a proposed order of preference, in decreasing order: the rid-id listed
        first is the most preferred, and subsequent streams have progressively
        lower preference. The order of the listed rid-ids in the "recv"
        direction expresses which simulcast streams are preferred, with
        the leftmost being most preferred. This can be of importance if the
        number of actually sent simulcast streams has to be reduced for some
        reason.</t>

        <t>rid-ids that have explicit <xref target="RFC5583"
        format="default">dependencies</xref> <xref target="RFC8851"
        format="default"/> to other rid-ids (even in the same media
        description) <bcp14>MAY</bcp14> be used.</t>
        
	<t>Use of more than a single, alternative simulcast format for a
	simulcast stream <bcp14>MAY</bcp14> be specified as part of the
	attribute parameters by expressing the simulcast stream as a
	comma-separated list of alternative rid-ids. The order of the rid-id
	alternatives within a simulcast stream is significant; the rid-id
	alternatives are listed from (left) most preferred to (right) least
	preferred. For the use of simulcast, this overrides the normal codec
	preference as expressed by format-type ordering on the "m=" line,
	using regular SDP rules. This is to enable a separation of general
	codec preferences and simulcast-stream configuration
	preferences. However, the choice of which alternative to use per
	simulcast stream is independent, and there is currently no mechanism
	for the offerer to force the answerer to choose the same alternative
	for multiple simulcast streams.
	</t>

        <t>A simulcast stream can use a codec defined such that the same RTP
        synchronization source (SSRC) can change RTP payload type multiple
        times during a session, possibly even on a per-packet basis. A typical
        example is a speech codec that makes use of formats for <xref
        target="RFC3389" format="default">Comfort Noise</xref> and/or <xref
        target="RFC4733" format="default">dual-tone multifrequency
        (DTMF)</xref>.</t>

        <t>If <xref target="RFC7728" format="default">RTP stream
        pause/resume</xref> is supported, any rid-id <bcp14>MAY</bcp14> be
        prefixed by a "~" character to indicate that the corresponding
        simulcast stream is paused already from the start of the RTP
        session. In this case, support for RTP stream pause/resume
        <bcp14>MUST</bcp14> also be included under the same "m=" line where
        "a=simulcast" is included. All RTP payload types related to such an
        initially paused simulcast stream <bcp14>MUST</bcp14> be listed in the
        SDP as pause/resume capable as specified by <xref target="RFC7728"
        format="default"/> -- e.g., by using the "*" wildcard format for
        "a=rtcp-fb".</t>
        <t>An initially paused simulcast stream in the "send" direction for the
        endpoint sending the SDP <bcp14>MUST</bcp14> be considered equivalent to an
        unsolicited locally paused stream and handled accordingly.
        Initially paused simulcast streams are resumed as described by the RTP
        pause/resume specification. An RTP stream receiver that wishes to
        resume an unsolicited locally paused stream needs to know the SSRC of
        that stream. 

	The SSRC of an initially paused simulcast stream can be obtained from
	an RTP stream sender RTCP Sender Report (SR) or Receiver Report (RR)
	that includes both the desired SSRC as initial SSRC in the source
	description (SDES) chunk, optionally a <xref target="RFC8843"
	format="default">MID SDES item</xref> (if used and if rid-ids are not
	unique across "m=" lines), and the rid-id value in an <xref
	target="RFC8852" format="default">RtpStreamId RTCP SDES
	item</xref>.</t>


        <t>If the endpoint sending the SDP includes a "recv"-direction
        simulcast stream that is initially paused, then the remote RTP sender
        receiving the SDP <bcp14>SHOULD</bcp14> put its RTP stream in an unsolicited locally
        paused state. The simulcast stream sender does not put the stream in
        the locally paused state if there are other RTP stream receivers in
        the session that do not mark the simulcast stream as initially paused.
        However, in centralized conferencing, the RTP sender usually does not
        see the SDP signaling from RTP receivers and cannot make this
        determination. The reason for requiring that an initially paused "recv" stream
        be considered locally paused by the remote RTP sender instead of
        making it equivalent to implicitly sending a pause request is that
        the pausing RTP sender cannot know which receiving SSRC owns the
        restriction when Temporary Maximum Media Stream Bit Rate Request
        (TMMBR) and Temporary Maximum Media Stream Bit Rate Notification
        (TMMBN) are used for pause/resume signaling (<xref target="RFC7728"
	sectionFormat="of" section="5.6" />); this is because the RTP
	receiver's SSRC
        in the "send" direction is sometimes not yet known.</t>
        <t>Use of the redundant audio data format <xref target="RFC2198" format="default"/>
	could be seen as a form of simulcast for loss-protection
        purposes, but it is not considered conflicting with the mechanisms
        described in this memo and <bcp14>MAY</bcp14> therefore be used as any other format.
        In this case, the "red" format, rather than the carried formats, <bcp14>SHOULD</bcp14>
        be the one to list as a simulcast stream on the "a=simulcast"
        line.</t>
        <t>The media formats and corresponding characteristics of simulcast
        streams <bcp14>SHOULD</bcp14> be chosen such that they are different -- e.g., as
        different SDP formats with differing "a=rtpmap" and/or "a=fmtp" lines,
        or as differently defined RTP payload format restrictions. If this
        difference is not required, it is <bcp14>RECOMMENDED</bcp14> to use RTP duplication
	procedures <xref target="RFC7104" format="default"/> instead of simulcast. To avoid
	complications in implementations, a single rid-id
        <bcp14>MUST NOT</bcp14> occur more than once per "a=simulcast" line. Note that this
        does not eliminate use of simulcast as an RTP duplication mechanism,
        since it is possible to define multiple different rid-ids that are
        effectively equivalent.</t>
      </section>
      <section anchor="sec-offer-answer" numbered="true" toc="default">
        <name>Offer/Answer Use</name>
<dl>
<dt>Note:</dt> <dd>The inclusion of "a=simulcast" or the use of simulcast
            does not change any of the interpretation or Offer/Answer
            procedures for other SDP attributes, such as "a=fmtp" or "a=rid".</dd>
</dl>
        <section numbered="true" toc="default">
          <name>Generating the Initial SDP Offer</name>
          <t>An offerer wanting to use simulcast for a media description <bcp14>SHALL</bcp14>
          include one "a=simulcast" attribute in that media description in the
          offer. An offerer listing a set of receive simulcast streams and/or
          alternative formats as rid-ids in the offer <bcp14>MUST</bcp14> be prepared to
          receive RTP streams for any of those simulcast streams and/or
          alternative formats from the answerer.</t>
        </section>
        <section numbered="true" toc="default">
          <name>Creating the SDP Answer</name>
          <t>An answerer that does not understand the concept of simulcast
          will also not know the attribute and will remove it in the SDP
          answer, as defined in existing SDP offer/answer procedures <xref target="RFC3264" format="default"/>. Since SDP session-level simulcast is
          undefined in this memo, an answerer that receives an offer with the
          "a=simulcast" attribute on the SDP session level <bcp14>SHALL</bcp14> remove it in the
          answer. An answerer that understands the attribute but receives
          multiple "a=simulcast" attributes in the same media description
          <bcp14>SHALL</bcp14> disable use of simulcast by removing all "a=simulcast" lines
          for that media description in the answer.</t>
          <t>An answerer that does understand the attribute and wants to
          support simulcast in an indicated direction <bcp14>SHALL</bcp14> reverse
          directionality of the unidirectional direction parameters -- "send"
          becomes "recv" and vice versa -- and include it in the answer.</t>
          <t>An answerer that receives an offer with simulcast containing an
          "a=simulcast" attribute listing alternative rid-ids <bcp14>MAY</bcp14> keep all the
          alternative rid-ids in the answer, but it <bcp14>MAY</bcp14> also choose to remove
          any nondesirable alternative rid-ids in the answer. The answerer
          <bcp14>MUST NOT</bcp14> add any alternative rid-ids in the "send" direction in the answer
          that were not present in the offer receive direction. The answerer
          <bcp14>MUST</bcp14> be prepared to receive any of the receive-direction rid-id
          alternatives and <bcp14>MAY</bcp14> send any of the "send"-direction alternatives
          that are part of the answer.</t>
          <t>An answerer that receives an offer with simulcast that lists a
          number of simulcast streams <bcp14>MAY</bcp14> reduce the number of simulcast
          streams in the answer, but it <bcp14>MUST NOT</bcp14> add simulcast streams.</t>
          <t>An answerer that receives an offer without RTP stream
          pause/resume capability <bcp14>MUST NOT</bcp14> mark any simulcast streams as
          initially paused in the answer.</t>
          <t>An RTP stream answerer capable of pause/resume that receives an
          offer with RTP stream pause/resume capability <bcp14>MAY</bcp14> mark any rid-ids
          that refer to pause/resume capable formats as initially paused in
          the answer.</t>
          <t>An answerer that receives indication in an offer of a rid-id
          being initially paused <bcp14>SHOULD</bcp14> mark that rid-id as initially paused
          also in the answer, regardless of direction, unless it has good
          reason for the rid-id not being initially paused. One reason to
          remove an initial pause in the answer compared to the offer could be,
          for example, that all "receive"-direction simulcast streams for a
          media source the answerer accepts in the answer would otherwise be
          paused.</t>
        </section>
        <section numbered="true" toc="default">
          <name>Offerer Processing the SDP Answer</name>
          <t>An offerer that receives an answer without "a=simulcast" <bcp14>MUST NOT</bcp14>
          use simulcast towards the answerer. An offerer that receives an
          answer with "a=simulcast" without any rid-id in a specified
          direction <bcp14>MUST NOT</bcp14> use simulcast in that direction.</t>
          <t>An offerer that receives an answer where some rid-id alternatives
          are kept <bcp14>MUST</bcp14> be prepared to receive any of the kept "send"-direction
          rid-id alternatives and <bcp14>MAY</bcp14> send any of the kept "receive"-direction
          rid-id alternatives.</t>
          <t>An offerer that receives an answer where some of the rid-ids are
          removed compared to the offer <bcp14>MAY</bcp14> release the corresponding
          resources (codec, transport, etc) in its "receive" direction and <bcp14>MUST
          NOT</bcp14> send any RTP packets corresponding to the removed rid-ids.</t>
          <t>An offerer that offered some of its rid-ids as initially paused
          and receives an answer that does not indicate RTP stream
          pause/resume capability <bcp14>MUST NOT</bcp14> initially pause any simulcast
          streams.</t>
          <t>An offerer with RTP stream pause/resume capability that receives
          an answer where some rid-ids are marked as initially paused <bcp14>SHOULD</bcp14>
          initially pause those RTP streams, even if they were marked as
          initially paused also in the offer, unless it has good reason for
          those RTP streams not being initially paused. One such reason could be,
          for example, that the answerer would otherwise initially not
          receive any media of that type at all.</t>
        </section>
        <section numbered="true" toc="default">
          <name>Modifying the Session</name>
          <t>Offers inside an existing session follow the same rules as for
          initial SDP offer, with these additions:</t>
          <ol spacing="normal" type="1">
            <li>rid-ids marked as initially paused in the offerer's "send"
              direction <bcp14>SHALL</bcp14> reflect the offerer's opinion of the current
              pause state at the time of creating the offer. This is purely
              informational, and RTP stream
              pause/resume signaling <xref target="RFC7728" format="default"/> in the ongoing
	      session <bcp14>SHALL</bcp14> take precedence in case of any conflict or
	      ambiguity.</li>
            <li>rid-ids marked as initially paused in the offerer's "receive"
              direction <bcp14>SHALL</bcp14> (as in an initial offer) reflect the offerer's
              desired rid-id pause state. Except for the case where the
              offerer already paused the corresponding RTP stream through
              <xref target="RFC7728" format="default">RTP stream pause/resume</xref> signaling,
	      this is identical to the conditions at an initial offer.</li>
          </ol>
          <t>Creation of SDP answers and processing of SDP answers inside an
          existing session follow the same rules as described above for
          initial SDP offer/answer.</t>
          <t>Session modification restrictions in <xref
	  target="RFC8851" sectionFormat="of" section="6.5">"RTP Payload Format
	  Restrictions"</xref>
          also apply.</t>
        </section>
      </section>
      <section numbered="true" toc="default">
        <name>Use with Declarative SDP</name>
        <t>This document does not define the use of "a=simulcast" in
        declarative SDP, partly because use of the <xref target="RFC8851" format="default">simulcast format identification</xref>
        is not defined for use in declarative SDP. If concrete use cases
        for simulcast in declarative SDP are identified in the future, the
        authors of this memo expect that additional specifications will
        address such use.</t>
      </section>
      <section anchor="sec-relating" numbered="true" toc="default">
        <name>Relating Simulcast Streams</name>
        <t>Simulcast RTP streams <bcp14>MUST</bcp14> be related on the RTP
	level through <xref target="RFC8852"
	format="default">RtpStreamId</xref>, as specified in the
        SDP <xref target="sec-cap" format="default">"a=simulcast" attribute
	</xref> parameters.
        This is sufficient as long as there is only a single media source per
        SDP media description. When using <xref target="RFC8843" format="default">BUNDLE</xref>, where
        multiple SDP media descriptions jointly specify a single RTP session,
        the SDES MID (Media Identification) mechanism in BUNDLE allows relating RTP
        streams back to individual media descriptions, after which the
	RtpStreamId relations described above can be used. 

	Use of the RTP header extension for the <xref target="RFC7941">RTCP
	source description items</xref> for both MID
	and RtpStreamId identifications can be important to ensure rapid
	initial reception, required to correctly interpret and process the RTP
	streams. Implementers of this specification <bcp14>MUST</bcp14>
	support the RTCP source description (SDES) item method and
	<bcp14>SHOULD</bcp14> support RTP header extension method to signal
	RtpStreamId on the RTP level.</t>
        <dl newline="false" spacing="normal">
          <dt>NOTE:</dt>
          <dd>For the case where it is clear from SDP that the
            RTP PT uniquely maps to a corresponding RtpStreamId, an RTP receiver
            can use RTP PT to relate simulcast streams. This can sometimes
            enable decoding even in advance of receiving RtpStreamId
            information in RTCP SDES and/or RTP header extensions.</dd>
        </dl>
        <t>RTP streams <bcp14>MUST</bcp14> only use a single alternative rid-id at a time
        (based on RTP timestamps) but <bcp14>MAY</bcp14> change format (and rid-id) on a
        per-RTP packet basis. This corresponds to the existing (nonsimulcast)
        SDP offer/answer case when multiple formats are included on the "m="
        line in the SDP answer, enabling per-RTP packet change of RTP payload
        type.</t>
      </section>
      <section anchor="sec-ex" numbered="true" toc="default">
        <name>Signaling Examples</name>
        <t>These examples describe a client-to-video-conference service, using
        a centralized media topology with an RTP mixer.</t>

        <figure anchor="fig-mixer-four-party">
          <name>Four-Party Mixer-Based Conference</name>
          <artwork align="center" name="" type="" alt=""><![CDATA[
+---+      +-----------+      +---+
| A |<---->|           |<---->| B |
+---+      |           |      +---+
           |   Mixer   |
+---+      |           |      +---+
| F |<---->|           |<---->| J |
+---+      +-----------+      +---+]]></artwork>
        </figure>

        <section anchor="sec-ex-single-source" numbered="true" toc="default">
          <name>Single-Source Client</name>
          <t>Alice is calling in to the mixer with a simulcast-enabled client
          capable of a single media source per media type. The client can send
          a simulcast of 2 video resolutions and frame rates: HD 1280x720p
          30fps and thumbnail 320x180p 15fps. This is defined below using the
          <xref target="RFC6236" format="default">"imageattr"</xref>. In this example, only the
          "pt" "a=rid" parameter is used to
          describe simulcast stream formats, effectively achieving a 1:1 mapping
          between RtpStreamId and media formats (RTP payload types). Alice's Offer:</t>

          <figure anchor="fig-up-offer">
            <name>Single-Source Simulcast Offer</name>
<sourcecode type="sdp">
v=0
o=alice 2362969037 2362969040 IN IP4 192.0.2.156
s=Simulcast-Enabled Client
c=IN IP4 192.0.2.156
t=0 0
m=audio 49200 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49300 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000
a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=rid:1 send pt=97 
a=rid:2 send pt=98
a=rid:3 recv pt=97
a=simulcast:send 1;2 recv 3
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
</sourcecode>
          </figure>

          <t>The only thing in the SDP that indicates simulcast capability is
          the line in the video media description containing the "simulcast"
          attribute. The included "a=fmtp" and "a=imageattr" parameters
          indicate that sent simulcast streams can differ in video
          resolution. The RTP header extension for RtpStreamId is offered to
          avoid issues with the initial binding between RTP streams (SSRCs)
          and the RtpStreamId identifying the simulcast stream and its
          format.</t>
          <t>The answer from the server indicates that it, too, is simulcast
          capable. Should it not have been simulcast capable, the
          "a=simulcast" line would not have been present, and communication
          would have started with the media negotiated in the SDP. Also, the
          usage of the RtpStreamId RTP header extension is accepted.</t>

          <figure anchor="fig-up-answer">
            <name>Single-Source Simulcast Answer</name>
<sourcecode type="sdp">
v=0
o=server 823479283 1209384938 IN IP4 192.0.2.2
s=Answer to Simulcast-Enabled Client
c=IN IP4 192.0.2.43
t=0 0
m=audio 49672 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49674 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000
a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=rid:1 recv pt=97
a=rid:2 recv pt=98
a=rid:3 send pt=97
a=simulcast:recv 1;2 send 3
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
</sourcecode></figure>

          <t>Since the server is the simulcast media receiver, it reverses the
          direction of the "simulcast" and "rid" attribute parameters.</t>
        </section>
        <section anchor="sec-ex-multi-source" numbered="true" toc="default">
          <name>Multisource Client</name>
          <t>Fred is calling in to the same conference as in the example above
          with a two-camera, two-display system, thus capable of handling two
          separate media sources in each direction, where each media source is
          simulcast enabled in the "send" direction. Fred's client is restricted
          to a single media source per media description.</t>
          <t>The first two simulcast streams for the first media source use
          different codecs, <xref target="RFC6190" format="default">H264-SVC</xref> and <xref target="RFC6184" format="default">H264</xref>. These two simulcast streams also have
          a temporal dependency. Two different video codecs, <xref target="RFC7741" format="default">VP8</xref> and H264, are offered as alternatives
          for the third simulcast stream for the first media source. Only the
          highest-fidelity simulcast stream is sent from start, the
	  lower-fidelity streams being initially paused.</t>
          <t>The second media source is offered with three different simulcast
          streams. All video streams of this second media source are loss
          protected by <xref target="RFC4588" format="default">RTP retransmission</xref>. In
	  addition, all but the highest-fidelity simulcast stream are
	  initially paused. Note that the lower resolution is more prioritized
	  than the medium-resolution simulcast stream.</t>
          <t>Fred's client is also using BUNDLE to send all RTP streams from
          all media descriptions in the same RTP session on a single media
          transport. Although using many different simulcast streams in this
          example, the use of RtpStreamId as simulcast stream identification
          enables use of a low number of RTP payload types. 

	  Note that when using both <xref target="RFC8843"
	  format="default">BUNDLE</xref> and <xref target="RFC8851"
	  format="default">"a=rid"</xref>, it is recommended to use the RTP
	  header extension for the <xref target="RFC7941" format="default">RTCP
	  source descriptions items</xref> for carrying
	  these RTP stream-identification fields, which is consequently also
	  included in the SDP.

Note also that for "a=rid",
	  the corresponding RtpStreamId SDES attribute RTP header extension is
	  named <xref target="RFC8852"
	  format="default">rtp-stream-id</xref>.</t>

          <figure anchor="fig-ms-offer">
            <name>Fred's Multisource Simulcast Offer</name>
<sourcecode type="sdp">
v=0
o=fred 238947129 823479223 IN IP6 2001:db8::c000:27d
s=Offer from Simulcast-Enabled Multi-Source Client
c=IN IP6 2001:db8::c000:27d
t=0 0
a=group:BUNDLE foo bar zen
m=audio 49200 RTP/AVP 99
a=mid:foo
a=rtpmap:99 G722/8000
m=video 49600 RTP/AVPF 100 101 103
a=mid:bar
a=rtpmap:100 H264-SVC/90000
a=rtpmap:101 H264/90000
a=rtpmap:103 VP8/90000
a=fmtp:100 profile-level-id=42400d;max-fs=3600;max-mbps=216000; \
    mst-mode=NI-TC
a=fmtp:101 profile-level-id=42c00d;max-fs=3600;max-mbps=108000
a=fmtp:103 max-fs=900; max-fr=30
a=rid:1 send pt=100;max-width=1280;max-height=720;max-fps=60;depend=2
a=rid:2 send pt=101;max-width=1280;max-height=720;max-fps=30
a=rid:3 send pt=101;max-width=640;max-height=360
a=rid:4 send pt=103;max-width=640;max-height=360
a=depend:100 lay bar:101
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 1;2;~4,3
m=video 49602 RTP/AVPF 96 104
a=mid:zen
a=rtpmap:96 VP8/90000
a=fmtp:96 max-fs=3600; max-fr=30
a=rtpmap:104 rtx/90000
a=fmtp:104 apt=96;rtx-time=200
a=rid:1 send max-fs=921600;max-fps=30
a=rid:2 send max-fs=614400;max-fps=15
a=rid:3 send max-fs=230400;max-fps=30
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 1;~3;~2
</sourcecode>
          </figure>

        </section>
        <section numbered="true" toc="default">
          <name>Simulcast and Redundancy</name>
          <t>The example in this section looks at applying simulcast with 
          audio and video redundancy formats. 

       The audio media description uses codec and bitrate restrictions,
       combined with the <xref target="RFC2198" format="default">RTP
       payload for redundant audio data</xref> for enhanced packet-loss
       resilience. The video media description applies both resolution and
       bitrate restrictions, combined with Forward Error Correction (FEC)
       in the form of <xref target="RFC8627" format="default">flexible
       FEC</xref> and <xref target="RFC4588" format="default">RTP
       retransmission</xref>.</t>

	  <t>
	    The audio source is offered to be sent as two simulcast
	    streams. The first simulcast stream is encoded with Opus,
	    restricted to 64 kbps (rid-id=1), and the second simulcast stream
	    (rid-id=2) is encoded with either G.711, or G.711 combined with
	    linear predictive coding (LPC) for redundancy and explicit comfort
	    noise (CN). Both simulcast streams include telephone-event
	    capability. In this example, stand-alone LPC is not offered as a
	    possible payload type for the second simulcast stream's RID, which
	    could be motivated by, for example, not providing sufficient
	    quality.
	  </t>

          <t>The video source is offered to be sent as two simulcast streams,
          both with two alternative simulcast formats. Redundancy and repair
          are offered in the form of both flexible FEC and RTP retransmission.
          The flexible FEC is not bound to any particular RTP streams and is
          therefore able to be used across all RTP streams that are being sent
          as part of this media description.</t>

          <figure anchor="fig-sim-red">
            <name>Simulcast and Redundancy Example</name>
<sourcecode type="sdp">
o=fred 238947129 823479223 IN IP6 2001:db8::c000:27d
s=Offer from Simulcast-Enabled Client using Redundancy
c=IN IP6 2001:db8::c000:27d
t=0 0
a=group:BUNDLE foo bar
m=audio 49200 RTP/AVP 97 98 99 100 101 102
a=mid:foo
a=rtpmap:97 G711/8000
a=rtpmap:98 LPC/8000
a=rtpmap:99 OPUS/48000/1
a=rtpmap:100 RED/8000/1
a=rtpmap:101 CN/8000
a=rtpmap:102 telephone-event/8000
a=fmtp:99 useinbandfec=1;usedtx=0
a=fmtp:100 97/98
a=fmtp:102 0-15
a=ptime:20
a=maxptime:40
a=rid:1 send pt=99,102;max-br=64000
a=rid:2 send pt=100,97,101,102
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=simulcast:send 1;2
m=video 49600 RTP/AVPF 103 104 105 106 107
a=mid:bar
a=rtpmap:103 H264/90000
a=rtpmap:104 VP8/90000
a=rtpmap:105 rtx/90000
a=rtpmap:106 rtx/90000
a=rtpmap:107 flexfec/90000
a=fmtp:103 profile-level-id=42c00d;max-fs=3600;max-mbps=108000
a=fmtp:104 max-fs=3600; max-fr=30
a=fmtp:105 apt=103;rtx-time=200
a=fmtp:106 apt=104;rtx-time=200
a=fmtp:107 repair-window=100000
a=rid:1 send pt=103;max-width=1280;max-height=720;max-fps=30
a=rid:2 send pt=104;max-width=1280;max-height=720;max-fps=30
a=rid:3 send pt=103;max-width=640;max-height=360;max-br=300000
a=rid:4 send pt=104;max-width=640;max-height=360;max-br=300000
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 1,2;3,4
</sourcecode>
          </figure>


        </section>
      </section>
    </section>
    <section anchor="sec-rtp-aspects" numbered="true" toc="default">
      <name>RTP Aspects</name>
      <t>This section discusses what the different entities in a simulcast
      media path can expect to happen on the RTP level. This is explored from
      source to sink by starting in an endpoint with a media source that is
      simulcasted to an RTP middlebox. That RTP middlebox sends media sources
      to other RTP middleboxes (cascaded middleboxes), as well as
      selecting some simulcast format of the media source and sending it to
      receiving endpoints. Different types of RTP middleboxes and their usage
      of the different simulcast formats results in several different
      behaviors.</t>
      <section numbered="true" toc="default">
        <name>Outgoing from Endpoint with Media Source</name>
        <t>The most straightforward simulcast case is the RTP streams being
        emitted from the endpoint that originates a media source. When
        simulcast has been negotiated in the sending direction, the endpoint
        can transmit up to the number of RTP streams needed for the negotiated
        simulcast streams for that media source. Each RTP stream (SSRC) is
        identified by associating it (<xref target="sec-relating" format="default"/>) with
        an RtpStreamId SDES item, transmitted in RTCP and possibly also as an
        RTP header extension. In cases where multiple media sources have been
        negotiated for the same RTP session and thus <xref target="RFC8843" format="default">BUNDLE</xref> is used, the MID SDES item will also be
	sent, similarly to the RtpStreamId.</t>
        <t>Each RTP stream might not be continuously transmitted due to any of
        the following reasons: temporarily paused using <xref target="RFC7728" format="default">Pause/Resume</xref>, sender-side application logic
        temporarily pausing it, or lack of network resources to transmit this
        simulcast stream. However, all simulcast streams that have been
        negotiated have active and maintained SSRCs (at least in regular RTCP
        reports), even if no RTP packets are currently transmitted. The
        relation between an RTP stream (SSRC) and a particular simulcast
        stream is not expected to change, except in exceptional situations
        such as SSRC collisions. At SSRC changes, the usage of MID and
        RtpStreamId should enable the receiver to correctly identify the RTP
        streams even after an SSRC change.</t>
      </section>
      <section numbered="true" toc="default">
        <name>RTP Middlebox to Receiver</name>
        <t>RTP streams in a multiparty RTP session can be used in multiple
        different ways when the session utilizes simulcast at least on the
        media-source-to-middlebox legs. This is to a large degree due to the
        different RTP middlebox behaviors, but also the needs of the
        application. This text assumes that the RTP middlebox will select a
        media source and choose which simulcast stream for that media source
        to deliver to a specific receiver. In many cases, at most one
        simulcast stream per media source will be forwarded to a particular
        receiver at any instant in time, even if the selected simulcast stream
        may vary. For cases where this does not hold due to application needs,
        the RTP stream aspects will fall under the middlebox-to-middlebox
        case (<xref target="sec-rtp-box-box" format="default"/>).</t>
        <t>The selection of which simulcast streams to forward towards the
        receiver is application specific. However, in conferencing
        applications, active speaker selection is common. In case the number
        of media sources possible to forward, N, is less than the total number
        of media sources available in a multimedia session, the current and
        previous speakers (up to N in total) are often the ones forwarded. To
        avoid the need for media-specific processing to determine the current
        speaker(s) in the RTP middlebox, the endpoint providing a media source
        may include metadata, such as the <xref target="RFC6464" format="default">RTP header
        extension for client-to-mixer audio level indication</xref>.</t>
        <t>The possibilities for stream switching are media type specific, but
        for media types with significant interframe dependencies in the
        encoding, like most video coding, the switching needs to be made at
        suitable switching points in the media stream that breaks or otherwise
        deals with the dependency structure. Even if switching points can be
        included periodically, it is common to use mechanisms like <xref target="RFC5104" format="default">Full Intra Requests</xref> to request switching
        points from the endpoint performing the encoding of the media
        source.</t>
        <t>Inclusion of the RtpStreamId SDES item for an SSRC in the
	middlebox-to-receiver direction should only occur when use of
	RtpStreamId has
        been negotiated in that direction. It is worth noting that one can
        signal multiple RtpStreamIds when simulcast signaling indicates only
        a single simulcast stream, allowing one to use all of the RtpStreamIds
        as alternatives for that simulcast stream. One reason for including
        the RtpStreamId in the middlebox-to-receiver direction for an RTP
        stream is to let the receiver know which restrictions apply to the
        currently delivered RTP stream. In case the RtpStreamId is negotiated
        to be used, it is important to remember that the used identifiers will
        be specific to each signaling session. Even if the central entity can
        attempt to coordinate, it is likely that the RtpStreamIds need to be
        translated to the leg-specific values. The below cases will assume
	that RtpStreamId is not used in the mixer to receiver
        direction.</t>
        <section numbered="true" toc="default">
          <name>Media-Switching Mixer</name>
          <t>This section discusses the behavior in cases where the RTP
          middlebox behaves like the media-switching mixer in
          RTP topologies (<xref target="RFC7667"
	  sectionFormat="of" section="3.6.2"/>). The
	  fundamental aspect
          here is that the media sources delivered from the middlebox will be
          the mixer's conceptual or functional ones. For example, one media
          source may be the main speaker in high-resolution video, while a
          number of other media sources are thumbnails of each
          participant.</t>
          <t>The above results in the RTP stream produced by the mixer being
          one that switches between a number of received incoming RTP streams
          for different media sources and in different simulcast versions. The
          mixer selects the media source to be sent as one of the RTP streams
          and then selects among the available simulcast streams for the most
          appropriate one. The selection criteria include available bandwidth
          on the mixer-to-receiver path and restrictions based on the
          functional usage of the RTP stream delivered to the receiver. As an
          example of the latter, it is unnecessary to forward a full HD video
          to a receiver if the display area is just a thumbnail. Thus,
          restrictions may exist to not allow some simulcast streams to be
          forwarded for some of the mixer's media sources.</t>
          <t>This will result in a single RTP stream being used for each of
          the RTP mixer's media sources. At any point in time, this RTP stream
	  is a selection of one particular RTP stream arriving to the mixer,
          where the RTP header-field values are rewritten to provide a
          consistent, single RTP stream. If the RTP mixer doesn't receive any
          incoming stream matched to this media source, the SSRC will not
          transmit but be kept alive using RTCP. The SSRC and thus RTP stream
          for the mixer's media source is expected to be long-term stable. It
          will only be changed by signaling or other disruptive events. Note
          that although the above talks about a single RTP stream, there can
          in some cases be multiple RTP streams carrying the selected
          simulcast stream for the originating media source, including
          redundancy or other auxiliary RTP streams.</t>

          <t>The mixer may communicate the identity of the originating media
          source to the receiver by including the Contributing Source (CSRC) field with the
          originating media source's SSRC value. Note that due to the
          possibility that the RTP mixer switches between simulcast versions
          of the media source, the CSRC value may change, even if the media
          source is kept the same.</t>

          <t>It is important to note that any MID SDES item from the
          originating media source needs to be removed and not be associated
          with the RTP stream's SSRC. That is, there is nothing in the
          signaling between the mixer and the receiver that is structured
          around the originating media sources, only the mixer's media
          sources. If they were associated with the SSRC, the receiver
          would likely believe that there has been an SSRC collision and
          the RTP stream is spurious, because it doesn't carry the identifiers used
          to relate it to the correct context. However, this is not true for
          CSRC values, as long as they are never used as an SSRC. In these cases,
          one could provide CNAME and MID as SDES items. A receiver could use
          this to determine which CSRC values that are associated with the
          same originating media source.</t>
          <t>If RtpStreamIds are used in the scenario described by this
          section, it should be noted that the RtpStreamId on a particular
          SSRC will change based on the actual simulcast stream selected for
          switching. These RtpStreamId identifiers will be local to this leg's
          signaling context. In addition, the defined RtpStreamIds and their
          parameters need to cover all the media sources and simulcast streams
          received by the RTP mixer that can be switched into this media
          source, sent by the RTP mixer.</t>
        </section>
        <section numbered="true" toc="default">
          <name>Selective Forwarding Middlebox</name>
          <t>This section discusses the behavior in cases where the RTP
          middlebox behaves like the Selective Forwarding Middlebox in RTP
	  topologies (<xref target="RFC7667"
	  sectionFormat="of" section="3.7"/>). Applications
          for this type of RTP middlebox result in each originating
          media source having a corresponding media source on the leg
          between the middlebox and the receiver. A Selective Forwarding
          Middlebox (SFM) could go as far as exposing all the simulcast
          streams for a media source; however, this section will focus on
          having a single simulcast stream that can contain any of the
          simulcast formats. This section will assume that the SFM projection
          mechanism works on the media-source level and maps one of the media
          source's simulcast streams onto one RTP stream from the SFM to the
          receiver.</t>
          <t>This usage will result in the individual RTP stream(s) for
          one media source being able to switch between being active and
	  paused, based on
          the subset of media sources the SFM wants to provide the receiver
          for the moment. With SFMs, there exist no reasons to use CSRC to
          indicate the originating stream, as there is a one-to-one
	  media-source mapping. If the application requires knowing the
	  simulcast
          version received to function well, then RtpStreamId should be
          negotiated on the SFM to receiver leg. Which simulcast stream that
          is being forwarded is not made explicit unless RtpStreamId is used
          on the leg.</t>
          <t>Any MID SDES items being sent by the SFM to the receiver are only
          those agreed between the SFM and the receiver, and no MID values
          from the originating side of the SFM are to be forwarded.</t>
          <t>An SFM could expose corresponding RTP streams for all the media
          sources and their simulcast streams and then, for any media source
          that is to be provided, forward one selected simulcast stream.
          However, this is not recommended, as it would unnecessarily increase
          the number of RTP streams and require the receiver to timely detect
          switching between simulcast streams. The above usage requires the
          same SFM functionality for switching, while avoiding the
          uncertainties of timely detecting that an RTP stream ends. The
          benefit would be that the received simulcast stream would be
          implicitly provided by which RTP stream would be active for a media
          source. However, using RtpStreamId to make this explicit also
          exposes which alternative format is used. The conclusion is that
          using one RTP stream per simulcast stream is unnecessary. The issue
          with timely detecting end of streams, independent of whether they are
          stopped temporarily or long term, is that there is no explicit
          indication that the transmission has intentionally been stopped. The
          RTCP-based <xref target="RFC7728" format="default">pause and resume
	  mechanism</xref>
          includes a PAUSED indication that provides the last RTP sequence
          number transmitted prior to the pause. Due to usage, the timeliness
          of this solution depends on when delivery using RTCP can occur in
          relation to the transmission of the last RTP packet. If no explicit
          information is provided at all, then detection based on
	  nonincreasing RTCP SR field values and timers need to be used to
          determine pause in RTP packet delivery. As a result, when the last
	  RTP packet arrives (if it arrives), one usually
	  cannot determine that this will be the last. That it was the last is
          something that one learns later.</t>
        </section>
      </section>
      <section anchor="sec-rtp-box-box" numbered="true" toc="default">
        <name>RTP Middlebox to RTP Middlebox</name>
        <t>This relates to the transmission of simulcast streams between RTP
        middleboxes or other usages where one wants to enable the delivery of
        multiple simultaneous simulcast streams per media source, but the
        transmitting entity is not the originating endpoint. For a particular
        direction between middleboxes A and B, this looks very similar to the
        originating-to-middlebox case on a media-source basis. However, in
        this case, there are usually multiple media sources, originating from
        multiple endpoints. This can create situations where limitations in
        the number of simultaneously received media streams can arise -- for
        example, due to limitation in network bandwidth. In this case, a subset
        of not only the simulcast streams but also media sources can be
        selected. As a result, individual RTP streams can become
        paused at any point and later be resumed based on various criteria.</t>
        <t>The MIDs used between A and B are the ones agreed between these two
        identities in signaling. The RtpStreamId values will also be provided
        to ensure explicit information about which simulcast stream they are.
        The RTP-stream-to-MID and -RtpStreamId associations should here be
	long-term stable.</t>
      </section>
    </section>
    <section anchor="sec-network-aspects" numbered="true" toc="default">
      <name>Network Aspects</name>
      <t>Simulcast is in this memo defined as the act of sending multiple
      alternative encoded streams of the same underlying media
      source. Transmitting multiple independent streams that originate from
      the same
      source could potentially be done in several different ways using
      RTP. A general discussion on considerations for use of the different RTP
      multiplexing alternatives can be found in <xref target="RFC8872" format="default">"Guidelines for Using the Multiplexing Features of
      RTP to Support Multiple Media Streams"</xref>. Discussion and
      clarification on how to handle multiple streams in an RTP session can be
      found in <xref target="RFC8108" format="default"/>.</t>
      <t>The network aspects that are relevant for simulcast are:</t>
      <dl newline="false" spacing="normal">
        <dt>Quality of Service (QoS):</dt>
        <dd>When using simulcast, it might be
          of interest to prioritize a particular simulcast stream, rather than
          applying equal treatment to all streams. For example, lower-bitrate
          streams may be prioritized over higher-bitrate streams to minimize
          congestion or packet losses in the low-bitrate streams. Thus, there
          is a benefit to using a simulcast solution with good QoS support.</dd>

        <dt>NAT/FW Traversal (Network Address Translator / Firewall Traversal):</dt>
        <dd>Using multiple RTP sessions incurs
          more cost for NAT/FW traversal unless they can reuse the same
          transport flow, which can be achieved by <xref target="RFC8843" format="default">multiplexing negotiation using SDP port
	  numbers</xref>.</dd>
      </dl>
      <t/>
      <section numbered="true" toc="default">
        <name>Bitrate Adaptation</name>
        <t>Use of multiple simulcast streams can require a significant amount
        of network resources. The aggregate bandwidth for all simulcast
        streams for a media source (and thus SDP media description) is bounded
        by any SDP "b=" line applicable to that media source. It is assumed
        that a suitable congestion-control mechanism is used by the
        application to ensure that it doesn't cause persistent congestion. If
        the amount of available network resources varies during an RTP session
        such that it does not match what is negotiated in SDP, the bitrate
        used by the different simulcast streams may have to be reduced
        dynamically. When a simulcasting media source uses a single media
        transport for all of the simulcast streams, it is likely that a joint
        congestion control across all simulcast streams is used for that media
        source. What simulcast streams to prioritize when allocating available
        bitrate among the simulcast streams in such adaptation <bcp14>SHOULD</bcp14> be taken
        from the simulcast stream order on the "a=simulcast" line and ordering
        of alternative simulcast formats (<xref target="sec-cap" format="default"/>). Simulcast
        streams that have pause/resume capability and that would be given such
        low bitrate by the adaptation process that they are considered not
        really useful can be temporarily paused until the limiting condition
        clears.</t>
      </section>
    </section>
    <section anchor="sec-limitation" numbered="true" toc="default">
      <name>Limitation</name>
      <t>The chosen approach has a limitation that relates to the use of a
      single RTP session for all simulcast formats of a media source, which
      comes from sending all simulcast streams related to a media source under
      the same SDP media description.</t>
      <t>It is not possible to use different simulcast streams on different
      media transports, which limits the possibilities for applying different QoS to
      different simulcast streams. When using unicast, QoS mechanisms based on
      individual packet marking are feasible, since they do not require
      separation of simulcast streams into different RTP sessions to apply
      different QoS.</t>
      <t>It is also not possible to separate different simulcast streams into
      different multicast groups to allow a multicast receiver to pick the
      stream it wants, rather than receive all of them. In this case, the only
      reasonable implementation is to use different RTP sessions for each
      multicast group so that reporting and other RTCP functions operate as
      intended. Such simulcast usage in a multicast context is out of scope for
      the current document and would require additional specification.</t>
    </section>
    <section anchor="sec-iana" numbered="true" toc="default">
      <name>IANA Considerations</name>
      <t>This document registers a new media-level SDP attribute,
      "simulcast", in the "att-field (media level only)" registry within the
      "Session Description Protocol (SDP) Parameters" registry, according to the
      procedures of <xref target="RFC4566" format="default"/> and <xref target="RFC8859" format="default"/>.</t>
      <dl newline="false" spacing="normal">
        <dt>Contact name, email:</dt>
        <dd>The IESG (iesg@ietf.org)</dd>
        <dt>Attribute name:</dt>
        <dd>simulcast</dd>
        <dt>Long-form attribute name:</dt>
        <dd>Simulcast stream description</dd>
        <dt>Charset dependent:</dt>
        <dd>No</dd>
        <dt>Attribute value:</dt>
        <dd>sc-value; see <xref target="sec-attr" format="default"/> of RFC
	8853.</dd>
        <dt>Purpose:</dt>
        <dd>Signals simulcast capability for a set of RTP
          streams</dd>
        <dt>Mux category:</dt>
        <dd>NORMAL</dd>
      </dl>
    </section>
    <section anchor="sec-security" numbered="true" toc="default">
      <name>Security Considerations</name>
      <t>The simulcast capability, configuration attributes, and parameters
      are vulnerable to attacks in signaling.</t>
      <t>A false inclusion of the "a=simulcast" attribute may result in
      simultaneous transmission of multiple RTP streams that would otherwise
      not be generated. The impact is limited by the media description joint
      bandwidth, shared by all simulcast streams irrespective of their number.
      However, there may be a large number of unwanted RTP streams that will
      impact the share of bandwidth allocated for the originally wanted RTP
      stream.</t>
      <t>A hostile removal of the "a=simulcast" attribute will result in
      simulcast not being used.</t>

      <t>
	Integrity protection and source authentication of all SDP signaling,
	including simulcast attributes, can mitigate the risks of such attacks
	that attempt to alter signaling.
      </t>
      <t>Security considerations related to the use of "a=rid" and the
      RtpStreamId SDES item are covered in <xref target="RFC8851" format="default"/>
      and <xref target="RFC8852" format="default"/>. There are no additional
      security concerns related to their use in this specification.</t>
    </section>
  </middle>
  <back>

    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include
	    href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3264.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3550.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4566.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5234.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7405.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7728.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>

<!-- draft-ietf-mmusic-rid (RFC 8851) -->
<reference anchor="RFC8851" target="https://www.rfc-editor.org/info/rfc8851">
  <front>
    <title>RTP Payload Format Restrictions</title>
    <author initials="A.B." surname="Roach" fullname="Adam Roach" role="editor">
      <organization/>
    </author>
    <date month="January" year="2021"/>
  </front>
    <seriesInfo name="RFC" value="8851"/>
    <seriesInfo name="DOI" value="10.17487/RFC8851"/>

</reference>

<!-- draft-ietf-avtext-rid-09 (RFC 8852) -->
<reference anchor="RFC8852" target="https://www.rfc-editor.org/info/rfc8852">
  <front>
    <title>RTP Stream Identifier Source Description (SDES)</title>
    <author initials="A.B." surname="Roach" fullname="Adam Roach"/>
    <author initials="S" surname="Nandakumar" fullname="Suhas Nandakumar"/>
    <author initials="P" surname="Thatcher" fullname="Peter Thatcher"/>
    <date month="January" year="2021"/>
  </front>
    <seriesInfo name="RFC" value="8852"/>
    <seriesInfo name="DOI" value="10.17487/RFC8852"/>

</reference>

<!-- draft-ietf-mmusic-sdp-mux-attributes-17 (RFC 8859) -->
        <reference anchor="RFC8859" target="https://www.rfc-editor.org/info/rfc8859">
          <front>
            <title>A Framework for Session Description Protocol (SDP)
            Attributes When Multiplexing</title>
            <author initials="S" surname="Nandakumar" fullname="Suhas Nandakumar">
              <organization/>
            </author>
            <date month="January" year="2021"/>
          </front>
            <seriesInfo name="RFC" value="8859"/>
            <seriesInfo name="DOI" value="10.17487/RFC8859"/>

        </reference>


<!-- draft-ietf-mmusic-sdp-bundle-negotiation (RFC 8843) -->
    <reference anchor="RFC8843" target="https://www.rfc-editor.org/info/rfc8843">
      <front>
        <title>Negotiating Media Multiplexing Using the Session Description Protocol (SDP)</title>
        <author initials="C" surname="Holmberg" fullname="Christer Holmberg">
          <organization/>
        </author>
        <author initials="H" surname="Alvestrand" fullname="Harald Alvestrand">
          <organization/>
        </author>
        <author initials="C" surname="Jennings" fullname="Cullen Jennings">
          <organization/>
        </author>
        <date month="January" year="2021"/>
      </front>
        <seriesInfo name="RFC" value="8843"/>
        <seriesInfo name="DOI" value="10.17487/RFC8843"/>
    </reference>

      </references>
      <references>
        <name>Informative References</name>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2198.xml"/>

        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3389.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4588.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4733.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5104.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5109.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5583.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6184.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6190.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6236.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6464.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7104.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7656.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7667.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7741.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8108.xml"/>
        <xi:include
	    href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7941.xml"/>
        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8627.xml"/>


<!-- draft-ietf-avtcore-multiplex-guidelines: 8872 -->
<reference anchor="RFC8872" target="https://www.rfc-editor.org/info/rfc8872">
   <front>
      <title>Guidelines for Using the Multiplexing Features of RTP to Support
      Multiple Media Streams</title>
      <author initials="M" surname="Westerlund" fullname="Magnus Westerlund">
	 <organization/>
      </author>
      <author initials="B" surname="Burman" fullname="Bo Burman">
	 <organization/>
      </author>
      <author initials="C" surname="Perkins" fullname="Colin Perkins">
	 <organization/>
      </author>
      <author initials="H" surname="Alvestrand" fullname="Harald Alvestrand">
	 <organization/>
      </author>
      <author initials="R" surname="Even" fullname="Roni Even">
      </author>
      <date month="January" year="2021"/>
   </front>
   <seriesInfo name="RFC" value="8872"/>
   <seriesInfo name="DOI" value="10.17487/RFC8872"/>
</reference>

      </references>
    </references>
    <section anchor="sec-requirements" numbered="true" toc="default">
      <name>Requirements</name>
      <t>The following requirements are met by the defined solution to support
      the <xref target="sec-use-cases" format="default">use cases</xref>:</t>

      <dl newline="false" spacing="normal">
        <dt>REQ-1:</dt>
        <dd anchor="req-1">
          <t>Identification:</t>
          <dl newline="false" spacing="normal">
            <dt>REQ-1.1:</dt>
            <dd anchor="req-1.1">It must be possible to
              identify a set of simulcasted RTP streams as originating from
              the same media source in SDP signaling.</dd>
            <dt>REQ-1.2:</dt>
            <dd anchor="req-1.2">An RTP endpoint must be
              capable of identifying the simulcast stream that a received RTP
              stream is associated with, knowing the content of the SDP
              signaling.</dd>
          </dl>
        </dd>
        <dt>REQ-2:</dt>
        <dd anchor="req-2">
          <t>Transport usage. The solution
          must work when using:</t>
          <dl newline="false" spacing="normal">
            <dt>REQ-2.1:</dt>
            <dd anchor="req-2.1">Legacy SDP with separate
              media transports per SDP media description.</dd>
            <dt>REQ-2.2:</dt>
            <dd anchor="req-2.2">
              <xref target="RFC8843" format="default">Bundled</xref>
              SDP media descriptions.</dd>
          </dl>
        </dd>

        <dt>REQ-3:</dt>
        <dd anchor="req-3">
          <t>Capability negotiation. The
	  following must be possible:</t>
          <dl newline="false" spacing="normal">
            <dt>REQ-3.1:</dt>
            <dd anchor="req-3.1">The sender can express
              capability of sending simulcast.</dd>
            <dt>REQ-3.2:</dt>
            <dd anchor="req-3.2">The receiver can express
              capability of receiving simulcast.</dd>
            <dt>REQ-3.3:</dt>
            <dd anchor="req-3.3">The sender can express
	      the maximum number of simulcast streams that can be
	      provided.</dd>
            <dt>REQ-3.4:</dt>
            <dd anchor="req-3.4">The receiver can express the
              maximum number of simulcast streams that can be received.</dd>
            <dt>REQ-3.5:</dt>
            <dd anchor="req-3.5">The sender can detail the
              characteristics of the simulcast streams that can be
              provided.</dd>
            <dt>REQ-3.6:</dt>
            <dd anchor="req-3.6">The receiver can detail the
              characteristics of the simulcast streams that it prefers to
              receive.</dd>
          </dl>
        </dd>
        <dt>REQ-4:</dt>
        <dd anchor="req-4">Distinguishing features. It must
          be possible to have different simulcast streams use different codec
          parameters, as can be expressed by SDP format values and RTP payload
          types.</dd>
        <dt>REQ-5:</dt>
        <dd anchor="req-5">
          <t>Compatibility. It must be
          possible to use simulcast in combination with other RTP mechanisms
          that generate additional RTP streams:</t>
          <dl newline="false" spacing="normal">
            <dt>REQ-5.1:</dt>
            <dd anchor="req-5.1">
              <xref target="RFC4588" format="default">RTP retransmission</xref>.</dd>
            <dt>REQ-5.2:</dt>
            <dd anchor="req-5.2">
              <xref target="RFC5109" format="default">RTP Forward Error Correction</xref>.</dd>
            <dt>REQ-5.3:</dt>
            <dd anchor="req-5.3">Related payload types
              such as audio Comfort Noise and/or DTMF.</dd>
            <dt>REQ-5.4:</dt>
            <dd>A single simulcast stream can consist of
              multiple RTP streams, to support codecs where a dependent stream
              is dependent on a set of encoded and dependent streams, each
              potentially carried in their own RTP stream.</dd>
          </dl>
        </dd>
        <dt>REQ-6:</dt>
        <dd anchor="req-6">
          <t>Interoperability. The solution
          must be possible to use in:</t>
          <dl newline="false" spacing="normal">
            <dt>REQ-6.1:</dt>
            <dd anchor="req-6.1">Interworking with
              nonsimulcast legacy clients using a single media source per
              media type.</dd>
            <dt>REQ-6.2:</dt>
            <dd anchor="req-6.2">WebRTC environment with
              a single media source per SDP media description.</dd>
          </dl>
        </dd>
      </dl>

    </section>
    <section anchor="sec-ack" numbered="false" toc="default">
      <name>Acknowledgements</name>
      <t>The authors would like to thank <contact fullname="Bernard Aboba"/>, <contact
      fullname="Thomas Belling"/>, <contact fullname="Roni Even"/>, <contact
      fullname="Adam Roach"/>, <contact fullname="Iñaki Baz Castillo"/>,
      <contact fullname="Paul Kyzivat"/>, and <contact fullname="Arun
      Arunachalam"/> for the feedback they provided during the development of
      this document.</t>
    </section>

    <section anchor="sec-contributors" numbered="false" toc="default">
      <name>Contributors</name>

      <t><contact fullname="Morgan Lindqvist"/> and <contact fullname="Fredrik
      Jansson"/>, both from Ericsson, have contributed with important material
      to the first draft versions of this document. <contact fullname="Robert
      Hanton"/> and <contact fullname="Cullen Jennings"/> from Cisco, <contact
      fullname="Peter Thatcher"/> from Google, and <contact fullname="Adam
      Roach"/> from Mozilla contributed significantly to subsequent
      versions.</t>
    </section>

  </back>
</rfc>
