<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">

<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-ietf-nvo3-geneve-16" number="8926" submissionType="IETF" category="std" consensus="true" obsoletes="" updates="" xml:lang="en" sortRefs="true" symRefs="true" tocInclude="true" version="3">

  <front>
    <title abbrev="Geneve Protocol">Geneve: Generic Network Virtualization Encapsulation</title>
    <seriesInfo name="RFC" value="8926"/>
    <author fullname="Jesse Gross" initials="J." role="editor" surname="Gross">
      <organization/>
      <address>
        <email>jesse@kernel.org</email>
      </address>
    </author>
    <author fullname="Ilango Ganga" initials="I." role="editor" surname="Ganga">
      <organization abbrev="Intel">Intel Corporation</organization>
      <address>
        <postal>
          <street>2200 Mission College Blvd.</street>
          <city>Santa Clara</city><region>CA</region><code>95054</code>
          <country>United States of America</country>
        </postal>
        <email>ilango.s.ganga@intel.com</email>
      </address>
    </author>
    <author fullname="T. Sridhar" initials="T." role="editor" surname="Sridhar">
      <organization abbrev="VMware">VMware, Inc.</organization>
      <address>
        <postal>
          <street>3401 Hillview Ave.</street>
          <city>Palo Alto</city><region>CA</region><code>94304</code>
          <country>United States of America</country>
        </postal>
        <email>tsridhar@utexas.edu</email>
      </address>
    </author>
    <date month="November" year="2020"/>

<keyword>overlay</keyword>
<keyword>tunnel</keyword>
<keyword>extensible</keyword>
<keyword>variable</keyword>
<keyword>metadata</keyword>
<keyword>options</keyword>
<keyword>endpoint</keyword>
<keyword>transit</keyword>

    <abstract>

      <t>
   Network virtualization involves the cooperation of devices with a
   wide variety of capabilities such as software and hardware tunnel
   endpoints, transit fabrics, and centralized control clusters.  As a
   result of their role in tying together different elements of the
   system, the requirements on tunnels are influenced by all of these
   components.  Therefore, flexibility is the most important aspect of a
   tunneling protocol if it is to keep pace with the evolution of technology.
   This document describes Geneve, an encapsulation protocol designed to
   recognize and accommodate these changing capabilities and needs.</t>
    </abstract>
  </front>
  <middle>
    <section anchor="sec-1" numbered="true" toc="default">
      <name>Introduction</name>
      <t>
   Networking has long featured a variety of tunneling, tagging, and
   other encapsulation mechanisms.  However, the advent of network
   virtualization has caused a surge of renewed interest and a
   corresponding increase in the introduction of new protocols.  The
   large number of protocols in this space -- for example, ranging all the way from
   VLANs <xref target="IEEE.802.1Q_2018" format="default"/> and MPLS <xref target="RFC3031" format="default"/> through the more recent
   VXLAN (Virtual eXtensible Local Area Network)  <xref target="RFC7348" format="default"/>
   and NVGRE (Network Virtualization
   Using Generic Routing Encapsulation) <xref target="RFC7637"
   format="default"/> -- often
   leads to questions about the need for new encapsulation formats and
   what it is about network virtualization in particular that leads to
   their proliferation. Note that the list of protocols presented above is non-exhaustive.</t>
      <t>
   While many encapsulation protocols seek to simply partition the
   underlay network or bridge two domains, network
   virtualization views the transit network as providing connectivity
   between multiple components of a distributed system.  In many ways,
   this system is similar to a chassis switch with the IP underlay
   network playing the role of the backplane and tunnel endpoints on the
   edge as line cards.  When viewed in this light, the requirements
   placed on the tunneling protocol are significantly different in terms of
   the quantity of metadata necessary and the role of transit nodes.</t>
      <t>
   Work such as "VL2: A Scalable and Flexible Data Center Network" <xref target="VL2" format="default"/> and "NVO3 Data Plane Requirements" <xref target="I-D.ietf-nvo3-dataplane-requirements" format="default"/> 
   have described some of the properties that the data plane must have to support network
   virtualization.  However, one additional defining requirement is the
   need to carry metadata (e.g., system state) along with the packet data; 
   example use cases of metadata are noted below. The use of
   some metadata is certainly not a foreign concept -- nearly all
   protocols used for network virtualization have at least 24 bits of identifier
   space as a way to partition between tenants.  This is often described
   as overcoming the limits of 12-bit VLANs; when seen in that
   context or any context where it is a true tenant identifier, 16
   million possible entries is a large number.  However, the reality is
   that the metadata is not exclusively used to identify tenants, and
   encoding other information quickly starts to crowd the space.  In
   fact, when compared to the tags used to exchange metadata between
   line cards on a chassis switch, 24-bit identifiers start to look
   quite small.  There are nearly endless uses for this metadata,
   ranging from storing input port identifiers for simple security policies to
   sending service-based context for advanced middlebox applications 
   that terminate and re-encapsulate Geneve traffic.</t>
      <t>
   Existing tunneling protocols have each attempted to solve different
   aspects of these new requirements only to be quickly rendered out of
   date by changing control plane implementations and advancements.
   Furthermore, software and hardware components and controllers all
   have different advantages and rates of evolution -- a fact that should
   be viewed as a benefit, not a liability or limitation.  This document describes Geneve, a protocol that seeks to avoid these problems by
   providing a framework for tunneling for network virtualization rather
   than being prescriptive about the entire system.</t>
      <section anchor="sec-1.1" numbered="true" toc="default">
        <name>Requirements Language</name>
        <t>
   The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>",
   "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this
   document are to be interpreted as described in BCP 14 <xref target="RFC2119" format="default"/>
          <xref target="RFC8174" format="default"/> when, and only when, they appear in all
   capitals, as shown here.</t>
      </section>
      <section anchor="sec-1.2" numbered="true" toc="default">
        <name>Terminology</name>
        <t>
   The Network
   Virtualization over Layer 3 (NVO3) Framework <xref target="RFC7365" format="default"/> defines many of the concepts commonly
   used in network virtualization.  In addition, the following terms are
   specifically meaningful in this document:</t>
<dl newline="false" spacing="normal">       
 <dt>Checksum offload:</dt>
<dd>An optimization implemented by many NICs (Network Interface Controllers)
that enables computation and verification of upper-layer protocol
   checksums in hardware on transmit and receive, respectively.  This
   typically includes IP and TCP/UDP checksums that would otherwise be
   computed by the protocol stack in software.</dd>
        
<dt>Clos network:</dt>  <dd>A technique for composing network fabrics larger than
   a single switch while maintaining non-blocking bandwidth across
   connection points.  ECMP is used to divide traffic across the
   multiple links and switches that constitute the fabric.  Sometimes
   termed "leaf and spine" or "fat tree" topologies.</dd>
        
<dt>ECMP:</dt>
<dd>Equal Cost Multipath. A routing mechanism for selecting from
   among multiple best next-hop paths by hashing packet headers in order
   to better utilize network bandwidth while avoiding reordering of packets 
   within a flow.</dd>
        
<dt>Geneve:</dt><dd>Generic Network Virtualization Encapsulation. The tunneling
   protocol described in this document.</dd>
   
<dt>LRO:</dt><dd>Large Receive Offload. The receiver-side equivalent function
of LSO, in which multiple protocol segments (primarily TCP) are coalesced into
 larger data units.</dd>
        
<dt>LSO:</dt><dd> Large Segmentation Offload. A function provided by many
   commercial NICs that allows data units larger than the MTU to be
   passed to the NIC to improve performance, the NIC being responsible
   for creating smaller segments of a size less than or equal to the MTU
   with correct protocol headers.  When referring specifically to TCP/IP, this
   feature is often known as TSO (TCP Segmentation Offload).</dd>
        <dt>
   Middlebox:</dt><dd>  In the context of this document, the term "middlebox" refers to network 
   service functions or service interposition appliances that typically implement tunnel endpoint functionality, terminating and re-encapsulating Geneve traffic.</dd>
        <dt>NIC:</dt><dd>Network Interface Controller. Also called "Network Interface Card" or "Network Adapter".
   A NIC could be part of a tunnel endpoint or transit device and can either
   process or aid in the processing of Geneve packets.</dd>
        <dt>
   Transit device:</dt> <dd> A forwarding element (e.g., router or switch) along the path of the tunnel
   making up part of the underlay network.  A transit device may be
   capable of understanding the Geneve packet format but does not
   originate or terminate Geneve packets.</dd>
        <dt>
   Tunnel endpoint:</dt><dd>  A component performing encapsulation and
   decapsulation of packets, such as Ethernet frames or IP datagrams, in
   Geneve headers.  As the ultimate consumer of any tunnel metadata,
   tunnel endpoints have the highest level of requirements for parsing and
   interpreting tunnel headers.  Tunnel endpoints may consist of either
   software or hardware implementations or a combination of the two.
   Tunnel endpoints are frequently a component of a Network Virtualization Edge (NVE)
   but may also be found in middleboxes or other elements making up an NVO3 network.</dd>
        <dt>VM:</dt><dd>Virtual Machine.</dd>
</dl>
      </section>
    </section>
    <section anchor="sec-2" numbered="true" toc="default">
      <name>Design Requirements</name>
      <t>
   Geneve is designed to support network virtualization use cases for data center environments.  In these situations,
   tunnels are typically established to act as a backplane between the
   virtual switches residing in hypervisors, physical switches, or
   middleboxes or other appliances.  An arbitrary IP network can be used
   as an underlay, although Clos networks composed using ECMP links are a
   common choice to provide consistent bisectional bandwidth across all
   connection points. Many of the concepts of network virtualization overlays 
   over IP networks are described in the NVO3 Framework <xref target="RFC7365" format="default"/>.
   <xref target="ref-sample-geneve-deployment"/> shows an example of a
   hypervisor, a top-of-rack switch for connectivity to physical servers, and a WAN uplink
   connected using Geneve tunnels over a simplified Clos network.  These
   tunnels are used to encapsulate and forward frames from the attached
   components, such as VMs or physical links.</t>
      <figure anchor="ref-sample-geneve-deployment">
        <name>Sample Geneve Deployment</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
  +---------------------+           +-------+  +------+
  | +--+  +-------+---+ |           |Transit|--|Top of|==Physical
  | |VM|--|       |   | | +------+ /|Router |  | Rack |==Servers
  | +--+  |Virtual|NIC|---|Top of|/ +-------+\/+------+
  | +--+  |Switch |   | | | Rack |\ +-------+/\+------+
  | |VM|--|       |   | | +------+ \|Transit|  |Uplink|   WAN
  | +--+  +-------+---+ |           |Router |--|      |=========>
  +---------------------+           +-------+  +------+
         Hypervisor

              ()===================================()
                      Switch-Switch Geneve Tunnels
]]></artwork>
      </figure>

      <t>
   To support the needs of network virtualization, the tunneling protocol
   should be able to take advantage of the differing (and evolving)
   capabilities of each type of device in both the underlay and overlay
   networks.  This results in the following requirements being placed on
   the data plane tunneling protocol:</t>
      <ul spacing="normal">
        <li>The data plane is generic and extensible enough to support current
      and future control planes.</li>
        <li>Tunnel components are efficiently implementable in both hardware
      and software without restricting capabilities to the lowest common
      denominator.</li>
        <li>High performance over existing IP fabrics is maintained.</li>
      </ul>
      <t>
   These requirements are described further in the following
   subsections.</t>
      <section anchor="sec-2.1" numbered="true" toc="default">
        <name>Control Plane Independence</name>
        <t>
   Although some protocols for network virtualization have included a
   control plane as part of the tunnel format specification (most
   notably, VXLAN <xref target="RFC7348" format="default"/> prescribed a multicast-learning-based control plane), these specifications have largely been treated
   as describing only the data format.  The VXLAN packet format has
   actually seen a wide variety of control planes built on top of it.</t>
        <t>
   There is a clear advantage in settling on a data format: most of the
   protocols are only superficially different and there is little
   advantage in duplicating effort.  However, the same cannot be said of
   control planes, which are diverse in very fundamental ways.  The case
   for standardization is also less clear given the wide variety in
   requirements, goals, and deployment scenarios.</t>
        <t>
   As a result of this reality, Geneve is a pure tunnel format
   specification that is capable of fulfilling the needs of many control
   planes by explicitly not selecting any one of them.  This
   simultaneously promotes a shared data format and reduces the
   chance of obsolescence by future control plane
   enhancements.</t>
      </section>
      <section anchor="sec-2.2" numbered="true" toc="default">
        <name>Data Plane Extensibility</name>
        <t>
   Achieving the level of flexibility needed to support current and
   future control planes effectively requires an options infrastructure
   to allow new metadata types to be defined, deployed, and either
   finalized or retired.  Options also allow for differentiation of
   products by encouraging independent development in each vendor's core
   specialty, leading to an overall faster pace of advancement.  By far,
   the most common mechanism for implementing options is the Type-Length-Value (TLV) format.</t>

        <t>
   It should be noted that, while options can be used to support non-wirespeed
   control packets, they are equally important in data packets
   as well for segregating and directing forwarding. (For instance, the
   examples given before regarding input-port-based security policies and
   terminating/re-encapsulating service interposition both require tags
   to be placed on data packets.)  Therefore, while it would be desirable to limit the
   extensibility to only control packets for the purposes of simplifying
   the datapath, that would not satisfy the design requirements.</t>
        <section anchor="sec-2.2.1" numbered="true" toc="default">
          <name>Efficient Implementation</name>
       <t>
   There is often a conflict between software flexibility and hardware
   performance that is difficult to resolve.  For a given set of
   functionality, it is obviously desirable to maximize performance.
   However, that does not mean new features that cannot be run at a desired
   speed today should be disallowed.  Therefore, for a protocol to be considered
   efficiently implementable, it is expected to have a set of common capabilities that can
   be reasonably handled across platforms as well as a graceful
   mechanism to handle more advanced features in the appropriate
   situations.</t>

          <t>
   The use of a variable-length header and options in a protocol often
   raises questions about whether the protocol is truly efficiently
   implementable in hardware.  To answer this question in the context of Geneve, it is
   important to first divide "hardware" into two categories: tunnel
   endpoints and transit devices.</t>
   <t>
   Tunnel endpoints must be able to parse the variable-length header, including any
   options, and take action.  Since these devices are actively
   participating in the protocol, they are the most affected by Geneve.
   However, as tunnel endpoints are the ultimate consumers of the data,
   transmitters can tailor their output to the capabilities of the
   recipient.</t>



          <t>
   Transit devices may be able to interpret the options; however, 
   as non-terminating devices, transit devices
   do not originate or terminate the Geneve packet. Hence, they <bcp14>MUST NOT</bcp14> modify Geneve headers and 
   <bcp14>MUST NOT</bcp14> insert or delete options, as that is the responsibility of tunnel endpoints.
   Options, if present in the packet, <bcp14>MUST</bcp14> only be generated and terminated by tunnel endpoints.
   The participation of transit devices in interpreting options is
   <bcp14>OPTIONAL</bcp14>.</t>
          <t>
   Further, either tunnel endpoints or transit devices <bcp14>MAY</bcp14> use offload
   capabilities of NICs, such as checksum offload, to improve the
   performance of Geneve packet processing.  The presence of a Geneve
   variable-length header should not prevent the tunnel endpoints and
   transit devices from using such offload capabilities.</t>
        </section>
      </section>
      <section anchor="sec-2.3" numbered="true" toc="default">
        <name>Use of Standard IP Fabrics</name>
        <t>
   IP has clearly cemented its place as the dominant transport mechanism,
   and many techniques have evolved over time to make it robust,
   efficient, and inexpensive.  As a result, it is natural to use IP
   fabrics as a transit network for Geneve.  Fortunately, the use of IP
   encapsulation and addressing is enough to achieve the primary goal of
   delivering packets to the correct point in the network through
   standard switching and routing.</t>
        <t>
   In addition, nearly all underlay fabrics are designed to exploit
   parallelism in traffic to spread load across multiple links without
   introducing reordering in individual flows.  These ECMP techniques typically involve parsing and hashing
   the addresses and port numbers from the packet to select an outgoing
   link.  However, the use of tunnels often results in poor ECMP
   performance, as without additional knowledge of the protocol, the
   encapsulated traffic is hidden from the fabric by design, and only
   tunnel endpoint addresses are available for hashing.</t>
        <t>
   Since it is desirable for Geneve to perform well on these existing
   fabrics, it is necessary for entropy from encapsulated packets to be
   exposed in the tunnel header.  The most common technique for this is
   to use the UDP source port, which is discussed further in
   <xref target="sec-3.3" format="default"/>.</t>
      </section>
    </section>
    <section anchor="sec-3" numbered="true" toc="default">
      <name>Geneve Encapsulation Details</name>
      <t>
   The Geneve packet format consists of a compact tunnel header
   encapsulated in UDP over either IPv4 or IPv6.  A small fixed tunnel
   header provides control information plus a base level of
   functionality and interoperability with a focus on simplicity.  This
   header is then followed by a set of variable-length options to allow for
   future innovation.  Finally, the payload consists of a protocol data
   unit of the indicated type, such as an Ethernet frame. Sections <xref target="sec-3.1" format="counter"/>
   and <xref target="sec-3.2" format="counter"/> illustrate the Geneve packet format transported (for
   example) over Ethernet along with an Ethernet payload.</t>
      <section anchor="sec-3.1" numbered="true" toc="default">
        <name>Geneve Packet Format over IPv4</name>
<figure>
<name>Geneve Packet Format over IPv4</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

Outer Ethernet Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                 Outer Destination MAC Address                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Outer Destination MAC Address |   Outer Source MAC Address    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Outer Source MAC Address                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Optional Ethertype=C-Tag 802.1Q|  Outer VLAN Tag Information   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Ethertype = 0x0800 IPv4    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Outer IPv4 Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Version|  IHL  |Type of Service|          Total Length         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         Identification        |Flags|      Fragment Offset    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Time to Live |Protocol=17 UDP|         Header Checksum       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     Outer Source IPv4 Address                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Outer Destination IPv4 Address              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Outer UDP Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Source Port = xxxx      |    Dest Port = 6081 Geneve    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           UDP Length          |        UDP Checksum           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Geneve Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Ver|  Opt Len  |O|C|    Rsvd.  |          Protocol Type        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        Virtual Network Identifier (VNI)       |    Reserved   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   ~                    Variable-Length Options                    ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Inner Ethernet Header (example payload):
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                 Inner Destination MAC Address                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Inner Destination MAC Address |   Inner Source MAC Address    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Inner Source MAC Address                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Optional Ethertype=C-Tag 802.1Q|  Inner VLAN Tag Information   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Payload:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Ethertype of Original Payload |                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
   |                                  Original Ethernet Payload    |
   |                                                               |
   ~ (Note that the original Ethernet frame's preamble, start      ~
   | frame delimiter (SFD), and frame check sequence (FCS) are not |
   | included, and the Ethernet payload need not be 4-byte aligned)|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Frame Check Sequence:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   New Frame Check Sequence (FCS) for Outer Ethernet Frame     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
      </figure>
      </section>
      <section anchor="sec-3.2" numbered="true" toc="default">
        <name>Geneve Packet Format over IPv6</name>
<figure><name>Geneve Packet Format over IPv6</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

Outer Ethernet Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                 Outer Destination MAC Address                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Outer Destination MAC Address |   Outer Source MAC Address    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Outer Source MAC Address                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Optional Ethertype=C-Tag 802.1Q|  Outer VLAN Tag Information   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Ethertype = 0x86DD IPv6    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Outer IPv6 Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Version| Traffic Class |           Flow Label                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         Payload Length        | NxtHdr=17 UDP |   Hop Limit   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   +                                                               +
   |                                                               |
   +                     Outer Source IPv6 Address                 +
   |                                                               |
   +                                                               +
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   +                                                               +
   |                                                               |
   +                  Outer Destination IPv6 Address               +
   |                                                               |
   +                                                               +
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Outer UDP Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Source Port = xxxx      |    Dest Port = 6081 Geneve    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           UDP Length          |        UDP Checksum           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Geneve Header:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Ver|  Opt Len  |O|C|    Rsvd.  |          Protocol Type        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        Virtual Network Identifier (VNI)       |    Reserved   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   ~                    Variable-Length Options                    ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Inner Ethernet Header (example payload):
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                 Inner Destination MAC Address                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Inner Destination MAC Address |   Inner Source MAC Address    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Inner Source MAC Address                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Optional Ethertype=C-Tag 802.1Q|  Inner VLAN Tag Information   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Payload:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Ethertype of Original Payload |                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
   |                                  Original Ethernet Payload    |
   |                                                               |
   ~ (Note that the original Ethernet frame's preamble, start      ~
   | frame delimiter (SFD), and frame check sequence (FCS) are not |
   | included, and the Ethernet payload need not be 4-byte aligned)|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Frame Check Sequence:
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   New Frame Check Sequence (FCS) for Outer Ethernet Frame     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
      </figure>
      </section>
      <section anchor="sec-3.3" numbered="true" toc="default">
        <name>UDP Header</name>
        <t>
   The use of an encapsulating UDP <xref target="RFC0768" format="default"/> header follows the
   connectionless semantics of Ethernet and IP in addition to providing
   entropy to routers performing ECMP.  Therefore, header fields are 
   interpreted as follows:</t>
        <dl newline="false" spacing="normal" indent="3">
          <dt>Source Port:</dt>
          <dd>
            <t>
	A source port selected by the originating tunnel endpoint.  This source port <bcp14>SHOULD</bcp14> be the same for all packets
      belonging to a single encapsulated flow to prevent reordering due
      to the use of different paths.  To encourage an even distribution
      of flows across multiple links, the source port <bcp14>SHOULD</bcp14> be
      calculated using a hash of the encapsulated packet headers using,
      for example, a traditional 5-tuple.  Since the port represents a
      flow identifier rather than a true UDP connection, the entire
      16-bit range <bcp14>MAY</bcp14> be used to maximize entropy. In addition to setting the source port, 
      for IPv6, the flow label <bcp14>MAY</bcp14> also be used for providing entropy. For an example of 
      using the IPv6 flow label for tunnel use cases, see <xref target="RFC6438" format="default"/>. 
            </t>
            <t>
	If Geneve traffic is shared with other UDP listeners 
	on the same IP address, tunnel endpoints <bcp14>SHOULD</bcp14> implement a mechanism 
	to ensure ICMP return traffic arising from network errors is directed 
	to the correct listener. The definition of such a mechanism is beyond 
	the scope of this document.
            </t>
          </dd>
          <dt>Dest Port:</dt>
          <dd>
            <t>
	IANA has assigned port 6081 as the fixed well-known destination port
	for Geneve.  Although the well-known value should be used by default, it is <bcp14>RECOMMENDED</bcp14> that implementations make
      this configurable.  The chosen port is used for identification of
      Geneve packets and <bcp14>MUST NOT</bcp14> be reversed for different ends of a
      connection as is done with TCP. It is the responsibility of the control plane to manage any reconfiguration of the assigned port and its interpretation by respective devices.
      The definition of the control plane is beyond the scope of this document.
            </t>
          </dd>
          <dt>UDP Length:</dt>
          <dd>
            <t>
	The length of the UDP packet including the UDP header.</t>
          </dd>
          <dt>UDP Checksum:</dt>
          <dd>
            <t>
	In order to protect the Geneve header, options, and payload from
	potential data corruption, the UDP checksum <bcp14>SHOULD</bcp14> be generated as 
	specified in <xref target="RFC0768" format="default"/> and <xref target="RFC1122" format="default"/> when 
	Geneve is encapsulated in IPv4. To protect the IP header, Geneve header, 
	options, and payload from potential data corruption, the UDP checksum <bcp14>MUST</bcp14> 
	be generated by default as specified in <xref target="RFC0768" format="default"/> 
	and <xref target="RFC8200" format="default"/> when Geneve
	is encapsulated in IPv6, except under certain conditions, which are outlined in the next paragraph. 
	Upon receiving such packets with a non-zero UDP checksum, 
	the receiving tunnel endpoints <bcp14>MUST</bcp14> validate the checksum.
	If the checksum is not correct, the packet <bcp14>MUST</bcp14> be dropped; otherwise, 
	the packet <bcp14>MUST</bcp14> be accepted for decapsulation.
            </t>
            <t>
	Under certain conditions, the UDP checksum <bcp14>MAY</bcp14> be set to zero on transmit
	for packets encapsulated in both IPv4 and IPv6 <xref target="RFC8200" format="default"/>.
	See <xref target="sec-4.3" format="default"/> for additional
	requirements that apply when using zero 
	UDP checksum with IPv4 and IPv6. Disabling the use of UDP checksums is 
	an operational consideration that should take into account the risks
       	and effects of packet corruption.
            </t>
          </dd>
        </dl>
      </section>
      <section anchor="sec-3.4" numbered="true" toc="default">
        <name>Tunnel Header Fields</name>
        <dl newline="false" spacing="normal" indent="3">
          <dt>Ver (2 bits):</dt>
          <dd>
            <t>
	The current version number is 0.  Packets received by a tunnel endpoint with an unknown version <bcp14>MUST</bcp14> be dropped. Transit 
      devices interpreting Geneve packets with an unknown
      version number <bcp14>MUST</bcp14> treat them as UDP packets with an unknown
      payload.
            </t>
          </dd>
          <dt>Opt Len (6 bits):</dt>
          <dd>
            <t>
	The length of the option fields, expressed in 4-byte multiples, not including the 8-byte fixed tunnel
      header.  This results in a minimum total Geneve header size of 8
      bytes and a maximum of 260 bytes.  The start of the payload
      headers can be found using this offset from the end of the base
      Geneve header.
            </t>
            <t>
   Transit devices <bcp14>MUST</bcp14> maintain consistent forwarding behavior
   irrespective of the value of 'Opt Len', including ECMP link
   selection.
            </t>
          </dd>
          <dt>O (1 bit):</dt>
          <dd>
            <t>
	Control packet.  This packet contains a control message. Control messages are sent between tunnel endpoints.
      Tunnel endpoints <bcp14>MUST NOT</bcp14> forward the payload,
      and transit devices <bcp14>MUST NOT</bcp14> attempt to interpret it.
      Since control messages are less frequent, it is <bcp14>RECOMMENDED</bcp14>
      that tunnel endpoints direct these packets to a high-priority control
      queue (for example, to direct the packet to a general purpose CPU
      from a forwarding Application-Specific Integrated Circuit (ASIC) or to separate out control traffic on a
      NIC).  Transit devices <bcp14>MUST NOT</bcp14> alter forwarding behavior on the
      basis of this bit, such as ECMP link selection.
            </t>
          </dd>
          <dt>C (1 bit):</dt>
          <dd>
            <t>
	Critical options present.  One or more options has the critical bit set (see <xref target="sec-3.5" format="default"/>).  If this bit is set, then
      tunnel endpoints <bcp14>MUST</bcp14> parse the options list to interpret any
      critical options.  On tunnel endpoints where option parsing is not
      supported, the packet <bcp14>MUST</bcp14> be dropped on the basis of the 'C' bit
      in the base header.  If the bit is not set, tunnel endpoints <bcp14>MAY</bcp14>
      strip all options using 'Opt Len' and forward the decapsulated
      packet.  Transit devices <bcp14>MUST NOT</bcp14> drop packets on the
      basis of this bit.
            </t>
          </dd>
          <dt>Rsvd. (6 bits):</dt>
          <dd>
            <t>
	Reserved field, which <bcp14>MUST</bcp14> be zero on transmission and <bcp14>MUST</bcp14> be ignored on receipt.
            </t>
          </dd>
          <dt>Protocol Type (16 bits):</dt>
          <dd>
            <t>
	The type of protocol data unit appearing after the Geneve header.  This follows the Ethertype
      <xref target="ETYPES" format="default"/> convention, with Ethernet itself being represented by the
      value 0x6558.
            </t>
          </dd>
          <dt>Virtual Network Identifier (VNI) (24 bits):</dt>
          <dd>
            <t>
	An identifier for a unique element of a virtual network.  In many situations, this may
      represent an L2 segment; however, the control plane defines the
      forwarding semantics of decapsulated packets.  The VNI <bcp14>MAY</bcp14> be used
      as part of ECMP forwarding decisions or <bcp14>MAY</bcp14> be used as a mechanism
      to distinguish between overlapping address spaces contained in the
      encapsulated packet when load balancing across CPUs.
            </t>
          </dd>
          <dt>Reserved (8 bits):</dt>
          <dd>
            <t>
	Reserved field, which <bcp14>MUST</bcp14> be zero on transmission and ignored on receipt.
            </t>
          </dd>
        </dl>
      </section>
      <section anchor="sec-3.5" numbered="true" toc="default">
        <name>Tunnel Options</name>
      <figure anchor="geneve-options">
        <name>Geneve Option</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Option Class         |      Type     |R|R|R| Length  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   ~                  Variable-Length Option Data                  ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
      </figure>
        <t>
   The base Geneve header is followed by zero or more options in Type-Length-Value format.  Each option consists of a 4-byte option
   header and a variable amount of option data interpreted according to
   the type.</t>
        <dl newline="false" spacing="normal" indent="3">
          <dt>Option Class (16 bits):</dt>
          <dd>
            <t>
	Namespace for the 'Type' field.  IANA has created a "Geneve Option Class" registry to
      allocate identifiers for organizations, technologies, and vendors
      that have an interest in creating types for options.  Each
      organization may allocate types independently to allow
      experimentation and rapid innovation.  It is expected that, over
      time, certain options will become well known, and a given
      implementation may use option types from a variety of sources.  In
      addition, IANA has reserved specific ranges for
      allocation by IETF Review and for Experimental Use (see <xref target="sec-7" format="default"/>).
            </t>
          </dd>
          <dt>Type (8 bits):</dt>
          <dd>
            <t>
	Type indicating the format of the data contained in this option.  Options are primarily designed to encourage future
      extensibility and innovation, and standardized forms of these
      options will be defined in separate documents.
            </t>
            <t>
	The high-order bit of the option type indicates that this is a
      critical option.  If the receiving tunnel endpoint does not recognize
      the option and this bit is set, then the packet <bcp14>MUST</bcp14> be dropped.
      If this bit is set in any option, then the 'C' bit in the
      Geneve base header <bcp14>MUST</bcp14> also be set.  Transit devices <bcp14>MUST NOT</bcp14>
      drop packets on the basis of this bit.  The following figure shows
      the location of the 'C' bit in the 'Type' field:
            </t>
          </dd>
        </dl>
<figure><name>&apos;C&apos; Bit in the &apos;Type&apos; Field</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
   0 1 2 3 4 5 6 7 8
   +-+-+-+-+-+-+-+-+
   |C|    Type     |
   +-+-+-+-+-+-+-+-+
]]></artwork>
</figure>
        <dl newline="false" spacing="normal" indent="3">
          <dt/>
          <dd>
      The requirement to drop a packet with an unknown option with the 'C' bit set
      applies to the entire tunnel endpoint system and not a particular
      component of the implementation.  For example, in a system
      comprised of a forwarding ASIC and a general purpose CPU, this
      does not mean that the packet must be dropped in the ASIC.  An
      implementation may send the packet to the CPU using a rate-limited
      control channel for slow-path exception handling.</dd>
        </dl>
        <dl newline="false" spacing="normal" indent="3">
          <dt>R (3 bits):</dt>
          <dd>
	Option control flags reserved for future use.  These bits <bcp14>MUST</bcp14> be
	zero on transmission and <bcp14>MUST</bcp14> be ignored on receipt.
	</dd>
          <dt>Length (5 bits):</dt>
          <dd>
            <t>
	Length of the option, expressed in 4-byte
	multiples, excluding the option header.  The total length of each
	option may be between 4 and 128 bytes. A value of 0 in the 'Length' field implies
       	an option with only an option header and no option data. Packets in which the total
      length of all options is not equal to the 'Opt Len' in the base
      header are invalid and <bcp14>MUST</bcp14> be silently dropped if received by a
      tunnel endpoint that processes the options.
            </t>
          </dd>
          <dt>Variable-Length Option Data:</dt>
          <dd>
            <t>
	Option data interpreted according to 'Type'.
            </t>
            
          </dd>
        </dl>
        <section anchor="sec-3.5.1" numbered="true" toc="default">
          <name>Options Processing</name>
          <t>
   Geneve options are intended to be originated and processed
   by tunnel endpoints.  However, options <bcp14>MAY</bcp14> be interpreted by transit
   devices along the tunnel path.  Transit devices not
   interpreting Geneve headers (which may or may not include options) <bcp14>MUST</bcp14> handle
   Geneve packets as any other UDP packet and maintain consistent forwarding behavior.</t>
          <t>
   In tunnel endpoints, the generation and interpretation of options is
   determined by the control plane, which is beyond the scope of this
   document.  However, to ensure interoperability between heterogeneous
   devices, some requirements are imposed on options and the devices that
   process them:</t>
          <ul spacing="normal">
            <li>Receiving tunnel endpoints <bcp14>MUST</bcp14> drop packets containing unknown options
      with the 'C' bit set in the option type.  Conversely, transit
      devices <bcp14>MUST NOT</bcp14> drop packets as a result of encountering unknown
      options, including those with the 'C' bit set.</li>
            <li>The contents of the options and their ordering <bcp14>MUST NOT</bcp14> be
      modified by transit devices.</li>
            <li>If a tunnel endpoint receives a Geneve packet with an 'Opt Len' (the total length of all options) 
	that exceeds the options-processing capability of the tunnel endpoint, then 
	the tunnel endpoint <bcp14>MUST</bcp14> drop such packets. An implementation may raise an 
	exception to the control plane in such an event. It is the responsibility 
	of the control plane to ensure the communicating peer tunnel endpoints 
	have the processing capability to handle the total length of options. 
	The definition of the control plane is beyond the scope of this document.</li>
          </ul>
          <t>
   When designing a Geneve option, it is important to consider how the
   option will evolve in the future.  Once an option is defined, it is
   reasonable to expect that implementations may come to depend on a
   specific behavior.  As a result, the scope of any future changes must
   be carefully described upfront.</t>
          <t>
   Architecturally, options are intended to be self descriptive and independent. 
   This enables parallelism in options processing and reduces implementation complexity.
   However, the control plane may impose certain ordering restrictions, as 
   described in <xref target="sec-4.5.1" format="default"/>.</t>
          <t>
   Unexpectedly significant interoperability issues may result from
   changing the length of an option that was defined to be a certain
   size.  A particular option is specified to have either a fixed
   length, which is constant, or a variable length, which may change
   over time or for different use cases.  This property is part of the
   definition of the option and is conveyed by the 'Type'.  For fixed-length options, some implementations may choose to ignore the 'Length'
   field in the option header and instead parse based on the well-known
   length associated with the type.  In this case, redefining the length
   will impact not only the parsing of the option in question but also any
   options that follow.  Therefore, options that are defined to be a fixed
   length in size <bcp14>MUST NOT</bcp14> be redefined to a different length.  Instead,
   a new 'Type' should be allocated. Actual definition of the option type is beyond 
   the scope of this document.  The option type and its interpretation should be 
   defined by the entity that owns the option class.</t>
          <t>
   Options may be processed by NIC hardware utilizing offloads (e.g., LSO and LRO) 
   as described in <xref target="sec-4.6" format="default"/>. Careful consideration should be 
   given to how the offload capabilities outlined in <xref target="sec-4.6" format="default"/> 
   impact an option's design.
          </t>
        </section>
      </section>
    </section>
    <section anchor="sec-4" numbered="true" toc="default">
      <name>Implementation and Deployment Considerations</name>
      <section anchor="sec-4.1" numbered="true" toc="default">
        <name>Applicability Statement</name>

        <t>
	Geneve is a UDP-based network virtualization overlay encapsulation protocol
	designed to establish tunnels between NVEs over an existing IP network.
	It is intended for use in public or private data center environments, 
	for deploying multi-tenant overlay networks over an existing IP underlay network.</t>
        <t>
	As a UDP-based protocol, Geneve adheres 
	to the UDP usage guidelines as specified in <xref target="RFC8085" format="default"/>. 
	The applicability of these guidelines is dependent on the underlay
	IP network and the nature of the Geneve payload protocol
	(for example, TCP/IP, IP/Ethernet).</t>
        <t>
	Geneve is intended to be deployed in a data center network environment 
	operated by a single operator or an adjacent set of cooperating network 
	operators that fits with the definition of controlled environments
       	in <xref target="RFC8085" format="default"/>. A network in a controlled environment can be
	managed to operate under certain conditions, whereas in the general
   	Internet, this cannot be done.  Hence, requirements for a tunneling
   	protocol operating under a controlled environment can be less
   	restrictive than the requirements of the general Internet.
        </t>
        <t>
	For the purpose of this document, a traffic-managed controlled environment
	(TMCE) is defined as an IP network that is traffic engineered and/or otherwise
	managed (e.g., via use of traffic rate limiters) to avoid congestion. The concept
	of a TMCE is outlined in <xref target="RFC8086" format="default"/>. Significant portions of the text 
	in <xref target="sec-4.1" format="default"/> through <xref target="sec-4.3" format="default"/> are based 
	on <xref target="RFC8086" format="default"/> as applicable to Geneve.</t>
        <t>
	It is the responsibility of the operator to ensure that the guidelines/requirements
	in this section are followed as applicable to their Geneve deployment(s).</t>
      </section>
      <section anchor="sec-4.2" numbered="true" toc="default">
        <name>Congestion-Control Functionality</name>
        <t>
	Geneve does not natively provide congestion-control functionality and relies
	on the payload protocol traffic for congestion control. As such, Geneve <bcp14>MUST</bcp14>
	be used with congestion-controlled traffic or within a TMCE to avoid congestion. An operator of a TMCE may avoid congestion through careful provisioning
	of their networks, rate-limiting user data traffic, and managing traffic 
	engineering according to path capacity.</t>
      </section>
      <section anchor="sec-4.3" numbered="true" toc="default">
        <name>UDP Checksum</name>
        <t>
	The outer UDP checksum <bcp14>SHOULD</bcp14> be used with Geneve when transported
   over IPv4; this is to provide integrity for the Geneve headers,
   options, and payload in case of data corruption (for example, to
   avoid misdelivery of the payload to different tenant systems).  The UDP checksum provides a statistical guarantee 
	that a payload was not corrupted in transit. These integrity checks are not 
	strong from a coding or cryptographic perspective and are not designed to 
	detect physical-layer errors or malicious modification of the datagram 
	(see <xref target="RFC8085" sectionFormat="of" section="3.4"/>). In deployments where such a risk exists, 
	an operator <bcp14>SHOULD</bcp14> use additional data integrity
	mechanisms such as those offered 
	by IPsec (see <xref target="sec-6.2" format="default"/>).</t>
        <t>
	An operator <bcp14>MAY</bcp14> choose to disable UDP checksums
	and use zero UDP checksum if Geneve packet integrity is provided by other data
	integrity mechanisms, such as IPsec or additional checksums, or if one of
	the conditions (a, b, or c) in <xref target="sec-4.3.1" format="default"/> is met.</t>
        <t>
	By default, UDP checksums <bcp14>MUST</bcp14> be used when Geneve is transported over IPv6.
	A tunnel endpoint <bcp14>MAY</bcp14> be configured for use with zero UDP checksum if 
	additional requirements in <xref target="sec-4.3.1" format="default"/> are met.</t>
        <section anchor="sec-4.3.1" numbered="true" toc="default">
          <name>Zero UDP Checksum Handling with IPv6</name>
          <t>
	When Geneve is used over IPv6, the UDP checksum is used to protect IPv6 headers, 
	UDP headers, and Geneve headers, options, and payload from potential data corruption.
	As such, by default, Geneve <bcp14>MUST</bcp14> use UDP checksums when transported over IPv6.
	An operator <bcp14>MAY</bcp14> choose to configure zero UDP checksum if 
	operating in a TMCE as stated in 
	<xref target="sec-4.1" format="default"/> if one of the following conditions is met.</t>
          <ol spacing="normal" type="a">
            <li>It is known that packet corruption is exceptionally
	unlikely (perhaps based on knowledge of equipment types in their underlay 
	network) and the operator is willing to risk undetected packet
	    corruption.</li>
            <li>It is judged through observational measurements (perhaps through historic
	or current traffic flows that use non-zero checksum) that the level of packet
	corruption is tolerably low and is where the operator is willing to risk undetected corruption.</li>
            <li>The Geneve payload is carrying applications that are tolerant of misdelivered 
	or corrupted packets (perhaps through higher-layer checksum validation
	and/or reliability through retransmission). </li>
          </ol>
          <t> In addition, Geneve tunnel implementations using zero UDP checksum <bcp14>MUST</bcp14> meet
		the following requirements:</t>
          <ol spacing="normal" type="1">
            <li>Use of UDP checksum over IPv6 <bcp14>MUST</bcp14> be the default 
	configuration for all Geneve tunnels.</li>
            <li>If Geneve is used with zero UDP checksum over IPv6, then such
	    a tunnel
	endpoint implementation <bcp14>MUST</bcp14> meet all the requirements specified
	in <xref target="RFC6936" sectionFormat="of" section="4"/> and requirement 1 as specified in <xref target="RFC6936" sectionFormat="of" section="5"/> since it is relevant to Geneve.</li>
            <li>The Geneve tunnel endpoint that decapsulates the tunnel
	    <bcp14>SHOULD</bcp14> check that the 
	source and destination IPv6 addresses are valid for the Geneve tunnel that
	is configured to receive zero UDP checksum and discard other packets 
	for which such a check fails.</li>
            <li>
              <t>The Geneve tunnel endpoint that encapsulates the tunnel <bcp14>MAY</bcp14> use different
	IPv6 source addresses for each Geneve tunnel that uses zero UDP checksum mode
	in order to strengthen the decapsulator's check of the IPv6 source address
	(i.e., the same IPv6 source address is not to be used with more than one IPv6
	destination address, irrespective of whether that destination address is
	a unicast or multicast address). When this is not possible, it is <bcp14>RECOMMENDED</bcp14>
	to use each source address for as few Geneve tunnels that use zero UDP
	checksum as is feasible.
              </t>
              <t>
	Note that for requirements 3 and 4, the receiving tunnel endpoint can apply 
	these checks only if it has out-of-band knowledge that the encapsulating tunnel 
	endpoint is applying the indicated behavior. One possibility to obtain this out-of-band 
	knowledge is through signaling by the control plane. The definition of 
	the control plane is beyond the scope of this document.</t>
            </li>
            <li>Measures <bcp14>SHOULD</bcp14> be taken to prevent Geneve traffic over IPv6 with zero UDP 
	checksum from escaping into the general Internet. Examples of such measures include
	employing packet filters at the gateways or edge of the Geneve network and/or 
	keeping logical or physical separation of the Geneve network from networks 
	carrying general Internet traffic.</li>
          </ol>
          <t> The above requirements do not change the requirements
	specified in either <xref target="RFC8200" format="default"/> or
	<xref target="RFC6936" format="default"/>.
          </t>
          <t>The use of the source IPv6 address in addition to the
	destination IPv6 address, plus the recommendation against
 	reuse of source IPv6 addresses among Geneve tunnels, collectively
	provide some mitigation for the absence of UDP checksum coverage of
	the IPv6 header. A traffic-managed controlled environment that satisfies
       	at least one of the three conditions listed at the beginning of
	this section provides additional assurance.
          </t>
        </section>
      </section>
      <section anchor="sec-4.4" numbered="true" toc="default">
        <name>Encapsulation of Geneve in IP</name>
        <t>
   As an IP-based tunneling protocol, Geneve shares many properties and
   techniques with existing protocols.  The application of some of these
   are described in further detail, although, in general, most concepts
   applicable to the IP layer or to IP tunnels generally also function
   in the context of Geneve.</t>
        <section anchor="sec-4.4.1" numbered="true" toc="default">
          <name>IP Fragmentation</name>
          <t>
   It is <bcp14>RECOMMENDED</bcp14> that Path MTU Discovery (see <xref
   target="RFC1191" format="default"/> and <xref target="RFC8201" format="default"/>) be used to prevent or minimize fragmentation.
   The use of Path MTU Discovery on the transit network provides the
   encapsulating tunnel endpoint with soft-state information about the link that it may use
   to prevent or minimize fragmentation depending on its role in the
   virtualized network. The NVE can maintain this state (the MTU size of
   the tunnel link(s) associated with the tunnel endpoint), so if a
   tenant system sends large packets that, when encapsulated, exceed the
   MTU size of the tunnel link, the tunnel endpoint can discard such
   packets and send exception messages to the tenant system(s). If the
   tunnel endpoint is associated with a routing or forwarding function and/or has the capability
   to send ICMP messages, the encapsulating tunnel endpoint <bcp14>MAY</bcp14> send ICMP fragmentation 
   needed <xref target="RFC0792" format="default"/> or Packet Too Big <xref target="RFC4443" format="default"/> messages to the tenant system(s).
   When determining the MTU size of a tunnel link, the maximum length of options <bcp14>MUST</bcp14> be assumed as options may vary 
   on a per-packet basis. Recommendations and guidance for handling fragmentation in
   similar overlay encapsulation services like Pseudowire Emulation
	  Edge-to-Edge (PWE3) are provided in <xref target="RFC3985"
	  sectionFormat="of" section="5.3"/>.</t>
          <t>
   Note that some implementations may not be capable of supporting
   fragmentation or other less common features of the IP header, such as
   options and extension headers. Some of the issues associated 
   with MTU size and fragmentation in IP tunneling and use of ICMP messages are
   outlined in <xref target="I-D.ietf-intarea-tunnels"
   sectionFormat="of" section="4.2"/>.</t>
        </section>
        <section anchor="sec-4.4.2" numbered="true" toc="default">
          <name>DSCP, ECN, and TTL</name>
          <t>
   When encapsulating IP (including over Ethernet) packets in Geneve,
   there are several considerations for propagating Differentiated Services
   Code Point (DSCP) and Explicit Congestion Notification (ECN) bits
   from the inner header to the tunnel on transmission and the reverse
   on reception.</t>


          <t>
   <xref target="RFC2983" format="default"/> provides guidance for mapping DSCP between inner and outer
   IP headers.  Network virtualization is typically more closely aligned
   with the Pipe model described, where the DSCP value on the tunnel
   header is set based on a policy (which may be a fixed value, one
   based on the inner traffic class or some other mechanism for
   grouping traffic).  Aspects of the Uniform model (which treats the
   inner and outer DSCP values as a single field by copying on ingress
   and egress) may also apply, such as the ability to re-mark the inner
   header on tunnel egress based on transit marking.  However, the
   Uniform model is not conceptually consistent with network
   virtualization, which seeks to provide strong isolation between
   encapsulated traffic and the physical network.</t>
          <t>
   <xref target="RFC6040" format="default"/> describes the mechanism for exposing ECN capabilities on IP
   tunnels and propagating congestion markers to the inner packets.
   This behavior <bcp14>MUST</bcp14> be followed for IP packets encapsulated in Geneve.</t>
          <t>
   Though either the Uniform or Pipe models could be used for handling TTL (or Hop Limit in case of IPv6) when tunneling IP packets, the Pipe model is more consistent with network virtualization.
   <xref target="RFC2003" format="default"/> provides guidance on handling TTL between inner IP header and outer IP tunnels;
   this model is similar to the Pipe model and is <bcp14>RECOMMENDED</bcp14> for 
   use with Geneve for network virtualization applications.</t>
        </section>
        <section anchor="sec-4.4.3" numbered="true" toc="default">
          <name>Broadcast and Multicast</name>
          <t>
   Geneve tunnels may either be point-to-point unicast between two
   tunnel endpoints or utilize broadcast or multicast addressing.  It is
   not required that inner and outer addressing match in this respect.
   For example, in physical networks that do not support multicast,
   encapsulated multicast traffic may be replicated into multiple
   unicast tunnels or forwarded by policy to a unicast location
   (possibly to be replicated there).</t>
          <t>
   With physical networks that do support multicast, it may be desirable
   to use this capability to take advantage of hardware replication for
   encapsulated packets.  In this case, multicast addresses may be
   allocated in the physical network corresponding to tenants,
   encapsulated multicast groups, or some other factor.  The allocation
   of these groups is a component of the control plane and, therefore,
   is beyond the scope of this document.</t>
          <t> 
   When physical multicast is in
   use, devices with heterogeneous capabilities may be present in the same group. 
   Some options may only be interpretable by a subset of the devices in the group. 
   Other devices can safely ignore such options unless the 'C' bit is set to 
   mark the unknown option as critical.  The requirements outlined in <xref target="sec-3.4" format="default"/> 
   apply for critical options.</t>
          <t>
   In addition, <xref target="RFC8293" format="default"/> provides examples of various mechanisms that can 
   be used for multicast handling in network virtualization overlay networks.</t>
        </section>
        <section anchor="sec-4.4.4" numbered="true" toc="default">
          <name>Unidirectional Tunnels</name>
          <t>
   Generally speaking, a Geneve tunnel is a unidirectional concept.  IP
   is not a connection-oriented protocol, and it is possible for two
   tunnel endpoints to communicate with each other using different paths or to
   have one side not transmit anything at all.  As Geneve is an IP-based
   protocol, the tunnel layer inherits these same characteristics.</t>
          <t>
   It is possible for a tunnel to encapsulate a protocol, such as TCP,
   that is connection oriented and maintains session state at that
   layer.  In addition, implementations <bcp14>MAY</bcp14> model Geneve tunnels as
   connected, bidirectional links, for example, to provide the abstraction of
   a virtual port.  In both of these cases, bidirectionality of the
   tunnel is handled at a higher layer and does not affect the operation
   of Geneve itself.</t>
        </section>
      </section>
      <section anchor="sec-4.5" numbered="true" toc="default">
        <name>Constraints on Protocol Features</name>
        <t>
   Geneve is intended to be flexible for use with a wide range of current and
   future applications.  As a result, certain constraints may be placed
   on the use of metadata or other aspects of the protocol in order to
   optimize for a particular use case.  For example, some applications
   may limit the types of options that are supported or enforce a
   maximum number or length of options.  Other applications may only
   handle certain encapsulated payload types, such as Ethernet or IP.
   These optimizations can be implemented either globally (throughout
   the system) or locally (for example, restricted to certain classes
   of devices or network paths).</t>
        <t>
   These constraints may be communicated to tunnel endpoints either
   explicitly through a control plane or implicitly by the nature of the
   application.  As Geneve is defined as a data plane protocol that is
   control plane agnostic, definition of such mechanisms is beyond the scope of this
   document.</t>
        <section anchor="sec-4.5.1" numbered="true" toc="default">
          <name>Constraints on Options</name>
          <t>
   While Geneve options are flexible, a control plane may restrict
   the number of option TLVs as well as the order and size of the TLVs
   between tunnel endpoints to make it simpler for a data plane
   implementation in software or hardware to handle (see <xref target="I-D.ietf-nvo3-encap" format="default"/>).
   For example, there may be some critical information, such as a secure
   hash, that must be processed in a certain order to provide the lowest
   latency, or there may be other scenarios where the options must be
	  processed in a given order due to protocol semantics.</t>
          <t>
   A control plane may negotiate a subset of option TLVs and certain TLV
   ordering; it may also limit the total number of option TLVs present
   in the packet, for example, to accommodate hardware capable of
   processing fewer options.  Hence, a control plane
   needs to have the ability to describe the supported TLV subset and
   its ordering to the tunnel endpoints.  In the absence of a control
   plane, alternative configuration mechanisms may be used for this
   purpose.  Such mechanisms are beyond the scope of this document.</t>
        </section>
      </section>
      <section anchor="sec-4.6" numbered="true" toc="default">
        <name>NIC Offloads</name>

        <t>
   Modern NICs currently provide a variety of offloads to enable the
   efficient processing of packets.  The implementation of many of these
   offloads requires only that the encapsulated packet be easily parsed
   (for example, checksum offload).  However, optimizations such as LSO
   and LRO involve some processing of the options themselves since they
   must be replicated/merged across multiple packets.  In these
   situations, it is desirable not to require changes to the offload
   logic to handle the introduction of new options.  To enable this,
   some constraints are placed on the definitions of options to allow
   for simple processing rules:</t>
        <ul spacing="normal">
          <li>When performing LSO, a NIC <bcp14>MUST</bcp14> replicate the entire Geneve header
      and all options, including those unknown to the device, onto each
      resulting segment unless an option allows an exception.  
      Conversely, when performing LRO, a NIC may assume that a
      binary comparison of the options (including unknown options) is
      sufficient to ensure equality and <bcp14>MAY</bcp14> merge packets with equal
      Geneve headers.</li>
          <li>Options <bcp14>MUST NOT</bcp14> be reordered during the course of offload
      processing, including when merging packets for the purpose of LRO.</li>
          <li>NICs performing offloads <bcp14>MUST NOT</bcp14> drop packets with unknown
      options, including those marked as critical, unless explicitly configured to do so.</li>
        </ul>
        <t>
   There is no requirement that a given implementation of Geneve employ
   the offloads listed as examples above.  However, as these offloads
   are currently widely deployed in commercially available NICs, the
   rules described here are intended to enable efficient handling of
   current and future options across a variety of devices.</t>
      </section>
      <section anchor="sec-4.7" numbered="true" toc="default">
        <name>Inner VLAN Handling</name>
        <t>
   Geneve is capable of encapsulating a wide range of protocols; therefore, a given implementation is likely to support only a small
   subset of the possibilities.  However, as Ethernet is expected to be
   widely deployed, it is useful to describe the behavior of VLANs
   inside encapsulated Ethernet frames.</t>
        <t>
   As with any protocol, support for inner VLAN headers is <bcp14>OPTIONAL</bcp14>.  In
   many cases, the use of encapsulated VLANs may be disallowed due to
   security or implementation considerations.  However, in other cases, the trunking of VLAN frames across a Geneve tunnel can prove useful.  As
   a result, the processing of inner VLAN tags upon ingress or egress
   from a tunnel endpoint is based upon the configuration of the tunnel
   endpoint and/or control plane and is not explicitly defined as part of
   the data format.</t>
      </section>
    </section>
    <section anchor="sec-5" numbered="true" toc="default">
      <name>Transition Considerations</name>
      <t>
   Viewed exclusively from the data plane, Geneve is compatible with existing IP networks 
   as it appears to most devices as UDP packets.
   However, as there are already a number of tunneling protocols deployed
   in network virtualization environments, there is a practical question
   of transition and coexistence.</t>
      <t>
   Since Geneve builds on the base data plane functionality provided by the most
   common protocols used for network virtualization (VXLAN and NVGRE),
   it should be straightforward to port an existing control plane
   to run on top of it with minimal effort.  With both the old and new
   packet formats supporting the same set of capabilities, there is no
   need for a hard transition; tunnel endpoints directly communicating with
   each other can use any common protocol, which may be different even
   within a single overall system.

   As transit devices are primarily
   forwarding packets on the basis of the IP header, all protocols
   appear to be similar, and these devices do not introduce additional
   interoperability concerns.</t>
      <t>
   To assist with this transition, it is strongly suggested that
   implementations support simultaneous operation of both Geneve and
   existing tunneling protocols, as it is expected to be common for a single
   node to communicate with a mixture of other nodes. Eventually, older
   protocols may be phased out as they are no longer in use.</t>
    </section>
    <section anchor="sec-6" numbered="true" toc="default">
      <name>Security Considerations</name>



      <t>
   As it is encapsulated within a UDP/IP packet, Geneve does not have any inherent security
   mechanisms.
   As a result, an attacker with access to the underlay
   network transporting the IP packets has the ability to snoop on, alter, or
   inject packets.  Compromised tunnel endpoints or transit devices may also
   spoof identifiers in the tunnel header to gain access to networks
   owned by other tenants.</t>
      <t>
   Within a particular security domain, such as a data center operated
   by a single service provider, the most common and highest-performing security
   mechanism is isolation of trusted components.  Tunnel traffic can be
   carried over a separate VLAN and filtered at any untrusted
   boundaries.</t>
      <t>
   When crossing an untrusted link, such as the general Internet, VPN technologies such as IPsec
   <xref target="RFC4301" format="default"/> should be used to provide authentication and/or encryption of
   the IP packets formed as part of Geneve encapsulation (see <xref target="sec-6.1.1" format="default"/>).</t>
      <t>
   Geneve does not otherwise affect the security of the encapsulated
   packets. As per the guidelines of BCP 72 <xref target="RFC3552" format="default"/>, the following sections 
   describe potential security risks that may be applicable to Geneve deployments 
   and approaches to mitigate such risks. It is also noted that not all such risks are applicable
   to all Geneve deployment scenarios, i.e., only a subset may be applicable to certain deployments. 
   An operator has to make an assessment based on their network
   environment, determine the risks that are applicable to their specific environment, and use appropriate mitigation approaches as applicable. </t>
      <section anchor="sec-6.1" numbered="true" toc="default">
        <name>Data Confidentiality</name>
        <t>
	Geneve is a network virtualization overlay encapsulation protocol 
	designed to establish tunnels between NVEs
	over an existing IP network. It can be used to deploy multi-tenant overlay networks
       	over an existing IP underlay network in a public or private data center. 

	The overlay service is typically provided by a service provider, such as a 
	cloud service provider or a private data center operator. This may or not may be 
	the same provider as an underlay service provider. Due to the nature of multi-tenancy in such environments, 
	a tenant system may expect data confidentiality to ensure its packet data is not tampered with
	(i.e., active attack) in transit or is a target of unauthorized
	monitoring (i.e., passive attack), for example, by other tenant systems or underlay service provider.
	A compromised network node or a transit device within a
	data center may passively monitor Geneve packet data between NVEs or route
	traffic for further inspection. A tenant may
	expect the overlay service provider to provide data confidentiality as part of the service, or
	a tenant may bring its own data confidentiality mechanisms like IPsec or TLS to protect the data
	end to end between its tenant systems. The overlay provider is expected to provide 
   	cryptographic protection in cases where the underlay provider is not the 
   	same as the overlay provider to ensure the payload is not exposed to the underlay.</t>
        <t>
	If an operator determines data confidentiality is necessary in their environment 
	based on their risk analysis -- for example, in multi-tenant
	environments -- then an encryption mechanism <bcp14>SHOULD</bcp14> be used to encrypt the tenant
	data end to end between the NVEs. The NVEs may use existing well-established 
	encryption mechanisms, such as IPsec, DTLS, etc.</t>
        <section anchor="sec-6.1.1" numbered="true" toc="default">

          <name>Inter-Data Center Traffic</name>
          <t>
	A tenant system in a customer premises (private data center) may want to connect
	to tenant systems on their tenant overlay network in a public cloud data center, or a tenant may want to have its tenant systems located in multiple geographically
	separated data centers for high availability. Geneve data traffic between tenant systems
	across such separated networks should be protected from threats when traversing public networks.
	Any Geneve overlay data leaving the data center network beyond the operator's security domain
	<bcp14>SHOULD</bcp14> be secured by encryption mechanisms, such as 
	IPsec or other VPN technologies, to protect the communications between the NVEs 
	when they are geographically separated over untrusted network links. Specification of 
	data protection mechanisms employed between data centers is beyond the scope of this document.</t>
          <t>
	The principles described in <xref target="sec-4" format="default"/> regarding controlled environments still apply to 
	the geographically separated data center usage outlined in this section.</t>
        </section>
      </section>
      <section anchor="sec-6.2" numbered="true" toc="default">
        <name>Data Integrity</name>
        <t>
	Geneve encapsulation is used between NVEs to establish overlay tunnels over an existing
	IP underlay network. In a multi-tenant data center,  a rogue or compromised tenant system
	may try to launch a passive attack, such as monitoring the traffic of other tenants, or an
	active attack, such as trying to inject unauthorized Geneve encapsulated traffic such 
	as spoofing, replay, etc., into the network. To prevent such attacks, an NVE <bcp14>MUST NOT</bcp14> 
	propagate Geneve packets beyond the NVE to tenant systems and <bcp14>SHOULD</bcp14> employ packet-filtering
	mechanisms so as not to forward unauthorized traffic between tenant systems in different tenant networks.
	An NVE <bcp14>MUST NOT</bcp14> interpret Geneve packets from tenant systems other than as frames to be encapsulated.</t>
        <t>
	A compromised network node or a transit device within a data center may launch an active
	attack trying to tamper with the Geneve packet data between NVEs. Malicious tampering of
	Geneve header fields may cause the packet from one tenant to be forwarded to a different 
	tenant network. If an operator determines there is a possibility of such a threat in their environment,
	the operator may choose to employ data integrity mechanisms between NVEs. In order to prevent
	such risks, a data integrity mechanism <bcp14>SHOULD</bcp14> be used in such environments to protect the
	integrity of Geneve packets, including packet headers, options, and payload on communications
	between NVE pairs. A cryptographic data protection mechanism, such as IPsec, may be used to
	provide data integrity protection. A data center operator may choose to deploy any other 
	data integrity mechanisms as applicable and supported in their underlay networks,
	although non-cryptographic mechanisms may not protect the Geneve portion of the packet from tampering. </t>
      </section>
      <section anchor="sec-6.3" numbered="true" toc="default">
        <name>Authentication of NVE Peers</name>
        <t>
	A rogue network device or a compromised NVE in a data center environment might be able to
	spoof Geneve packets as if it came from a legitimate NVE. In order to mitigate such a risk,
	an operator <bcp14>SHOULD</bcp14> use an authentication mechanism, such as IPsec, to ensure that the 
	Geneve packet originated from the intended NVE peer in environments where the operator
	determines spoofing or rogue devices are potential threats. Other simpler source checks,
	such as ingress filtering for VLAN/MAC/IP addresses, reverse path forwarding checks, etc.,
	may be used in certain trusted environments to ensure Geneve packets originated
       	from the intended NVE peer.</t>
      </section>
      <section anchor="sec-6.4" numbered="true" toc="default">
        <name>Options Interpretation by Transit Devices</name>
        <t>
	Options, if present in the packet, are generated and terminated by tunnel endpoints. As indicated
	in <xref target="sec-2.2.1" format="default"/>, transit devices may interpret the options. However, 
	if the packet is protected by encryption from tunnel endpoint
	to tunnel endpoint (for example, through IPsec), transit devices will not have visibility into the Geneve header or options
	in the packet.  In such cases, transit devices <bcp14>MUST</bcp14> handle Geneve packets as any other IP packet 
	and maintain consistent forwarding behavior. In cases where options are interpreted by transit devices, the operator
	<bcp14>MUST</bcp14> ensure that transit devices are trusted and not compromised. The definition of 
	a mechanism to ensure this trust is beyond the scope of this document.</t>
      </section>
      <section anchor="sec-6.5" numbered="true" toc="default">
        <name>Multicast/Broadcast</name>
        <t>
	In typical data center networks where IP multicasting is not supported in the underlay 
	network, multicasting may be supported using multiple unicast tunnels. The same security
	requirements as described in the above sections can be used to protect Geneve communications
	between NVE peers. If IP multicasting is supported in the underlay network and the operator
	chooses to use it for multicast traffic among tunnel endpoints, then the operator in such
	environments may use data protection mechanisms, such as IPsec with multicast 
	extensions <xref target="RFC5374" format="default"/>, to protect multicast traffic among Geneve NVE groups.</t>
      </section>
      <section anchor="sec-6.6" numbered="true" toc="default">
        <name>Control Plane Communications</name>
        <t>
	A Network Virtualization Authority (NVA) as outlined in <xref target="RFC8014" format="default"/> may
	be used as a control plane for configuring and managing the Geneve NVEs. The data center
	operator is expected to use security mechanisms to protect the communications between
	the NVA and NVEs and to use authentication mechanisms to detect any rogue or compromised 
	NVEs within their administrative domain.  Data protection mechanisms for control plane 
	communication or authentication mechanisms between the NVA and NVEs are beyond 
	the scope of this document.</t>
      </section>
    </section>
    <section anchor="sec-7" numbered="true" toc="default">
      <name>IANA Considerations</name>
      <t>
	IANA has allocated UDP port 6081 in the "Service Name and Transport Protocol 
	Port Number Registry" <xref target="IANA-SN" format="default"/> as the well-known destination port
       	for Geneve:</t>
<dl newline="false" spacing="compact">
<dt>Service Name:</dt><dd>geneve</dd>
<dt>Transport Protocol(s):</dt><dd>UDP</dd>
<dt>Assignee:</dt><dd>IESG &lt;iesg@ietf.org&gt;</dd>
<dt>Contact:</dt><dd>IETF Chair &lt;chair@ietf.org&gt;</dd>
<dt>Description:</dt><dd>Generic Network Virtualization Encapsulation (Geneve)</dd>
<dt>Reference:</dt><dd>[RFC8926]</dd>
<dt>Port Number:</dt><dd>6081</dd>
</dl>
      <t>
   In addition, IANA has created a new subregistry titled "Geneve Option Class"
   for option classes. This registry has been placed under  
   a new "Network Virtualization Overlay (NVO3)" heading in the IANA protocol registries <xref target="IANA-PR" format="default"/>. 
   The "Geneve Option Class" registry consists of
   16-bit hexadecimal values along with descriptive strings, assignee/contact information, and references.  
   The registration rules for the new registry are (as defined by <xref target="RFC8126" format="default"/>):</t>
      <table align="center"> <name>Geneve Option Class Registry Ranges</name>
        <thead>
          <tr>
            <th align="left"> Range</th>
            <th align="left"> Registration Procedures</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td align="left">0x0000-0x00FF</td>
            <td align="left">IETF Review</td>
          </tr>
          <tr>
            <td align="left">0x0100-0xFEFF</td>
            <td align="left">First Come First Served</td>
          </tr>
          <tr>
            <td align="left">0xFF00-0xFFFF</td>
            <td align="left">Experimental Use</td>
          </tr>
        </tbody>
      </table>
    </section>

 
  </middle>
  <back>

<displayreference target="I-D.ietf-nvo3-encap" to="NVO3-ENCAP"/>
<displayreference target="I-D.ietf-nvo3-dataplane-requirements" to="NVO3-DATAPLANE"/>
<displayreference target="I-D.ietf-intarea-tunnels" to="INTAREA-TUNNELS"/>

    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.0768.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.0792.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.1122.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.1191.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2003.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4443.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6040.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6936.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7365.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8085.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8126.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8200.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8201.xml"/>
      </references>
      <references>
        <name>Informative References</name>



        <reference anchor="ETYPES" target="https://www.iana.org/assignments/ieee-802-numbers">
          <front>
            <title>IEEE 802 Numbers</title>
            <author>
              <organization>IANA</organization>
            </author>
          </front>
        </reference>

        <xi:include href="https://datatracker.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-nvo3-encap.xml"/>

        <xi:include href="https://datatracker.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-nvo3-dataplane-requirements.xml"/>

        <xi:include href="https://datatracker.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-intarea-tunnels.xml"/>



        <reference anchor="IANA-PR" target="https://www.iana.org/protocols">
          <front>
            <title>Protocol Registries</title>
            <author>
              <organization>IANA</organization>
            </author>
               </front>
        </reference>



        <reference anchor="IANA-SN" target="https://www.iana.org/assignments/service-names-port-numbers">
          <front>
            <title>Service Name and Transport Protocol Port Number Registry</title>
            <author>
              <organization>IANA</organization>
            </author>
              </front>
        </reference>

<reference anchor="IEEE.802.1Q_2018" target="http://ieeexplore.ieee.org/servlet/opac?punumber=8403925">
          <front>
            <title>IEEE Standard for Local and Metropolitan Area Networks--Bridges and Bridged Networks</title>
            <seriesInfo name="DOI" value="10.1109/IEEESTD.2018.8403927"/>
            <seriesInfo name="IEEE" value="802.1Q-2018"/>
            <author>
              <organization>IEEE</organization>
            </author>
            <date month="July" year="2018"/>
          </front>
        </reference>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2983.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3031.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3552.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3985.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4301.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5374.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6438.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7348.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7637.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8014.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8086.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8293.xml"/>



        <reference anchor="VL2" target="https://dl.acm.org/doi/10.1145/1594977.1592576">
          <front>
            <title>VL2: A Scalable and Flexible Data Center Network</title>
            <seriesInfo name="DOI" value="10.1145/1594977.1592576"/>
            <author surname="Greenberg, A., et al.">
              <organization/>
            </author>
            <date month="August" year="2009"/>
          </front>
       <refcontent>ACM SIGCOMM Computer Communication Review</refcontent>
        </reference>
      </references>
    </references>
   <section anchor="sec-9" numbered="false" toc="default">
      <name>Acknowledgements</name>
      <t> 
	The authors wish to acknowledge <contact fullname="Puneet Agarwal"/>,
	<contact fullname="David Black"/>, <contact fullname="Sami Boutros"/>,
	  <contact fullname="Scott Bradner"/>, 
	<contact fullname="Martín Casado"/>, <contact fullname="Alissa Cooper"/>,
	  <contact fullname="Roman Danyliw"/>, <contact fullname="Bruce Davie"/>,
	    <contact fullname="Anoop Ghanwani"/>, <contact fullname="Benjamin
	      Kaduk"/>, <contact fullname="Suresh Krishnan"/>, <contact
	      fullname="Mirja Kühlewind"/>, <contact fullname="Barry Leiba"/>,
	      <contact fullname="Daniel Migault"/>, <contact fullname="Greg
		Mirksy"/>, <contact fullname="Tal Mizrahi"/>, 
	<contact fullname="Kathleen Moriarty"/>, <contact fullname="Magnus
	  Nyström"/>, <contact fullname="Adam Roach"/>, <contact fullname="Sabrina
	  Tanamal"/>, <contact fullname="Dave Thaler"/>, <contact fullname="Éric Vyncke"/>, 
	<contact fullname="Magnus Westerlund"/>, and many other members of the NVO3 Working Group for their reviews, comments, and suggestions.</t>
      <t> 
	The authors would like to thank <contact fullname="Sam Aldrin"/>,
	<contact fullname="Alia Atlas"/>, <contact fullname="Matthew Bocci"/>,
	  <contact fullname="Benson Schliesser"/>, and <contact fullname="Martin Vigoureux"/> 
	for their guidance throughout the process.</t>
    </section>
    
   <section anchor="sec-8" numbered="false" toc="default">
      <name>Contributors</name>
      <t>
   The following individuals were authors of an earlier version of this
   document and made significant contributions:</t>

    <contact fullname="Pankaj Garg" >
        <organization>Microsoft Corporation</organization>
        <address>
          <postal>
            <street>1 Microsoft Way</street>
            <city>Redmond</city>
            <region>WA</region><code>98052</code>
            <country>United States of America</country>
          </postal>
          <email>pankajg@microsoft.com</email>
        </address>
    </contact>

    <contact fullname="Chris Wright" >
        <organization>Red Hat Inc.</organization>
        <address>
          <postal>
            <street>1801 Varsity Drive</street>
            <city>Raleigh</city>
            <region>NC</region><code>27606</code>
            <country>United States of America</country>
          </postal>
          <email>chrisw@redhat.com</email>
        </address>
    </contact>

    <contact fullname="Kenneth Duda" >
        <organization>Arista Networks</organization>
        <address>
          <postal>
            <street>5453 Great America Parkway</street>
            <city>Santa Clara</city>
            <region>CA</region><code>95054</code>
            <country>United States of America</country>
          </postal>
          <email>kduda@arista.com</email>
        </address>
    </contact>

    <contact fullname="Dinesh G. Dutt" >
        <organization>Independent</organization>
        <address>
          <postal>
            <street></street>
            <city></city>
            <region></region><code></code>
            <country></country>
          </postal>
          <email>didutt@gmail.com</email>
        </address>
    </contact>

    <contact fullname="Jon Hudson" >
        <organization>Independent</organization>
        <address>
          <postal>
            <street></street>
            <city></city>
            <region></region><code></code>
            <country></country>
          </postal>
          <email>jon.hudson@gmail.com</email>
        </address>
    </contact>

    <contact fullname="Ariel Hendel" >
        <organization>Facebook, Inc.</organization>
        <address>
          <postal>
            <street>1 Hacker Way</street>
            <city>Menlo Park</city>
            <region>CA</region><code>94025</code>
            <country>United States of America</country>
          </postal>
          <email>ahendel@fb.com</email>
        </address>
    </contact>
   </section>
    
  </back>
</rfc>
