<?xml version="1.0" encoding="utf-8"?>

<!-- generated by https://github.com/cabo/kramdown-rfc2629 version 1.6.5 (Ruby 2.7.0) -->

<!DOCTYPE rfc [
 <!ENTITY nbsp    "&#160;">
 <!ENTITY zwsp   "&#8203;">
 <!ENTITY nbhy   "&#8209;">
 <!ENTITY wj     "&#8288;">
]> 

<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-ietf-opsec-probe-attribution-09" number="9511" submissionType="IETF" category="info" consensus="true" tocInclude="true" sortRefs="true" symRefs="true" updates="" obsoletes="" xml:lang="en" version="3">

<!-- xml2rfc v2v3 conversion 2.40.0 -->
  <front>
    <title abbrev="Attribution of Internet Probes">Attribution of Internet Probes</title>
    <seriesInfo name="RFC" value="9511"/>
    <author initials="É." surname="Vyncke" fullname="Éric Vyncke">
      <organization>Cisco</organization>
      <address>
        <postal>
          <street>De Kleetlaan 6A</street>
          <city>Diegem</city>
          <code>1831</code>
          <country>Belgium</country>
        </postal>
        <email>evyncke@cisco.com</email>
      </address>
    </author>
    <author initials="B." surname="Donnet" fullname="Benoît Donnet">
      <organization>Université de Liège</organization>
      <address>
        <postal>
          <country>Belgium</country>
        </postal>
        <email>benoit.donnet@uliege.be</email>
      </address>
    </author>
    <author initials="J." surname="Iurman" fullname="Justin Iurman">
      <organization>Université de Liège</organization>
      <address>
        <postal>
          <country>Belgium</country>
        </postal>
        <email>justin.iurman@uliege.be</email>
      </address>
    </author>
    <date year="2023" month="November"/>

    <area>ops</area>
    <workgroup>opsec</workgroup>

<keyword>research</keyword>
<keyword>measurement</keyword>
<keyword>identification</keyword>
<keyword>probing</keyword>
<keyword>out-of-band</keyword>
<keyword>in-band</keyword>

    <abstract>
      <t>Active measurements over the public Internet can target either collaborating parties or non-collaborating ones. Sometimes these measurements, also called "probes", are viewed as unwelcome or aggressive.</t>
      <t>This document suggests some simple techniques for a source to identify its probes. This allows any party or organization to understand what an unsolicited probe packet is, what its purpose is, and, most importantly, who to contact. The technique relies on offline analysis of the probe; therefore, it does not require any change in the data or control plane. It has been designed mainly for layer 3 measurements.</t>
    </abstract>
  </front>
  <middle>
    <section anchor="introduction">
      <name>Introduction</name>
      <t>Most measurement research (e.g., <xref target="LARGE_SCALE"/>, <xref target="RFC7872"/>, and <xref target="I-D.vyncke-v6ops-james"/>) is about sending IP packets (sometimes with extension headers or layer 4 headers) over the public Internet, and those packets can be destined to either collaborating parties or non-collaborating ones. Such packets are called "probes" in this document.</t>
      <t>Sending unsolicited probes should obviously be done at a rate low enough to not unduly impact the other parties' resources. But even at a low rate, those probes could trigger an alarm that will request some investigations by either the party receiving the probe (i.e., when the probe destination address is one address assigned to the receiving party) or a third party having some devices through which those probes are transiting (e.g., an Internet transit router). The investigation will be done offline by using packet captures; therefore, probe attribution does not require any change in the data or control planes.</t>
      <t>This document suggests some simple techniques for a source to identify its probes. This allows any party or organization to understand:</t>
      <ul spacing="normal">
        <li>what an unsolicited probe packet is,</li>
        <li>what its purpose is, and</li>
        <li>most importantly, who to contact for further information.</li>
      </ul>
      <t>It is expected that only researchers with good intentions will use these techniques, although anyone might use them. This is discussed in <xref target="security"/>.</t>
      <t>While the technique could be used to mark measurements done at any layer of the protocol stack, it is mainly designed to work for measurements done at layer 3 (and its associated options or extension headers).</t>
    </section>
    <section anchor="probe-description">
      <name>Probe Description</name>
      <t>This section provides a way for a source to describe (i.e., to identify) its probes.</t>
      <section anchor="uri">
        <name>Probe Description URI</name>
        <t>This document defines a Probe Description URI as a URI pointing to one of the following:</t>
        <ul spacing="normal">
          <li>a Probe Description File (see <xref target="file"/>) as defined in <xref target="iana"/>, e.g., "https://example.net/.well-known/probing.txt";</li>
          <li>an email address, e.g., "mailto:lab@example.net"; or</li>
          <li>a phone number, e.g., "tel:+1-201-555-0123".</li>
        </ul>
      </section>
      <section anchor="file">
        <name>Probe Description File</name>
        <t>As defined in <xref target="iana"/>, the Probe Description File must be made available at "/.well-known/probing.txt". The Probe Description File must follow the format defined in <xref target="RFC9116" section="4" sectionFormat="of" /> and should contain the following fields defined in <xref target="RFC9116" section="2" sectionFormat="of" />:</t>
        <ul spacing="normal">
          <li>Canonical</li>
          <li>Contact</li>
          <li>Expires</li>
          <li>Preferred-Languages</li>
        </ul>
        <t>A new field "Description" should also be included to describe the measurement. To match the format defined in <xref target="RFC9116" section="4" sectionFormat="of" />, this field must be a one-line string with no line break.</t>
        <section anchor="example">
          <name>Example</name>
          <artwork><![CDATA[
# Canonical URI (if any)
Canonical: https://example.net/measurement.txt

# Contact address
Contact: mailto:lab@example.net

# Validity
Expires: 2023-12-31T18:37:07z

# Languages
Preferred-Languages: en, es, fr

# Probes description
Description: This is a one-line string description of the probes.
]]></artwork>
        </section>
      </section>
    </section>
    <section anchor="out-of-band-probe-attribution">
      <name>Out-of-Band Probe Attribution</name>
      <t>A possibility for probe attribution is to build a specific URI based on the source address of the probe packet, following <xref target="RFC8615"/>. For example, with a probe source address 2001:db8:dead::1, the following URI is built:</t>
      <ul spacing="normal">
        <li>If the reverse DNS record for 2001:db8:dead::1 exists, e.g., "example.net", then the Probe Description URI is "https://example.net/.well-known/probing.txt". There should be only one reverse DNS record; otherwise, the Probe Description File should also exist for all reverse DNS records and be identical.</li>
        <li>Else (or in addition), the Probe Description URI is "https://[2001:db8:dead::1]/.well-known/probing.txt".</li>
      </ul>
      <t>The built URI must be a reference to the Probe Description File (see <xref target="file"/>).</t>
      <t>As an example, the UK National Cyber Security Centre <xref target="NCSC"/> uses a similar attribution. They scan for vulnerabilities across Internet-connected systems in the UK and publish information on their scanning <xref target="NCSC_SCAN_INFO"/>, providing the address of the web page in reverse DNS.</t>
    </section>
    <section anchor="in-band-probe-attribution">
      <name>In-Band Probe Attribution</name>
      <t>Another possibility for probe attribution is to include a Probe Description URI in the probe itself. Here is a non-exhaustive list of examples:</t>
      <ul spacing="normal">
        <li>For an ICMPv6 echo request <xref target="RFC4443"/>, include it in the data field.</li>
        <li>For an ICMPv4 echo request <xref target="RFC0792"/>, include it in the data field.</li>
        <li>For a UDP datagram <xref target="RFC0768"/>, include it in the data payload if there is no upper-layer protocol after the transport layer.</li>
        <li>For a TCP segment <xref target="RFC9293"/>, include it in the data payload if there is no upper-layer protocol after the transport layer.</li>
        <li>For an IPv6 packet <xref target="RFC8200"/>, include it in a PadN option inside either a Hop-by-Hop or Destination Options header.</li>
      </ul>
      <t>The Probe Description URI must start at the first octet of the payload and must be terminated by an octet of 0x00, i.e., it must be null terminated. If the Probe Description URI cannot be placed at the beginning of the payload, then it must be preceded by an octet of 0x00. Inserting the Probe Description URI could obviously bias the measurement itself if the probe packet becomes larger than the path MTU. Some examples are given in <xref target="examples"/>.</t>
      <t>Using a magic string (i.e., a unique, special opaque marker) to signal the presence of the Probe Description URI is not recommended as some transit nodes could apply different processing for packets containing this magic string.</t>
      <t>For the record, in-band probe attribution was used in <xref target="I-D.vyncke-v6ops-james"/>.</t>
    </section>
    <section anchor="operational-and-technical-considerations">
      <name>Operational and Technical Considerations</name>
      <t>Using either the out-of-band or in-band technique, or even both combined, highly depends on intent or context. This section describes the upsides and downsides of each technique so that probe owners or probe makers can freely decide what works best for their cases.</t>
      <t>The advantages of using the out-of-band technique are that the probing measurement is not impacted by probe attribution and that it is easy to set up, i.e., by running a web server on a probe device to describe the measurements. Unfortunately, there are some disadvantages too. In some cases, using the out-of-band technique might not be possible due to several conditions: the presence of a NAT, too many endpoints to run a web server on, the probe source IP address cannot be known (e.g., RIPE Atlas <xref target="RIPE_ATLAS"/> probes are sent from IP addresses not owned by the probe owner), dynamic source addresses, etc.</t>
      <t>The primary advantage of using the in-band technique is that it covers the cases where the out-of-band technique is not feasible (as described above). The primary disadvantage is that it could potentially bias the measurements, since packets with the Probe Description URI might be discarded. For example, data is allowed in TCP segments with the SYN flag <xref target="RFC9293"/> but may change the way they are processed, i.e., TCP segments with the SYN flag containing the Probe Description URI might be discarded. Another example is the Probe Description URI included in a Hop-by-Hop or Destination Options header inside a PadN option. <xref target="RFC4942" section="2.1.9.5" sectionFormat="of" /> (an Informational RFC) suggests that a PadN option should only contain 0s and be smaller than 8 octets, thus limiting its use for probe attribution. If a PadN option does not respect the recommendation, it is suggested that one may consider dropping such packets. For example, since version 3.5, the Linux Kernel follows these
   recommendations and discards such packets.</t>
      <t>Having both the out-of-band and in-band techniques combined also has a big advantage, i.e., it could be used as an indirect means of "authenticating" the Probe Description URI in the in-band probe, thanks to a correlation with the out-of-band technique (e.g., a reverse DNS lookup). While the out-of-band technique alone is less prone to spoofing, the combination with the in-band technique offers a more complete solution.</t>
    </section>
    <section anchor="ethical-considerations">
      <name>Ethical Considerations</name>
      <t>Executing measurement experiences over the global Internet obviously requires ethical consideration, which is discussed in <xref target="ANRW_PAPER"/>, especially when unsolicited transit or destination parties are involved.</t>
      <t>This document proposes a common way to identify the source and the purpose of active probing in order to reduce the potential burden on the unsolicited parties.</t>
      <t>But there are other considerations to be taken into account, from the payload content (e.g., is the encoding valid?) to the transmission rate (see also <xref target="IPV6_TOPOLOGY"/> and <xref target="IPV4_TOPOLOGY"/> for some probing speed impacts). Those considerations are out of scope of this document.</t>
    </section>
    <section anchor="security">
      <name>Security Considerations</name>
      <t>This document proposes simple techniques for probe attribution. It is expected that only ethical researchers would use them, which would simplify and reduce the time to identify probes across the Internet. In fact, these techniques could be used by anyone, malicious or not, which means that the information obtained cannot be blindly trusted. Using these techniques should not mean that a probe can be trusted. Instead, third parties should use this solution to potentially understand the origin and context of such probes. This solution is not perfect, but it provides a way for probe attribution, which is better than no solution at all.</t>
      <t>Probe attribution is provided to identify the source and intent of specific probes, but there is no authentication possible for the inline information.  Therefore, a malevolent actor could provide false information while conducting the probes or spoof them so that the action is attributed to a third party. In that case, not only would this third party be wrongly accused, but it might also be exposed to unwanted solicitations (e.g., angry emails or phone calls if the malevolent actor used someone else's email address or phone number). As a consequence, the recipient of this information cannot trust it without confirmation.  If a recipient cannot confirm the information or does not wish to do so, it should treat the flows as if there were no probe attribution. Note that using probe attribution does not create a new DDoS vector since there is no expectation that third parties would automatically confirm the information obtained.</t>
      <t>As the Probe Description URI is transmitted in the clear and as the Probe Description File is publicly readable, Personally Identifiable Information (PII) should not be used for an email address and a phone number; a generic or group email address and phone number should be preferred. Also, the Probe Description File could contain malicious data (e.g., links) and therefore should not be blindly trusted.</t>
    </section>
    <section anchor="iana">
      <name>IANA Considerations</name>
      <t>IANA has added the following URI suffix to the "Well-Known URIs" registry in accordance with <xref target="RFC8615"/>:</t>
      <dl newline="false" spacing="normal">
        <dt>URI Suffix:</dt> <dd>probing.txt</dd>
        <dt>Change Controller:</dt> <dd>IETF</dd>
        <dt>Reference:</dt> <dd>RFC 9511</dd>
        <dt>Status:</dt> <dd>permanent</dd>
      </dl>
    </section>
  </middle>
  <back>

    <displayreference target="I-D.vyncke-v6ops-james" to="JAMES"/>
    
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>

<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9116.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8615.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4443.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.0792.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.0768.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9293.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8200.xml"/>

      </references>
      <references>
        <name>Informative References</name>

        <reference anchor="LARGE_SCALE" target="https://dl.acm.org/doi/pdf/10.1145/1071690.1064256">
          <front>
            <title>Efficient Algorithms for Large-Scale Topology Discovery</title>
            <seriesInfo name="DOI" value="10.1145/1071690.1064256"/>
            <author initials="B." surname="Donnet" fullname="Benoît Donnet">
              <organization>Université Pierre &amp; Marie Curie Laboratoire LiP6-CNRS</organization>
            </author>
            <author initials="P." surname="Raoult" fullname="Philippe Raoult">
              <organization>Université Pierre &amp; Marie Curie Laboratoire LiP6-CNRS</organization>
            </author>
            <author initials="T." surname="Friedman" fullname="Timur Friedman">
              <organization>Université Pierre &amp; Marie Curie Laboratoire LiP6-CNRS</organization>
            </author>
            <author initials="M." surname="Crovella" fullname="Mark Crovella">
              <organization>Boston University Computer Science Department</organization>
            </author>
            <date year="2005" month="June"/>
          </front>
	  <seriesInfo name="DOI" value="10.1145/1071690.1064256"/>
        </reference>

        <reference anchor="IPV6_TOPOLOGY" target="http://www.cmand.org/papers/beholder-imc18.pdf">
          <front>
            <title>In the IP of the Beholder: Strategies for Active IPv6 Topology Discovery</title>
            <author initials="R." surname="Beverly" fullname="Robert Beverly">
              <organization>Naval Postgraduate School</organization>
            </author>
            <author initials="R." surname="Durairajan" fullname="Ramakrishnan Durairajan">
              <organization>University of Oregon</organization>
            </author>
            <author initials="D." surname="Plonka" fullname="David Plonka">
              <organization>Akamai Technologies</organization>
            </author>
            <author initials="J." surname="Rohrer" fullname="Justin P. Rohrer">
              <organization>Naval Postgraduate School</organization>
            </author>
            <date year="2018" month="October"/>
          </front>
          <seriesInfo name="DOI" value="10.1145/3278532.3278559"/>	  
        </reference>

        <reference anchor="IPV4_TOPOLOGY" target="http://www.cmand.org/papers/yarrp-imc16.pdf">
          <front>
            <title>Yarrp'ing the Internet: Randomized High-Speed Active Topology Discovery</title>
            <author initials="R." surname="Beverly" fullname="Robert Beverly">
              <organization>Naval Postgraduate School</organization>
            </author>
            <date year="2016" month="November"/>
          </front>
          <seriesInfo name="DOI" value="10.1145/2987443.2987479"/>	  
        </reference>

        <reference anchor="RIPE_ATLAS" target="https://atlas.ripe.net/">
          <front>
            <title>RIPE Atlas</title>
            <author>
              <organization>RIPE Network Coordination Centre (RIPE NCC)</organization>
            </author>
          </front>
        </reference>

        <reference anchor="NCSC" target="https://www.ncsc.gov.uk/">
          <front>
            <title>The National Cyber Security Centre</title>
            <author>
              <organization>UK NCSC</organization>
            </author>
          </front>
        </reference>

        <reference anchor="NCSC_SCAN_INFO" target="https://www.ncsc.gov.uk/information/ncsc-scanning-information">
          <front>
            <title>NCSC Scanning information</title>
            <author>
              <organization>UK NCSC</organization>
            </author>
          </front>
        </reference>

        <reference anchor="SCAPY" target="https://scapy.net/">
          <front>
            <title>Scapy</title>
            <author>
              <organization/>
            </author>
          </front>
        </reference>

        <reference anchor="ANRW_PAPER" target="https://pure.mpg.de/rest/items/item_3517635/component/file_3517636/content">
          <front>
            <title>Crisis, Ethics, Reliability &amp; a measurement.network - Reflections on Active Network Measurements in Academia</title>
            <author initials="T." surname="Fiebig" fullname="Tobias Fiebig">
              <organization>Max-Planck-Institut für Informatik</organization>
            </author>
            <date year="2023" month="July"/>
          </front>
	  <seriesInfo name="DOI" value="10.1145/3606464.3606483"/>
        </reference>

<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7872.xml"/>

<reference anchor="I-D.vyncke-v6ops-james" target="https://datatracker.ietf.org/doc/html/draft-vyncke-v6ops-james-03">
<front>
<title>
Just Another Measurement of Extension header Survivability (JAMES)
</title>
<author initials="É." surname="Vyncke" fullname="Éric Vyncke">
</author>
<author initials="R." surname="Léas" fullname="Raphaël Léas">
<organization>Université de Liège</organization>
</author>
<author initials="J." surname="Iurman" fullname="Justin Iurman">
<organization>Université de Liège</organization>
</author>
<date month="January" day="9" year="2023"/>
</front>
<seriesInfo name="Internet-Draft" value="draft-vyncke-v6ops-james-03"/>
</reference>

<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4942.xml"/>

      </references>
    </references>
    <section anchor="examples">
      <name>Examples of In-Band Attribution</name>
      <t>Here are several examples generated by <xref target="SCAPY"/> and displayed in the 'tcpdump' format:</t>
      <t>IP packet with Probe Description URI inside a Destination Options extension header:</t>

      <artwork><![CDATA[
IP6 2001:db8:dead::1 > 2001:db8:beef::1: DSTOPT 60878 > traceroute:
Flags [S], seq 0, win 8192, length 0

0x0000:  6000 0000 0044 3c40 2001 0db8 dead 0000  `....D<@........
0x0010:  0000 0000 0000 0001 2001 0db8 beef 0000  ................
0x0020:  0000 0000 0000 0001 0605 012c 6874 7470  ...........,http
0x0030:  733a 2f2f 6578 616d 706c 652e 6e65 742f  s://example.net/
0x0040:  2e77 656c 6c2d 6b6e 6f77 6e2f 7072 6f62  .well-known/prob
0x0050:  696e 672e 7478 7400 edce 829a 0000 0000  ing.txt.........
0x0060:  0000 0000 5002 2000 2668 0000            ....P...&h..
]]></artwork>

      <t>IP packet with the URI in the data payload of a TCP SYN:</t>
      <artwork><![CDATA[      
IP6 2001:db8:dead::1.15581 > 2001:db8:beef::1.traceroute:
Flags [S], seq 0:23, win 8192, length 23

0x0000:  6000 0000 002b 0640 2001 0db8 dead 0000  `....+.@........
0x0010:  0000 0000 0000 0001 2001 0db8 beef 0000  ................
0x0020:  0000 0000 0000 0001 3cdd 829a 0000 0000  ........<.......
0x0030:  0000 0000 5002 2000 c9b7 0000 6d61 696c  ....P.......mail
0x0040:  746f 3a6c 6162 4065 7861 6d70 6c65 2e6e  to:lab@example.n
0x0050:  6574 00                                  et.
]]></artwork>

              <t>IP echo request with another URI in the data part of the ICMP ECHO_REQUEST:</t>
      <artwork><![CDATA[      
IP6 2001:db8:dead::1 > 2001:db8:beef::1: ICMP6, echo request, id 0,
seq 0, length 28

0x0000:  6000 0000 001c 3a40 2001 0db8 dead 0000  `.....:@........
0x0010:  0000 0000 0000 0001 2001 0db8 beef 0000  ................
0x0020:  0000 0000 0000 0001 8000 2996 0000 0000  ..........).....
0x0030:  7465 6c3a 2b31 2d32 3031 2d35 3535 2d30  tel:+1-201-555-0
0x0040:  3132 3300                                123.
]]></artwork>

              <t>IPv4 echo request with a URI in the data part of the ICMP ECHO_REQUEST:</t>
            <artwork><![CDATA[
IP 192.0.2.1 > 198.51.10.1: ICMP echo request, id 0, seq 0, length 31

0x0000:  4500 0033 0001 0000 4001 8e93 c000 0201  E..3....@.......
0x0010:  c633 0a01 0800 ea74 0000 0000 6d61 696c  .3d....t....mail
0x0020:  746f 3a6c 6162 4065 7861 6d70 6c65 2e6e  to:lab@example.n
0x0030:  6574 00                                  et.
]]></artwork>
    </section>
    <section anchor="acknowledgments" numbered="false">
      <name>Acknowledgments</name>
      <t>The authors would like to thank <contact fullname="Alain Fiocco"/>, <contact fullname="Fernando Gont"/>, <contact fullname="Ted Hardie"/>, <contact fullname="Mehdi Kouhen"/>, and <contact fullname="Mark Townsley"/> for helpful discussions as well as <contact fullname="Raphael Leas"/> for an early implementation.</t>
      <t>The authors would also like to gracefully acknowledge useful reviews and comments received from <contact fullname="Warren Kumari"/>, <contact fullname="Jen Linkova"/>, <contact fullname="Mark Nottingham"/>, <contact fullname="Prapanch Ramamoorthy"/>, <contact fullname="Tirumaleswar Reddy.K"/>, <contact fullname="Andrew Shaw"/>, and <contact fullname="Magnus Westerlund"/>.</t>
    </section>
  </back>
</rfc>
