RFC 9387 | DOTS Telemetry Use Cases | April 2023 |
Hayashi, et al. | Informational | [Page] |
DDoS Open Threat Signaling (DOTS) telemetry enriches the base DOTS protocols to assist the mitigator in using efficient DDoS attack mitigation techniques in a network. This document presents sample use cases for DOTS telemetry. It discusses what components are deployed in the network, how they cooperate, and what information is exchanged to effectively use these techniques.¶
This document is not an Internet Standards Track specification; it is published for informational purposes.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are candidates for any level of Internet Standard; see Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9387.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Distributed Denial-of-Service (DDoS) attacks, such as volumetric attacks and resource-consuming attacks, are critical threats to be handled by service providers. When such DDoS attacks occur, service providers have to mitigate them immediately to protect or recover their services.¶
For service providers to immediately protect their network services from DDoS attacks, DDoS mitigation needs to be highly automated. To that aim, multivendor components involved in DDoS attack detection and mitigation should cooperate and support standard interfaces.¶
DDoS Open Threat Signaling (DOTS) is a set of protocols for real-time signaling, threat-handling requests, and data filtering between the multivendor elements [RFC9132] [RFC8783]. DOTS telemetry enriches the DOTS protocols with various telemetry attributes allowing optimal DDoS attack mitigation [RFC9244]. This document presents sample use cases for DOTS telemetry to enhance the overview and the purpose described in [RFC9244]. This document also presents what components are deployed in the network, how they cooperate, and what information is exchanged to effectively use attack-mitigation techniques.¶
Readers should be familiar with the terms defined in [RFC8612], [RFC8903], and [RFC9244].¶
In addition, this document uses the following terms:¶
This section describes DOTS telemetry use cases that use telemetry attributes included in the DOTS telemetry specification [RFC9244].¶
The following subsections assume that once the DOTS signal channel is established, DOTS clients will proceed with the telemetry setup configuration detailed in Section 7 of [RFC9244]. The following telemetry parameters are used:¶
Some transit providers have to mitigate large-scale DDoS attacks using DDoS Mitigation Systems (DMSes) with limited resources that are already deployed in their network. For example, recently reported large DDoS attacks exceeded several Tbps [DOTS_Overview].¶
This use case enables transit providers to use their DMS efficiently under volume-based DDoS attacks whose volume is more than the available capacity of the DMS. To enable this, the attack traffic of top talkers is redirected to the DMS preferentially by cooperation among forwarding nodes, flow collectors, and orchestrators.¶
Figure 1 gives an overview of this use case. Figure 2 provides an example of a DOTS telemetry message body that is used to signal top talkers (2001:db8:1::/48 and 2001:db8:2::/48).¶
The forwarding nodes send traffic statistics to the flow collectors, e.g., using IP Flow Information Export (IPFIX) [RFC7011]. When DDoS attacks occur, the flow collectors identify the attack traffic and send information about the top talkers to the orchestrator using the "target-prefix" and "top-talkers" DOTS telemetry attributes. The orchestrator then checks the available capacity of the DMSes using a network management protocol, such as the Simple Network Management Protocol (SNMP) [RFC3413] or YANG with the Network Configuration Protocol (YANG/NETCONF) [RFC7950]. After that, the orchestrator orders the forwarding nodes to redirect as much of the top talker's traffic to the DMSes as they can handle by dissemination of Flow Specifications using tools such as Border Gateway Protocol Dissemination of Flow Specification Rules (BGP Flowspec) [RFC8955].¶
The flow collector implements a DOTS client while the orchestrator implements a DOTS server.¶
Transit providers can deploy their DMSes in clusters. Then, they can select the DMS to be used to mitigate a DDoS attack at the time of an attack.¶
This use case enables transit providers to select a DMS with sufficient capacity for mitigation based on the volume of the attack traffic and the capacity of the DMS. Figure 3 gives an overview of this use case. Figure 4 provides an example of a DOTS telemetry message body that is used to signal percentiles for total attack traffic.¶
The forwarding nodes send traffic statistics to the flow collectors, e.g., using IPFIX. When DDoS attacks occur, the flow collectors identify the attack traffic and send information about the attack traffic volume to the orchestrator using the "target-prefix" and "total-attack-traffic" DOTS telemetry attributes. The orchestrator then checks the available capacity of the DMSes using a network management protocol, such as the Simple Network Management Protocol (SNMP) [RFC3413] or YANG with the Network Configuration Protocol (YANG/NETCONF) [RFC7950]. After that, the orchestrator selects a DMS with sufficient capacity to which attack traffic should be redirected. For example, a simple DMS selection algorithm can be used to choose a DMS whose available capacity is greater than the "peak-g" telemetry attribute indicated in the DOTS telemetry message. The orchestrator orders the appropriate forwarding nodes to redirect the attack traffic to the DMS relying upon routing policies, such as BGP [RFC4271].¶
The detailed DMS selection algorithm is out of the scope of this document.¶
The flow collector implements a DOTS client while the orchestrator implements a DOTS server.¶
A transit provider network has multiple paths to convey attack traffic to a DMS. In such a network, the attack traffic can be conveyed while avoiding congested links by adequately selecting an available path.¶
This use case enables transit providers to select a path with sufficient bandwidth for redirecting attack traffic to a DMS according to the bandwidth of the attack traffic and total traffic. Figure 5 gives an overview of this use case. Figure 6 provides an example of a DOTS telemetry message body that is used to signal percentiles for total traffic and total attack traffic.¶
The forwarding nodes send traffic statistics to the flow collectors, e.g., using IPFIX. When DDoS attacks occur, the flow collectors identify attack traffic and send information about the attack traffic volume to the orchestrator using the "target-prefix" and "total-attack-traffic" DOTS telemetry attributes. The underlying forwarding nodes send the volume of the total traffic passing the node to the orchestrator using the "total-traffic" telemetry attributes. The orchestrator then selects a path with sufficient bandwidth to which the flow of attack traffic should be redirected. For example, a simple selection algorithm can be used to choose a path whose available capacity is greater than the "peak-g" telemetry attribute that was indicated in a DOTS telemetry message. After that, the orchestrator orders the appropriate forwarding nodes to redirect the attack traffic to the DMS by dissemination of Flow Specifications using tools such as BGP Flowspec [RFC8955].¶
The detailed path selection algorithm is out of the scope of this document.¶
The flow collector and forwarding nodes implement a DOTS client while the orchestrator implements a DOTS server.¶
Short but extreme volumetric attacks, such as pulse wave DDoS attacks, are threats to Internet transit provider networks. These attacks start from zero and go to maximum values in a very short time span. The attacks go back to zero and then back to maximum values, repeating in continuous cycles at short intervals. It is difficult for transit providers to mitigate such an attack with their DMSes by redirecting attack flows because this may cause route flapping in the network. The practical way to mitigate short but extreme volumetric attacks is to offload mitigation actions to a forwarding node.¶
This use case enables transit providers to mitigate short but extreme volumetric attacks. Furthermore, the aim is to estimate the network-access success rate based on the bandwidth of the attack traffic. Figure 7 gives an overview of this use case. Figure 8 provides an example of a DOTS telemetry message body that is used to signal total pipe capacity. Figure 9 provides an example of a DOTS telemetry message body that is used to signal various percentiles for total traffic and total attack traffic.¶
When DDoS attacks occur, the network management system receives alerts. Then, it sends the target IP address(es) and volume of the DDoS attack traffic to the administrative system using the "target-prefix" and "total-attack-traffic" DOTS telemetry attributes. After that, the administrative system orders relevant forwarding nodes to carry out rate-limiting of all traffic destined to the target based on the pipe capability by the dissemination of the Flow Specifications using tools such as BGP Flowspec [RFC8955]. In addition, the administrative system estimates the network-access success rate of the target, which is calculated by (total-pipe-capability / (total-pipe-capability + total-attack-traffic)).¶
Note that total pipe capability information can be gathered by telemetry setup in advance (Section 7.2 of [RFC9244]).¶
The network management system implements a DOTS client while the administrative system implements a DOTS server.¶
Some volumetric attacks, such as DNS amplification attacks, can be detected with high accuracy by checking the Layer 3 or Layer 4 information of attack packets. These attacks can be detected and mitigated through cooperation among forwarding nodes and flow collectors using IPFIX. It may also be necessary to inspect the Layer 7 information of suspicious packets to detect attacks such as DNS water torture attacks [DNS_Water_Torture_Attack]. To carry out the DNS water torture attack, an attacker commands a botnet to make thousands of DNS requests for fake subdomains against an authoritative name server. Such attack traffic should be detected and mitigated at the DMS.¶
This use case enables transit providers to select a mitigation technique based on the type of attack traffic, whether it is an amplification attack or not. To use such a technique, the attack traffic is blocked by forwarding nodes or redirected to a DMS based on the attack type through cooperation among forwarding nodes, flow collectors, and an orchestrator.¶
Figure 10 gives an overview of this use case. Figure 11 provides an example of attack mappings that are shared using the DOTS data channel in advance. Figure 12 provides an example of a DOTS telemetry message body that is used to signal percentiles for total attack traffic, total attack traffic protocol, and total attack connection; it also shows attack details.¶
The example in Figure 11 uses the folding defined in [RFC8792] for long lines.¶
Attack mappings are shared using the DOTS data channel in advance (Section 8.1.6 of [RFC9244]). The forwarding nodes send traffic statistics to the flow collectors, e.g., using IPFIX. When DDoS attacks occur, the flow collectors identify attack traffic and send attack type information to the orchestrator using the "vendor-id" and "attack-id" telemetry attributes. The orchestrator then resolves abused port numbers and orders relevant forwarding nodes to block the amplification attack traffic flow by dissemination of Flow Specifications using tools such as BGP Flowspec [RFC8955]. Also, the orchestrator orders relevant forwarding nodes to redirect traffic other than the amplification attack traffic using a routing protocol, such as BGP [RFC4271].¶
The flow collector implements a DOTS client while the orchestrator implements a DOTS server.¶
It is possible for the transit provider to add value to the DDoS mitigation service by reporting ongoing and detailed DDoS countermeasure status to the enterprise network. In addition, it is possible for the transit provider to know whether the DDoS countermeasure is effective or not by receiving reports from the enterprise network.¶
This use case enables the mutual sharing of information about ongoing DDoS countermeasures between the transit provider and the enterprise network. Figure 13 gives an overview of this use case. Figure 14 provides an example of a DOTS telemetry message body that is used to signal total pipe capacity from the enterprise network administrator to the orchestrator in the ISP. Figure 15 provides an example of a DOTS telemetry message body that is used to signal percentiles for total traffic and total attack traffic as well as attack details from the orchestrator to the network.¶
The network management system in the enterprise network reports limits of incoming traffic volume from the transit provider to the orchestrator in the transit provider in advance. It is reported using the "total-pipe-capacity" telemetry attribute in the DOTS telemetry setup.¶
When DDoS attacks occur, DDoS mitigation orchestration [RFC8903] is carried out in the transit provider. Then, the DDoS mitigation systems report the status of DDoS countermeasures to the orchestrator by sending "attack-detail" telemetry attributes. After that, the orchestrator integrates the reports from the DDoS mitigation systems, while removing duplicate contents, and sends the integrated report to a network administrator using DOTS telemetry periodically.¶
During the DDoS mitigation, the orchestrator in the transit provider retrieves the link congestion status from the network manager in the enterprise network using the "total-traffic" telemetry attributes. Then, the orchestrator checks whether or not the DDoS countermeasures are effective by comparing the "total-traffic" and the "total-pipe-capacity" telemetry attributes.¶
The DMS implements a DOTS server while the orchestrator behaves as a DOTS client and a server in the transit provider. In addition, the network administrator implements a DOTS client.¶
DDoS detection based on tools, such as IPFIX, is a lighter-weight method of detecting DDoS attacks compared to DMSes in Internet transit provider networks. DDoS detection based on the DMSes is a more accurate method for detecting attack traffic than flow monitoring.¶
The aim of this use case is to increase flow collectors' detection accuracy by carrying out supervised machine-learning techniques according to attack detail reported by the DMSes. To use such a technique, forwarding nodes, flow collectors, and a DMS should cooperate. Figure 16 gives an overview of this use case. Figure 17 provides an example of a DOTS telemetry message body that is used to signal attack detail.¶
The forwarding nodes send traffic statistics to the flow collectors, e.g., using IPFIX. When DDoS attacks occur, DDoS mitigation orchestration is carried out (as per Section 3.3 of [RFC8903]), and the DMS mitigates all attack traffic destined for a target. The DDoS mitigation system reports the "vendor-id", "attack-id", and "top-talker" telemetry attributes to a flow collector.¶
After mitigating a DDoS attack, the flow collector attaches outputs of the DMS as labels to the statistics of the traffic flow of top talkers. The outputs, for example, are the "attack-id" telemetry attributes. The flow collector then carries out supervised machine learning to increase its detection accuracy, setting the statistics as an explanatory variable and setting the labels as an objective variable.¶
The DMS implements a DOTS client while the flow collector implements a DOTS server.¶
DMSes can detect DDoS attack traffic, which means DMSes can also identify clean traffic. This use case supports unsupervised machine learning for anomaly detection according to a baseline reported by the DMSes. To use such a technique, forwarding nodes, flow collectors, and a DMS should cooperate. Figure 18 gives an overview of this use case. Figure 19 provides an example of a DOTS telemetry message body that is used to signal baseline.¶
The forwarding nodes carry out traffic mirroring to copy the traffic destined to an IP address and to monitor the traffic by a DMS. The DMS then identifies clean traffic and reports the baseline telemetry attributes to the flow collector using DOTS telemetry.¶
The flow collector then carries out unsupervised machine learning to be able to carry out anomaly detection.¶
The DMS implements a DOTS client while the flow collector implements a DOTS server.¶
Security considerations for DOTS telemetry are discussed in Section 14 of [RFC9244]. These considerations apply to the communication interfaces where DOTS is used.¶
Some use cases involve controllers, orchestrators, and programmable interfaces. These interfaces can be misused by misbehaving nodes to further exacerbate DDoS attacks. The considerations are for end-to-end systems for DoS mitigation, so the mechanics are outside the scope of DOTS protocols. Section 5 of [RFC7149] discusses some generic security considerations to take into account in such contexts (e.g., reliable access control). Specific security measures depend on the actual mechanism used to control underlying forwarding nodes and other controlled elements. For example, Section 12 of [RFC8955] discusses security considerations that are relevant to BGP Flowspec. IPFIX-specific considerations are discussed in Section 11 of [RFC7011].¶
This document has no IANA actions.¶
The authors would like to thank Mohamed Boucadair and Valery Smyslov for their valuable feedback.¶
Thanks to Paul Wouters for the detailed AD review.¶
Many thanks to Donald Eastlake 3rd, Phillip Hallam-Baker, Sean Turner, and Peter Yee for their reviews.¶
Thanks to Lars Eggert, Murray Kucherawy, Roman Danyliw, Robert Wilton, and Éric Vyncke for the IESG review.¶