| TOC |
This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 22, 2003.
Copyright (C) The Internet Society (2003). All Rights Reserved.
The Real Time Transport Protocol (RTP) provides unreliable transport of real time media from a sender to one or more recipients. RTP sessions are typically set up through signaling protocols such as the Session Initiation Protocol (SIP) or the Real Time Streaming Protocol (RTSP). When RTP is set up with these protocols, a potential Denial of Service (DoS) attack is introduced. This attack allows an attacker to cause a flood of RTP packets to be sent towards a target. We describe this attack, and also show how it is effectively prevented using Interactive Connectivity Establishment (ICE), first introduced as a means of handling Network Address Translator (NAT) traversal.
| TOC |
| TOC |
The Real-Time Transport Protocol (RTP) provides for carriage of real-time media, such as audio and video, from a sender to one or more receivers. An RTP session is defined as an association between a set of participants communicating with RTP. To take plaec, RTP sessions need a set of parameters to be shared amongst participants. These parameters include the IP address and port where media is to be sent, the payload format to be used, the media codec to be used, and so on. This information is conveyed through signaling protocols, such as the Session Initiation Protocol (SIP) and the Real Time Streaming Protocol (RTSP). Both of these protocols use the Session Description Protocol (SDP) for describing the RTP session parameters.
When RTP sessions are set up using SIP or RTSP, the possibility of a denial-of-service (DoS) attack is introduced. This attack allows an attacker to direct a stream of RTP packets from a network server, used as a launching point of the attack, towards a target. Because RTP streams can potentially require a lot of bandwidth (up to several hundreds of megabits per second using uncompressed high quality video ), the attack provides substantial amplification properties, making it an attractive venue for attacks.
| TOC |
The principle assumption behind the attack is that there are servers on the public network which will, through protocols such as SIP and RTSP, create potentially large numbers of RTP sessions that generate media towards the client. In the case of SIP, interactive voice response systems (IVRs), telephony gateways, conferencing servers, and voicemail systems, all fit into this category. Each of these devices are ones which will accept a large number of SIP calls. There is no requirement that the calls come from unauthenticated sources. The attack can be launched even if the device requires authentication. Of course, authentication might provide some amount of traceability. Interestingly, many of these systems provide authentication in the application itself. IVR servers and voicemail servers typically prompt the user for their PIN or passcode. In these cases, the media stream is already established. Establishment of a media stream is sufficient to launch the attack, and so application-layer authentication provides no traceability.
In the case of RTSP, an RTSP server, by definition, will create large numbers of RTP sessions with clients that connect to it.
To launch an attack against a target, the attacker needs to have an IP address of the target that is reachable by the server. To some degree, this requirement can help prevent attacks against clients behind Network Address Translators (NAT). However, if the attacker has the public IP address of the NAT behind which a client sits (such as a residential NAT), the attack can be launched against the NAT itself. Such an attack may very well be just as effective as a direct attack on the client, as it may prevent the client from accessing any kind of network services.
With the IP address in hand, the attacker sets up a substantial number of RTP sessions using SIP or RTSP, depending on the server type. In the case of SIP, it would do so by sending an INVITE with an SDP body. The connection line in the SDP body would indicate the IP address of the target. In the case of RTSP, it would use the Transport header field of the request. RTSP does acknowledge the possibility of this attack, however. It says:
destination: The address to which a stream will be sent. The client may specify the multicast address with the destination parameter. To avoid becoming the unwitting perpetrator of a remote- controlled denial-of-service attack, a server SHOULD authenticate the client and SHOULD log such attempts before allowing the client to direct a media stream to an address not chosen by the server. This is particularly important if RTSP commands are issued via UDP, but implementations cannot rely on TCP as reliable means of client identification by itself. A server SHOULD not allow a client to direct media streams to an address that differs from the address commands are coming from.
Client authentication does not prevent the attack, it merely provides some form of traceability. Given the ease with which clients can typically obtain accounts with providers that provide streaming services, its not clear that this traceability is sufficient to prevent the attack. The last sentence of the above text would restrict attacks to ones where the client was attacking another device behind the same NAT, or where the client can spoof the source address of the target.
If the server automatically accepts the signaling and sets up media sessions, the client can create a potentially large number of RTP sessions, all directing media towards a single target. Assuming even low bandwidth media (say, 32 kbps), the client needs a few signaling messages, each of about 1kB, in order to create a stream of 32 kbps. This provides excellent amplification properties, making it a ripe target for launching attacks. The attack is particularly easy to launch with SIP, unfortunately. SIP does not have any specific measures built in to verify that the IP address in the SDP corresponds to the source of the signaling messages. This is not an oversight; it is impossible to do in a peer-to-peer signaling system.
| TOC |
In this section, we discuss several approaches that we considered for preventing the attack.
Whenever one is concerned about security issues with RTP, the first place to turn to is the Secure Real Time Transport Protocol (SRTP). Unfortunately, SRTP does not help at all in this instance. Neither integrity checks or encryption prevent this attack. The nature of the attack is the volume of packets it creates towards an unwitting target; whether those packets are signed by the server, or are encrypted with the wrong key, is not relevant for the attack.
Another approach is RTCP. We could modify the algorithm used by the server so that it sends some small number of RTP packets, and then waits for RTCP packets in return before sending further RTP packets. This way, the target would not be flooded with RTP packets unless the attacker could send RTCP packets back to the server, and furthermore, could construct those packets properly. For example, the highest sequence number received field of the receiver report would have to be within the range of sequence numbers sent by the server. To create such a packet, the attacker would need to be capable of eavesdropping the RTP packets sent from the server to the target. SRTP would prevent such observation, and even if SRTP is not used, it makes it hard to launch this attack.
The main issue with RTCP is that this substanially alters the overall RTCP behavior. RTCP then serves two purpose - one to report back on reception quality, and the other to serve as a check for connectivity to a willing recipient before sending data. By using the same protocol for both, it becomes less effective for either. RTCP statistics become tainted because extra RTP and RTCP packets are used just for this connectivity check. The RTCP interval algorithm needs to be modified to suit both needs. Reliability is needed for the connectivity check, but its not desirable for reporting statistics.
As such, we concluded that something else was needed.
| TOC |
Interactive Connectivity Establishment (ICE) is a technique for NAT traversal that makes use of peer-to-peer Simple Traversal of UDP through NAT (STUN). The basic idea is simple. A caller obtains as many IP address and port combinations as it can, as potential alternatives for receiving media. These include addresses learned by sending STUN to a server, TURN addresses, local interfaces, VPN interfaces, and addresses learned through MIDCOM. It sends all of these to the caller in an INVITE. Some of them will work, and some of them won't, depending on the network topology between the caller and called party. To determine which one to use, the called party tests all of them by sending a STUN request to the caller. When the caller receives the STUN request, it sends a response back to the called party. If the called party gets a response, it knows that it can reach the caller, and furthermore, than now the caller can reach it. The called party proceeds to send media once connectivity has been verified with the STUN request and response.
The ICE technique has a side-effect of preventing the denial-of-service attack. Consider a server that supports ICE, and which refuses to set up a call to a client unless connectivity is verified with ICE. The caller is actually an attacker. It places a single IP address into the SDP of the INVITE, pointing towards the target. The server doesnt send RTP yet. It sends a STUN request to the target. Since the target is not an active participant, there is no STUN response. Thus, as far as the server is concerned, there is no connectivity to the recipient, and no RTP packets are sent. The attack is prevented.
In legitimate cases, the recipient of the STUN request generates a response, and then media can flow. The STUN messages are reliable and operate relatively quickly. Since they are not RTP or RTCP packets, they don't interferere with normal RTCP operation and have no impact on RTP statistics.
An attacker could attempt to generate a fake STUN response towards the server, in an attempt to fool it into thinking there was connectivity. However, that is not easily done. The attacker would need to observe the STUN request, in order to obtain the transaction ID necessary to compute the response. The attacker would also need to time the STUN response appropriately, so that it was sent after the server sends the STUN request, but before the transaction times out. This is not impossible, but at least raises the bar further.
| TOC |
The end result is that ICE makes it such that this attack cannot be launched unless the attacker (1) can observe packets sent from the server to the target, (2) can time the STUN response appropriately. This is a much more difficult set of criteria to meet than without ICE. Indeed, an attacker that can meet these criteria can launch attacks against servers that generate TCP traffic, such as web content. As such, ICE makes the attack equally tractable (or intractable) to similar attacks for any other client-server application.
The ICE approach works for media carried with RTP and with other protocols as well. It is not restricted to RTP. It also does not interfere with the normal operation of RTCP and RTCP-based statistics. It works for any protocol that follows a basic offer/answer exchange. This includes SIP and RTSP, but also includes protocols such as MGCP, Megaco and H.323, all of which are susceptible to this attack.
| TOC |
This document is entirely about security considerations.
| TOC |
|||Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996.|
|||Jacobson, V., Casner, S., Frederick, R. and H. Schulzrinne, "RTP: A Transport Protocol for Real-Time Applications", draft-ietf-avt-rtp-new-12 (work in progress), March 2003 (TXT, PS).|
|||Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.|
|||Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.|
|||Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.|
|||Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998 (TXT, HTML, XML).|
|||Perkins, C. and L. Gharai, "RTP Payload Format for Uncompressed Video", draft-ietf-avt-uncomp-video-02 (work in progress), March 2003.|
|||Baugher, M., "The Secure Real-time Transport Protocol", draft-ietf-avt-srtp-08 (work in progress), June 2003.|
|||Rosenberg, J., "Interactive Connectivity Establishment (ICE): A Methodology for Nettwork Address Translator (NAT) Traversal for the Session Initiation Protocol (SIP)", draft-rosenberg-sipping-ice-00 (work in progress), February 2003.|
|||Rosenberg, J., Weinberger, J., Huitema, C. and R. Mahy, "STUN - Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs)", RFC 3489, March 2003.|
|||Rosenberg, J., "Traversal Using Relay NAT (TURN)", draft-rosenberg-midcom-turn-01 (work in progress), March 2003.|
|||Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A. and A. Rayhan, "Middlebox communication architecture and framework", RFC 3303, August 2002.|
|||Andreasen, F. and B. Foster, "Media Gateway Control Protocol (MGCP) Version 1.0", RFC 3435, January 2003.|
|||Groves, C., Pantaleo, M., Anderson, T. and T. Taylor, "Gateway Control Protocol Version 1", RFC 3525, June 2003.|
| TOC |
|600 Lanidex Plaza|
|Parsippany, NJ 07054|
|Phone:||+1 973 952-5000|
| TOC |
The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director.
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees.
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Funding for the RFC Editor function is currently provided by the Internet Society.