<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc='yes'?>
<?rfc tocdepth='5'?>

<?rfc compact='yes'?>
<?rfc subcompact='no'?>

<rfc ipr="full3978">


    <front>
        <title abbrev="ICE">
Interactive Connectivity Establishment (ICE):
A Methodology for Network Address Translator (NAT) Traversal for
Offer/Answer Protocols</title>
    
        <author initials="J.R." surname="Rosenberg"
                fullname="Jonathan Rosenberg">
            <organization>Cisco Systems</organization>
    
            <address>
                <postal>
                    <street>600 Lanidex Plaza</street>
                    <city>Parsippany</city> <region>NJ</region>
                    <code>07054</code>
                    <country>US</country>
                </postal>
    
                <phone>+1 973 952-5000</phone>
                <email>jdrosen@cisco.com</email>
                <uri>http://www.jdrosen.net</uri>
            </address>
        </author>
    
        <date month="July" year="2005" />
    
        <area>Transport</area>
        <workgroup>MMUSIC</workgroup>
        <keyword>SIP</keyword>
        <keyword>NAT</keyword>
        <abstract>
            <t>This document describes a methodology for Network
            Address Translator (NAT) traversal for multimedia session
            signaling protocols, such as the Session Initiation
            Protocol (SIP). This methodology is called Interactive
            Connectivity Establishment (ICE). ICE makes use of
            existing protocols, such as Simple Traversal of UDP
            Through NAT (STUN) and Traversal Using Relay NAT
            (TURN). ICE makes use of STUN in peer-to-peer cooperative
            fashion, allowing participants to discover, create and
            verify mutual connectivity.</t>
        </abstract>
    </front>

<middle>

<!-- add sender selection algorithm:

A client MUST send from the transport address on which it most recently received an RTP packet. If there is no such address (as no RTP packets are being received yet), it SHOULD send from the transport address of highest local priority on which a STUN connectivity check has been received. If there is no such address (as no connectivity checks have been received yet), it SHOULD send from the local transport address of highest priority. 

-->

<!-- update SDP/SIP mapping to mention that sending the answer in a 18x 
is a good idea, in order to reduce post-pickup delay -->

<!-- flemming: discuss interactions with preconditions -->

<!-- flemming: discuss interactions with middleboxes -->

<!-- attack if I can snoop the signaling if its not encrypted, I can coopt
someones media stream. fix this by requiring that learned addresses
have lowest priority, or that they need to be signaling in a re-invite
as they used to have to be -->

<!-- make sure domain names can be used inside of the ice candidates; that is
a potential solution 

<!--- stun as keepalive instead of no-op, recommend no-op if candidate is not there -->


<section title="Introduction">

<t>
A multimedia session signaling protocol is a protocol that exchanges
control messages between a pair of agents for the purposes of
establishing the flow of media traffic between them. This media flow
is distinct from the flow of control messages, and may take a
different path through the network. Examples of such protocols are the
<xref target="RFC3261">Session Initiation Protocol (SIP)</xref>, the
<xref target="RFC2326">Real Time Streaming Protocol (RTSP)</xref> and
the International Telecommunications Union (ITU) H.323.
</t>

<t>
These protocols, by nature of their design, are difficult to operate
through Network Address Translators (NAT). Because their purpose in
life is to establish a flow of packets, they tend to carry IP
addresses within their messages, which is known to be problematic
through NAT <xref target="RFC3235"/>. The protocols also seek to
create a media flow directly between participants, so that there is no
application layer intermediary between them. This is done to reduce
media latency, decrease packet loss, and reduce the operational costs
of deploying the application. However, this is difficult to accomplish
through NAT. A full treatment of the reasons for this is
beyond the scope of this specification.
</t>

<t> Numerous solutions have been proposed for allowing these protocols
to operate through NAT. These include Application Layer Gateways
(ALGs), the <xref target="RFC3303"> Middlebox Control Protocol</xref>,
<xref target="RFC3489">Simple Traversal of UDP through NAT
(STUN)</xref>, <xref target="I-D.rosenberg-midcom-turn">Traversal
Using Relay NAT</xref>, and <xref target="RFC3102">Realm Specific
IP</xref> <xref target="RFC3103"/> along with session description
extensions needed to make them work, such as the Session Description
Protocol (SDP) <xref target="RFC2327"/> attribute for the Real Time
Control Protocol (RTCP) <xref target="RFC3605"/>. Unfortunately, these
techniques all have pros and cons which make each one optimal in some
network topologies, but a poor choice in others. The result is that
administrators and implementors are making assumptions about the
topologies of the networks in which their solutions will be
deployed. This introduces complexity and brittleness into the
system. What is needed is a single solution which is flexible enough
to work well in all situations.  </t>

<t>
This specification provides that solution for protocols based on the
offer-answer model, RFC 3264 <xref target="RFC3264"/>. It is called Interactive
Connectivity Establishment, or ICE. ICE makes use of STUN and TURN,
but uses them in a specific methodology which avoids 
many of the pitfalls of using any one alone. 
</t>

</section>

<section anchor="sec:terms" title="Terminology">

<t>
Several new terms are introduced in this specification:
</t>

<list style="hanging">

<t hangText="Peer:">From the perspective of one of the agents in a
session, its peer is the other agent. Specifically, from the
perspective of the offerer, the peer is 
the answerer. From the perspective of the answerer, the peer is the
offeror. </t>

<t hangText="Transport Address:"> The combination of an IP address and port.
</t>

<t hangText="Local Transport Address:"> A local transport address a
transport address that has been allocated from the operating system on
the host. This includes transport addresses obtained through Virtual
Private Networks (VPNs) and transport addresses obtained through Realm
Specific IP (RSIP) <xref target="RFC3102"/> (which lives at the
operating system level). Transport addresses are typically obtained by
binding to an interface.
</t>

<t hangText="m/c line:">The media and connection lines in the SDP,
which together hold the transport address used for the receipt of
media.
</t>

<t hangText="Derived Transport Address:"> A derived transport address
is a transport address which is derived from a local transport
address. The derived transport address is related to the
associated local transport address in that packets sent to the derived
transport address are received on the socket bound to its associated
local transport address. Derived addresses are obtained using protocols like
STUN and TURN, and more generally, any <xref target="RFC3424">UNSAF
protocol</xref>.
</t>

<t hangText="Candidate Transport Address:"> A transport address
advertised by a agent in an offer or answer. A candidate transport
address can either by a local transport address or a derived transport
address.  </t>


<t hangText="Peer Derived Transport Address:"> A peer derived
transport address is a derived transport address learned from a STUN
server running within a peer in a media session.
</t>

<t hangText="TURN Derived Transport Address:"> A derived transport
address obtained from a TURN server. 
</t>

<t hangText="STUN Derived Transport Address:"> A derived transport
address obtained from a STUN server whose address has been provisioned
into the UA. This, by definition, excludes Peer Derived Transport
Addresses.
</t>

<t hangText="Candidate:">A sequence of candidate transport addresses
that form an atomic set for usage with a particular media stream. In
the case of RTP, there are two candidate transport addresses per
candidate: one for RTP, and another for RTCP. Connectivity is verified
to all of the candidate transport addresses within a candidate before
that candidate is used. The transport addresses that compose a
candidate are all of the same type - local, STUN derived, TURN derived
or peer derived. </t>

<t hangText="Local Candidate:">A candidate whose transport addresses
are local transport addresses.
</t>

<t hangText="STUN Candidate:">A candidate whose transport addresses
are STUN derived transport addresses.
</t>

<t hangText="TURN Candidate:">A candidate whose transport addresses
are TURN derived transport addresses.
</t>

<t hangText="Peer Candidate:">A candidate whose transport addresses
are peer derived transport addresses.
</t>

<t hangText="Active Candidate:">The candidate that is
in use for exchange of media. This is the one that an agent places in
the m/c line of an offer or answer. 
</t>

</list>

</section>

<section title="Overview of ICE">

<REDO WHEN DONE>

<t>
ICE makes the fundamental assumption that clients exist in a network
of segmented connectivity. This segmentation is the result of a number
of addressing realms in which a client can simultaneously be
connected. We use "realms" here in the broadest sense. A realm is defined
purely by connectivity. Two clients are in the same realm if, when
they exchange the addresses each has in that realm, they are able to
send packets to each other. This includes IPv6 and IPv4 realms, which
actually use different address spaces, in addition to private networks
connected to the public Internet through NAT.
</t>

<t>The key assumption in ICE is that a client cannot know, apriori,
which address realms it shares with any peer it may wish to
communicate with. Therefore, in order to
communicate, it has to try connecting to addresses in all of the
realms.
</t>

<figure anchor="fig:ice-model"><artwork>
<![CDATA[
       Agent A          TURN,STUN Servers          Agent B
          |(1) Gather Addresses |                     |
          |-------------------->|                     |
          |(2) Offer            |                     |
          |------------------------------------------>|
          |                     |(3) Gather Addresses |
          |                     |<--------------------|
          |(4) Answer           |                     |
          |<------------------------------------------|
          |(5) Media            |                     |
          |<------------------------------------------|
          |(6) Media            |                     |
          |------------------------------------------>|
          |(7) STUN Checks      |                     |
          |<------------------------------------------|
          |(8) STUN Checks      |                     |
          |------------------------------------------>|
          |(9) Offer            |                     |
          |------------------------------------------>|
          |(10) Answer          |                     |
          |<------------------------------------------|
          |(11) Media           |                     |
          |<------------------------------------------|
          |(12) Media           |                     |
          |------------------------------------------>|

]]></artwork></figure>

<t> The basic flow of operation for ICE is shown in <xref
target="fig:ice-model"/>. Before the offeror establishes a session,
it obtains local transport addresses from its operating system on as
many interfaces as it has access to. These interfaces can include
IPv4 and IPv6 interfaces, in addition to Virtual Private Network (VPN)
interfaces or ones associated with RSIP. For media protocols that
support both UDP and TCP (such as the Real Time Transport Protocol
(RTP) <xref target="RFC3550"/>, which can run over either), it
obtains both TCP and UDP transport addresses. In addition, the agent
obtains derived transport addresses from each local transport address
using protocols such as STUN and TURN. Each local and derived
transport address becomes a candidate for receipt of media traffic. 
</t>

<t> The agent will choose one of its candidate transport addresses as
its initial media transport address for inclusion in the connection
and media lines in the offer. This transport address will be
utilized for media traffic while connectivity is verified to all of
the candidates. Since these checks may take time to execute, media
clipping will occur if the media transport address is not reachable by
the peer. To minimize the probability of clipping, the transport
address that is most likely to work is chosen. This is normally a
TURN-derived tranport address, but others can be utilized based on
local policy.  </t>

<t> Each candidate transport address (including the one being used as
the media transport address) is listed in an a=candidate attribute in
the offer. Each candidate is given a preference. Preference is a
matter of local policy, but typically, lowest preference would be
given to transport addresses learned from a TURN server (i.e., TURN
derived transport addresses). Each candidate is also assigned a
distinct username fragment and a password. </t>

<t> The offer is then sent to the answerer. This specification
does not address the issue of how the signaling messages themselves
traverse NAT. It is assumed that signaling protocol specific
mechanisms are used for that purpose. The answerer follows a similar
process as the offeror followed; it obtains addresses from local
interfaces, obtains derived transport addresses from those, and the
combination becomes its set of candidate transport addresses. It
picks one as its initial media transport address and places it into the
m/c line in the answer, and then lists all of them in the a=candidate
attributes in the answer, along
with a preference, username fragment and password.  </t>

<t> Once the offer/answer exchange has completed, each agent sends
media from its media transport address to the media transport address
of its peer. This media stream may or may not work, depending on
whether or not the media transport address is reachable. In parallel
with the transmission of media, a connectivity check begins. This
check makes use of STUN messages sent from each candidate to each
other candidate. These checks will allow each agent to determine
whether it can send packets from a particular candidate to a candidate
from its peer, and whether packets can be sent back. If, after a
certain period of time, an agent determines that a pair of candidates
works, and has a higher priority 
than the transport addresses currently in use for media (perhaps
because the ones in use don't work), it sends a new offer that
"promotes" its candidate into the m/c line. This causes the media
traffic to switch to this new transport address. </t>









<t> Both agents run a STUN server on each of their local transport
addresses. To verify connectivity between candidates, each agent
pairs up each of its UDP candidates with each UDP candidate from its
peer. Each pair, called a pairing, has a preference. This preference
is computed as the arithmetic average of the preference of each
candidate that make up the pair. For each pair, each agent offers
a STUN transaction from its candidate addresses to the candidate
address of its peer. The username in the STUN request is obtained by
concatenating the username fragment of its candidate with that of its
peers candidate. The password in the STUN request is that of its peers
candidate. Note that, because media is also being sent while the STUN
checks are in progress, each agent will need to be able to
demultiplex stun and media traffic on the local transport address
being used for media transport.  </t>

<t>A pair of candidates is considered verified when a agent receives
a answer to its STUN request sent between the pair, and receives a
STUN request from its peer. If, after a brief period of time, the pair
of candidates corresponding to the media transport addresses has not
been verified, the session originator sends an Update request,
changing the media transport address to a transport address of a pair for which
connectivity has been verified. This avoids continued media clipping
while the checks complete. Otherwise, after the STUN transactions have
completed, the session originator will send an Update request,
changing its media transport address to the one in the verified pair
with the highest preference.
</t>

<t>Furthermore, when a either the offeror or answerer receives a
STUN request, it takes note of the source IP address and port of that
request. It compares that transport address to the existing set of
candidate transport addresses. If it's not amongst them, it gets added
as another 
candidate address for the peer. The incoming STUN message provides the agent
with enough context to associate that transport address with a
STUN username, STUN password, and priority, just as if it had been sent in an
offer or accept message. As such, the agent begins sending STUN
messages to it as well, and if those succeed, the address can be used
if it has a higher priority.
</t>


</section>

<section anchor="sec:detail" title="ICE Procedures">

<t>
This section describes the detailed processing needed for ICE.
</t>

<section anchor="sec:init" title="Sending the Initial Offer">

<t> When an agent wishes to begin a session by sending an initial
offer, it starts by gathering transport addresses, as described in
<xref target="sec:gather"/>. This will produce a set of candidates,
including local ones, STUN-derived ones, and TURN-derived ones.  </t>

<t>This process of gathering candidates can actually happen
at any time before sending the initial offer. A agent can pre-gather
transport addresses, using a user interface cue (such as picking up
the phone, or entry into an address book) as a hint that
communications is imminent. Doing so eliminates any additional
perceivable call setup delays due to address gathering.  </t>

<t>
When it comes time to offer communications, it determines a
priority for each candidate and identifies the active candidate that
will be used for receipt of media, as described in <xref
target="sec:priority"/>.</t> 

<t> The next step is to construct the offer message. For each media
stream, it places its candidates into a=candidate attributes in the
offer and puts its active candidate into the m/c line. The process for
doing this is described in <xref target="sec:enc-cand"/>. The offer is
then sent.
</t>


</section>

<section anchor="sec:resp" title="Receipt of the Offer and Generation
of the Answer">

<t>Upon receipt of the offer message, the agent checks if the offer
contains any a=candidate attributes. If it does, the offeror supports
ICE. In that case, it starts gathering candidates, as described in
<xref target="sec:gather"/>, and prioritizes them <xref
target="sec:priority"/>.  This processing is done immediately on
receipt of the offer, to prepare for the case where the user should
accept the call, or early media needs to be generated. By gathering
candidates while the user is being alerted to the request for
communications, session establishment delays due to that gathering can
be eliminated.  </t>

<t>
At some point, the answerer will decide to accept or reject the
communications. A rejection terminates ICE processing. In
the case of acceptance, the answer is constructed, and if the offeror
supported ICE, the candidates
are encoded into the SDP as described in <xref
target="sec:enc-cand"/>. The answer is then sent. If the offeror
supported ICE, the answerer begins its connectivity checks as
described in <xref 
target="sec:stun-send"/>.
</t>

<t> In addition, and regardless if the offeror supported ICE, the
answerer can begin sending media packets as it normally would. It
sends media according to the procedures in <xref
target="sec:send-media"/>.  </t>

</section>

<section title="Processing the Answer">

<t>
There are two possible cases for processing of the answer. If
the answerer did not support ICE, the answer
will not contain any a=candidate attributes. As a
result, the offeror knows that it cannot perform its connectivity
checks. In this case, it proceeds with normal media processing as if
ICE was not in use. The procedures for sending media, described in
<xref target="sec:send-media"/>, MUST be followed however.
</t>


<t>If the answer contains candidates, it implies that the
answerer supported ICE. In that case, the offeror begins connectivity
checks as described in <xref
target="sec:stun-send"/>. It also starts sending media, using the
candidate in the m/c line, based on the procedures described
in <xref target="sec:send-media"/>. 
</t>

</section>

<section title="Common Procedures">

<t>
This section discusses procedures that are common between offeror
and answerer.
</t>

<section anchor="sec:gather" title="Gathering Candidates">

<t>
An agent gathers candidates when it believes that communications is
imminent. For offerors, this occurs before sending an offer
(<xref target="sec:init"/>). For answerers, it occurs before
sending an answer (<xref target="sec:resp"/>).
</t>

<t> Each candidate is composed of a series of transport addresses of
the same type. In the case of RTP, the candidate is composed of either
one or two transport addresses. Normally there are two - one for RTP,
and one for RTCP. However, if RTCP is not in use, a candidate will
only contain a single transport address.  </t>

<t> The first step is to gather local candidates. Local candidates are
obtained by binding to ephemeral ports on an
interface (physical or virtual, including VPN interfaces) on the
host. Specifically, for each UDP-only media stream the agent wishes to
use, the agent SHOULD obtain a set of candidates (one for each
interface) by binding to N ephemeral UDP ports on each interface,
where N is the number of transport addresses needed for the
candidate. For RTP, N is typically two. For each TCP-only media stream
the agent wishes to use, the agent SHOULD obtain a set of candidates
by binding to N ephemeral TCP ports on each interface, where N is the
number of transport addresses needed for the candidate. For media
streams that can support either UDP or TCP, the agent SHOULD obtain a
set of candidates by binding to N ephemeral UDP and N ephemeral TCP
ports on each interface, where N is the number of transport addresses
needed for the candidate.
</t>

<t>
If a host has K local interfaces, this will result in K candidates for
each UDP stream (requiring K*N transport addresses), K candidates for
each TCP stream (requiring K*N transport addresses), and 2K candidates
for streams that support UDP and TCP (requiring 2*K*N transport
addresses).
</t>

<t> Media streams carried using the <xref target="RFC3550">Real Time
Transport Protocol (RTP)</xref> can run over TCP <xref
target="I-D.ietf-avt-rtp-framing-contrans"/>. As such, it is
RECOMMENDED that both UDP and TCP candidates be obtained. Transmission
of real time media over UDP is generally preferred to TCP. However,
many network environments, for better or for worse, permit only TCP
traffic. Obtaining a TCP candidate, and then using it in
conjunction with a TURN relay as described below, allows for ICE to
make use of the TCP media only when UDP connectivity is non-existent,
as it may be in these restricted environments. However, providers of
real-time communications services may decide that it is preferable to
have no media at all than it is to have media over TCP. To allow for
choice, it is RECOMMENDED that agents be configurable with whether
they obtain TCP candidates for real time media.
</t>

<list style="indent"> <t> Having it be configurable, and then
configuring it to be off, is far better than not having the capability
at all. An important goal of this specification is to provide a single
mechanism that can be used across all types of endpoints. As such, it
is preferable to account for provider and network variation through
configuration, instead of hard-coded limitations in an
implementation. Furthermore, network characteristics and connectivity
assumptions can, and will change over time. Just because a agent is
communicating with a server on the public network today, doesn't mean
that it won't need to communicate with one behind a NAT tomorrow. Just
because a agent is behind a full cone NAT today, doesn't mean that
tomorrow they won't pick up their agent and take it to a public
network access point where there is a symmetric NAT or one that only
allows outbound TCP. The way to handle these cases and build a
reliable system is for agents to implement a diverse set of techniques
for allocating addresses, so that at least one of them is almost
certainly going to work in any situation. Implementors should consider
very carefully any assumptions that they make about deployments before
electing not to implement one of the mechanisms for address
allocation. In particular, implementors should consider whether the
elements in the system may be mobile, and connect through different
networks with different connectivity. They should also consider
whether endpoints which are under their control, in terms of location
and network connectivity, would always be under their control. Only in
cases where there isn't now, and never will be, endpoint mobility or
nomadicity of any sort, should a technique be omitted.  </t></list>


<t>
Once the agent has obtained local candidates, it obtains candidates
with derived transport 
addresses. Agents which serve end users directly, 
such as softphones, hardphones, terminal adaptors and so on, MUST 
implement STUN and SHOULD use it to obtain STUN candidates. These
devices SHOULD implement and SHOULD use TURN to 
obtain TURN candidates. They MAY implement and MAY
use other protocols that provide derived transport addresses, such as 
<xref target="I-D.huitema-v6ops-teredo">TEREDO</xref>. As with TCP,
usage of STUN and TURN is at SHOULD strength to allow for provider
variation. If it is not to be used, it is also RECOMMENDED that it be
implemented and just disabled through configuration, so that it can
re-enabled through configuration if conditions change in the future.
</t>

<t>Agents which
represent network servers under the control of a service provider,
such as gateways to the telephone network, media servers, or
conferencing servers that are targeted at deployment only in networks
with public IP addresses MAY use STUN, TURN or other similar protocols
to obtain candidates. 
</t>

<list style="indent"> <t>Why would these types of endpoints even
bother to implement ICE?  The answer is that such an implementation
greatly facilitates NAT traversal for endpoints that connect to
it. The ability to process STUN connectivity checks allows for the
network server to obtain peer-derived transport addresses that can be
used to provide relay-free traversal of symmetric NAT for endpoints
that connect to it. Furthermore, implementation of the STUN
connectivity checks allows for NAT bindings along the way to be kept
open. ICE also provides numerous security properties that are
independent of NAT traversal, and would benefit any multimedia
endpoint. See <xref target="sec:security"/> for a discussion on these
benefits. 
</t></list>

<t> To obtain STUN candidates (which are always UDP), the client takes
a local UDP candidate, and for each configured STUN server, produces a
STUN candidate. It is anticipated that clients may have a multiplicity
of STUN servers configured in network environments where there are
multiple layers of NAT, and that layering is known to the provider of
the client. To produce the STUN candidate from the local candidate, it
follows the procedures of Section 9 of RFC 3489 for each local
transport address in the local candidate. It obtains a shared secret
from the STUN server and then initiates a Binding Request transaction
from the local transport address to that server. The Binding Response
will provide the client with its STUN derived transport address in the
MAPPED-ADDRESS attribute. If the client had K local candidates, this
will produce S*K STUN candidates, where S is the number of configured
STUN servers. </t>

<t> To obtain UDP TURN candidates, the client takes a local UDP
candidate, and for each configured TURN server, produces a TURN
candidate. It is anticipated that clients may have a multiplicity of
TURN servers configured in network environments where there are
multiple layers of NAT, and that layering is known to the provider of
the client. To produce the TURN candidate from the local candidate, it
follows the procedures of Section 8 of <xref
target="I-D.rosenberg-midcom-turn"/> for each local transport address
in the local candidate. It initiates an Allocate Request transaction
from the local transport address to that server. The Allocate Response
will provide the client with its TURN derived transport address in the
MAPPED-ADDRESS attribute. If the client had K local candidates, this
will produce S*K UDP TURN candidates, where S is the number of
configured TURN servers. </t>

<t> To obtain a TURN-derived TCP candidates, the client takes a local
TCP candidate, and for each configured TURN server, produces a TCP
TURN candidate. It is anticipated that clients may have a multiplicity
of TURN servers configured in network environments where there are
multiple layers of NAT, and that layering is known to the provider of
the client. To produce the TURN candidate from the local candidate, it
iterates through the local transport addresses in the local candidate,
and for for each one, initiates a TCP connection from the same
interface the local transport address to the TURN server. It is not
neccesary to initiate the connection from the actual port in the local
transport address. Following the procedures of Section 8 of <xref
target="I-D.rosenberg-midcom-turn"/>, it initiates an Allocate Request
transaction over the connection. The Allocate Response will provide
the client with its TCP TURN derived transport address in the
MAPPED-ADDRESS attribute. If the client had K local TCP candidates,
this will produce S*K TCP TURN candidates, where S is the number of
configured TURN servers. </t>


</section>

<section anchor="sec:enc-candidate" title="Encoding Candidates into
SDP">

<t> For each candidate to be placed into the SDP, the agent includes a
series of a=candidate attributes as media-level attributes, one for
each transport address in the candidate. Each of the transport
addresses for the same candidate MUST have the same value of the
candidate-id attribute. The a=candidate attributes for different
candidates MUST be unique within that media stream. Using a simple
sequence number, incrementing by one for each candidate for a media
stream, meets these requirements. The transport, unicast-address and
port of the attribute are set to those for the candidate. The qvalue
is set to the priority of this candidate (note that, for RTP, the RTP
and RTCP transport addresses MUST have equal priority values). The tid
MUST be chosen randomly with 128 bits of randomness. The tid is chosen
only when the transport address is placed into the SDP for the first
time; subsequent offers or answers within the same session containing
that same transport address would use the same tid used previously.
</t>

<t>
The tid serves as a unique identifier for each transport address. It also
gets combined, through concatenation, with the tid of a peer
candidate to form the username and password that is placed in the STUN checks
between the peers. This allows the STUN message to uniquely identify
the pairing whose connectivity it is checking. The tid is needed as a
unique identifier because the IP address within the candidate fails to
provide that uniqueness as a consequence of NAT. 
</t>

<t> Consider agents A, B, and C. A and B are within private enterprise
1, which is using 10.0.0.0/8. C is within private enterprise 2, which
is also using 10.0.0.0/8. As it turns out, B and C both have IP
address 10.0.1.1. A sends an offer to C. C, in its answer, provides A
with its transport addresses. In this case, thats 10.0.1.1:8866 and
8877. As it turns out, B is in a session at that same time, and is
also using 10.0.1.1:8866 and 8877. This means that B is prepared to
accept STUN messages on those ports, just as C is. A will send a STUN
request to 10.0.1.1:8866 and 8877. However, these do not go to C as
expected. Instead, they go to B. If B just replied to them, A would
believe it has connectivity to C, when in fact it has connectivity to
a completely different user, B. To fix this, tid takes on the role of
a unique identifier. C provides A with an identifier for its transport
address, and A provides one to C. A concatenates these two identifiers
and uses the result as the username and password in its STUN query to
10.0.1.1:8866. This STUN query arrives at B. However, the username is
unknown to B, and so the request is rejected. A treats the rejected
STUN request as if there were no connectivity to C (which is actually
true). Therefore, the error is avoided.  </t>

<t>
An unfortunate consequence of the non-uniqueness of IP addresses is
that, in the above example, B might not even be an ICE agent. It
could be any host, and the port to which the STUN packet is directed
could be any ephemeral port on that host. If there is an application
listening on this socket for packets, and it is not prepared to handle
malformed packets for whatever protocol is in use, the operation of
that application could be effected. Fortunately, since the ports
exchanged in SDP are ephemeral and ususally drawn from the dynamic or
registered range, the odds are good that the port is not used to run a
server on host B, but rather is the agent side of some protocol. This
decreases the probability of hitting a port in-use, due to the
transient nature of port usage in this range. However, the possibility
of a problem does exist, and network deployers should be prepared for
it. 
</t>

<t>
Note that, because there are separate transport addresses for RTP and RTCP,
each will have a distinct tid.
</t>

<t>
The active candidate is placed into the m/c lines of the SDP. For RTP
streams, this is done by placing the RTP address and port into the c
and m lines in the SDP respectively. If the agent it utilizing RTCP,
it MUST encode its address and port using the a=rtcp attribute as
defined in RFC 3605 <xref target="RFC3605"/>. If RTCP is not in use,
the agent MUST signal that using b=RS:0 and b=RR:0 as defined in RFC
3556 <xref target="RFC3556"/>. 
</t>

<t>
For media streams that are inherently TCP-based (as opposed to ones
where TCP is a fallback and would be listed as a candidate but not the
initial active address), the connections MUST be signaled using
comedia <xref target="I-D.ietf-mmusic-sdp-comedia"/>, and those
connections MUST be in "holdconn" mode. This has the effect of
suspending connection attempts via the comedia mechanisms, allowing
ICE to open the connections instead. These connections then get removed from
holdconn mode when the ICE procedures complete and an updated
offer/answer exchange takes place that promotes one of the
existing ICE-established connections to active. Note that this has the
result of increasing the post-dial-delay for TCP-oriented media, but
brings with it substantial security and NAT traversal properties.
</t>

</section>


</section>

<section anchor="sec:priority" title="Prioritizing the Transport
Addresses and Choosing an Active One">

<t>The prioritization process takes the set of candidates and
associates each with a priority. This 
priority reflects the desire that the agent has to receive media on that
address, and is assigned as a value from 0 to 1 (1 being most
preferred). Priorities are ordinal, so that their significance is only
meaningful relative to other candidates for a particular media stream.
</t>

<t>
This specification makes no normative recommendations on how the
prioritization is done. However, some useful guidelines are suggested
on how such a prioritization can be determined.
</t>

<t>
One criteria for choosing one candidate over another is
whether or not that candidate involves the use of a
relay. That is, if media is sent to that candidate, will the
media first transit a relay before being received. TURN candidates
make use of relays (the TURN server), as do any 
local candidates associated with a VPN server. When media is
transited through a relay, it can increase the latency between
transmission and reception. It can increase the packet losses, because
of the additional router hops that may be taken. It may increase the
cost of providing service, since media will be routed in and right
back out of a relay run by the provider. If these concerns are
important, candidates with this property can be listed with
lower priority.
</t>

<t>
Another criteria for choosing one candidate over another is IP address
family. ICE works with both IPv4 and IPv6. It therefore provides a
transition mechanism that allows dual-stack hosts to prefer
connectivity over IPv6, but to fall back to IPv4 in case the v6
networks are disconnected (due, for example, to a failure in a 6to4
relay) <xref target="RFC3056"/>. It can also help with hosts that have
both a native IPv6 address and a 6to4 address. In such a case, higher
priority could be afforded to the native v6 address, followed by the
6to4 address, followed by a native v4 address. This allows a site to
obtain and begin using native v6 addresss immediately, yet still
fallback to 6to4 addresses when communicating with agents in other
sites that do not yet have native v6 connectivity.
</t>

<t>
Another criteria for choosing one candidate over another is security. If
a user is a telecommuter, and therefore connected to their corporate
network and a local home network, they may prefer their voice traffic
to be routed over the VPN in order to keep it on the corporate network
when communicating within the enterprise, but use the local network
when communicating with users outside of the enterprise.
</t>

<t>
Another criteria for choosing one address over another is topological
awareness. This is most useful for candidates which make use
of relays (including TURN and VPN). In those cases, if a agent has
preconfigured or dynamically discovered knowledge of the topological
proximity of the relays to itself, it can use that to select closer
relays with higher priority.
</t>

<t>
Finally, the transport protocol itself is a criteria for choosing one
candidate over another. If a particular media stream can run over UDP
or TCP, the UDP candidates might be preferred over the TCP
candidates. This allows ICE to use the lower latency UDP connectivity
if it exists, but fallback to TCP if UDP doesn't work.
</t>

<t>
Once the candidates have been prioritized, one is selected as
the active one. This is the candidate that will be used for actual
exchange of media, until replaced by an updated offer or answer. Since
the ICE connectivity checks can take a few seconds to execute, media
clipping can occur is this candidate doesn't work. The active candidate
will also be used to receive media from ICE-unaware peers. As such, it is
RECOMMENDED that one be chosen based on the likelihood of that candidate
to work with the peer that is being contacted. Unfortunately, it is
difficult to ascertain which candidate that 
might be. As an example, consider a user within an enterprise. To
reach non-ICE capable agents within the enterprise, a local
candidate has to be used, since the enterprise policies may
prevent communication between elements using a relay on the public
network. However, when communicating to peers outside of the
enterprise, a TURN-based candidate from a publically accessible TURN
server is needed. 
</t>

<t>
Indeed, the difficulty in picking just one address that will work is
the whole problem that motivated the development of this specification
in the first place. As such, it is RECOMMENDED that the default
address be a TURN candidate from a TURN server
providing public IP addresses. Furthermore, ICE is only truly
effective when it is supported on both sides of the session. It is
therefore most prudent to deploy it to close-knit communities as a
whole, rather than piecemeal. In the example above, this would mean
that ICE would ideally be deployed completely within the enterprise,
rather than just to parts of it. 
</t>

</section>

<section anchor="sec:stun-send" title="Connectivity Checks">

<t>
Once the offer/answer exchange has completed, both agents will have a
set of candidates for each media stream. Each agent forms a set of pairings
for each media stream by combining each of its UDP candidates with
each of the UDP candidates of its peer, and by combining each of its
TCP candidates with each of the TCP candidates of its peer. If
candidates for other transport protocols were signaled through the
offer/answer exchange, a pairing is performed between each of those as
well. If an offer/answer exchange took place for a session comprised
of an audio and a video stream, and each stream had two UDP and two
TCP candidates from each agent, there would be 16 pairings, 8 for
audio and 8 for video. Each of those eight would be comprised of four UDP and
four TCP. Note that there is no requirement that the number of
candidates from each peer be the same. One agent can offer two UDP
candidates for a media stream, and the answer can contain three UDP
candidates for the same media stream. In that case, there would be six
UDP pairings.
</t>

<t> Each candidate has a number of transport addresses. In the case of
RTP, there are either one or two. Within the pairing, the transport
addresses of each candidate are linked together one-to-one to form a
transport address pair. In the case of RTP, the result will either be
one or two transport address pairs - one for RTP, and possibly another
for RTCP. The relationship between a candidate, transport address,
pairing and transport address pair are shown in <xref
target="fig:pairing"/>. This figure shows the pairing as seen by the
agent that owns the candidate {A,B}. The candidate owned by that agent
is called the native candidate, and the one owned by its peer is the
remote candidate. As the figure shows, there is one pairing between
two candidates, and two transport address pairs ({A,C} and {B,D}). If
one of the candidates only had one transport address (in the case
where RTCP was not being used by one agent), there would only be one
transport address pair, {A,C}. Each transport address is associated
with a tid. Furthermore, each transport address pair is associated
with an ID, the transport address pair ID. This ID is equal to the
concatenation of the tid of the native transport address with the tid
of the remote transport address. This means that the identifiers are
different for each agent. For the agent that owns {A,B}, the transport
address pair ID is WY for the first transport address pair, and XZ for
the second. For the agent that owns {C,D}, it would be reversed - YW
for the first transport address pair, and ZX for the second.  </t>

<figure anchor="fig:pairing"><artwork>
<![CDATA[
             ...........................................                  
             .                                         .                  
  .......... .                                         . ..........       
  .        . .  .............           .............  . .        .       
  .        . .  .           .           .           .  . .        .       
  .    --  . .  .    --     .           .    --     .  . .   --   .       
  .   | A|<<<<<<<<<<| A|--------------------| C|>>>>>>>>>>>>| K|  .       
  .    --  . .  .    --     . Transport .    --     .  . .   --   .       
  .        . .  . Transport .  Address  . Transport .  . .        .       
  .        . .  .  Address  .   Pair    .  Address  .  . .        .       
  .        . .  .  tid=W    .   ID=WY   .   tid=Y   .  . .        .       
  .        . .  .           .           .           .  . .        .       
  .        . .  .           .           .           .  . .        .       
  .        . .  .           .           .           .  . .        .       
  .    --  . .  .    --     .           .    --     .  . .   --   .       
  .   | J|<<<<<<<<<<| B|--------------------| D|>>>>>>>>>>>>| D|  .       
  .    --  . .  .    --     . Transport .    --     .  . .   --   .       
  .......... .  . Transport .  Address  . Transport .  . ..........       
  Associated .  .  Address  .   Pair    .  Address  .  . Associated       
  Local      .  .   tid=X   .   ID=XZ   .   tid=Z   .  . Local            
  Transport  .  .           .           .           .  . Transport        
  Addresses  .  .............           .............  . Addresses        
             .       Native              Remote        .                  
             .     Candidate            Candidate      .                  
             .        and                  and         .                  
             . Transport Addresses Transport Addresses .                  
             .                                         .                  
             ...........................................                  
                                                                          
                                Pairing                                   

]]></artwork></figure>

<t>
The figure also shows that each transport address has an associated
local transport address. The associated local transport address is the
local transport address at which the agent will receive packets sent
to the transport address. For a local transport address, its
associated local transport address is the same. That is the case of
transport address A and D in the diagram. For STUN derived and TURN derived
transport addresses, however, they are not the same. The associated
local transport address is the one from which the STUN or TURN
transport was derived.
</t>

<t> Next, each agent begins sending connectivity checks for each
transport address pair. The procedure differs for UDP and TCP.
</t>

<section title="UDP Connectivity Checks">

<t> An agent considers a UDP pairing validated when all of its
transport address pairs have been validated. Each transport address
pair is validated if an agent successfully completed a STUN Binding
Request transaction from its native transport address to the
corresponding remote transport address, and when it has received a
STUN Binding Request transaction on its native transport address, sent
from the remote transport address. This ensures that packets can flow
in each direction.  </t>

<t>
Because validation of a transport address pair involves a STUN
transaction in each direction, a pair can be in one of five states -
unknown, invalid, send-valid, receive-valid and valid. Each transport address
pair starts in the unknown state.
</t>

<section anchor="sec:send-valid" title="Send Validation">

<t>
To validate a transport address pair in the send direction, an agent
needs to complete a successful STUN Binding Request transaction. This
means it needs to send a Binding Request from its native transport
address to the remote transport address, and receive a successful
Binding Response back.
</t>

<t>For UDP-based transport addresses, an agent initiates a STUN
Binding Request transaction by sending from its native transport address,
and sends it to the remote transport address. The meaning of
"sending from its native transport address" is clear in the case of a
local transport address - the request is sent such that the source IP
address and port of the packet is equal to that local transport
address. However, the meaning is different for STUN and TURN derived
transport addresses. For STUN derived transport address, it is sent by
sending from the local transport address used to derive that STUN
address. For TURN derived transport addresses, it is sent by using
TURN mechanisms to send the request through the TURN server (using the
SEND primitive). Sending the request through the TURN server
neccesarily requires that the request be sent from the client, using
the local transport address used to derive the TURN transport
address.  </t>

<t> The Binding Request sent by the agent MUST contain the USERNAME
attribute. This attribute MUST be set to the transport address pair ID
of the corresponding transport address pair as seen by its peer. Thus,
for the first transport address pair in the example above, if the
agent on the left sends the STUN Binding Request, the USERNAME will
have the value YW. The request MAY contain the MESSAGE-INTEGRITY
attribute, computed according to RFC 3489 procedures. The
MESSAGE-INTEGRITY The Binding Request MUST NOT contain the
CHANGE-REQUEST or ANSWER-ADDRESS attribute.  </t>

<t> Each of these STUN transactions will generate either a timeout, or
a response. If the response is a 420, 500, or 401, the agent should
try again as described in RFC 3489. Either initially, or after such a
retry, the STUN transaction might produce a non-recoverable failure
response (error codes 400, 431, or 600) or a failure result
inapplicable to this usage of STUN and thus unrecoverable (432,
433). If this happens the transport address pair and its corresponding
candidate is considered invalid.  If the STUN transaction produces a
430 error or times out, the client SHOULD retry with a new STUN
Binding Request transaction. The 430 response code, as described
below, is generated when the server doesn't recognize the STUN
username because the BindingRequest was sent received prior to the
receipt of the answer. Its ocurrence is a result of a failed race
between the BindingRequest and the answer. This is remedied by
retrying, which allows the "slower" answer to be received. These retry
transactions carry the same USERNAME value as the original Binding
Request, and differ only in their STUN transaction ID. If these
retries have not produced a success response after Tg seconds, the
transport address pair is considered invalid. Tg SHOULD be
configurable. It is RECOMMENDED that it default to 50 seconds. This is
a reasonable approximation of the maximum SIP transaction
duration. </t>

<!-- add motivation for why this timout value is used-->

<t>
If the STUN transaction succeeds for a UDP transport address pair
(producing a success response), and the pair was previously in the
receive-valid state, it is considered valid. If the pair was
previously in the unknown state, it is considered send-valid. 
</t>

<t>
If a transport address pair is send-valid or valid, an agent MUST
generate a new STUN Binding Request transaction every Tr seconds. This
transaction ensures that NAT bindings for the transport address pair
remain open while the candidate is under consideration. They can also
be used to keep the bindings alive when the candidate is promoted to
active, as described in <xref target="sec:keepalives:/>. Tr SHOULD be
configurable, and SHOULD default to 15 seconds. Each new Binding
Request transaction is processed according to the procedures in this
Section. It is possible for a previously valid candidate to later be
invalidated by a subsequent STUN transaction. This happens in cases
where the NAT bindings expire. 
</t>

</section>

<section anchor="sec:recv-valid" title="Receive Validation">

<t> As a result of providing a list of candidates in its offer or
answer, an ICE implementation will receive STUN Binding Request
messages. An agent MUST be prepared to receive STUN Binding Requests
on each local transport address from the moment it sends an offer or
answer that contains a candidate with that local transport
address. Similarly, it MUST be prepared to receive STUN Binding
Requests on a local transport address the moment it sends an offer or
answer that contains a STUN or TURN candidate derived from a local
candidate containing that local transport address. It
can cease listening for STUN messages on that local transport address
after reliably sending an updated offer or answer which does not
include any candidates equal to or derived from that local transport
address. Here, "reliably" means that the agent knows that the offer or
answer was received by its peer. This knowledge is based on the
protocol carrying the offer/answer exchanges. In the case of SIP, if
the offer is in an INVITE, the agent knows this was received by its
peer when a 200 OK or reliable provisional response <xref
target="RFC3262"/> is received with the answer. If the offer is in a
reliable provisional response, the agent knows it was reliably
received when the PRACK arrives. If an answer is in a 200 OK response,
the agent knows this was received when the ACK is received.
</t>

<t>
The agent does not need to provide STUN service on any other
IP address or port, unlike the STUN usage described in <xref
target="RFC3489"/>. The need to run the service on multiple ports is
to support the change flags. However, those flags are not needed with
ICE, and the server SHOULD reject, with a 400 answer, any STUN
requests with these flags set. The CHANGED-ADDRESS attribute in a
BindingAnswer is set to the transport address on which the server is
running.
</t>

<t>Furthermore, there is no need to support TLS or to be prepared to
receive SharedSecret request messages. Those messages are used to
obtain shared secrets to be used with BindingRequests. However, with
ICE, a shared secret is not needed. The tid's that are exchanged and
used to form the STUN USERNAME attribute do not actually require the
security properties associated with a shared secret in order for ICE
to operate securely; this is because ICE security is bootstrapped off
of the protocol carrying the offer/answer exchanges.
</t>

<t> One of the candidates will be in use as the active candidate. For
the transport addresses comprising that candidate, the agent will
receive both STUN requests and media packets on its associated local
transport addresses. The agent MUST be able to disambiguate them. In
the case of RTP/RTCP, this disambiguation is easy. RTP and RTCP
packets start with the bits 0b10 (v=2). The first two bits in STUN are
always 0b00. This disambiguation also works for packets sent using
Secure RTP <xref target="RFC3711"/>, since the RTP header is in the
clear. Disambiguating STUN with other media stream protocols may be
more complicated. However, it can always be possible with arbitrarily
high probabilities by selecting an appropriately random username (see
below).  </t>

<t>
The STUN Binding Request can only be usefully processed once an
offer/answer exchange has completed. As a result, if an offeror
receives a STUN Binding Request message prior to the receipt of an
answer to its offer, it MUST reject the request with a 430 response. 
This will cause the answerer to retry, and give time for the answer
(which is in transit) to arrive at the offerer.
</t>

<t> If the offer/answer exchange has completed, the agent MUST follow
the procedures defined in RFC 3489 and verify that the USERNAME
attribute is known to the server. Here, this is done by taking the
USERNAME attribute, and comparing it against the transport address
pair identifiers for each transport address pair as seen by that
agent. If there is no match, the STUN Binding Request generates a
400. If there is a match, the resulting transport address pair is
called the matching transport address pair. The user agent proceeds
with the processing of the request and generation of a response as per
RFC 3489. In addition, the if the state of that transport address pair
was previously unknown, it changes to receive-valid. If the state was
previously send-valid, it moves to valid.  </t>

<t>
An agent will continue to receive periodic STUN transactions as long
as it had listed its transport address in an a=candidate attribute. It
MUST process those transactions according to this section. It is
possible that a transport address pair that was previously valid may
become invalidated as a result of a subsequent failed STUN transaction.
</t>

</section>

<section anchor="sec:learning" title="Learning New Candidates from Connectivity Checks">

<t>
ICE makes use of candidate addresses learned through protocols like
STUN, as described in <xref target="sec:gather"/>. These addresses are
learned when STUN requests are sent to configured STUN
servers. However, the peer-to-peer STUN connectivity checks can
themselves provide additional candidates that ICE can make use
of. This happens when two agents are separated by a symmetric
NAT. When the agent behind the symmetric NAT sends a Binding Request
to the other agent (which can have a public address or be behind any
type of NAT except for symmetric), the symmetric NAT will create a new
NAT binding for this Binding Request. Because of the properties of
symmetric NAT, that binding can be used be the agent on the public
side of the symmetric NAT to send packets back to the agent behind the
symmetric NAT.
</t>

<t>
To do this, ICE agents dynamically learn new candidates by examining
the source IP addresses and MAPPED-ADDRESS attributes in STUN Binding
Requests and Responses respectively. If they don't match any existing
candidates, a new candidate is added. This candidate corresponds to
the new IP address and port created by the symmetric NAT, and is a new
point of contact for the agent behind the symmetric NAT. Since that
candidate is only reachable from the very specific IP address and port
where the STUN request was sent to, the new candidate is paired up
with that transport address on the other agent. Since all candidates
need to have properties, such as tids, priorities and candidate IDs,
these are all computed algorithmically, so that they can be determined
by both agents just from the STUN message. 
</t>

<t>The specific procedures on receipt of a Binding Request and
Response for accomplishing this are described here. 
</t>

<section title="On Receipt of a Binding Request">

<t> When a STUN Binding Request is received which generates a success
response, the source IP address and port of that request is compared
all existing remote transport addresses. If there is no match, the
agent creates a new remote candidate, and adds a transport address to
it. It sets the IP address and port of this new remote transport
address to the IP address and port that was present in the incoming
Binding Request. Since this is a new candidate transport address, it
requires a new tid. The agent creates one algorithmically, by
concatenating the tid of the remote transport address in the matching
transport address pair (recall that the matching transport address
pair is the one whose transport address pair ID matched the username
of the incoming Binding Request) with the string representation of the
source IP address and port from the incoming Binding Request. This
string representation is defined using the grammar for "hostport" from
RFC 3261 <xref target="RFC3261"/>, which defines the familiar notation
of the IP address and port separated by a colon.  </t>

<t>
The priority of the new candidate MUST be set to the priority of the
remote candidate in the matching transport address pair. There is no
need to compute the candidate ID for this new candidate. 
</t>

<t>
Though this is a valid transport address, the agent does not pair it
up with each of its own transport addresses. Rather, it pairs it up
only with the native transport address from the matching transport
address pair. This creates a new transport address pair. Since
connectivity has been verified in the receive direction, the agent
sets its state to receive-valid. As with all other transport address
pairs, the agent will attempt to validate send capabilities by sending
a STUN Binding Request according to the procedures in <xref
target="sec:send-valid"/>.
</t>

<t> It is important to note that this process creates a new remote
transport address, not a whole new remote candidate. For a whole
remote candidate to come into existence, all of its component
transport addresses must come into existence, and all must have been
obtained as a result of a STUN Binding Requests between transport
address pairs in the same pairing. As an example, consider the pairing
in <xref target="fig:pairing"/>. If the peer is behind a symmetric
NAT, the Binding Request sent from C to A might produce a new remote
transport address for RTP. To create a full candidate, a STUN Binding
Request from D to B has to also create a new remote transport address,
to be used for RTCP. If this were to happen, the resulting set of
relationships is shown in <xref target="fig:pairing2"/>. To simplify
the diagram, associated local transport address relationships have
been omitted. Notice how the tids of the new remote candidate have
been constructed by concatenating the tids of the original remote
candidate with the newly discovered transport addresses, here, {R,S}.
</t>

<figure anchor="fig:pairing2"><artwork>
<![CDATA[
        .............                              .............          
        .           .                              .           .          
        .    --     .                              .    --     .          
        .   | A|---------------------------------------| C|    .          
        .    -- -----------+  Transport            .    --     .          
        . Transport .      |   Address             . Transport .          
        .  Address  .      |    Pair               .  Address  .          
        .  tid=W    .      |    ID=WY              .   tid=Y   .          
        .           .      |                       .           .          
        .           .      |                       .           .          
        .           .      |                       .           .          
        .    --     .      |                       .    --     .          
        .   | B|-----------C---------------------------| D|    .          
        .    -- ---------+ |  Transport            .    --     .          
        . Transport .    | |   Address             . Transport .          
        .  Address  .    | |    Pair               .  Address  .          
        .   tid=X   .    | |    ID=XZ              .   tid=Z   .          
        .           .    | |                       .           .          
        .............    | |                       .............          
                         | |                         remote               
            native       | |                         candidate            
            candidate    | |                                              
                         | |                       .............          
                         | |                       .           .          
                         | |                       .    --     .          
                         | +---------------------------| R|    .          
                         |     Transport           .    --     .          
                         |      Address            . Transport .          
                         |       Pair              .  Address  .          
                         |       ID=WYR            .   tid=YR  .          
                         |                         .           .          
                         |                         .           .          
                         |                         .           .          
                         |                         .    --     .          
                         +-----------------------------| S|    .          
                               Transport           .    --     .          
                                Address            . Transport .          
                                 Pair              .  Address  .          
                                 ID=XZS            .   tid=ZS  .          
                                                   .           .          
                                                   .............          
                                                    peer-derived          
                                                    remote candidate      
]]></artwork></figure>

</section>

<section title="On Receipt of a Binding Response">

<t>
When an agent receives a successful Binding Response, it examines the
MAPPED-ADDRESS attribute in that response. If the MAPPED-ADDRESS does
match any of the existing candidate transport addresses, this
represents a new peer-derived transport address. 
</t>

<t>
The agent creates a new local candidate, and adds a transport address to
it. It sets the IP address and port of this new native transport
address to the IP address and port that was present in the MAPPED-ADDRESS
attribute of the Binding Response. Since this is a new candidate
transport address, it 
requires a new tid. The agent creates one algorithmically, by
concatenating the tid of the native transport address in the transport
address pair that was being validated by the Binding Request with the string representation of the
source IP address and port from the MAPPED-ADDRESS attribute. This
string representation is defined using the grammar for "hostport" from
RFC 3261 <xref target="RFC3261"/>, which defines the familiar notation
of the IP address and port separated by a colon.  </t>

<t>
The priority of the new candidate MUST be set to the priority of the
native candidate that was being validated by the Binding Request. The
agent SHOULD assign a new candidate ID to this candidate.
</t>

<t>
Though this is a valid transport address, the agent does not pair it
up with each of the remote transport addresses. Rather, it pairs it up
only with the remote transport address from the transport
address pair that was being validated. This creates a new transport address pair. Since
connectivity has been verified in the send direction, the agent
sets its state to send-valid. As with all other transport address
pairs, the agent will attempt to validate receive capabilities by
waiting for a a STUN Binding Request according to the procedures in <xref
target="sec:receive-valid"/>.
</t>

<t> It is important to note that this process creates a new native
transport address, not a whole new candidate. For a whole
native candidate to come into existence, all of its component
transport addresses must come into existence, and all must have been
obtained as a result of a STUN Binding Requests between transport
address pairs in the same pairing. 
</t>

</section>

</section>

<section title="TCP Connectivity Checks">

<section title="Connection Establishment">

<t>
Because of the connection-oriented nature of TCP, the connectivity
checks work differently. After the offer/answer exchange completes,
each agent will have a set of TCP candidates at which it is waiting to
receive a connection on, and it will have a similar set from its
peer. Thus, a pairing of TCP candidates allows for the possibility of
TCP connections in each direction. Unlike the UDP checks, where the
STUN packets are sent from the native transport addresses to the
remote ones, the TCP connections are not opened from the native TCP
transport addresses to the remote ones. This would represent a
simultaneous open, and represent an unusual condition that would
either fail, or at best result in a single TCP connection. Rather, ICE
desires to attempt two connections, one in each direction, and use one
of them if both happen to succeed.
</t>

<t> To accomplish this, each agent will attempt to open a connection
to each remote transport address in the transport address pair, and do
so "from" its native transport address. Here, however, "from" means
something different than the UDP case. If the native transport address
is a local transport address, the agent opens the TCP connection from
the same IP interface used to obtain the local transport address, but
from a different and ephemeral port. Indeed, that port
MUST NOT be the same as the port in the local transport address. If
the native transport address is a TURN-derived TCP transport address,
no attempt is made to open a connection at all. TURN-derived TCP
transport addresses can only be used in passive mode. 
</t>

<t>
As such, for each TCP transport address pair, there will be either
zero, one, or two connection attempts. If the transport address pairs
are both TURN-derived, there will be zero (both sides passive). If one
of the transport addresses is local, and the other TURN derived, there
will be one connection attempt. The agent owning the local transport
address will be in active mode, and the agent owning the TURN-derived
one will be in passive mode. If both are local transport address,
there will be two attempts, and each agent will act in active mode.
</t>

<t>
Because a transport address pair can produce multiple connections,
validity becomes a property of the TCP connection itself. A transport
address pair is considered valid if at least one valid connection has
been established within it. An entire pairing is valid if all
transport address pairs are valid.
</t>

</section>

<section title="Sending STUN Binding Requests">

<t>Once the connection is established, the agent which opened the
connection (that is, acted in active mode) sends a STUN Binding Request over
that connection. STUN Binding Requests as described in RFC 3489 are
not normally sent over UDP, but when used in conjunction with ICE for
connectivity checks, they are sent over TCP.  
</t>

<t>
This unusual operation requires some explanation. At first glance, a
successful TCP connection ought to be sufficient. Clearly,
connectivity is established, as TCP packets were exchanged in both
directions via the TCP handshake. While that is true, the STUN Binding
Requests serve many purposes, only one of which is to literally test
connectivity. The STUN requests also serve as a correlation vehicle,
allowing the agent to match the source of a connection attempt with
the offer/answer signaling driving the entire mechanism. For example,
in the case of a forked SIP INVITE carrying an offer, the UAC may
receive two connection attempts to each of its passive TCP addresses,
one from each branch of the fork. These are readily disambiguated by
the STUN Binding Request which will follow, as the tid in the USERNAME
tells the UAC which branch has initiated the connection. 
</t>

<t>
More importantly, however, the STUN Binding Request is an essential
part of the security properties of ICE. Without it, an entity
eavesdropping the signaling messages would be able to deny service or
hijack media connections, and such attacks would require encryption of
the offer/answer exchanges (using a mechanism like SIPS <xref
target="RFC3261"/>) to prevent. However, when a STUN Binding Request
exchange is added, these attacks are completely foiled without the
need for SIPS, raising the overall security of ICE substantially with
minimal cost. These properties of ICE are discussed thoroughly in
<xref target="sec:security"/>. 
</t>

<t>
As such, once an agent has actively opened a TCP connection to the
remote agent, it sends a STUN Binding Request over that
connection. Recall that STUN messages include length indicators,
allowing them to be framed over a connection-oriented transport
protocol. The Binding Request MUST contain the USERNAME
attribute. This attribute MUST be set to the transport address pair ID
of the corresponding transport address pair as seen by its peer. Thus,
for the first transport address pair in <xref target="fig:pairing"/>, if the
agent on the left sends the STUN Binding Request, the USERNAME will
have the value YW. The request MAY contain the MESSAGE-INTEGRITY
attribute, computed according to RFC 3489 procedures. The
MESSAGE-INTEGRITY The Binding Request MUST NOT contain the
CHANGE-REQUEST or ANSWER-ADDRESS attribute. The STUN BindingRequest
message SHOULD NOT be retransmitted over the connection. </t>

<t>
The STUN will generate either a timeout, or
a response. If the response is a 420, 500, or 401, the agent should
try again as described in RFC 3489. Either initially, or after such a
retry, the STUN transaction might produce a non-recoverable failure
response (error codes 400, 431, or 600) or a failure result
inapplicable to this usage of STUN and thus unrecoverable (432,
433). If this happens the connection is considered invalid.  If the
STUN transaction produces a 
430 error or times out, the client SHOULD retry with a new STUN
Binding Request transaction. The 430 response code is a result of a failed race
between the BindingRequest and the answer. This is remedied by
retrying, which allows the "slower" answer to be received. These retry
transactions carry the same USERNAME value as the original Binding
Request, and differ only in their STUN transaction ID. If these
retries have not produced a success response after Tg seconds, the
connection is considered invalid. Tg SHOULD be
configurable. It is RECOMMENDED that it default to 50 seconds. This is
a reasonable approximation of the maximum SIP transaction
duration. </t>

<t>
If the STUN Binding Request generates a successful response, the
connection over which it was sent is considered valid. Furthermore,
the agent stores the IP address and port from the MAPPED-ADDRESS
response in the STUN Binding Response. This is called the "apparent"
native transport address for the active side of the connection. It
will be used later if this connection is used for media transport. 
</t> 

<t> Once a connection is valid, the agent which initiated the
connection MUST generate a new STUN Binding Request transaction every
Tr seconds. This transaction ensures that NAT bindings for the
connection remain open while the connection is under consideration as
a candidate. Tr SHOULD be configurable, and SHOULD default to 15
seconds. Each new Binding Request transaction is processed according
to the procedures in this section. It is possible for a previously
valid candidate to later be invalidated by a subsequent STUN
transaction. This happens in cases where the NAT bindings expire. Note
that, unlike the UDP case, STUN is sent only while a connection is is
not active for media. If the connection is used as the active
connection for media, STUN MUST NOT be sent.
</t>


</section>

<section title="Receiving STUN Requests">

<t>
When an agent acted as the passive side of a TCP connection, it will
receive a STUN Binding Request over that connection. 

<t> One of the candidates will be in use as the active candidate. For
the transport addresses comprising that candidate, the agent will
receive both STUN requests and media packets on its associated local
transport addresses. The agent MUST be able to disambiguate them. In
the case of RTP/RTCP, this disambiguation is easy. RTP and RTCP
packets start with the bits 0b10 (v=2). The first two bits in STUN are
always 0b00. This disambiguation also works for packets sent using
Secure RTP <xref target="RFC3711"/>, since the RTP header is in the
clear. Disambiguating STUN with other media stream protocols may be
more complicated. However, it can always be possible with arbitrarily
high probabilities by selecting an appropriately random username (see
below).  </t>

<t>
The STUN Binding Request can only be usefully processed once an
offer/answer exchange has completed. As a result, if an offeror
receives a STUN Binding Request message prior to the receipt of an
answer to its offer, it MUST reject the request with a 430 response. 
This will cause the answerer to retry, and give time for the answer
(which is in transit) to arrive at the offerer.
</t>

<!-- need to be more specific on rfc3489 since thats udp-based -->

<t> If the offer/answer exchange has completed, the agent MUST follow
the procedures defined in RFC 3489 and verify that the USERNAME
attribute is known to the server. Here, this is done by taking the
USERNAME attribute, and comparing it against the transport address
pair identifiers for each transport address pair as seen by that
agent. If there is no match, the STUN Binding Request generates a
400. If there is a match, the resulting transport address pair is
called the matching transport address pair. The user agent proceeds
with the processing of the request and generation of a response as per
RFC 3489. In addition, the agent stores the source IP address and port
of the Binding Request, and associates it with the connection. This
address is called the "apparent" remote transport address for this
connection. </t>

<t>
An agent will continue to receive periodic STUN transactions as long
as it had listed its transport address in an a=candidate attribute. It
MUST process those transactions according to this section. It is
possible that a transport address pair that was previously valid may
become invalidated as a result of a subsequent failed STUN transaction.
</t>

<t>
Note that, unlike the UDP case, there will never be simultaneous
transmission of media and STUN packets over TCP connections. This is
because the connection is listed as on hold according to comedia
procedures, and no media will be transmitted. ICE will establish the
connections as described here. Once established, an updated
offer/answer exchange can promote those connections to active usage
through the comedia "exist" mechanism, as described below. The
additional offer/answer exchange provides a barrier synchronization
point at which a TCP connection switches from ICE control to control
by the media source and sinks. Once it is active, STUN packets will no
longer be sent on the connection.
</t>

</section>

<!-- end TCP processing -->
</section>

<!-- end connectivity checks -->
</section>

<section title="Promoting a Valid Candidate to Active">


<section title="Minimum Requirements">

<t>
As the STUN connectivity checks run, they will result in the
validation of pairings. Once validated, a pairing can be used by
promoting it to active. This promotion occurs by placing the transport
addresses for the native candidate of the pairing into the m/c
line and sending an updated offer. It MAY promote a candidate
associated with any validated pairing at any time, as long as the
candidate had been provided in series of a=candidate attributes in the
most recent offer (in other words, an agent can't validate a
candidate, omit that candidate from the a=candidate attribute of an
offer, and then later on, generate a new offer that promotes the
candidate to active). The procedures for doing so are described here. 
</t>

<t> If the pairing to be promoted is UDP, the native candidate is
encoded into an update offer as described in <xref
target="sec:enc-candidate"/>. The transport addresses constituting the
candidate SHOULD also be listed in a=candidate attributes, so that
STUN can be used as an ongoing keepalive. 
</t>

<t>If the pairing to be
promoted is TCP, it is more complicated. Recall that a TCP transport
address pairing can produce a connection in each direction. Thus, when
a connection is promoted, it must be done so that it is clear which
connection is to be used.
</t>

<t>If the agent was the active one and established the connection, it
includes its apparent native transport address in the m/c line of the
SDP (recall that this address was discovered via the STUN exchange
over the connection). Note that this is instead of the SHOULD-strength
recommendation in comedia, which recommends that the port number sent
by the entity which initiated the connection should be '9'. The actual
port number is present to facilitate identification of the
connection. The a=setup attribute MUST be present and MUST contain the
value "active". The a=connection attribute MUST be present and MUST
have the value of "existing".
</t>

<t>
If the agent was the passive one and was the recipient of the
connection, it includes its transport address in the m/c line of the
SDP. In this case, that address will be the same as the one it had
placed into the a=candidate line of the SDP. The a=setup attribute
MUST be present and MUST contain the value of "passive". The
a=connection attribute MUST be present and MUST have the value of
"existing". 
</t>

<t>Any candidates which the
agent would like to retain as valid candidates are also included in
a=candidate lines in the offer. It SHOULD include any candidates
learned from the peer-to-peer discovery processing of <xref
target="sec:learning"/>, and SHOULD include any candidates of higher
priority than the one just promoted to active. It SHOULD omit
candidates of lower priority than the one being promoted to active. It
SHOULD omit any for whom all pairings that include that candidate have
become invalid. 
</t>

<t>
Once it has decided on the set of candidates to provide in the updated
offer, the agent constructs the offer and follows the procedures in
<xref target="sec:subsequent"/> which defines subsequent offer/answer
processing.
</t>

</section>

<section anchor="sec:suggest" title="Suggested Algorithm">

<t> ICE leaves substantial variability to implementors around when an
agent decides to generate a new offer. However, there are good ways to
do this, and bad ways. Perhaps the worst algorithm possible would be
to generate a new offer every time a candidate with higher priority
than the active one becomes valid. This algorithm will likely result
in a large number of offer/answer exchanges in rapid succession, many
of which will produce "glare" as each agent will independently
initiate an exchange. This will consume CPU and network resources for
little benefit. Rather, the ideal algorithm strikes a balance between
usage of network resources and the desire to use the ideal pair of
candidates.  </t>

<t>
The following algorithm provides a good tradeoff, and usage of this
algorithm is RECOMMENDED. The algorithm results in a bounded number of
additional offer/answer exchanges after the initial one - never more
than two, and frequently one or zero. The algorithm almost never
produces a glare condition.
</t>

<t>
Once the initial offer/answer exchange completes, media flow will
happen, though not optimally (where optimal is defined by the policies
used to set the priorities of the candidates), as long as the
candidate that is active has been validated. Thus, the objective of
the algorithm is to quickly make sure that there is a valid path for
media (to avoid clipping), and then do a single offer/answer exchange
to use the highest priority pairing that was validated. 
</t>

<t>
After the initial offer/answer exchange, each agent sets a timer
Tu. This timer SHOULD have a configurable baseline value, which SHOULD
default to 3 seconds. The actual timer is set to this baseline, plus a
time value chosen uniformly beween -1 and 1 seconds. This causes the
actual timer to be randomized so that the timer doesnt fire
simultaneously at each agent. In addition, each agent monitors the status
of the active pairing. If the active media stream is UDP-based, the
status of the active candidates is equal to the status of the pairing
with matching transport addresses. In the case of TCP-based media, the
active media stream is never active initially, since it always begins
with the "holdconn" state. 
</t>

<t> If, when Tu fires, the active pairing has not been validated, and
there exists at least one pairing that has been validated, the agent
generates a new offer. This offer promotes its highest priority
candidate with a validated pairing to the active candidate. If there
are no pairings that have been validated when the timer fires, the
agent waits until one is validated, and once that happens, sets a
timer to fire randomly between 0 and 2 seconds. When the timer fires,
a new offer is generated that promotes the candidate from this
validating pairing to active. If the active pairing is validated when
the timer fires, the agent does nothing at this time.  </t>

<t>
If new offer is to be sent, the agent includes the new active
candidate in the a=candidate attribute list. It also includes all
candidates with higher priority than the one that is active, including
ones it learned from the connectivity checks themselves. 
</t>

<t>
At this point, media is flowing successfully, since a valid candidate
is active. However, it may not be optimal. So, the next stage of the
algorithm is to let the connectivity checks continue. If those checks
indicate that a pairing between the two highest priority candidates
from both agents has been validated, each agent sets a timer whose
value is randomly set between 0 and 2 seconds. When the timer fires,
a new offer is generated that promotes the candidate from this
validating pairing to active. Otherwise, when the connectivity checks
have all concluded, such that no pairing exists in the invalid state, each agent sets a timer whose
value is randomly set between 0 and 2 seconds. When the timer fires,
a new offer is generated that promotes the candidate from the valid pairing
with the highest priority to active.  </t>

</section>

</section>


<section title="Subsequent Offer/Answer Exchanges">

<t>
An offer/answer exchange within a session can occur at any time,
whether it is the result of the algorithm described in <xref
target="sec:suggest"/>, or because one of the agents wishes to add or
remove a media stream, or add a codec, and so on. 
</t>

<section title="Sending of an Offer">

<t>
The meaning of a=candidate attributes within a subsequent offer have
the same meaning they do in an initial offer. They are a request for
the peer to attempt (or continue to attempt if the candidate was
provided previously) a connectivity check using STUN from each of its
own candidates. As such, an a=candidate attribute is included
in subsequent offers when (1) connectivity checks haven't concluded
yet to that candidate, or (2) the checks have concluded, and the
candidate is currently active. In that case, STUN is used to keep the
bindings active.
</t>

<t>
If an agent sends an offer which omits candidates it had sent to its
peer previously, it MUST cease connectivity checks from that
candidate. Any pairings that include the absent native candidate are
discarded. Any STUN transactions in progress from that candidate are
immediately terminated - no further retransmissions take place, and no
further transactions from that candidate will be made. If a TCP
connection was opened to or from that candidate, and that connection is not
listed as the active one in the offer, the connection is torn down.
</t>

<t>
The offer MAY contain a new active candidate in the m/c line. 


</section>

<section title="Receiving the Offer and Sending an Answer">

<t>
If an agent receives an updated offer with a=candidate attributes,
it checks to see if it already knows about the listed candidates. This is
done by comparing the tid with the candidates it had received in the
previous offer or answer from the peer. If the tid is already known,
processing for that candidate continues as if no offer had been
made. Any connectivity checks in progress continue, and any ongoing
STUN keepalives continue.
</t>

<t>
If a candidate which had been listed previously is no longer present
in the offer, this tells the answerer to cease connectivity
checks. Any pairings that include the absent remote candidate are
discarded. Any STUN transactions in progress to that candidate are
immediately terminated - no further retransmissions take place, and no
further transactions to that candidate will be made. If a TCP
connection was opened to or from that candidate, and that connection is not
listed as the active one in the offer, the connection is torn down.
</t>

<t>
The agent then sends its answer. Like the offerer, it can add or
remove candidates from its answer. If it removed candidates from its
answer, it ceases STUN connectivity checks from those candidates, and
any pairings that include those candidates are discarded. Any STUN
transactions in progress to that candidate are 
immediately terminated - no further retransmissions take place, and no
further transactions to that candidate will be made. If a TCP
connection was opened to or from that candidate, and that connection is not
listed as the active one in the answer, the connection is torn down.
</t>

<t>
After transmission of the answer, there may be a set of candidates
which were new in the offer, and a set that were new in the
answer. The agent begins
connectivity checks as described in <xref target="sec:stun-send"/>,
pairing each new candidate in its answer with all candidates in the
offer, and each new candidate in the offer with all of its candidates
in the answer. 
</t>

<t>
The m/c line may have also changed, indicating a new active
candidate. If the m/c line contains a UDP stream, the agent begins
sending media to the transport addresses listed there. In addition, it
checks to see if those transport addresses correspond to a remote
candidate in a valid pairing. So long as the remote agent has offered
up a candidate that has been validated by ICE, it should be the
case. Indeed, there may be a multitude of valid pairings containing
the transport addresses in the m/c line as the remote candidate. In
that case, the agent MUST choose the pairing whose native candidate
has the highest priority. It MUST place this candidate in the m/c
line. Transmission of media occurs as defined in <xref
target="sec:send-media"/>.
</t>

<t>
If the m/c line has changed, and now indicates a new TCP candidate,
the agent examines it. The comedia "a=connection" attribute will
normally be present and normally contain the value of "existing". If
not present, or if present but with a value of "new", comedia process
is followed, as apparently the peer has abandoned ICE operation for
this media stream. Assuming it contains a value of "existing", the
agent looks at whether the a=setup attribute is present. If its value
is "active", it means that a connection that was initiated by the
remote agent is to be used. The agent examines the transport address
in the m/c line. It looks for a matching value in the apparent remote
transport addresses of existing connections. If it matches multiple
connections (though it should normally match just one), one of those
connections is chosen. The native transport address of that connection
is then placed into the m/c line of the answer. If no existing
connections where matched, an error has occured. The agent SHOULD
respond with "holdconn", and then generate its own offer with a
connection to the peer which it believes is valid.
</t>

<t>
If the a=setup attribute had a value of "passive", it means that a
connection that was initiated by the agent itself is to be used. The
agent examines the transport address in the m/c line. It looks for a
matching value amongst the remote transport addresses in valid
pairings. If multiple pairings match, it MUST choose the one whose
native transport address has the highest priority. The apparent native
transport address associated with an active connection initiated by
the agent is then placed into the m/c line, and that TCP connection is
used to send and receive media. If no pairings match, an error has
occured. The agent SHOULD
respond with "holdconn", and then generate its own offer with a
connection to the peer which it believes is valid.
</t>

</section>

<section title="Receiving the Answer">

<t>
If an agent receives an answer with a=candidate attributes,
it checks to see if it already knows about the listed candidates. This is
done by comparing the tid with the candidates it had received in the
previous offer or answer from the peer. If the tid is already known,
processing for that candidate continues as if no offer had been
made. Any connectivity checks in progress continue, and any ongoing
STUN keepalives continue.
</t>

<t>
If a candidate which had been listed previously is no longer present
in the answer, this tells the offerer to cease connectivity
checks. Any pairings that include the absent remote candidate are
discarded. Any STUN transactions in progress to that candidate are
immediately terminated - no further retransmissions take place, and no
further transactions to that candidate will be made. If a TCP
connection was opened to or from that candidate, and that connection is not
listed as the active one in the answer, the connection is torn down.
</t>

<t>
Furthermore, there may be a set of candidates
which were new in the offer, and a set that were new in the
answer. The agent begins
connectivity checks as described in <xref target="sec:stun-send"/>,
pairing each new candidate in its offer with all candidates in the
answer, and each new candidate in the answer with all of its candidates
in the offer. 
</t>

<t>
The m/c line may have also changed, indicating a new active
candidate. If the m/c line contains a UDP stream, the agent begins
sending media to the transport addresses listed there as defined in <xref
target="sec:send-media"/>. 
</t>

fix this

<t>
If the m/c line has changed, and now indicates a new TCP candidate,
the agent examines it. The comedia "a=connection" attribute will
normally be present and normally contain the value of "existing". If
not present, or if present but with a value of "new", comedia process
is followed, as apparently the peer has abandoned ICE operation for
this media stream. Assuming it contains a value of "existing", the
agent looks at whether the a=setup attribute is present. If its value
is "active", it means that a connection that was initiated by the
remote agent is to be used. The agent examines the transport address
in the m/c line. It looks for a matching value in the apparent remote
transport addresses of existing connections. If it matches multiple
connections (though it should normally match just one), one of those
connections is chosen. The native transport address of that connection
is then placed into the m/c line of the answer. If no existing
connections where matched, an error has occured. The agent SHOULD
respond with "holdconn", and then generate its own offer with a
connection to the peer which it believes is valid.
</t>

<t>
If the a=setup attribute had a value of "passive", it means that a
connection that was initiated by the agent itself is to be used. The
agent examines the transport address in the m/c line. It looks for a
matching value amongst the remote transport addresses in valid
pairings. If multiple pairings match, it MUST choose the one whose
native transport address has the highest priority. The apparent native
transport address associated with an active connection initiated by
the agent is then placed into the m/c line, and that TCP connection is
used to send and receive media. If no pairings match, an error has
occured. The agent SHOULD
respond with "holdconn", and then generate its own offer with a
connection to the peer which it believes is valid.
</t>








<section anchor="sec:management" title="Management of Resources">

<t>
The beginning of a multimedia session results in the creation of
several resources to support ICE. These include gathered addresses,
both local and derived, along with the local STUN servers that run on
the local addresses. These resources must be maintained and eventually
freed.
</t>

<t>
It is RECOMMENDED that all gathered addresses be retained for the
duration of the session. Even if they are not used initially, this
allows them to be used later in the session should conditions change,
requiring a signaling operation to update the set of candidate
addresses. Maintaining these resources depends on the
type of resource. For a local transport address, nothing is
required. The socket is maintained until freed by the ICE
application. For STUN derived transport addresses, the bindings in the
NAT for that address need to be maintained. If the derived transport
address is used by the peer for media, the media itself serves to keep
the bindings alive (see <xref target="sec:keepalives"/>). A agent can
determine that a STUN derived transport address was used for media
when the RTP packet arrives at the associated local transport
address. For the other STUN derived transport addresses, the agent
SHOULD periodically generate STUN transactions to the STUN
server. Every 20 seconds is RECOMMENDED.
</t>

<t>
For TURN derived transport addresses, the bindings in the NAT along
with the mappings in the TURN server need to be maintained. Media
traffic itself can accomplish that. The agent will know that its TURN
derived transport address is in use when an RTP packet arrives at the
associated local transport address. For other TURN derived transport
addresses, the TURN keepalive mechanisms SHOULD be used.
</t>

<t>Once the STUN servers are started on the local transport addresses,
they MUST run until a valid media packet is detected on that transport
address. Once a media packet is received, it signals that the peer has
completed its connectivity checks and has decided to use that
transport address (or the derived transport address, as the case may
be) for media communications. While the server is running, it MUST act
as a normal STUN server, but MUST only accept STUN requests from
agents that authenticate, as discussed below in <xref
target="sec:stun-recv"/></t>

</section>

<section anchor="sec:keepalives" title="Binding Keepalives">

<t>
Once the STUN connectivity checks complete, STUN packets are no longer
used. However, bindings in intermediate NATs need to be kept alive so
that the media can continue to flow. Doing so is the responsibility of
the media protocol.
</t>

<t>
In the case of RTP, the RTP packets themselves normally come
sufficiently quickly to keep the bindings alive. However, several
cases merit further discussion. Firstly, in some RTP usages, such as
SIP, the media streams can be "put on hold". This is accomplished by
using the SDP "sendonly" or "inactive" attributes, as defined in RFC
3264 <xref target="RFC3264"/>. RFC 3264 directs implementations to
cease transmission of media in these cases. However, doing so may
cause NAT bindings to timeout, and media won't be able to come off
hold.
</t>

<t>
As such, agents SHOULD instead send a media packet periodically,
independent of whether the stream is "sendonly", "recvonly" or
"inactive". At least once every 20 seconds is RECOMMENDED. These
packets can be sent using any of the payload formats listed by the
peer in its SDP. For audio streams, It is RECOMMENDED that
implementations support the RTP payload format for comfort noise <xref
target="RFC3389"/>, which makes a good choice. For video codecs, a
minimally coded frame is a good choice. 
</t>

<t>
Secondly, some RTP payload formats, such as the payload format for
text conversation <xref target="I-D.ietf-avt-rfc2793bis"/>, may send
packets so infrequently that the interval exceeds the NAT binding
timeouts. In such cases, the implementation should send some any kind
of content, if possible. If the payload type doesn't allow anything
meaningful to be sent, even a malformed RTP packet is superior to
nothing at all; the malformed packet would be rejected by the peer,
and have the side effect of keeping the NAT bindings open.
</t>

</section>

<section anchor="sec:send-media" title="Sending Media"/>

<t>
When an agent sends media packets, it MUST send them from the same IP
address and port it has advertised in the m/c-line. This provides a
property known as symmetry, which is an essential facet of NAT
travresal. 
</t>

<t>
In the case of a STUN-derived transport address, this means that the
RTP packets are sent from the local transport address used to obtain
the STUN address. In the case of a TURN-derived transport address,
this means that media packets are sent through the TURN server (using
the TURN SEND primitive). For local transport addresses, media is sent
from that local transport address.
</t>

<t>
This symmetric behavior MUST be followed by an agent even if its peer
in the session doesn't support ICE.
</t>

</section>

</section>

<section anchor="sec:run" 
title="Running STUN on Derived Transport Addresses">

<t>
One of the seemingly bizarre operations done during the ICE processing
is the transmission of a STUN request to a transport address which is
obtained through TURN or STUN itself. This actually does work, and in
fact, has extremely useful properties. The subsections below go
through the detailed operations that would occur at each point to
demonstrate correctness and the properties derived from it. They are
tutorial in nature.
</t>

<section title="STUN on a TURN Derived Transport Address">

<figure anchor="fig:turn-stun-step1"><artwork>
<![CDATA[
              +----------+                                                
              |          |192.0.2.1:26524                                 
              |   TURN   X                                                
              |  Server  |                                                
              |          |                                                
              |          |                                                
              +----------+                                                
   192.0.2.1:7764.    ^192.0.2.1:7764                                     
                 .    .                                                   
                 .    .192.0.2.88:5063                                    
              +----------+                                                
              |   NAT    |                                                
              +----------+                                                
       TURN      .    .                                                   
       Answer  .    . TURN Request                                      
                 .    .                                                   
   10.0.1.1:8866 V    .10.0.1.1:8866                                      
              +----------+                      +----------+              
              |          |                      |          |              
              |  Agent  |                      |  Agent  |              
              |          |                      |          |              
              |    A     |                      |    B     |              
              |          |                      |          |              
              +----------+                      +----------+
]]></artwork></figure>

<t>
Consider a agent A that is behind a NAT, shown in <xref
target="fig:turn-stun-step1"/>. It connects to a TURN server
on the public side of the NAT. To do that, A binds to a local
transport address, say 10.0.1.1:8866, and then sends a TURN request
to the TURN server. The NAT translates the net-10 address to
192.0.2.88:5063. Assume that the TURN server is running on
192.0.2.1 and listening for TURN traffic on port 7764. The TURN server
allocates a derived transport address 192.0.2.1:26524 to the agent
(shown as the X on the TURN server in the diagram),
and returns it in the TURN answer. Remember that all traffic from
the TURN server to the agent is sent from 192.0.2.1:7764 to
10.0.1.1:8866, including the TURN answer.
</t>

<t>Now, the agent runs a STUN server on 10.0.1.1:8866, and advertises
that its server actually runs on 192.0.2.1:26524. Another agent, B,
sends a STUN request to this server. It sends it from a local
transport address, 192.0.2.77:1296. When it arrives at
192.0.2.1:26524, it is discarded since agent A has not sent a packet
to 192.0.2.77:1296. Once agent A gets agent B's accept message, it
will learn about B's candidate address, and generate a STUN request
towards it. This results in a permission being installed in the TURN
server, so that packets from 192.0.2.77:1296 will now be accepted. The
next STUN request from agent B will therefore succeed. This is the
normal mode of operations for port restricted NAT; as described in
TURN, the server turns a symmetric NAT into a port restricted one
<xref target="I-D.rosenberg-midcom-turn"/>.
</t>

<figure anchor="fig:turn-stun-step2"><artwork>
<![CDATA[
               +----------+                                               
               |          |192.0.2.1:26524          STUN Request          
               |   TURN   X<...............................               
               |  Server  |                 STUN Answer .               
               |          |.........................      .               
               |          |192.0.2.1:26524         .      .               
               +----------+                        .      .               
   192.0.2.1:7764 .   ^ 192.0.2.1:7764             .      .               
                  .   .                            .      .               
  192.0.2.88:5063 V   . 192.0.2.88:5063            .      .               
               +----------+                        .      .               
               |   NAT    |                        .      .               
               +----------+                        .      .               
   192.0.2.1:7764 .   ^ 192.0.2.1:7764             .      .               
                  .   .                  192.0.2.77:1296  .               
                  .   .                            .      .               
    10.0.1.1:8866 V   . 10.0.1.1:8866              V      .192.0.2.77:1296
               +----------+                      +----------+             
               |          |                      |          |             
               |  Agent  |                      |  Agent  |             
               |          |                      |          |             
               |    A     |                      |    B     |             
               |          |                      |          |             
               +----------+                      +----------+

]]></artwork></figure>

<t>
As shown in <xref target="fig:turn-stun-step2"/>, agent B will retry,
sending it STUN request from 192.0.2.77:1296 to 192.0.2.1:26524. This
successful STUN request is forwarded to the agent, sent with a source
address of 192.0.2.1:7764 and a destination address of
192.0.2.88:5063. This passes through the NAT, which rewrites the
destination address to 10.0.1.1:8866. This arrives at A's STUN
server. The server observes the source address of 192.0.2.1:7764, and
generates a STUN answer containing this value in the MAPPED-ADDRESS
attribute. The STUN answer is sent with a source address of
10.0.1.1:8866, and a destination of 192.0.2.1:7764. This arrives at
the TURN server, which, because of current destination is
192.0.2.1:7764, sends the STUN answer with a source address of
192.0.2.1:26524 and destination of 192.0.2.77:1296, which is B's STUN
agent. </t>

<t>
Now, as far as A is concerned, it has obtained a new candidate
transport address of 192.0.2.1:7764. And indeed, it has! STUN derived
transport addresses are scoped to the session, so they can only be
used by the peer in the session. Furthermore, that peer has to send
requests from the socket on which the STUN server was running. In this
case, A is the peer, and its STUN server was on 10.0.1.1:8866. If it
sends to 192.0.2.1:7764, the packet goes to the TURN server, and since
the destination address is set to 192.0.2.77:1296, is forwarded to B,
and specifically, is forwarded to the transport address B sent the
STUN request from. Therefore, the address is indeed a valid candidate
transport address. Its priority is derived from the priority of
agent B's public IP address.
</t>

<t>
The benefit of this is that it allows two agents to share the same
TURN server for media traffic in both directions. With "normal" TURN
usage, both agents would obtain a derived address from their own TURN
servers. The result is that, for a single call, there are two bindings
allocated by each side from their respective servers, and all four are
used. With ICE, that drops to two bindings allocated from a single
server. Of course, all four bindings are allocated initially. However,
once one of the agents begins receiving media on its STUN derived
address, it can deallocate its TURN resources.
</t>

</section>

<section title="STUN on a STUN Derived Transport Address">

<t>
Consider a agent A that is behind a NAT. It connects to a STUN server
on the public side of the NAT. To do that, A binds to a local
transport address, say 10.0.1.1:8866, and then sends a STUN request
to the STUN server. The NAT translates the net-10 address to
192.0.2.88:5063. Assume that the STUN server is running on
192.0.2.1 and listening for STUN traffic on port 3478, the default
STUN port. The STUN server sees a source IP address of
192.0.2.88:5063, and returns that to the agent in the STUN
answer. The NAT forwards the answer to the agent.
</t>

<t>Now, the agent runs a STUN server on 10.0.1.1:8866, and advertises
that its transport address is 192.0.2.88:5063. Another agent, B,
sends a STUN request to this address. It sends it from a local
transport address, 192.0.2.77:1296. When it arrives at 192.0.2.88:5063
(on the NAT), the NAT rewrites the source address to 10.0.1.1:8866,
assuming that it is of the full-cone or restricted variety <xref
target="RFC3489"/>, and the permission for 192.0.2.77:1296 is
open. This arrives at A's local STUN server. The server observes the source
address of 192.0.2.77:1296, and generates a STUN answer containing
this value in the MAPPED-ADDRESS attribute. The STUN answer is sent
with a source address of 10.0.1.1:8866, and a destination of
192.0.2.77:1296. This arrives at B's STUN agent. </t>

<t>
Now, as far as A is concerned, the STUN request had a source transport
address which was already known to A, presumably from an ICE
exchange. As far as B is concerned, the check succeeded, and the
address is viable.
</t>

</section>

</section>


<section title="Example">

<t>
In the example that follows, messages are labeled with "message name
A,B" to mean a message from transport address A to B. For STUN
Requests, this is followed by curly brackets enclosing the username
and password. For STUN answers, this is followed by square brackets
and the value of MAPPED ADDRESS. The example shows a flow of two
agents where one is behind a full cone NAT, and the other is on the public
Internet. 
</t>

<figure><artwork>
<![CDATA[
          A                NAT              STUN                B
          |(1) STUN Req P1,STUN-PUBLIC        |                 |
          |---------------->|                 |                 |
          |                 |(2) STUN Req U, STUN-PUBLIC        |
          |                 |---------------->|                 |
          |                 |(3) STUN Res STUN-PUBLIC, U [U]    |
          |                 |<----------------|                 |
          |(4) STUN Res STUN-PUBLIC, P1 [U]   |                 |
          |<----------------|                 |                 |
          |(5) Intitiate {P2,ufrag1A,pass1A,q=0.4}              |
          |{U,ufrag2A,pass2A,q=0.4}           |                 |
          |---------------------------------------------------->|
          |                 |                 |(6) STUN Req P3,STUN-PUBLIC
          |                 |                 |<----------------|
          |                 |                 |(7) STUN Res STUN-PUBLIC,P3 [P3]
          |                 |                 |---------------->|
          |(8) Accept {P3,ufrag1B,pass1B,q=0.4}                 |
          |<----------------------------------------------------|
          |                 |(9) STUN Req P3,P2                 |
          |                 |(ufrag1Aufrag1B,pass1A)            |
          |                 |<----------------------------------|
          |                 |Timeout          |                 |
          |                 |(10) STUN Req P3,U                 |
          |                 |(ufrag2Aufrag1B,pass2A)            |
          |                 |<----------------------------------|
          |(11) STUN Req P3,P1                |                 |
          |(ufrag2Aufrag1B,pass2A)            |                 |
          |<----------------|                 |                 |
          |(12) STUN Res P1,P3 [P3]           |                 |
          |---------------->|                 |                 |
          |                 |(13) STUN Res U,P3 [P3]            |
          |                 |---------------------------------->|
          |(14) STUN Req P2,P3                |                 |
          |(ufrag1Bufrag1A,pass1B)            |                 |
          |---------------->|                 |                 |
          |                 |(15) STUN Req W,P3                 |
          |                 |(ufrag1Bufrag1A,pass1B)            |
          |                 |---------------------------------->|
          |                 |(16) STUN Res P3,W [W]             |
          |                 |<----------------------------------|
          |(17) STUN Res P3,P2 [W]            |                 |
          |<----------------|                 |                 |
          |(18) STUN Req P1,P3                |                 |
          |(ufrag1Bufrag2A,pass1B)            |                 |
          |---------------->|                 |                 |
          |                 |(19) STUN Req U,P3                 |
          |                 |(ufrag1Bufrag2A,pass1B)            |
          |                 |---------------------------------->|
          |                 |(20) STUN Res P3,U [U]             |
          |                 |<----------------------------------|
          |(21) STUN Res P3,P1 [U]            |                 |
          |<----------------|                 |                 |

]]></artwork></figure>

<t>
The offeror, agent A, binds to a local transport address P1, which
will be used as an associated local transport address. As such, it
sends a STUN request to its STUN server (message 1). This passes
through a NAT, and the NAT maps private address P1 to public address U
(message 2). The STUN server mirrors this public address in the
MAPPED-ADDRESS of the STUN answer (message 3), and it is forwarded
to the offeror (message 4). Now, agent A has a STUN derived
transport address of U. It also binds to a second local transport
address, P2, which will be a usable local transport address. It starts
STUN servers on both local transport addresses P1 and P2. It then
generates an Offer request to agent B (message 5) which contains
both of the gathered transport addresses P2 and U, along with username
fragments and passwords.
</t>

<t>
Agent B is not behind a NAT. It binds to a local transport address
P3, and sends a STUN request to its STUN server (message 6). This is
responded to by the STUN server (message 7). The agent observes that
this address is identical to its local transport address, and
therefore that local transport address is, which was targeted for an
associated local transport address, is promoted to a usable local
transport address. It then sends an Accept message to agent A,
including this transport address and its username fragment and
password (message 8).
</t>

<t>
Once the Accept message is sent, the agent can perform its STUN
connectivity checks. B has a single local transport address (P3),
which it matches up with A's two remote transport addresses (P2 and
U). B tries P2 (message 9). This request fails since P2 is a
private address. In parallel, B tries U (message 10). Since A's NAT is
full cone, this packet is accepted and is passed to agent A (message
11). Agent A generates a answer (message 12) which is forwarded to
agent B (message 13). The source transport address in the STUN
packet, P3, is already known to agent A, and thus no new candidates
are learned. Agent B learns that agent A is reachable at transport
address U, but not P3. Thus, it can begin sending media to U from
local transport address P3.
</t>

<t>
Once the Accept message arrives at agent A, it can begin its
connectivity checks. It has two local transport addresses P1 and P2,
which it combines with agent Bs single transport address P3. It tries
to send a STUN packet from P2 to P3 (message 14). Since the NAT has
not seen source address P2 yet, it maps it to a new public transport
address W, and the STUN request is forwarded to agent B (message
15). Agent B generates a STUN answer (message 16), which is
forwarded back to agent A (message 17). Based on this, agent A
learns that it can reach P3 from P2. Agent B learns a new remote
transport address, W. However, the priority of this address is the
same as P2, which is 0.4, and equal to the priority of address U, to
which agent B has already connected. Thus, it does not bother to
perform the check (such a check would have succeeded if it had been
done). </t>

<t>
While the P2->P3 check is taking place, agent A also sends a
STUN request from P1 to P3 (message 18). This passes through the NAT,
which maps the source transport address to the same public address it
allocated previously, U. This STUN request arrives at agent B
(message 19). It generates a answer (message 20), which is forwarded
to agent A (message 21). Based on this check, agent A learns that P3
is also reachable from P1. Agent B did not learn a new candidate
transport address, since U was already known. Now, agent A can send
media to P3 from either P1 or P2.
</t>

</section>

<section anchor="sec:sdp-alt" title="Grammar">

<t>
This specification defines a new SDP attribute. It is called "candidate". 
The candidate attribute MUST be present within a media block of the
SDP. It contains a transport address for a candidate that can be used for
connectivity checks. There MAY be multiple candidate
attributes in a media block. 
</t>

<t>The syntax of this attribute is:</t>

<figure><artwork>
<![CDATA[
candidate-attribute   = "candidate" ":" candidate-id SP tid SP 
                        transport SP 
                        qvalue SP   ;qvalue from RFC 3261
                        unicast-address SP 
                        port SP 
                          ;unicast-address, port from RFC 2327
transport             = "UDP" / "TCP" / transport-extension
transport-extension   = token               
candidate-id          = 1*DIGIT
id                    = non-ws-string

]]></artwork></figure>

<t> The candidate-id is used to group together the transport addresses
for a particular candidate. It MUST be a positive integer whose value
is less than (2^31 -1). It MUST have the same value for all transport
addresses within the same candidate. It MUST have a different value
for transport addresses within different candidates for the same media
stream. The tid production contains an identifier, chosen with 128 bits of
randomness, that identifies the transport address. The tid of a pair of
transport addresses is combined to for the username and password of a STUN
request from one transport address to another. The transport production
indicates the transport protocol for the candidate. This can be either
UDP or TCP. Extensibility is provided to allow for future transport
protocols to be used with ICE, such as the Datagram Congestion Control
Protocol (DCCP) <xref target="I-D.ietf-dccp-spec"/>. The
unicast-address production is from RFC 2327, and contains the IPv4 or
IPv6 address of the candidate. The port production contains its port.
</t>

</section>

<section title="SIP and SDP Specific Security Considerations">

<t>
The SDP messages described here contain usernames and passwords. If
those passwords are transmitted in the clear, it introduces
significant security vulnerabilities, discussed in detail below. In
summary, those vulnerabilities would allow an eavesdropper that can
inject packets, to "steal" the media streams for a call unless secure
media transport (such as SRTP) is used. Even if SRTP is used, an
attacker could disrupt a call and prevent media from flowing. These
attacks, fortunately, can be obviated by providing secure transport of
the SDP. SIP-based implementations of ICE SHOULD use the sips URI
scheme when transporting SDP with ICE information, and MAY use
S/MIME <xref target="RFC3261"/>.
</t>

</section>


</section>

<section title="Interactions with Forking">

</section>

<section title="Interactions with Preconditions">

</section>

<section anchor="sec:sec" title="Security Considerations">

<t>
STUN itself introduces many security considerations. In particular,
there are attacks whereby an eavesdropper replays STUN packets with a
modified source address. These modified packets can cause service
disruptions and denial-of-service attacks, which are only partially
mitigated by the heuristics described in STUN <xref
target="RFC3489"/>. 
</t>

<t>
Interestingly, when STUN is used within ICE, these security weaknesses
are mitigated completely, without the need for the heuristics defined
in RFC 3489.
</t>

<t>
Consider an attacker that intercepts a STUN packet used for
connectivity checks, and replays it using a faked source address. If
successful, this would fool an endpoint into thinking that this faked
source address was a valid destination for media (recall that the
source transport address of received STUN packets is used as a
potential candidate address). However, the
recipient of the replayed packet will not just send media to that
candidate. It will verify it with a STUN connectivity check. This
check will be sent to that faked source address, and if there is no
answer, the address will not be used. The attacker cannot answer the
STUN request without access to the username and password, which are
exchanged as part of the signaling. Thus, if the signaling is
protected as recommended above, the attacker cannot obtain the
username or password. 
</t>

<t>
If an attacker instead intercepts and replays STUN packets used for
the purposes of unilateral allocation, a similar result occurs. The
target of the attack will be fooled into thinking it has a STUN
derived transport address that it does not. Its peer will perform a
connectivity check to this address, which will fail. The attacker
cannot force this check to succeed without access to the username and
password, which are protected. Thus, this address will not be used.
</t>

<t>
In the worst case, an attacker can generate enough traffic so that
none of the valid STUN checks or unilateral allocations succeed. This
would result in a service disruption. However, this attack is no worse
than any pure packet flood disruption attack launched against
any other protocol. These attacks cannot be prevented by any protocol
means. 
</t>

<t>
If an attacker could intercept and modify the contents of the Offer
or Accept messages, they could disrupt the session, divert the media, and
otherwise take control over the session. This attack is prevented by
encryption, authentication and message integrity of the signaling
channel used for ICE.
</t>

</section>

<section anchor="sec:iana" title="IANA Considerations">

<section title="SDP Attribute Name">

<t>This specification defines one new SDP attribute per the procedures
of Appendix B of RFC 2327. The required information for the
registration is:
</t>

<list style="hanging">

<t hangText="Contact Name:"> Jonathan Rosenberg, jdrosen@jdrosen.net.
</t>

<t hangText="Attribute Name:"> candidate</t>

<t hangText="Long Form:"> candidiate </t>

<t hangText="Type of Attribute:"> media level </t>

<t hangText="Charset Considerations:"> The attribute is not subject
the the charset attribute. </t>

<t hangText="Purpose:"> This attribute is used with Interactive
Connectivity Establishment (ICE), and provides one of many possible
candidate addresses for communication. These addresses are validated
with an end-to-end connectivity check using Simple Traversal of UDP
with NAT (STUN).
</t>

<t hangText="Appropriate Values:"> See <xref target="sec:sdp-alt"/> of
RFC XXXX [Note to RFC-ed: please replace XXXX with the RFC number of
this specification]. </t>

</list>

</section>

<section title="URN Sub-Namespace Registration">

<t>
This section registers a new XML namespace, per the guidelines
in <xref target="RFC3688"/></t>

<list style="hanging">

<t>URI: The URI for this namespace is
urn:ietf:params:xml:ns:ice.</t>

<t>Registrant Contact: IETF, MMUSIC working group,
(mmusic@ietf.org), Jonathan Rosenberg
(jdrosen@jdrosen.net).</t>

<t>XML: 
<figure><artwork>
<![CDATA[
             BEGIN
             <?xml version="1.0"?>
             <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"
                       "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">
             <html xmlns="http://www.w3.org/1999/xhtml">
             <head>
               <meta http-equiv="content-type"
                  content="text/html;charset=iso-8859-1"/>
               <title>ICE Namespace</title>
             </head>
             <body>
               <h1>Namespace for ICE Documents</h1>
               <h2>urn:ietf:params:xml:ns:ice</h2>
               <p>See <a href="[URL of published
RFC]">RFCXXXX</a>. [Note to RFC-ed: please replace XXXX with the RFC
number of this specification.]</p>
             </body>
             </html>
             END
]]></artwork></figure></t>

</list>
</section>

<section title="XML Schema Registration">

<t>This section registers an XML schema per the procedures in
<xref target="RFC3688"/>.
</t>

<list style="hanging">

<t>URI: urn:ietf:params:xml:schema:ice</t>

<t>Registrant Contact: IETF, MMUSIC working group,
(mmusic@ietf.org), Jonathan Rosenberg
(jdrosen@jdrosen.net).</t>

<t>The XML for this schema can be found as the sole content of <xref
target="sec:schema"/>.</t>

</list>

</section>

</section>

<section anchor="sec:iab" title="IAB Considerations">

<t>
The IAB has studied the problem of "Unilateral Self Address Fixing",
which is the general process by which a agent attempts to determine
its address in another realm on the other side of a NAT through a
collaborative protocol reflection mechanism <xref target="RFC3424"/>. 
ICE is an example of a
protocol that performs this type of function. Interestingly, the
process for ICE is not unilateral, but bilateral, and the difference
has a signficant impact on the issues raised by IAB. The IAB has mandated
that any protocols developed for this purpose document a specific set
of considerations. This section meets those requirements.
</t>

<section title="Problem Definition">

<t>
From RFC 3424 any UNSAF proposal must provide:
</t>

<list style="hanging">
<t>
Precise definition of a specific, limited-scope problem that is
 to be solved with the UNSAF proposal.   A short term fix should
not be generalized to solve other problems; this is why  "short
term fixes usually aren't".
</t>
</list>

<t>
The specific problems being solved by ICE are:
</t>

<list style="hanging">

<t>
Provide a means for two peers to determine the set of transport
addresses which can be used for communication.
</t>

<t>
Provide a means for resolving many of the limitations of other
UNSAF mechanisms by wrapping them in an additional layer of processing
(the ICE methodology).
</t>

<t>
Provide a means for a agent to determine an address that is
reachable by another peer with which it wishes to communicate.
</t>

</list>

</section>

<section title="Exit Strategy">

<t>
From RFC 3424, any UNSAF proposal must provide:
</t>

<list style="hanging"><t>
Description of an exit strategy/transition plan.  The better
short term fixes are the ones that will naturally see less and
less use as the appropriate technology is deployed.
</t></list>

<t>
ICE itself doesn't easily get phased out. However, it is useful even
in a globally connected Internet, to serve as a means for detecting
whether a router failure has temporarily disrupted connectivity, for
example. However, what ICE does is help phase out other UNSAF
mechanisms. ICE effectively selects amongst those mechanisms,
prioritizing ones that are better, and deprioritizing ones that are
worse. Local IPv6 addresses can be preferred. As NATs
begin to dissipate as IPv6 is introduced, derived transport addresses
from other UNSAF mechanisms simply never get used, because higher
priority connectivity exists. Therefore, the servers get used less and
less, and can eventually be remove when their usage goes to zero.
</t>

<t>
Indeed, ICE can assist in the transition from IPv4 to IPv6. It can be
used to determine whether to use IPv6 or IPv4 when two dual-stack
hosts communicate with SIP (IPv6 gets used). It can also allow a
network with both 6to4 and native v6 connectivity to determine which
address to use when communicating with a peer. 
</t>

</section>

<section title="Brittleness Introduced by ICE">

<t>
From RFC3424, any UNSAF proposal must provide:
</t>

<list style="hanging"><t>
Discussion of specific issues that may render systems more
"brittle".  For example, approaches that involve using data at
multiple network layers create more dependencies, increase
debugging challenges, and make it harder to transition.
</t></list>

<t> ICE actually removes brittleness from existing UNSAF
mechanisms. In particular, traditional STUN (the usage described in
RFC 3489) has several points of brittleness. One of them is the
discovery process which requires a agent to try and classify the type
of NAT it is behind. This process is error-prone. With ICE, that
discovery process is simply not used. Rather than unilaterally
assessing the validity of the address, its validity is dynamically
determined by measuring connectivity to a peer. The process of
determining connectivity is very robust. The only potential problem is
that bilaterally fixed addresses through STUN can expire if traffic
does not keep them alive. However, that is substantially less
brittleness than the STUN discovery mechanisms.  </t>

<t>
Another point of brittleness in STUN, TURN, and any other unilateral
mechanism is its absolute reliance on an additional server. ICE makes
use of a server for allocating unilateral addresses, but allows
agents to directly connect if possible. Therefore, in some cases, the
failure of a STUN or TURN server would still allow for a call to
progress when ICE is used. 
</t>

<t>
Another point of brittleness in traditional STUN is that it assumes
that the STUN 
server is on the public Internet. Interestingly, with ICE, that is not
necessary. There can be a multitude of STUN servers in a variety of
address realms. ICE will discover the one that has provided a usable
address. 
</t>

<t>
The most troubling point of brittleness in traditional STUN is that it
doesn't work 
in all network topologies. In cases where there is a shared NAT
between each agent and the STUN server, traditional STUN may not
work. With ICE, 
that restriction can be lifted. 
</t>

<t>
Traditional STUN also introduces some security
considerations. Fortunately, those security considerations are also
mitigated by ICE.
</t>

</section>

<section title="Requirements for a Long Term Solution">

<t>From RFC 3424, any UNSAF proposal must provide:
</t>

<list style="hanging"><t>
Identify requirements for longer term, sound technical solutions
-- contribute to the process of finding the right longer term
solution.
</t>
</list>

<t>
Our conclusions from STUN remain unchanged. However, we feel ICE
actually helps because we believe it can be part of the long term
solution. 
</t>

</section>

<section title="Issues with Existing NAPT Boxes">

<t>From RFC 3424, any UNSAF proposal must provide:
</t>

<list style="hanging"><t>
Discussion of the impact of the noted practical issues with
existing, deployed NA[P]Ts and experience reports.
</t></list>

<t>
A number of NAT boxes are now being deployed into the market which try
and provide "generic" ALG functionality. These generic ALGs hunt for
IP addresses,  either in text or binary form within a packet, and
rewrite them if they match a binding. This will interfere with proper
operation of any UNSAF mechanism, including ICE. 
</t>

</section>

</section>

<section title="Acknowledgements">

<t>The authors would like to thank Douglas Otis, Francois Audet and
Magnus Westerland for their comments and input.
</t>

</section>

</middle>

<back>
<references title="Normative References">
<?rfc include="reference.RFC.3489"?>
<?rfc include="reference.RFC.3605"?>
<?rfc include="reference.RFC.3261"?>
<?rfc include="reference.RFC.3264"?>
<?rfc include="reference.RFC.3389"?>
<?rfc include="reference.RFC.3688"?>
<?rfc include="reference.I-D.ietf-mmusic-anat"?>
<?rfc include="reference.I-D.rosenberg-midcom-turn"?>
</references>

<references title="Informative References">
<?rfc include="reference.RFC.2326"?>
<?rfc include="reference.RFC.3235"?>
<?rfc include="reference.RFC.3303"?>
<?rfc include="reference.RFC.3102"?>
<?rfc include="reference.RFC.3103"?>
<?rfc include="reference.RFC.3424"?>
<?rfc include="reference.RFC.3550"?>
<?rfc include="reference.RFC.3711"?>
<?rfc include="reference.RFC.3056"?>
<?rfc include="reference.I-D.huitema-v6ops-teredo"?>
<?rfc include="reference.I-D.ietf-avt-rfc2793bis"?>


</references>
</back>
</rfc>


