CPS
472/572 Computer Networking
Lecture 10: Transport Level
Protocols
TCP and UDP
Dr. Jennifer Seitzer
Motivation:
The Transport Level Protocols underlie the
applications of the Internet. There are two main protocols. The first is User
Datagram Protocol (UDP) a connectionless, unreliable protocol -- similar to IP
at the next higher level. The second, and most popular, is TCP, Transport
Control Protocol. TCP assures reliable transport, and therefore takes care to
assure no lost packets, no duplicate packets, acknowledged receipt of in tact
packets, and no out of order packets are delivered. Tonight, we study both
Transport level Protocols including their behaviors, packet formats,
algorithms, and implementations.
Why is Reliability Needed?
problem - corrupted bits during transmission
solution - perform error detection and
correction
protocol layer with solution - data link
where techniques such as parity, checksum, and CRC were presented
transport protocols - attempt to avoid
transmission problems at the packet level (rather than the bit level)
Problems to Avoid
out of order delivery - data decomposed into packets may arrive at destination
out of sequence (e.g., consider an EMAIL message composed of erratic pieces)
duplicate packets - multiple copies of the same piece arrives at
destination
lost packets - missing chunks from overall data unit
replay - retransmission
of an old, delayed packet
data overrun - data is lost due to differing processing speeds of interacting nodes
congestion - each packet experiences excessive delays because the network is overrun
with packets
Underlying Goal: Reliable Transfer
reliable transfer - transfer of data without loss, ordered correctly, and
no duplications
aren’t all protocols reliable? - Yes and No.
well written - YES-- refereed and peer-reviewed before coming a standard
reliable in the communication context
- No. Reordering, retransmitting, and
duplication checking takes time and complexity.
Example of an unreliable protocol - UDP--User Datagram Protocol; provides a mechanism for
a single host to communicate with multiple destinations independently. (via
port numbers)
Technique 1: Sequencing
technique - sender numbers the packets; receiver tracks last packet number sent in
order AND maintains a list of stray, out-of-order packets
reliability problems solved - out-of-order and duplicate packets
handling out-of-order packets -
for incoming packet with sequence number j
if j-1 is the current
"last_num"
update last_num to j
send packet j to next
layer protocol
ELSE
add packet j to
out-of-order list
Technique 1: Sequencing
duplicate packets - often occur in WAN’s;
possible in LAN’s due to malfunctioning access hardware, for example.
handling duplicate packets -
for incoming packet with sequence number j
if "last_num" >= j
discard packet
ELSE if j appears on out-of-order
list
discard packet
ELSE
not a duplicate
Technique 2: Packet Retransmission
technique - sender resends a packet that has not been verified in a certain amount
of time; receiver sends a small message as the verification
acknowledgment (ACK) - small verification message
reliability problems solved - lost packets
handling lost packets -
for each outgoing packet
repeat
sender keeps a copy and sets timer
if ACK arrives before timer
expires discard
packet copy;
until packet copy discarded
Technique 2: Packet Retransmission
limitation - doesn’t work for failed hardware (e.g., receiver is down); number of
retransmissions are usually bounded
drawback - can introduce duplicate packets; a sender cannot distinguish between a
lost packet and a slow-to-acknowledge packet
handling lost packets -
for each outgoing packet
repeat
sender keeps a copy and sets
timer
if ACK arrives before timer
expires discard
packet copy;
until packet copy discarded OR #retransmits == Max
Technique 3: Session Identifiers
technique - sender and receiver require each packet to contain a session timestamp
of session start time
reliability problems solved - replay- an old packet is misinterpreted as a current
packet
difficulties of replay - aggregation of nonsense; early inadvertent termination
of session;
detection of replay packets -
for each incoming packet at a particular port
if session ID is the current
session ID
process packet
ELSE
discard the packet
Technique 3: Session Identifiers
EXAMPLE OF REPLAY
Suppose computers A and B exchange data with
packets numbered 1 to 5
Packet 4 encounters a long delay
Sender assumes packet 4 is lost so it
retransmits
Receiver receives new packet 4 & sends
ACK
Computers A and B communicate later with
packets numbered 1 to 10
Original packet 4 arrives during the second
session, so the data from old packet 4 is inserted into the data at receivers
side (which is now meaningless)
Technique 4: Flow Control
technique - sender controls the amount of data sent
reliability problems solved - data overrun - a source computer sends data faster
than the destination can receive it
methods - many;
we’ll discuss stop and go and sliding window
Technique 4: Flow Control
Stop and go - sender waits for an ACK after sending a packet before continuing
advantages - prevents data overrun; easy
disadvantages - very inefficient; wastes high bandwidth provided by
the hardware
Example-
packet size == 1000 octets
throughput == 2 Mbps
delay == 50 ms
one packet (stop and go) takes 100 ms
WHY?- 50 ms for the packet; 50 ms for the ACK
actual data rate - 80Kbps
WHY?- each packet takes 0.1 s; only 10 packets can be transmitted in one second; 10 packets is 80000 bits.
Technique 4: Flow Control
sliding window - sender waits for an ACK after sending n packets
before continuing, where n is the max window size
window size - max amount of data that can be sent before an acknowledgment arrives
requirements - incoming buffer must be large enough to accommodate the window
size of packets
handling sliding window - sender
sender extracts and sends n packets, keeping copies
while more packets to send
if ACK of rightmost window packet
arrives discard that packet copy;
move window to the left
by one
end while
Technique 4: Flow Control
handling sliding window - receiver
repeat
if packet arrives in sequence
sends ACK
passes up to upper layer
software
else if out of sequence
sends ACK
enqueues on out-of-order
list
until no more packets
sliding window - exists on the sender’s side
Transmission Time Comparison
stop and go - each packet takes 2N time to deliver (where N is the network latency)
sliding window - improves by a factor of K (where K is the number of
packets in the window)
improvement-
Tw == Throughput of sliding window
Tg == Throughput of stop and go
W == window size (in number of packets)
Tw = Tg * W
considering bandwidth-
Tw = min (B, Tg * W)
Congestion
definition - each packet experiences excessive delay because the entire network is
overrun
congestion collapse - a phenomenon causing the network to be unusable because all packets are repeatedly discarded and retransmitted
where does it occur?- in network systems with multiple links
graph theoretically - out degree is smaller than in degree
queues - queues of certain nodes fill up and cause excessive queuing delays for enqueued packets
packets discarded - ultimately, queue fills and node discards packets unable to be enqueued--causing a retransmission because of a "lost packet"
Avoiding Congestion Collapse
continual monitoring - of queue sizes of destination switches
inform - switches inform senders when congestion occurs; senders then reduce the send rate
packet loss - nodes use packet loss as an indicator of congestion NOT faulty hardware
loss - inferred by absence of ACK’s
Introduction to Transport Level
Internet layer - protocols provide universal
datagram service between hosts
transport layer - provides end-to-end
delivery between two entities running on hosts such as processes or programs
endpoint - one such entity
User Datagram Protocol (UDP) - provides
connectionless datagram service between two endpoints
Transport Control Protocol (TCP) - provides
reliable connection-oriented service between two endpoints
User Datagram Protocol (UDP)
capability - distinguishes among multiple
destinations within a single host destination
purpose - to deliver independent messages
called datagrams between programs on host machines
best effort delivery - unreliable - datagrams
may be lost, delivered out of order, duplicated
identifying the ultimate destination - we
need a way of uniquely specifying an endpoint
Identifying the Ultimate destination
port - a numeric identifier of an application
protocol port - any single endpoint within a
single host destination; also called ‘port’
unique identification - ordered pair
including IP address and port number
source and destination - may use different
port numbers for the same interaction
Transport and IP
UDP and TCP - use IP to provide data delivery
by encapsulating their own data units into IP datagrams
UDP Header
UDP Checksum
UDP unreliable - checksum is one (optional)
mechanism to guarantee that data has arrived in tact
checksum not used - value of zero
computed value zero? - UDP uses the ‘other’
ones-complement values of zero…all ones.
uses psuedoheader - in order to verify the
correct destination, fields outside the UDP header must be checked
not transmitted - prepended to the datagram
at both ends before checksum computation
psuedoheader format - standard specifed
format of 12 octets of data
Selecting UDP Port Numbers
coordination - endpoints must agree on a port
number
server - opens selected port and waits for
incoming messages
client - selects local port and sends message
to selected port
local or centrally assigned? - both; local
applications use dynamic binding and well known applications use
centrally assigned well known port numbers
Dynamic binding - network software
o
ECHO - 7
o
TIME OF DAY - 13
o
TFTP - 69
o
Quote of the Day - 17
Applications that use UDP
o example: client side - the programmer declares certain
procedures as remote;
o
example: server side -
the programmer implements these procedures and uses other RPC facilities to
declare them to be part of a server
Transport Control Protocol (TCP)
popularity - most widely used transport
protocol
uses IP - provides reliable data delivery by
using IP unreliable datagram delivery
reliable - compensates for loss, delay,
duplication and out-of-order packets
reliability - recall, this takes place at
level 4 and 2
Characteristics of TCP
connection oriented - application requests
connection to destination and then uses connection to deliver data to transfer
data
point-to-point - a TCP connection has two
endpoints: source application and destination application
reliability - TCP guarantees data will be
delivered without loss, duplication or transmission errors
full duplex - the endpoints of a TCP
connection can exchange data in both directions simultaneously
stream interface - application delivers data
to TCP as a continuous stream, with no record boundaries
virtual connection - TCP makes no guarantees
that data will be received in same blocks as transmitted
reliable connection startup - three-way
handshake that guarantees reliable, synchronized startup between endpoints
graceful connection shutdown - TCP guarantees
to deliver all data relieably before terminating the connection
TCP and IP
similar to UDP - TCP uses IP for data
delivery
interpretation - IP treats TCP like data and
does not interpret any contents of the TCP message
similar to UDP - endpoints are identified by ports
parallelism - Allows multiple connections on
each host
identifier - ports may be associated with an
application or a process
TCP Delivery
encapsulation - TCP data units travels in IP
datagrams
routers - examine IP header to forward
datagrams; TCP at destination interprets TCP messages
Recovers from -
techniques described before:
Packet Loss and Retransmission
Technique - TCP
uses positive acknowledgment with retransmission to achieve reliable data
delivery
receiver - sends
acknowledgment control messages (ACK) to sender to verify successful receipt of
data
sender - sets
timer when data transmitted; if timer expires before acknowledgment arrives,
sender retransmits (with new timer)
TCP Data Units
stream technique - TCP Application delivers
arbitrarily large chunks of data to TCP as a``stream''
data entity - breaks this data into segments,
each of which fits into an IP datagram
each segment - 20 byte header, then data
tracking segments in streams - original
stream is numbered by bytes; each segment contains sequence number of data
bytes
Acknowledgments (ACKs)
receiver - sends segment with sequence number
of acknowledged data ( not segment numbers….valid data received so far)
one ACK - can acknowledge many segments
(i.e., encompassing datagrams)
Retransmission Timer
purpose - determines max wait before
re-sending an assumed lost datagram
danger - inappropriate timeout can cause poor
performance:
too long - sender waits longer than necessary
before retransmitting; degrades performance
too short - sender generates unnecessary
traffic
varying connection distances - mandate that
timeout must be different for each connection and is set dynamically
local vs satellite link - host on same LAN
should have shorter timeout than host 20 hops away
another variation - delivery time across
internet may change over time; timeout must accommodate changes
Choosing a Timeout Value
adaptive - should accommodate all variations
of delay
Round trip time (RTT) - time from message
send to corresponding ACK received
timestamp - TCP timestamps all segments
maintains estimate of RTO - using the current
RTT, a current estimate is constantly refined
might be foiled - heuristics can sometimes
fail in the case of a quick change in RTT
Applications that use TCP
SMTP - Simple Mail Transfer Protocol -
protocol for transferring e-mail messages from one machine to another; SMTP
specifies how two mail systems interact and the format of control messages they
exchange to transfer mail.
Rlogin - Remote Login - remotr terminal
protocol developed for UNIX by Berkeley; offers essentially the same service as
TELNET
Applications that use TCP
TELNET - the TCP/IP standard protocol for
remote terminal service; it allows a user at one site to interact with a reote
timesharing system at another site as if the user’s keyboard and display
connected directly to the remote machine.
FTP - File Transfer Protocol- protocol for
transferring files from one machine to another; differs from TFTP by providing
interactive access, format specification, and authentication control.