CPS 472/572 Computer Networking
Lecture 10: Transport Level Protocols
TCP and UDP

Dr. Jennifer Seitzer

Motivation:

The Transport Level Protocols underlie the applications of the Internet. There are two main protocols. The first is User Datagram Protocol (UDP) a connectionless, unreliable protocol -- similar to IP at the next higher level. The second, and most popular, is TCP, Transport Control Protocol. TCP assures reliable transport, and therefore takes care to assure no lost packets, no duplicate packets, acknowledged receipt of in tact packets, and no out of order packets are delivered. Tonight, we study both Transport level Protocols including their behaviors, packet formats, algorithms, and implementations.

 

Why is Reliability Needed?

problem - corrupted bits during transmission

solution - perform error detection and correction

protocol layer with solution - data link where techniques such as parity, checksum, and CRC were presented

transport protocols - attempt to avoid transmission problems at the packet level (rather than the bit level)

Problems to Avoid

out of order delivery - data decomposed into packets may arrive at destination out of sequence (e.g., consider an EMAIL message composed of erratic pieces)

duplicate packets - multiple copies of the same piece arrives at destination

lost packets - missing chunks from overall data unit

replay - retransmission of an old, delayed packet

data overrun - data is lost due to differing processing speeds of interacting nodes

congestion - each packet experiences excessive delays because the network is overrun with packets

 

Underlying Goal: Reliable Transfer

reliable transfer - transfer of data without loss, ordered correctly, and no duplications

aren’t all protocols reliable? - Yes and No.

well written - YES-- refereed and peer-reviewed before coming a standard

reliable in the communication context - No. Reordering, retransmitting, and duplication checking takes time and complexity.

Example of an unreliable protocol - UDP--User Datagram Protocol; provides a mechanism for a single host to communicate with multiple destinations independently. (via port numbers)

Technique 1: Sequencing

technique - sender numbers the packets; receiver tracks last packet number sent in order AND maintains a list of stray, out-of-order packets

reliability problems solved - out-of-order and duplicate packets

handling out-of-order packets -
for incoming packet with sequence number j
         if j-1 is the current "last_num"
                  update last_num to j
                  send packet j to next layer protocol
         ELSE 
                  add packet j to out-of-order list

Technique 1: Sequencing

duplicate packets - often occur in WAN’s; possible in LAN’s due to malfunctioning access hardware, for example.

handling duplicate packets -
for incoming packet with sequence number j
         if "last_num" >= j
                  discard packet
         ELSE if j appears on out-of-order list
                  discard packet
         ELSE 
                  not a duplicate

Technique 2: Packet Retransmission

technique - sender resends a packet that has not been verified in a certain amount of time; receiver sends a small message as the verification

acknowledgment (ACK) - small verification message

reliability problems solved - lost packets

handling lost packets -
for each outgoing packet
repeat
         sender keeps a copy and sets timer
         if ACK arrives before timer expires                              discard packet copy;
until packet copy discarded
        

Technique 2: Packet Retransmission

limitation - doesn’t work for failed hardware (e.g., receiver is down); number of retransmissions are usually bounded

drawback - can introduce duplicate packets; a sender cannot distinguish between a lost packet and a slow-to-acknowledge packet

handling lost packets -
for each outgoing packet
repeat
         sender keeps a copy and sets timer
         if ACK arrives before timer expires                              discard packet copy;
until packet copy discarded OR #retransmits == Max       

Technique 3: Session Identifiers

technique - sender and receiver require each packet to contain a session timestamp of session start time

reliability problems solved - replay- an old packet is misinterpreted as a current packet

difficulties of replay - aggregation of nonsense; early inadvertent termination of session;

detection of replay packets -
for each incoming packet at a particular port
         if session ID is the current session ID
                  process packet
         ELSE 
                  discard the packet

Technique 3: Session Identifiers
EXAMPLE OF REPLAY

Suppose computers A and B exchange data with packets numbered 1 to 5

Packet 4 encounters a long delay

Sender assumes packet 4 is lost so it retransmits

Receiver receives new packet 4 & sends ACK

Computers A and B communicate later with packets numbered 1 to 10

Original packet 4 arrives during the second session, so the data from old packet 4 is inserted into the data at receivers side (which is now meaningless)

Technique 4: Flow Control

technique - sender controls the amount of data sent

reliability problems solved - data overrun - a source computer sends data faster than the destination can receive it

methods - many; we’ll discuss stop and go and sliding window

Technique 4: Flow Control

Stop and go - sender waits for an ACK after sending a packet before continuing

advantages - prevents data overrun; easy

disadvantages - very inefficient; wastes high bandwidth provided by the hardware

Example-
packet size == 1000 octets
throughput == 2 Mbps
delay == 50 ms

one packet (stop and go) takes 100 ms

WHY?- 50 ms for the packet; 50 ms for the ACK

actual data rate - 80Kbps

WHY?- each packet takes 0.1 s; only 10 packets can be transmitted in one second; 10 packets is 80000 bits.

Technique 4: Flow Control

sliding window - sender waits for an ACK after sending n packets before continuing, where n is the max window size

window size - max amount of data that can be sent before an acknowledgment arrives

requirements - incoming buffer must be large enough to accommodate the window size of packets

handling sliding window - sender
sender extracts and sends n packets, keeping copies
while more packets to send
         if ACK of rightmost window packet arrives  discard that packet copy;
                  move window to the left by one
end while      

Technique 4: Flow Control

handling sliding window - receiver
repeat
         if packet arrives in sequence
                  sends ACK
                  passes up to upper layer software
         else if out of sequence
                  sends ACK
                  enqueues on out-of-order list
until no more packets

sliding window - exists on the sender’s side

Transmission Time Comparison

stop and go - each packet takes 2N time to deliver (where N is the network latency)

sliding window - improves by a factor of K (where K is the number of packets in the window)

improvement-
Tw == Throughput of sliding window
Tg == Throughput of stop and go
W == window size (in number of packets)
Tw = Tg * W

considering bandwidth-
Tw = min (B, Tg * W)

Congestion

definition - each packet experiences excessive delay because the entire network is overrun

congestion collapse - a phenomenon causing the network to be unusable because all packets are repeatedly discarded and retransmitted

where does it occur?- in network systems with multiple links

graph theoretically - out degree is smaller than in degree

queues - queues of certain nodes fill up and cause excessive queuing delays for enqueued packets

packets discarded - ultimately, queue fills and node discards packets unable to be enqueued--causing a retransmission because of a "lost packet"

Avoiding Congestion Collapse

continual monitoring - of queue sizes of destination switches

inform - switches inform senders when congestion occurs; senders then reduce the send rate

packet loss - nodes use packet loss as an indicator of congestion NOT faulty hardware

loss - inferred by absence of ACK’s

Introduction to Transport Level

Internet layer - protocols provide universal datagram service between hosts

transport layer - provides end-to-end delivery between two entities running on hosts such as processes or programs

endpoint - one such entity

User Datagram Protocol (UDP) - provides connectionless datagram service between two endpoints

Transport Control Protocol (TCP) - provides reliable connection-oriented service between two endpoints

User Datagram Protocol (UDP)

capability - distinguishes among multiple destinations within a single host destination

purpose - to deliver independent messages called datagrams between programs on host machines

best effort delivery - unreliable - datagrams may be lost, delivered out of order, duplicated

identifying the ultimate destination - we need a way of uniquely specifying an endpoint

Identifying the Ultimate destination

port - a numeric identifier of an application

protocol port - any single endpoint within a single host destination; also called ‘port’

unique identification - ordered pair including IP address and port number

source and destination - may use different port numbers for the same interaction

Transport and IP

UDP and TCP - use IP to provide data delivery by encapsulating their own data units into IP datagrams

UDP Header

UDP Checksum

UDP unreliable - checksum is one (optional) mechanism to guarantee that data has arrived in tact

checksum not used - value of zero

computed value zero? - UDP uses the ‘other’ ones-complement values of zero…all ones.

uses psuedoheader - in order to verify the correct destination, fields outside the UDP header must be checked

not transmitted - prepended to the datagram at both ends before checksum computation

psuedoheader format - standard specifed format of 12 octets of data

Selecting UDP Port Numbers

coordination - endpoints must agree on a port number

server - opens selected port and waits for incoming messages

client - selects local port and sends message to selected port

local or centrally assigned? - both; local applications use dynamic binding and well known applications use centrally assigned well known port numbers

Dynamic binding - network software

o      ECHO - 7

o      TIME OF DAY - 13

o      TFTP - 69

o      Quote of the Day - 17

Applications that use UDP

o      example: client side - the programmer declares certain procedures as remote;

o      example: server side - the programmer implements these procedures and uses other RPC facilities to declare them to be part of a server

Transport Control Protocol (TCP)

popularity - most widely used transport protocol

uses IP - provides reliable data delivery by using IP unreliable datagram delivery

reliable - compensates for loss, delay, duplication and out-of-order packets

reliability - recall, this takes place at level 4 and 2

Characteristics of TCP

connection oriented - application requests connection to destination and then uses connection to deliver data to transfer data

point-to-point - a TCP connection has two endpoints: source application and destination application

reliability - TCP guarantees data will be delivered without loss, duplication or transmission errors

full duplex - the endpoints of a TCP connection can exchange data in both directions simultaneously

stream interface - application delivers data to TCP as a continuous stream, with no record boundaries

virtual connection - TCP makes no guarantees that data will be received in same blocks as transmitted

reliable connection startup - three-way handshake that guarantees reliable, synchronized startup between endpoints

graceful connection shutdown - TCP guarantees to deliver all data relieably before terminating the connection

TCP and IP

similar to UDP - TCP uses IP for data delivery

interpretation - IP treats TCP like data and does not interpret any contents of the TCP message

similar to UDP - endpoints are identified by ports

parallelism - Allows multiple connections on each host

identifier - ports may be associated with an application or a process

TCP Delivery

encapsulation - TCP data units travels in IP datagrams

routers - examine IP header to forward datagrams; TCP at destination interprets TCP messages

Recovers from -

techniques described before:

Packet Loss and Retransmission

Technique - TCP uses positive acknowledgment with retransmission to achieve reliable data delivery

receiver - sends acknowledgment control messages (ACK) to sender to verify successful receipt of data

sender - sets timer when data transmitted; if timer expires before acknowledgment arrives, sender retransmits (with new timer)

TCP Data Units

stream technique - TCP Application delivers arbitrarily large chunks of data to TCP as a``stream''

data entity - breaks this data into segments, each of which fits into an IP datagram

each segment - 20 byte header, then data

tracking segments in streams - original stream is numbered by bytes; each segment contains sequence number of data bytes

Acknowledgments (ACKs)

receiver - sends segment with sequence number of acknowledged data ( not segment numbers….valid data received so far)

one ACK - can acknowledge many segments (i.e., encompassing datagrams)

Retransmission Timer

purpose - determines max wait before re-sending an assumed lost datagram

danger - inappropriate timeout can cause poor performance:

too long - sender waits longer than necessary before retransmitting; degrades performance

too short - sender generates unnecessary traffic

varying connection distances - mandate that timeout must be different for each connection and is set dynamically

local vs satellite link - host on same LAN should have shorter timeout than host 20 hops away

another variation - delivery time across internet may change over time; timeout must accommodate changes

Choosing a Timeout Value

adaptive - should accommodate all variations of delay

Round trip time (RTT) - time from message send to corresponding ACK received

timestamp - TCP timestamps all segments

maintains estimate of RTO - using the current RTT, a current estimate is constantly refined

might be foiled - heuristics can sometimes fail in the case of a quick change in RTT

Applications that use TCP

SMTP - Simple Mail Transfer Protocol - protocol for transferring e-mail messages from one machine to another; SMTP specifies how two mail systems interact and the format of control messages they exchange to transfer mail.

Rlogin - Remote Login - remotr terminal protocol developed for UNIX by Berkeley; offers essentially the same service as TELNET

Applications that use TCP

TELNET - the TCP/IP standard protocol for remote terminal service; it allows a user at one site to interact with a reote timesharing system at another site as if the user’s keyboard and display connected directly to the remote machine.

FTP - File Transfer Protocol- protocol for transferring files from one machine to another; differs from TFTP by providing interactive access, format specification, and authentication control.