APPENDIX D: NETWORKING BASICS
Computers running on the Internet communicate to each other using the TCP and UDP protocols, which are both 4-layer protocols:
When you write Java programs that communicate over the network, you are programming at the application layer. Typically, you don't need to concern yourself with the TCP and UDP layers--instead you can use the classes in the java.net package. These classes provide system-independent network communication. However you do need to understand the difference between TCP and UDP to decide which Java classes your programs should use.
When two applications want to communicate to each another reliably they establish a connection and send data back and forth over that connection. this is analogous to making a telephone call--if you want to speak to Aunt Beatrice in Kentucky, a connection is established when you dial her phone number and she answers. You send data back and forth over the connection by speaking to one another over the phone lines. Like the phone company, TCP guarantees that data sent from one end of the connection actually gets to the other end and in the same order it was sent (otherwise an error is reported).
Definition: TCP is a connection-based protocol that provides a reliable flow of data between two computers.
Applications that require a reliable, point-to-point channel to communicate, use TCP to communicate. Hyper Text Transfer Protocol (HTTP), File Transfer Protocol (ftp), and Telnet (telnet) are all examples of applications that require a reliable communication channel. The order that the data is sent and received over the network is critical to the success of these applications--when using HTTP to read from a URL, the data must be received in the order that it was sent otherwise you end up with a jumbled HTML file, a corrupt zip file, or some other invalid information.
For many applications this guarantee of reliability is critical to the success of the transfer of information from one end of the connection to the other. However, other forms of communication don't require such strict communications and in fact are hindered by them either because of the performance hit from the extra overhead, or because the reliable connection invalidates the service altogether.
Consider, for example, a clock server that sends the current time to its client when requested to do so. If the client misses a packet does it really make sense to resend the packet? No, because the time won't be correct by the time the client receives it. If the client makes two requests and receives packets from the server out of order, it doesn't really matter because the client can figure out that the packets are out of order and request another one. The reliable channel here is unnecessary, causes performance degradation, and may hinder the usefulness of the service.
Another example of a service that doesn't need the guarantee of a reliable channel is the ping command. The whole point of the ping command is to test the communication between two programs over the network. In fact, ping needs to know about dropped or out of order packets to determine how good or bad the connection is. Thus a reliable channel would invalidate this service altogether.
The UDP protocol provides for non-guaranteed communication between two applications on the network. UDP is not connection-based like TCP. Rather it sends independent packets of data, called datagrams from one application to another. Sending datagrams is much like sending a letter through the mail service: the order of delivery is not important and is not guaranteed, and each message is independent of any others.
Definition: UDP is a protocol that sends independent packets of data, called datagrams from one computer to another with no guarantees about arrival. UDP is not connection-based like TCP.
Ports
Generally speaking, a computer has a single physical connection to the network. All data destined for a particular computer arrives through that connection. However, the data may be intended for different applications running on the computer. So how does the computer know which application to forward data to? Through the use of ports.
Data transmitted over the Internet is accompanied by addressing information that identifies the computer and the port that it's destined for. The computer is identified by its 32-bit IP address, which IP uses to deliver data to the right computer on the network. A 16-bit number, which TCP and UDP use to deliver the data to the right application, identifies ports.
In connection-based communication, an application establishes a connection with another application by binding a socket to a port number. This has the effect of registering the application with the system to receive all data destined for that port. No two applications can bind to the same port: Attempts to bind to a port that is already in use will fail.
In datagram-based communication, the datagram packet contains the port number of its destination.
Definition: The TCP and UDP protocols use
ports to map incoming data to a particular process running on a computer.
Port numbers range from 0 to 65535 (because 16-bit numbers represents ports). The port numbers ranging from 0 - 1023 are restricted--they are reserved for use by well-known services such as HTTP and ftp and other system services. Your applications should not attempt to bind to these ports. Ports that are reserved for well-known services such as HTTP and ftp are called well-known ports.
Through the classes in java.net, Java programs can use TCP or UDP to communicate over the Internet. The URL, URLConnection, Socket, and SocketServer classes all use TCP to communicate over the network. The DatagramPacket and DatagramServer classes use UDP.
All About Sockets
You use URLs and URLConnections to communicate over the network at a relatively high level and for a specific purpose: accessing resources on the Internet. Sometimes your programs require lower level network communication, for example, when you want to write a client-server application.
In client-server applications, the server provides some service, such as processing database queries or sending out current stock prices. The client uses the service provided by the server to some end: displaying database query results to the user or making stock purchase recommendations to an investor. The communication that occurs between the client and the server must be reliable--no data can be dropped and it must arrive on the client side in the same order that it was sent by the server.
TCP provides a reliable, point-to-point communication channel, which client-server applications on the Internet use to communicate. The Socket and ServerSocket classes in java.net provide a system-independent communication channel using TCP.
What Is a Socket?
A socket is one end-point of a two-way communication link between two programs running on the network. Socket classes are used to represent the connection between a client program and a server program. The java.net package provides two classes--Socket and ServerSocket that implement the client side of the connection and the server side of the connection, respectively.
A server application normally listens to a specific port waiting for connection requests from a client. When a connection request arrives, the client and the server establish a dedicated connection over which they can communicate. During the connection process, the client is assigned a local port number, and binds a socket to it. The client talks to the server by writing to the socket and gets information from the server by reading from it. Similarly, the server gets a new local port number (it needs a new port number so that it can continue to listen for connection requests on the original port). The server also binds a socket to its local port and communicates with the client by reading from and writing to it.
The client and the server must agree on a protocol--that is, they must agree on the language of the information transferred back and forth through the socket.
Definition: A socket is one end-point of a two-way communication link between two programs running on the network.
The java.net package in the Java development
environment provides a class Socket that represents one end of a
two-way connection between your Java program and another program on the
network. The Socket class implements the client side of the two-way link.
If you are writing server software, you will also be interested in the
ServerSocket class which implements the server side of the two-way link.
This lesson shows you how to use the Socket and ServerSocket classes.