Networking and the BSD Sockets APIs

by Michael Beam
12/27/2002

原文链接: http://www.macdevcenter.com/pub/a/mac/2002/12/26/cocoa.html?page=1

We’ve spend a great deal of time talking about Rendezvous in the previous two columns, but Rendezvous can’t exist in a vacuum. Having user-friendly service discovery does us no good unless we can make our applications talk to one another. Indeed, Rendezvous has absolutely no provisions for facilitating general network communications between applications, as it is only a protocol for advertising and discovering services on a network. It is a discovery protocol, not a communications protocol.

Today we shift gears into the communications side of this business; we will have very little to say about Rendezvous. Due to its Unix lineage, Mac OS X is a wonderful platform for learning about networking, since it has such a rich set of APIs to offer; in particular, we can program with the venerable BSD sockets API. Today we’ll learn about this API, and in doing so we will write a tiny pair of C applications that demonstrate how clients and servers can be made to talk with one another. In the next column, we will finish RCE with what we learn today by adding some Cocoa.

Concerning Sockets

Most of us have likely heard of sockets in the course of our experiences programming. I had always heard about sockets, but up until about half a year ago, I had never had the pleasure of programming with them. Having heard of something is far from understanding it, which is essential to being able to effectively use a technology. In this column and the next, I hope to spark some interest in the subject to give you a feel for the technology. Hopefully, many of you who had previously shied away from sockets and networking will go on to learn more about this interesting and relevant topic.

So just what is a socket? The man page for the socket() function, which we use to create sockets, describes this little thing in four words: an endpoint for communication. The analogy often used to relate sockets to everyday experiences is that of a telephone, which, as we will see, is indeed an accurate comparison. A telephone is, after all, an endpoint, or an interface, to a communications network that we use to communicate with other people.

In the same way that we speak into and listen to a phone, applications both send data across a network by writing to a socket, and receive data sent by a remote host by reading from the socket. If you are familiar with the Unix APIs for reading and writing to a file, you will be comfortable with sockets, as the same functions for file I/O are used for socket I/O – namely, read() and write().

Like two telephones that facilitate a conversation between two people, network connections exist between a pair of sockets, one for each end of the connection. Sockets are often talked about in pairs: one for the server side of the application, and one for the client side. The networking model that we are accustomed to is that of the relationship between a client and a server. A server is an application that is listening for connection requests from clients, and handling them appropriately. A client is a program that connects to a server. Usually client and servers are two completely different applications, as is the case with a Web client and server: Apache is a Web server, while OmniWeb, Internet Explorer, and Mozilla are all Web clients.

We will see in the next column how this distinction between server and client blurs when we talk about peer-to-peer chat applications like RCE. Sometimes, one application is both a client and a server that allows connections from other like applications. This is especially true of peer-to-peer applications, such as the chat application we’re building. We’ll get into this more in the next column, but understand as we progress through our discussion today that RCE will have both server functionality and client functionality.

Working With Sockets

Because of the differing tasks of a server and a client, their use of sockets is accordingly different. The role each side takes in establishing a communications link is reflected in the nature of the sockets each side uses. To wit, servers use what are known as passive sockets, and clients use active sockets.

When a server process starts up, it must create a socket; bind it to a local, unused port; tell that socket to listen for new connections from clients; and finally, begin waiting for new connections. This socket is often referred to as a listening socket, or a passive socket, or a server socket. All of these names suggest that the role of the socket is to sit patiently while listening to its assigned port for clients requesting a connection with the server. In the analogy of the telephone, creating the socket is like buying a phone, binding is akin to getting a hookup from the phone company, listening is plugging your phone into the wall, and finally, accepting is the act of answering the phone when it rings.

When a connection is received by the listening socket, the server must accept the connection and return a new connected socket that is used to communicate with the client. This new socket has an established connection to the client’s remote socket. By creating a new socket to handle the new connection, the listening socket is free to continue doing its thing, listening for connections from other clients.

Clients use sockets in a different way. A client creates a socket in the same way as a server; however, after the socket is created the use of the socket differs. With a socket in hand, the client uses that socket to attempt to connect to a server. Once the connection has been accepted by the server, the client can begin sending and receiving data from the server. Referring back to our phone analogy, connecting is no different than dialing a phone number for someone you want to talk to.

Our Sockets Toolkit

What are all of these functions that we have been alluding to without mentioning? They are the functions of the BSD Sockets API, which is primarily defined in the header sys/socket.h (header file paths are always referenced relative to the path /usr/include). There are seven functions that we will discuss, three of which are part of the standard library. They are:

  • int socket( int domain, int type, int protocol )
    Creates a new socket and returns the socket file descriptor. The domain argument is a constant to specify the address family of the socket; we will use AF_INET, which is IPv4 addressing. The argument type specifies the socket type; we will pass the constant SOCK_STREAM here, which is a TCP stream socket. The protocol argument does not concern us here, so we pass 0. Returns -1 if there is an error.

  • int connect( int s, const struct sockaddr *name, int namelen )
    Connects the socket identified by the file descriptor s to the remote socket specified in the address structure name. Returns 0 on success, -1 on error.
    int bind( int s, const struct sockaddr *name, int namelen )
    Binds the socket s to the port specified in the address structure name. Returns 0 on success, -1 on error.

  • int listen( int s, int backlog )
    Converts the socket s into a passive listening (server) socket. The parameter backlog specifies how many pending connections the kernel will allow before clients who attempt to connect will receive a connection refused error. Returns 0 on success, -1 on error.

  • int accept( int s, struct sockaddr *addr, int *addrlen )
    This function will return a socket connected to the remote socket with the first connection request in the connection queue. The socket returned is not the same socket as s, but it has the same properties as s. The address structure of the connected socket is returned in the struct addr. Returns -1 if there is an error.
    ssize_t read( int d, void *buf, size_t nbytes )
    Attempts to read nbytes of data from the socket d into the array buf. Returns the number of bytes that was actually read.

  • ssize_t write( int d, const void *buf, size_t nbytes )
    This function writes to the socket d nbytes number of bytes from the array buf.

  • int close( int s )
    This function closes the socket.

Let’s take a moment to look at these functions. We discussed above that servers and clients use sockets in different ways. As such, some of these functions are only appropriate for use by a client and others are used only by servers. First, both clients and servers use socket(), read(), write(), and close(). The function connect() is used by clients, while the remaining three – bind(), listen(), and accept() – are used by servers.

Our Program for the Day

Today we’ll create two small C applications. We won’t be learning any new Cocoa today. To code a C application, you have two options. In the first, you can create a new project in Project Builder for the client application, and another project for the server application. The type of project you need to make is a Standard Tool, which is the last item in the New Project Assistant’s list of project types. When you create a new standard tool project, you will be presented with a single file in the Sources group: main.c. This file has some code in it that does the whole “Hello, World!” bit – we can replace all of this with our own code (even the main function and #include statements). If you do the code here today, remember that you have to create a project for both the client and the server code, since we need two executables.

The other alternative is to do this all from Terminal, which is what I did when I wrote the code for this column. Here we only have to work with two files, server.c and client.c, and the cc command. When we get to the relevant parts, I’ll show how to compile the code from both the Unix shell and from Project Builder. In each section where I present client and server code, I will go through the steps we have to code in a somewhat isolated manner, and at the end present the entire source code for the component (client or server).

So, with that, onward!

Clients

Clients use sockets in the following manner: first, a client application must create a socket, which is done using the socket() function, as shown here:

1
2
3
4
5
6
int sockfd

if ( (sockfd = socket( AF_INET, SOCK_STREAM, 0 )) < 0 ) {
perror( "socket" );
exit(1);
}

This will create a TCP/IP streaming socket. A TCP streaming socket is a very reliable means of communication with all sorts of error checking built into the TCP protocol, and a convenient, two-way, byte based interface to communicating. The socket() function takes three arguments: the domain, the type, and the protocol. With the domain parameter we indicate the communications domain of the socket, which is based on a particular address family. The constant AF_INET that we use here tells the socket() function that we will be working with IPv4 addresses. The second argument, the type, specifies the type of socket we want. SOCK_STREAM indicates that we want a TCP streaming socket. Other types include datagram and raw sockets. Finally, we have the protocol argument. This argument isn’t used much, since each combination of domain and type usually supports only one protocol. So, we pass 0, and leave it at that.

Note how we handle an error in the socket() function. If a call to socket() is successful, it will return the file descriptor for the new socket, which is a small, positive integer. However, if the call to socket() resulted in an error, -1 is returned. We can check to see if the value of sockfd is positive or negative in an if-statement, as shown above. If there is an error, we handle it by printing the error message with a call to perror(), and we exit with a status of 1 (1 indicates to the parent process, usually a shell such as tcsh, that there was an error in the program. For more information on perror(), see the perror man page by typing man perror in the Terminal).

With a socket in hand, we call the connect() function, which will attempt to make a connection between the specified socket and a socket listening on a remote host. In the list of functions above, we saw that the connect() function’s second argument is type struct sockaddr, which is also used in the bind() and accept() functions. Interestingly, we never work with a sockaddr struct. Rather, sockaddr is kind of like an abstract superclass for protocol and address family-specific socket address structures. In other words, we never work with an actual sockaddr structure, but rather with an address structure for IPv4 or IPV6 addresses.

For IPv4, the address structure we use is of type struct sockaddr_in. The primary reason for the existence of the generic socket address structure type is that when the socket APIs were first written, ANSI C hadn’t settled on void * as the generic pointer type. However, sockets had to support multiple address families, and the idea of having a separate set of socket functions for each domain was unappealing to the API developers. To get around this, the writers of the sockets API had to come up with their own generic pointer type specifically for socket address structures: struct sockaddr. That bit of history aside, the definition for struct sockaddr_in is found in the header netinet/in.h (when we are using the sockets API with IPv4, we have to include this header in addition to sys/sockets.h). The definition for struct sockaddr_in is as follows:

1
2
3
4
5
6
7
struct sockaddr_in {
u_char sin_len;
u_char sin_family;
u_short sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};

The first member of the structure, u_char sin_len, is the length in bytes of the structure. The data type u_char is a synonym for unsigned char; the type definitions for that and other commonly-used types can be found in the header sys/types.h. Except for special uses of sockets, we don’t need to set or examine the value of the sin_len member; it is used internally by the kernel in various socket routines. The second member of sockaddr_in is sin_family, which is the address family of the socket. This member is set to the same constant that we passed as the protocol argument of the socket() function (AF_INET, for example). The next member, sin_port, is the port number to which we wish to connect. Our next member, sin_addr, is itself a structure of type struct in_addr. Next we have another struct, in_addr. This structure looks like this (and you’ll probably think this is dumb):

1
2
3
struct in_addr {
in_addr_t s_addr;
};

The sole member of this structure, s_addr, is type in_addr_t. The definition for this type is again found in sys/types.h, in which we find that in_addr_t is a 32-bit unsigned integer, which is a usually an unsigned int. When initializing our socket address structure, this is where we put the IP address to which we are either binding or attempting to connect. Finally, getting back up to sockaddr_in, the last member is just an array that pads the structure to a certain size. All we have to do is make sure that it is initialized to 0.

Now that we know what all is contained in a socket address structure, we can move on with our client code to connect to a server host. Before we can use the connect() function, we have to prepare an address structure so that the function knows where to connect. Socket address structures are prepared by first zeroing the memory that stores the structure, and then we go on to fill in the relevant fields, as shown here:

1
2
3
4
bzero( &serverAddress, sizeof(serverAddress) );
serverAddress.sin_family = AF_INET;
serverAddress.sin_port = htons( 12345 );
inet_pton( AF_INET, "127.0.0.1", &serverAddress.sin_addr );

There are several things to take note of here. The first thing we do is initialize the entire structure to 0 using the bzero() function, which stands for byte zero. This function takes the starting address of a chunk of memory that we want to zero, and the number of bytes to zero. In our case we pass the address of the variable serverAddress, obtained using the address operator &, and we pass the result of the sizeof() macro with serverAddress as the parameter (sizeof is used quite frequently in C to determine at runtime the size in bytes of a variable or data type).

Next we set sin_family to AF_INET: the same address family we created our socket with. Next we set the port number in sin_port to 12345. When we get to the server side of all of this, we will see that 12345 is the port number that we will bind our listening socket to.

Note the use of the function htons(). This function stands for host to network short, and it is used to convert a short int from the host byte (little- or big-endian) order to the network byte order (big-endian). If you haven’t been introduced to issues of byte order, then here is a quick rundown. Let’s continue with the example of a short int: a short int is 16 bits in size, or 2 bytes. Different computer architectures store and read the value of a multi-byte data type in memory differently. If you’re scanning through memory and come across the two bytes of a short int variable, will you come across either the low-order byte (little end) or the high-order byte (big end), first. In little-endian systems, you will run across the little end first, while in big-endian systems you run across the big end first.

The use of htons() and other similar functions is necessary because not all platforms order bytes within a multi-byte data type (such as short int, 2 bytes) the same way. We will see later another variant of this function, htonl() – host to network long.

Lastly, we set the address to which we want to connect. For the purposes of this column, we will connect our socket to the server socket bound to port 12345 on localhost, which has an IP address of 127.0.0.1. The function inet_pton() is a convenient function for converting a string into an in_addr structure. The name of the function is short for Internet Presentation to Number. Presentation refers to the human-readable representation of an IP address, while Number refers to the integer representation of an IP address required by the in_addr struct. By specifying AF_INET in the first argument, we are telling the function that we are working with IPv4 addresses.

Now, let’s connect. To connect to a remote socket we do the following:

1
2
3
4
5
if ( connect( sockfd, (struct sockaddr *)&serverAddress, 
sizeof(serverAddress)) < 0 ) {
perror( "connect" );
exit(1);
}

If connect() returns successfully, then we can begin communicating with the server. In our simple client example, we will do nothing more than read a short message sent by the server when a connection is received, which is done using the read() function:

1
2
3
4
5
6
7
8
9
10
11
12
13
// Declared up above
char buffer[201];
int n;

while ( (n = read( sockfd, buffer, 200 )) > 0 ) {
buffer[n] = 0; // Terminate string with null character
printf( buffer );
}

if ( n < 0 ) {
perror( "read" );
exit(1);
}

Due to the relatively slow nature of network connections, a client application is not guaranteed to receive the full message sent by the server in the time between calling connect() and the first read() call. Thus, we stick read() in a while loop and check the return value of read() at each pass to see how many bytes were read, and continue reading bytes until there are no more to be read (more data will be received in the time between subsequent calls to read()). Within the loop, we set the nth character of the buffer to the null character, and print out the string we have so far. The array buffer was made 201 bytes large to make room for the null character if the read function did indeed happen to get 200 bytes in one pass. Since we set the next character after the last one read to null, printf() will only print those characters received in the last call to read(). Functions that work with character arrays as strings always look for a null character to signal the end of the string, so that they don’t attempt to access memory that is not theirs to access.

So let’s take a look at our simple client program in its entirety, and then we’ll get on to discuss how to make a super-simple server for our client. Here is client.c:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

int main( int argc, char **argv )
{

int n, sockfd;
char buffer[201];
struct sockaddr_in serverAddress;

if ( (sockfd = socket( AF_INET, SOCK_STREAM, 0 )) < 0 ) {
perror( "socket" );
exit(1);
}

bzero( &serverAddress, sizeof(serverAddress) );
serverAddress.sin_family = AF_INET;
serverAddress.sin_port = htons( 12345 );

inet_pton( AF_INET, "127.0.0.1", &serverAddress.sin_addr );

if ( connect( sockfd, (struct sockaddr *)&serverAddress,
sizeof(serverAddress)) < 0 ) {
perror( "connect" );
exit(1);
}

while ( n = read( sockfd, buffer, 200) ) {
buffer[n] = 0;
printf( buffer );
}

if ( n < 0 ) {
perror( "read" );
exit(1);
}

return 0;
}

Servers

Now we discuss the startup procedure for a server, which is slightly more complicated than what a client must do. The first step is the same – create a socket:

1
2
3
4
5
6
7
struct sockaddr_in serverAddress;
int listenfd, connectfd;

if ( (listenfd = socket( AF_INET, SOCK_STREAM, 0 )) < 0 ) {
perror( "socket" );
exit(1);
}

The only thing that changed here is that we declared a second socket file descriptor variable to hold the return value of accept(), and we changed the name of the original socket file descriptor variable from sockfd to listenfd, to reflect the changed nature of the socket in a server application.

Next, we have to bind the socket to a port and address using the bind() function. This function takes a sockaddr_in struct, so we have to prepare one as we did for the client:

1
2
3
4
bzero( &serverAddress, sizeof(serverAddress) );
serverAddress.sin_family = AF_INET;
serverAddress.sin_port = htons( 12345 );
serverAddress.sin_addr.s_addr = htonl( INADDR_ANY );

Initializing the server’s socket address structure is done in the same way as the client’s, with a few small changes. We first zero the memory space occupied by serverAddress, and then set the family, port number, and address. Notice, however, that we set the address differently than before. We could have used inet_pton() with the same localhost IP address that we used for the client, but to do so would restrict the server to accepting connections only on the localhost interface. In other words, our server would not be able to accept connections from the network, since it would only be listening for connections on the IP address 127.0.0.1.

There are functions that let us obtain the IP address of the Ethernet interface, but there is a better solution that will allow the socket to accept connections on any of the available interfaces (Ethernet, localhost, Airport, Firewire, etc.). By setting the address to the constant INADDR_ANY, the kernel will bind the socket to ress.sin_addr.s_addr to the network representation (obtained using htonl()) of the constant INADDR_ANY all available network interfaces. Thus, if we simultaneously have an active Ethernet connection, an active Airport connection, and the loopback interface (127.0.0.1), we will be able to connect to the socket over any of these interfaces’ respective IP addresses.

Next we have to bind the socket to the address specified in the struct serverAddress. This is done using the bind() function:

1
2
3
4
5
if ( bind( listenfd, (struct sockaddr *)&serverAddress, 
sizeof(serverAddress)) < 0 ) {
perror( "bind" );
exit(1);
}

Like the connect() function, we pass the socket file descriptor, the socket address structure that specifies the address and port to bind to, and finally, the length of the address structure. As always, we check to see if the function executed successfully by comparing the return value to zero.

Next we call the listen() function to tell the socket to listen for incoming connections. By calling listen(), we are converting our socket into a passive socket that can accept connections. Calling listen() is pretty straightforward:

1
2
3
4
if ( listen( listenfd, 5 ) < 0 ) {
perror( "listen" );
exit(1);
}

listen() takes two arguments: the socket file descriptor and the backlog. The backlog argument is used to limit the size of the queue for incoming connections. Thus, by passing 5 we tell the kernel to queue up to 5 pending connections. If a client attempts to connect when the queue is full, the kernel will refuse the connection, and the client’s call to connect() will return with a connection-refused error. Note that the backlog does not specify the total number of connects the server can handle, because once a connection has been accepted by the server, the request is removed from the queue, thus making room for additional connection requests.

The next thing we do is call the accept() function within a rudimentary run loop. What accept() does is connect to the host whose connection request is at the front of the queue, and return a socket file descriptor for this connection. We can then read and write data to the client using this new socket. Let’s take a look at how our simple server will send a message to clients:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
for (;;) {
char *buffer = "Howdy!\n";

if ( (connectfd = accept( listenfd,
(struct sockaddr *)NULL, NULL )) < 0 ) {
perror( "accept" );
exit(1);
}

if ( write( connectfd, buffer, strlen(buffer)) < 0 ) {
perror( "write" );
exit(1);
}
close( connectfd );
}

The rudimentary run loop I mentioned above is done with the infinite for loop; the server will continue waiting for connections until the user kills the process (using Ctrl-C, for example). Our server is pretty inflexible, since the message it sends to clients that connect is hard-coded, and short: “Howdy!” The message is coded as a null-terminated string. In the call to accept(), we pass the file descriptor for the listening socket (and nothing for the address structure, since we don’t need any of that information that accept() returns in this structure), and in return, accept() gives us the file descriptor for the connected socket. This socket, connectfd, is our end of the connection between the server host and the client host. This is the socket that we read data from and write data to when we communicate with the client.

Finally, after we send the message, we close the connected socket using the close() function and return to the top of the loop, ready to accept a new connection.

Now, putting all of these pieces together (with error checking in place), we have the following code for a small server application:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

int main( int argc, char **argv )
{

struct sockaddr_in serverAddress;
int listenfd, connectfd;

if ( (listenfd = socket( AF_INET, SOCK_STREAM, 0 )) < 0 ) {
perror( "socket" );
exit(1);
}

bzero( &serverAddress, sizeof(serverAddress) );
serverAddress.sin_family = AF_INET;
serverAddress.sin_port = htons(12345);
serverAddress.sin_addr.s_addr = htonl( INADDR_ANY );

if ( bind( listenfd, (struct sockaddr *)&serverAddress,
sizeof(serverAddress) ) < 0 ) {
perror( "bind" );
exit(1);
}

if ( listen( listenfd, 5 ) < 0 ) {
perror( "listen" );
exit(1);
}

for (;;) {
char *buffer = "Howdy\n";

if ( (connectfd = accept( listenfd,
(struct sockaddr *)NULL, NULL )) < 0 ) {
perror( "accept" );
exit(1);
}

if ( write( connectfd, buffer, strlen(buffer) ) < 0 ) {
perror( "write" );
exit(1);
}

close( connectfd );
}
}

And that, my friends, is our server. If you created a standard tool project for both the client and the server, you can compile and run each. Before you run the client, however, make sure you have the server running. If you want to compile and run these in a shell, type the following commands to invoke the compiler:

1
2
% cc -o server server.c
% cc -o client client.c

When running, you might want to open a second shell so you have one for the client and one for the server. Again, make sure you have the server running (by typing ./server from the directory where you compiled the code), and then run the client (./client from the same directory). If you have trouble, here are the source files I worked with for you to play around with. If you want to see something kind of cool, try typing the following in the shell while the server is running (this, incidentally, is a good way of testing server applications):

1
% telnet localhost 12345

So there you have it – a very simple example that shows how Unix does networking. If you’re interested at all in networking, I can’t recommend strongly enough that you pick up the book Unix Network Programming, Volume 1 by W. Richard Stevens. With the prominence of networking capabilities in most applications today, every programmer should have this book. There is a lot more to network programming than what we saw here. There are many considerations to be made for scalability, protocol independence, security, and more. This book covers it all.

In the next column, we’ll take what we learned here and see how we use some of the Foundation classes to make RCE an application that can actually communicate over a network.