This document describes many of the functions used with sockets. Some are functions that are only for sockets; others are more general functions, but have important uses for sockets.
The purpose of this document is to learn the functions, what they do, and what sort of data they deal with. However, the code needed to implement them is often verbose, which can hinder a basic understanding . Therefore, we:
For more details on these functions, see the appropriate man(1) pages. 'man' is short for 'manual'. The '(1)' means section 1 of the manual. You can view man pages via one of the following ways:
Here are some terms we will commonly use to describe the data. Normally, words in all caps specify constants you actually write in the C code.
Note that the address family constants are often used where the protocol family contstants are supposed to be used. This will work on many systems, but we will consider it wrong in this lab .
#define DEFAULT_PROTOCOL 0
. . .
sd = socket ( PF_INET, SOCK_STREAM, DEFAULT_PROTOCOL);
Note that in TCP/IP, there is only one transport protcol (TCP or UDP) for each communication type (see comm_type), so this is the value you normally specify for the protocol parameter of the socket function.
Normally you specify this as your host's IP address, even if your host has only one network interface.
This value, or your host's IP address, is normally used as part of the socket name (or socket address).
Note that sometimes address family constants are used where the protocol family contstants are supposed to be used. This will work on many systems, but it is wrong, and might cause you problems in the future.
sd = socket ( proto_fam, comm_type, protocol )
creates the socket. For sockets, you use this instead of the open function. Unlike open, you do not specify the name of the socket. You can call the bind function to do that. However, there will be times when you do not need to specify a name.
For the protocol, you usually specify the 'def_protocol' value to have the system assign the protocol.
Note that in this call, you should use the protocol family constants instead of the address family constants.
status = socketpair ( proto_fam, comm_type, &sd_pair )
This is only for the PF_UNIX address family, so it will not be used in the TCP/IP lab.
This function returns a pair of unnamed sockets in sd_pair. This is designed to be an extension of pipes, in that it allows 2-way communication. It is designed for communiction between parent and child processes.
Note that in this call, you should use the protocol family constants instead of the address family constants.
status = close (sd)
The same 'close' function that is used for files. As with files, sockets should be closed.
TCP: In addition, you might need to use the shutdown function.
status = shutdown ( sd, direction )
Generally, this is sometimes used for TCP, and almost never for UDP. I'm not an expert in when exactly you need this and when you don't. To be safe, I usually add it to my TCP code. Below is information based on my best understanding of what I have read.
TCP: Stops communication in one/both directions, and informs the process you are connected with. 'direction' can specify one ofthe following:
- I no longer want to receive data
- I no longer want to send data
- I no longer want to either receive or send data
UDP: stops communication on the socket, but does not inform the process(es) you are communicating with.
status = unlink (socket_name)
Thisis only for the AF_UNIX address family, so it will not be used in the TCP/IP lab.
In the AF_UNIX address family, if you specify a name for the socket, an actual file is created. In addition to closing the socket, you must also delete it via the unlink function (or the rm command).
status = bind ( sd, sname, sname_len )
binds a name to a socket. For AF_INET, name consists of IP adress & IP port. The IP address is usually specified as INADDR_ANY, which means any of the host's IP address. If the port is set to undef_port, then the system will assign an unused IP port to the socket.
Actually, in the internet, the protocol is also part of the identification, but that was set in the socket function.
This is normally called only by servers. In clients, binding occurs automatically the first time you call one of the write/send or connect functions.
For AF_UNIX, you specify a UNIX path name. This will create an actual file (of a special type).
status = getsockname ( sd, &sname, &sname_len )
Sometimes you did not set the name of the socket, for example:
- if you called bind where the IP address = INADDR_ANY and/or the port = undef_port.
- OR if you bound a name to the socket implicitly via one of the write/send or connect functions.
This function will return the name of the socket.
status = getpeername ( sd, &sname, &sname_len )
TCP: Gets the name of the peer, that is the socket you are connected to. See the section on connections. Usually you run this after executing the accept function. This function will return the name of the socket.
UDP: For connectionless communication, you can get the name of the remote socket via the recvfrom or recvmsg functions.
clear buffer 'save_buffer' bytes_read = read (sd, buf, max_bytes) while ( bytes_read > 0 ) append contents of buf to save_buffer bytes_read = read (sd, buf, max_bytes)UDP: the data the caller specifies is sent in one packet. Thus, you need to call read/recv only once for each packet. However, if the packet is bigger than the read/recv call is willing to take, I believe that the excess data is discarded.
bytes_read = read ( sd, &buf, max_bytes )
the normal read function. A maximum of max_bytes is read into buf. bytes_read is set to the actual number of bytes read, or to -1 if there was an error.bytes_read = readv ( sd, &buf_set, num_bufs )
like read, but reads into a set of buffers. num_bufs = how many buffers there are.bytes_read = recv ( sd, &buf, max_bytes, flags )
Like 'read', with the addtion of flags. Flags can contain one or more of the following:bytes_read = recvfrom ( sd, &buf, max_bytes, flags, &sender_sname, &sender_sname_len )
- read out of band data
- peek at the data on the socket. But the data will remain around, so the next 'recv' will also get it.
Like recv, but also get the name of the sender. the sender's name.bytes_read = recvmsg ( sd, &msg_header, flags )
msg_header is a special structure that contains the set of buffers to write to, a buffer to get the access rights, plus (optionally) the name of the socket to send to.
Another way to get the name of the remote socket is via the getpeername function.
These functions can be understood by looking at their parallel read/recv functions above.
bytes_written = write ( sd, buf, buflen )
bytes_written = writev ( sd, buf_set, num_bufs )
bytes_written = send ( sd, buf, buflen, flags )
flags can be one or more of the following:bytes_written = sendto ( sd, buf, buflen, flags, recv_sname, recv_sname_len )
- send out of band data
- usually used only by diagnostic or routing programs.
bytes_written = sendmsg ( sd, msg_header, flags )
These functions are normally used in TCP, but not UPD.
status = listen ( sd, queue_max )
Before a socket can accept connections, you must call this to set up the queue of waiting connections.
queue_max specifies the maximum length of the queue. Once, it could not be more than 5, but some systems now have a much higher maximum. What happens if you specify a value larger than the system can handle? On Linux and SunOS, the system maximum is used; no error status is returned. On Solaris, there is no system maximum.
status = connect ( sd, other_sname, other_sname_len )
TCP: Starts a connection between socket sd and another socket, other_sock_name. A socket is normally connected only once. This call starts a connection.
UDP: this does not really start a connection, since UDP is connectionless, but it can be used to set the sending address. Thus, after calling this, you can use the write or send functions instead of the sendto function. You might call this several times to change the association. Then you can disconnect by calling this, but specifying an invalid socket name.
status = accept ( sd, &from_sname, &from_sname_len)
socket sd must have been bound to a name, and you must have already run 'listen' on it. This function extracts the first waiting connection on the queue. from_sname is the name of the socket that was accepted.
After this returns, you can get the name of the remote socket via the getpeername function.
shutdown: see above.
host_struct = gethostbyname (hostname)
Given the host name, get the IP address. Actually, host_struct might contain more than one valid name and IP address for the specified host. We will not go into details of this structure here.
host_struct = gethostbyaddr (address, address_length, addr_fam)
Given an IP address, get the host name. Actually, host_struct might contain more than one valid name and IP address for the specified host. We will not go into details of this structure here.
status = gethostname ( hostname, hostname_len)
get the name of the current host.
A 'service' means the type of network service we want. Examples include World Wide Web, telnet, FTP, and daytime. Standard services are associated 'well known' ports. These functions deal with converting service names or port numbers in text string to the binary format needed by the socket functions.
service_struct = getservbyname (service_name, transport_protocol)
Given the names of a service and a transport protocol (UDP or TCP), get the port number, which will be part of service_struct. We will not go into details of this structure here.
service_struct = getservbyport (port_number, transport_protocol)
Like getservbyname above, but you pass it a binary port number instead of the name of a service.
'long' here means 4 bytes, and 'short' means 2 bytes.
Note the pattern of the function names. For example:
ntohs = n to h s = network to host, short
|FUNCTION||FROM FORMAT||"TO"||TO FORMAT||DATA LENGTH|
You can set several socket options and get their current values. I will not list all of the different options here. See the man pages for these functions for more details.
status = getsockopt ( sd, protocol_level, socket_option, "value, "value_length )
The protocol_level specifies to which level the option is relevant. Examples are the general socket level, the IP level, and the TCP level.
The socket_option is the specific option we are interested in.
The value is the data we store the option's value in. The value_length is the length of that data structure. The type of the data depends on the option.
status = setsockopt ( sd, protocol_level, socket_option, value, value_length )
Like getsockopt above, but we set the value of the option.
|1||One reason is that the socket interface is designed not only for normal TCP/IP [version 4], but also the next generation of TCP/IP [version 6], 'UNIX' sockets [pipes], Novell IPX, and many others. In Linux, see /usr/include/bits/socket.h for the different protocol families defined for sockets.|
|2||In his book, UNIX Network Programming, 2nd Edition, section 4.2, W. Richard Stevens argues that, for TCP/IP at least, there is no need to distinguish between address and protocol families. However, in this lab, we require that you do.|
|3||deprecated means that its use is discouraged.|