2.4. Data I/O Over TCP/IP Transport Layer

libASSA library defines a set of Object-Oriented wrappers of UNIX BSD socket library functions for the data transmission. Both TCP and UDP sockets are supported.

2.4.1. Socket Class

An abstract class Socket defines an interface in terms of a concrete type of PEER_STREAM.

Figure 2-9. Socket class hierarchy

Derived class IPv4Socket is of particular interest to our discussion. This class allows us to transfer data over TCP V4 transport layer.

Member functions of class Socket can be roughly divided into these categories:

  1. Connection Establishment - methods used directly by Connector (see Section 2.3) and Acceptor (see Section 2.2) patterns:

         open ();
         close ();
         connect ();
         accept ();
         bind ();
    		  
  2. Socket manipulation - methods that allow to configure Socket class behavior and obtain data availability information on the socket.

          getBytesAvail ();
          in_avail ();
          getOptions ();
          setOptions ();
    		  
  3. Byte-oriented data transfer - methods to read/write blocks of bytes to/from a peer stream.

          read ();
          write ();
          flush ();
          ignore ();
    		  
  4. XDR-formatted base types data transfer - methods that allow to transfer built-in data types over the wire in the network-independent (XDR) format.

          operator >> (T);
          operator << (T);
    		  

    were type T is one of the following:

          char
          unsigned char
          signed char
          int
          unsigned int
          float
          double
          STL string
    		  

With this generic description behind, let us turn to the discussion pertaining to the different ways you can read and write data with libASSA library.

2.4.2. Testing Socket Stream State

Class Socket has been designed after C++ stream. Its state can be tested for TRUE or FALSE upon completion of an I/O operation.

When an I/O operation on the Socket fails, or EOF is reached (i.e. peer closes its end of the socket), the Socket's internal state is changed to reflect an exception that has occured. For example, let us pretend that we are reading a stream of characters from a peer and processing each character until peer closes connection:


int ServiceHandler::handle_read (int fd_)
{
    IPv4Socket& s = *this;
    char buf[4096];
    register int ret = 0;

    while (s && (ret = s.read (buf,4096)) > 0) 
    {
        buf[ret] = '\0';
        cout << buf;
    }
    cout << flush;

    return s.eof () ? -1 : s.in_avail ();
}
	  

When data has been successfully or unsuccessfully transmitted into the buf character buffer, Socket resets internally its state to either good, bad, or end-of-file. Note, that most of the time the latter two are equivalent in terms of how application program code deals with them.

A C++ compiler then notices that conversion to type bool is required and attempts an implicit conversion to type bool (or int if it doesn't support bool as a language keyword). The compiler tries to covert Socket& to bool (or int). Socket class has a couple of conversion operations defined just for that purpose:


class Socket {
public:
    operator void* () const;
    bool operator! () const;

    // other methods ...
};
	  

The return value of operator void* () const; is:

Thus, while() loop condition calls upon function:


(s).operator void* ();
	  

At any time you can test the state of your Socket object with the following methods:


bool bad() const;
bool fail() const;
bool good() const;
bool operator! () const;

iostate rdstate() const;
void setstate(iostate flag_);
void clear(iostate state_);
	  

Please, refer to the Chapter 4 for further details.

2.4.3. Blocking vs. Non-Blocking I/O

Blocking read occurs when one of the data input I/O system calls (i.e. read()) is called and TCP socket involved is set to the blocking mode. If there is no data available in the socket receive buffer, the application program is suspended until some data arrives Since TCP is a byte-oriented stream protocol, an application program will be awakened when it has at least 1 byte available for reading.

In the non-blocking read mode, if the read operation cannot be satisfied, it returns immediately with EWOULDBLOCK error.

Blocking write occurs when one of the output I/O system calls (i.e. write()) is called, and the socket is set to the blocking mode. If there is no free room available in the socket transmit buffer, an application program is put to sleep until some room becomes available.

In the non-blocking write mode, if there is no room at all in the socket send buffer, the write function call returns immediately with EWOULDBLOCK error. If there is some room available in the buffer, the return value will be the number of bytes that the kernel was able to copy into the buffer.

Note

IPv4Socket sets underlying socket to the non-blocking mode by default in its open() member function.

You can switch Socket mode between blocking and non-blocking with the call to turnOptionOn() member function:

Example 2-5. Setting socket to nonblocking mode


IPv4Socket sock;

// Set to the nonblocking mode
//
sock.turnOptionOn (Socket::nonblocking);

// Set to the blocking mode
//
sock.turnOptionOn (Socket::blocking);
		

2.4.4. Buffered vs Non-Buffered I/O

Every TCP socket has a pair of send and receive buffers associated with it allocated in the kernel space. They are referred to as kernel socket buffers. To access kernel socket buffers, we have to make system calls (read(), write(), ioctl(), etc.) which are very expensive. Our goal is to transfer data in the least expensive way.

Therefore, in addition, class Socket implements another level of buffering - socket buffers. This pair of buffers (incoming and outgoing) is introduced to increase data transfer performance.

Class Socketbuf implements socket buffers.

Figure 2-10. Socket buffering

Some application programs (telnet for one) are required to write each and every byte immediately to the kernel socket buffer. To achieve that, we configure Socket to perform unbuffered I/O.


IPv4Socket sock;

// Set Socket to unbuffered mode

sock.rdbuf ()->unbuffered (true);

// Set Socket to buffered mode

sock.rdbuf ()->unbuffered (false);
	  

Member function Socket::rdbuf() return the pointer to the Streambuf object used for buffering.

When reading data from the socket stream, every Socket::read() reads not only number of bytes asked, but all bytes available in the kernel socket buffer with one system call, up to maximum buffer size. If Socket::read() requested less data then are available in the Streambuf's read buffer, the request would be satisfied immediately. Thus, we avoid extra system calls which are expensive to make.

When writing data to the socket stream, every consecutive Socket::write() call from an application code adds bytes to the Streambuf's write buffer. When it becomes full, the whole buffer is written to the kernel write buffer with one system call.

The size of read and write buffers are defined by Streambuf::MAXTCPFRAMESZ static constant variable:


const int Streambuf::MAXTCPFRAMESZ = 1416; 
	  

This number is the maximum frame size that can be transmitted unfragmented by TCP with MTU 1500 (1500-20-60 = 1416). TCP frame can have options (up to 60 bytes) which, if ignored, might cause fragmentation. Also, the length of the IP packet must be evenly divisible by 8.

Flushing write buffer can be requested at any time by the application code by inserting flush modifier into Socket stream:


void foo ()
{
    ASSA::IPv4Socket sock;

    sock.write ();        // write some data to the socket buffer
    sock.write ();        // write more data to the socket buffer

    sock << ASSA::flush;        // flush data to the socket
}
	  

2.4.5. Non-Blocking I/O (Polling)

Another way to read data from the stream is to set Socket in nonblocking mode, attempt to read and then test errno for EWOULDBLOCK error. Or, you can peek into both kernel socket buffer and Streambuf buffer for data availability.

Function getBytesAvail() of class Socket returns the sum of bytes available both in the kernel socket buffer and Streambuf buffer space.

Function in_avail() of class Streambuf returns number of bytes available in the kernel socket buffer space.

class Socket 
{
public: 
    int getBytesAvail (void) const

    // ...
};

class Streambuf 
{
public:
    int in_avail () const;

    // ...
};
	  

These two functions let you find out how much data is available for processing at any given time.

The disadvantage of reading data this way is poor performance. Each of the function calls above implies making a system call. In addition, polling is generally inefficient in terms of CPU consumption. A better way of exchanging data with a peer is by following I/O Multiplexing model of data communication which is described in the next section.