Monday, November 15, 2010

Creating a std::iostream socket class (Part 3)

So far we've covered implementation of a custom stream buffer that can be used with standard library functions. Now all that's left to go is the actual socket code itself! For more information on modern socket programming, I highly recommend Beej's Guide to Network Programming. It's simple, easy to read, and covers what you need to know as you need to know it. It's also the reference I used while implementing my socket code. :)

Connecting to a server is done in three steps: Resolve the address, create a socket, and connect the socket to the address. Resolving the address looks like this:
  addrinfo hints, *info, *cur;
  memset(&hints, 0, sizeof(hints));
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;
  int ret = getaddrinfo(host.c_str(), port.c_str(), &hints, &info);
  if (ret != 0) handle_error();


Once we've resolved the address, we create a socket. Since getaddrinfo() returns a linked list of address results, we need to find one that has an address type we can connect to. This way, it will automatically use IPv6 if that's all that's available. Once we have a connected socket we can free the returned address info.
  for (cur = info; cur != NULL && m_data->socket == -1; cur = cur->ai_next) {
    m_data->socket = socket(cur->ai_family,
      cur->ai_socktype, cur->ai_protocol);
    if (m_data->socket != -1) {
      // we can bind via this protocol, can we connect?
      if (::connect(m_data->socket, cur->ai_addr, cur->ai_addrlen) == -1) {
        ::close(m_data->socket);
        m_data->socket = -1;
      } else {
        m_data->remotehost = host + ":" + port;
      }
    }
  }
  freeaddrinfo(info);


To read from the socket, use either read() (which works with all file descriptors) or recv() which also takes socket-specific flags. Note that by default, either will block if there's nothing to be read yet.
  int num = recv(m_data->socket, m_data->buffer, BUFSIZE, 0);
  if (num <= 0) handle_socket_closed();


To check whether there's anything to read on a socket, use the select() function:
  timeval waittime = { 0, 0 };
  fd_set readset;
  FD_ZERO(&readset);
  FD_SET(m_data->socket, &readset);
  select(m_data->socket+1, &readset, NULL, NULL, &waittime);
  if (FD_ISSET(m_data->socket, &readset)) read_socket_data();


If you want to listen for incoming network connections the setup is slightly different. We use a 'listener socket' which listens on a given port, and then when a connection attempt is made, a call to accept() will return another socket which is connected to the remote client.

To listen:
  // bind to the requested port
  addrinfo hints, *info, *cur;
  memset(&hints, 0, sizeof(hints));
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;
  hints.ai_flags = AI_PASSIVE;
  int r;
  if ((r = getaddrinfo(NULL, port.c_str(), &hints, &info)) != 0) return false;

  // try to get a socket to bind to the port
  for (cur = info; cur != NULL && m_data->socket == -1; cur = cur->ai_next) {
    m_data->socket = socket(cur->ai_family,
      cur->ai_socktype, cur->ai_protocol);
    if (m_data->socket != -1) {
      // insert lame joke about rings here
      if (bind(m_data->socket, cur->ai_addr, cur->ai_addrlen) == -1)
        close();
    }
    }
  freeaddrinfo(info);

  // if we have a socket, listen on it
  listen(m_data->socket, m_data->backlog);

  // accept the incoming connection
  sockaddr_storage addr;
  socklen_t addrlen = sizeof(addr);
  sock.m_data->socket = ::accept(m_data->socket, (sockaddr *)&addr, &addrlen);
  if (sock.m_data->socket == -1) return false; // fail :(


So there you go; almost everything you need to know to write socket code. And if you just want something that works, here's the full source of tcpstream.cpp and tcpstream.h. I've released it under attribution license, so feel free to use it in whatever projects you want, commercial or otherwise.

Saturday, November 13, 2010

You can't reuse ports within 60 seconds

As the title says, just a little gotchya I ran into. After having an open socket on a port, you can't bind another socket to that port for some specified timeout time afterwards. My test program for my tcpstream class worked fine, but if I tried to re-run it, I couldn't re-bind to the port I was using until a short while had passed. After an hour or so of trying to figure out why my socket wasn't closing, I finally found that it was actually Working As Intended(TM). It's fully explained in this post:

The short answer is:
"No, you may not re-use the port for the first 60 seconds
after a bound socket is closed (explicitly or because
the program exited)".

Friday, November 12, 2010

Creating a std::iostream socket class (Part 2)

Last time I outlined how the tcpstream class should work. Having written socket code before, the biggest unknown was "how does I makes a stream of my owns"? The answer is "you implement a custom std::streambuf object." Once you have a stream buffer reading from your source, you simply create a std::iostream object and set it to use that stream buffer, et voila!

The standard documentation for streambuf is rather terse and more than a little ambiguous, but happily, C++ Annotations Version 6.2.3 chapter 20 begins by detailing the implementation of a custom std::streambuf object.

In short, all that's required to create your own stream buffer is to implement some or all of the following methods:

// implement for std::streambuf read functionality
virtual int underflow();

// implement for std::streambuf write functionality
virtual int overflow(int c = EOF);

// optional but may improve performance
virtual std::streamsize xsgetn(char_type *s, std::streamsize n);
virtual std::streamsize xsputn(char_type *s, std::streamsize n);


The first of these, underflow(), is called when the owner of the stream buffer runs out of data while reading. The general form that this method should take is:

int underflow() {
  int bytes_read = fillbuffer(m_pBuffer, MAX_BUFFER_SIZE);
  setg(m_pBuffer, m_pBuffer, m_pBuffer + bytes_read);
}


The setg() call sets the object's stream buffer. The arguments are, in order:
  • A pointer to the first byte of the buffer.

  • A pointer to the current byte in the buffer (which will usually be the first byte of the buffer, if you read a whole buffer at a time).

  • A pointer to the byte after the end of the buffer.
For tcpstream, I added a helper method, fillbuffer(), which checks the socket for new data and is called at the beginning of all data-related methods. Since I didn't want to limit my stream to a predetermined fixed line length, I needed to buffer an arbitrary amount of data, and for this I settled on storing a linked list of data chunks. Each chunk represents a single read operation on the socket, and can store up to BUFFER_SIZE bytes.

struct datachunk {
  int bytes;
  uint8_t *data;
  datachunk *next; // pointer to next datachunk
  datachunk(int size) : bytes(size),
     data(new uint8_t[size]), next(NULL) { }
  ~datachunk() { delete[] data; }
};


With this linked list, the logic became pretty simple: If there's a new data chunk queued up, delete the current chunk and start using the new chunk's data. If there isn't then we've temporarily run out of data and we return EOF.

// implementation of underflow() for std::streambuf
int tcpbuf::underflow() {
  fillbuffer();

  // bail if we don't have another buffer to swap to yet
  if (m_data->inbox == NULL || m_data->inbox->next == NULL)
    return EOF;

  // we've used up the buffer at the front of the inbox, free it
  datachunk *old = m_data->inbox;
  m_data->inbox = old->next;
  delete old;

  // and set the new buffer up for the stream to read from
  char_type *newdata = (char_type *)m_data->inbox->data;
  setg(newdata, newdata, newdata + m_data->inbox->bytes);
  return *newdata;
}


Note: In order to keep the interface completely platform-independent without resorting to factory methods and the like, the class stores its data in a wrapper struct m_data. For this method it's enough to know that m_data->inbox is a pointer to the head of the list of datachunks.

The second of the mandatory methods, overflow(), is called in the inverse condition, when data is written to the stream. It just needs to make sure the byte it's given gets to where it needs to be. Unless you're buffering the data to write later (not a good plan for a socket class that's intended to be used somewhat interactively!) then it's usually a simple matter:

// implementation of overflow() for std::streambuf
int tcpbuf::overflow(int c) {
  if (m_data->socket == -1) return -1;
  return write(m_data->socket, &c, sizeof(c));
}


In this case, I didn't want to write my bytes to the socket individually every time, so I added an implementation for the optional xsputn() method, which simply writes n bytes instead of a single byte:

// implementation of xsputn() for std::streambuf
std::streamsize tcpbuf::xsputn(const char *s, std::streamsize n) {
  if (m_data->socket == -1) return -1;
  return write(m_data->socket, s, n);
}


So there you go. That's all you have to do to implement a std::streambuf. Of course, I haven't covered any of the socket code here. That's for next time.

One last thing; it's a drag to create a stream buffer, connect that to your network, and then create a new std::iostream to use that streambuffer, every time you want to connect a socket. There's a simple way around this though:

// tcpstream - a tcpbuf-based iostream, for convenience
class tcpstream : public std::iostream, public tcpbuf {
public:
  tcpstream() : std::iostream(this) { }
};


So now all of the socket-related methods in tcpbuf are available on a tcpstream, and you can also use the stream for your overloaded IO operators.

Saturday, November 6, 2010

Creating a std::iostream socket class (Part 1)

In the course of a project I'm currently working on, I've run across the need for some networking. In the past I've just wrapped up some Berkley sockets in a class and been done with it, but with this project I'm aiming to do things 'properly', using some of the more advanced features of C++ (including finally bidding a fond adieu to printf(), and starting to use std::cin / std::cout. After a bit of thought, I hit it. What better way to wrap up a socket than in a std::streambuf?

First port of call was trusty old cplusplus.com. Hmm. Nope, the same "you don't DESERVE to understand this" style as the standard library itself. Blargh. After some googling around, C++ Annotations 6.2.3 to the rescue! Now we're getting somewhere. Aha! Dr. Dobbs adds some insight.

Encouraged by the fact that it seemed possible, I looked around for existing libraries. A few options presented themselves in this excellent forum post, but they were all either too heavyweight or too awkward. Also, I just wanted to do it myself. Here's a snippet showing how simple it is to use:

std::string line;
tcpstream sock;
sock.connect("www.google.com", 80);
if (sock.bad()) handleError();
sock << "GET / HTTP/1.0\r\nConnection: close\r\n\r\n";
while (sock.connected()) {
std::getline(sock, line);
cout << line;
}

Next time I'll detail the process for creating a custom std::streambuf.