In this post we’ll discuss sockets, more precisely Network Sockets. It’s based on two excellent articles from Brian “Beej Jorgensen” Hall. We’ll first cover some basic OS and Network concepts and then go over some important functions needed to create a simple socket server that can handle requests from multiple clients.

This post assumes basic familiarity with the Linux operating system and the C programming language.

OS Basics

File Descriptors

A file descriptor is basically a numerical identifier (id) to a lookup table a given process contains. It is used to model not only files, but also things like stdin (special id = 0), stdout (id = 1) and stderr (id = 2). Sockets are also represented as file descriptors as we’ll see later.


fork() is a system call the current process can use to generate copies processes (known as children), that run the same program. One observation is that the child process gets a copy of the parent’s data.

This will be important for our example where the main process uses fork() to generate children to handle connections. The child needs to inherit some of the file descriptors from the parent.

fork() returns 0 if it’s the current process executing the code or a non-zero value corresponding to the process id (pid) of the child. A common way to use fork() is the following:

if (!fork()) {
printf("I'm the child!\n");
} else {
printf("I'm the parent!\n");

view raw
hosted with ❤ by GitHub

The return value can be used to distinguish between a parent and child and hence we can have them execute different code.


Signals are one of the simplest ways to communicate with a process. We just send a code that the process knows how to handle. There are OS-level signals like SIGKILL and SIGTERM and user-defined such as SIGUSR1. It’s possible to override how a process handles specific signals via sigaction() which take a structure that points to a handler:

void sigint_handler(int sig) {
write(0, "Ahhh! SIGINT!\n", 14);
int main(void) {
void sigint_handler(int sig); /* prototype */
struct sigaction sa;
sa.sa_handler = sigint_handler;
sa.sa_flags = 0;
sigaction(SIGINT, &sa, NULL);

view raw
hosted with ❤ by GitHub

Network Basics

Beej describes the Layered Network Model but follows with this interesting observation:

Now, this model is so general you could probably use it as an automobile repair guide if you really wanted to. A layered model more consistent with Unix might be:

  • Application Layer (telnet, ftp, etc.)
  • Host-to-Host Transport Layer (TCP, UDP)
  • Internet Layer (IP and routing)
  • Network Access Layer (Ethernet, wi-fi, or whatever)

In this model, sockets are in the application layer since it relies on TCP or UDP on top of IP. We’ll go over these 3 things next.

IP the Internet Protocol

The guide discusses some details of the IP, including the IPv4 and IPv6 distinction and different address types. The flags AF_INET and AF_INET6 are part of the socket API and associated to these 2 types.

One interesting detail is the byte order (Little-Endian and Big-Endian): while each computer systems can represent data in different ways, the order is standardized for the internet, and is Big-Endian. This is also known as Network Byte Order.


The User Datagram Protocol is connectionless (stateless) and provides no guarantees on the order of the datagrams, their delivery or that duplicates are avoided. The messages sent via UDP are known as datagrams.

The Transmission Control Protocol relies on a connection (via a 3-way handshake), guarantees order and perform retries. The messages sent via TCP are known as data stream.

The socket types SOCK_STREAM and SOCK_DGRAM are associated to the TCP and UDP protocols respectively.

Socket API

Before we proceed with our example, we’ll cover the C functions in sys/socket.h that correspond to the Linux socket APIs:

getaddrinfo() is a relatively high-level function that is capable of resolving an address (e.g. to an actual IP and port. An example of use is (fullcode):

int status;
struct addrinfo hints;
struct addrinfo *servinfo; // will point to the results
memset(&hints, 0, sizeof hints); // make sure the struct is empty
hints.ai_family = AF_INET6; // IPv6
hints.ai_socktype = SOCK_STREAM; // TCP stream sockets
hints.ai_flags = AI_PASSIVE; // fill in my IP for me
getaddrinfo("", "https", &hints, &servinfo);
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)servinfo->ai_addr;
void *addr = &(ipv6->sin6_addr);
char ipstr[INET6_ADDRSTRLEN];
inet_ntop(servinfo->ai_family, addr, ipstr, sizeof ipstr);
printf("IP: %s\n", ipstr);

view raw
hosted with ❤ by GitHub

The variable servinfo is of type addrinfo, which is a node of a linked list:

struct addrinfo {
int ai_flags; // AI_PASSIVE, AI_CANONNAME, etc.
int ai_family; // AF_INET, AF_INET6, AF_UNSPEC
int ai_socktype; // SOCK_STREAM, SOCK_DGRAM
int ai_protocol; // use 0 for "any"
size_t ai_addrlen; // size of ai_addr in bytes
struct sockaddr *ai_addr; // struct sockaddr_in or _in6
char *ai_canonname; // full canonical hostname
struct addrinfo *ai_next; // linked list, next node

view raw
hosted with ❤ by GitHub

getaddrinfo() returns a list of such values, any that match the criteria from the input parameters. In the Beej’s client code, we’ll see iterates over that list until it finds a set of parameters that it can connect to.

socket() returns a file descriptor. It’s basically creating a register with a given ID in a table and it returns that identifier we’ll use to establish a connection later. The interface is as follows:

int socket(int domain, int type, int protocol);

  • domain could be one of AF_INET and AF_INET6 (IPv4 / PIv6)
  • type is one of SOCK_STREAM or SOCK_DGRAM (TCP / UDP)
  • protocol could be one of PF_INET and PF_INET6

In practice AF_INET is the same as PF_INET. Beej says:

This PF_INET thing is a close relative of the AF_INET (…) they’re so closely related that they actually have the same value (…), it was thought that maybe an address family (what the “AF” in “AF_INET” stands for) might support several protocols that were referred to by their protocol family (what the “PF” in “PF_INET” stands for). That didn’t happen.

More conveniently we can also use the results of getaddrinfo() to fill these for us:

getaddrinfo("", "http", &hints, &res);
s = socket(res->ai_family, res->ai_socktype, res->ai_protocol);

view raw
hosted with ❤ by GitHub

bind() binds a socket to a specific hostname and port. It is not always needed (for example in the client case). The client doesn’t usually care which port is used on its own side, so it can let the OS choose. For the server case it’s important because it defines the IP address and port the socket will listen to.

This information can be more easily provided via the struct returned from getaddrinfo(). By providing null to getaddrinfo() and AI_PASSIVE to ai_flags, we’ll have this function fill the IP in res for us:

hints.ai_family = AF_INET6;
hints.ai_socktype = SOCK_STREAM;
// Important! Fill in my IP for me
hints.ai_flags = AI_PASSIVE;
// Use the address from localhost
getaddrinfo(NULL, "3490", &hints, &res);
sockfd = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
bind(sockfd, res->ai_addr, res->ai_addrlen);

view raw
hosted with ❤ by GitHub

connect() is the function a client can use to indicate the desire to establish a connection with a given server. Similarly to bind() we can use the results from getaddrinfo():

getaddrinfo("", "3490", &hints, &res);
sockfd = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
connect(sockfd, res->ai_addr, res->ai_addrlen);

view raw
hosted with ❤ by GitHub

Note how we didn’t need to bind(). connect() filled the local host and a random port for us.

listen() is the function a server calls to define how many connections it can listen to at a specific socket. Beej’s article mentions a system limit of 20, but using 5 to 10 seems to work in practice.

accept() is the function that is actually connects with a specific client. So that the original socket can keep listening to other incoming connections, accept() returns a new socket descriptor which will be used to send and receive messages.

send() / recv() are used to send and receive messages via the established connection. One important aspect is that while you specify the size of the data being sent, the API does not guarantee it will send the whole data, so you need to write a loop to make sure all data is  sent/received.


The high-level sequence of calls for the API above is:

  • getaddrinfo()
  • socket()
  • bind()
  • accept()
  • recv()/send()

For the client we have:

  • getaddrinfo()
  • socket()
  • connect()
  • send()/recv()

Beej provides examples for server and client codes in C: server.c and client.c.


Beej has an entire guide dedicated to inter-process communication. The guide covers the basic concepts such as creating new processes via fork() and handling signals; synchronization mechanisms suck as locks and semaphores; and communication mechanisms such as pipes, message queues and sockets. I like the conversational style of his writings.

I probably wrote code using sockets a long time ago. I didn’t have time to dig deep on this subject so I didn’t feel like I learned a ton. It was a good refresher nevertheless.


[1] Beej’s Guide to Unix Interprocess Communication
[2] Beej’s Guide to Network Programming

Related Posts

Content Delivery Network

In this post we’ll explore the basics of a Content Delivery Network (CDN) [1].


Here are the topics we’ll explore in this post.

  • Introduction
  • Advantages
  • Definitions
  • Caching
  • Routing


Content Delivery Networks, in a simplistic definition, is like the cache of the internet. It serves static content to end users. This cache is a bunch of machines that are located closer to the users than data centers.

Differently from data centers that can operate in a relatively independent manner, the location of CDNs are subject to the infrastructure of internet providers, which means a lot more resource/space sharing between companies is necessary and there are more regulations.

Companies often use use CDNs via third-party companies like Akamai and Cloudflare, or, as it happens with large internet companies like Google, go with their own solution.

Let’s see next reasons why a company would spend money in CDN services or infrastructure.


CDNs are closer to the user. Due to smaller scale and partnerships with internet providers, CDN machines are often physically located closer to the user than a data center can be.

Why does this matter? Data has to traverse the distance between the provider and the requester. Being closer to the user means lower latency. It also less hops to go through, meaning less required bandwidth [2].

Redundancy and scalability. Adding more places where data live increases redundancy and scalability. Requests will not go all to a few places (data centers) but will hit first the more numerous and distributed CDNs.

Due to shared infrastructure, CDN companies can shield small companies from DDoS attacks by distributing traffic among its own servers, absorbing the requests from the attack.

Improve performance of TLS/SSL connections. For websites that use secure HTTP connections, it’s necessary to perform 2 round-trips, one to establish the HTTP connection, another to validate the TLS certificate. Since CDNs must connect to the data center to retrieve the static content, it can leverage that to keep the HTTP connection alive, so that subsequent connections from client to data centers can skip the initial round trip [3].

Before we go into more details on how CDNs work, let’s review some terminology and concepts.

Definitions and Terminology

Network edge. is the part of network before a request traverses before reaching the data center. Think of the data center as a polygonal area  and the edge being the perimeter/boundary.

Internet exchange points (IXP). It’s a physical location containing network switches where (Internet Service Provider) ISPs connect with CDNs. The term seems to be used interchangeably with PoP (Point of Presence). According to this forum, the latter seems an outdated terminology [4]. The map below shows the location of IXP in the world:

Screenshot from 2019-07-04 23-07-19.png

Locations of IXP in the world (source: Data Center Map)

Border Gateway Protocol (BGP). The BGP is used for routing traffic between different ISPs. To connect a source IP with its destination, the connection might need rely on a networks from different a different provider than the original one.

CDN peering. CDN companies rely on each other to provide full world coverage. The cost of having a single company cover the entire world is prohibitive. Analogous to connections traversing networks from multiple ISPs.

Origin server. To disambiguate between the CDN servers and the servers where the original data is being served from, we qualify the latter as origin server.

Modes of communication. There are four main modes [5]:

  • Broadcast – one node sends a message to all nodes in the network
  • Multicast  – one node sends a message to a subset of the nodes in the network (defined via subscription)
  • Unicast – one node sends a message to a specific node
  • Anycast – one node sends a message to another node, but there’s flexibility to which node it will send it to.

ISP Tiers. Internet providers can be categorized into 3 tiers based on their network. According to Wikipedia [6] there’s no authoritative source defining to which tier networks belong, but the tiers have the following definition:

  • Tier 1: can exchange traffic with other Tier 1 networks without paying for it.
  • Tier 2: can exchange some traffic for free, but pays for at least some portion of it.
  • Tier 3: pays for all its traffic.

Most of ISPs are Tier 2 [7].


One of the primary purposes of a CDN is caching static content for websites [8]. The CDN machines act as a look through cache, that is, the CDN serves the data if cached, or makes a request to the origin server behind the scenes, caches it and then returns it, making the actual data fetching transparent to the user.  We can see that the CDN acts as a proxy server, or an intermediary which the client can talk to instead of the origin server.

Let’s consider the details of the case in which the data is not found in the cache. The CDN will issue an HTTP request to the origin server. In the response the origin server can indicate on their HTTP header whether the file is cacheable with the following property:

Cache-Control: public

For example, if we visit, we can inspect the network tab and look at the HTTP response header of a random JavaScript file:

Screenshot from 2019-07-06 23-07-19

we can see the line cache-control: public. In fact, if we look at the origin of this file, we see it’s coming from the domain which is Google’s own CDN.

Caching and Privacy

One of the important aspect of caching is to make sure privacy is honored. Consider the case of a private photo that is cached by a CDN. We need to make sure that it’s highly unlikely that someone without access will be able to see it.

Since CDNs store the static content but not the logic that performs access control, how can we make it privacy-safe? One option is to generate an image path that is very hard to reverse-engineer.

For example, the origin server could hash the image content using a cryptographic hash function and use that as the URL. Then, the corresponding entry in the CDN will have that name.

In theory if someone has access to the URL they can see my private photo but for this to happen I’d need to share the image URL, which is not much different from downloading the photo and sending to them. In practice there’s an additional risk if one uses a public computer and happen to navigate to the raw image URL. The URL will be in the browser history even if the user logs out. For reasons like these, it’s best to use incognito mode in these cases.

To make it extra safe the server could generate a new hash every so often so that even if someone got handle of an URL, it will soon be rendered invalid, minimizing unintended leaks. This is similar to the concept of some strategies for 2-fac authentication we discussed previously where a code is generated that makes the system vulnerable very temporarily.


Because CDNs is in between client and origin server, and due to its positioning, both physical and strategical, they can provide routing as well. According to this article from Imperva [9], this can mean improved performance by use of better infrastructure:

Much focus is given to CDN caching and FEO features, but it’s direct tier 1 network access that often provides the largest performance gains. It can revolutionize your website page load speeds and response times, especially if you’re catering to a global audience.

CDNs are often comprised of multiple distributed data centers, which they can leverage to distribute load and, as mentioned previously, protect against DDoS. In this context, we covered Consistent Hashing as a way to distribute load among multiple hosts which are constantly coming in and out of the availability pool.

CDNs can also rely on advanced routing techniques such as Anycast [10] that performs routing at the Network Layer, to avoid bottlenecks and single point of failures of a given hardware.


I wanted to understand CDNs better than being “the cache of the internet”. Some key concepts were new to me, including some aspects of the routing and the ISPs tiers.

While writing this post, I realized I know very little of the practical aspects of the internet: How it is structured, its major players, etc. I’ll keep studying these topics further.


[1] Cloudflare – What is a CDN?
[2] Cloudflare – CDN Performance
[3] Cloudflare – CDN SSL/TLS | CDN Security
[4] Cloudflare –  What is an Internet Exchange Point
[5] Cloudflare – What is Anycast?
[6] Wikipedia – Tier 1 network
[7] Wikipedia – Tier 2 network
[8] Imperva – CDN Caching
[9] Imperva – Route Optimization
[10] Cloudflare – Load Balancing without Load Balancers

Domestic server using Raspberry Pi

There are tons of tutorials on setting up a domestic server using a Raspberry Pi. I’ll add one more to the mix by describing my experience and lessons learned in creating a simple server.

Raspberry Pi

Raspberry Pi

Before starting, let’s first introduce some concepts and terminology. If you already know the basics of IP (Internet Protocol), feel free to skip to the next section (Hardware).


The scenario we’re using as example is a typical home setup, in which we have a bunch of devices that are able to access the internet through a router. The connection between these devices and the router form a private network.

IP Address. The router is assigned a public IP address by the ISP (Internet Service Provider – e.g. Comcast, AT&T, Verizon, etc). This IP usually changes from time to time, so it’s also called dynamic IP.

An IP (IPv4) address is a 32-bit integer often written in 4 groups separated by dots. For example, The creators of the IP address didn’t envision such an explosive growth of the internet and we’re now running out of IPv4 addresses. With that in mind, a new version, called IPv6, was designed, which uses 128 bits. IPv6 is not fully deployed yet, and for the purpose of this tutorial we’ll assume IPv4 throughout.

Public and Private IP Addresses. Because of the shortage of IPv4 addresses, we don’t have the luxury to assign a unique IP address to every possible device that exists. To work with that, only the router has a unique IP address (public IP). The devices in the local network as assigned what we call private IPs. While within one local network private IPs addresses must be unique, they don’t have to be unique across multiple local networks (for example, my laptop might have the same private IP address as yours but my router will have a different public IP address than yours).

To avoid confusion and routing problems, the set of IP public and privates addresses are disjoint. The private IPs must fall within these 3 ranges: ( to, ( to and ( to

It’s up to the router to assign the IP addresses to the devices in local area network.

Let’s analyze two use cases: a computer acting as a client and another where it acts as a host.

Client. A computer in the private network mostly acts as a client to some remote server, for example, when we open an URL using a browser (aka agent) and receive back some HTML page.

Behind the scenes, first we need to resolve the URL to an actual IP address (the address of the host that will serve the HTML page). The operating system will request this by talking to a DNS server, which has a mapping from domains to IPs.

Then it executes an HTTP (or HTTPS) request to that remote address, using port 80. Since we need to receive the response back, we also need to send the source’s IP. The problem is that the IP of the sender is a private IP and is not supposed to be used externally. To avoid that, the router will assign a random port and associate to that private IP address and send its own public IP address plus this random port, so that when it receives the response back it can route it back to the computer. Because it translates an internal IP to external IP and vice-versa, we also say the router is a NAT (Network Address Translation) gateway.

Server. In a less common scenario, and one which we explore in this tutorial, is when one computer in our private network serves as a host that external agents can talk to.

In this case we need to register a domain to get a user-friendly URL that maps to our external IP. We also need to instruct the router how to route the request to the computer acting as host. Since the external agent doesn’t know about the internals of our network, only about the external IP address, we manually need to tell the router what to do, and we do this via port forwarding. When a request to our external IP is made with a specific port, we’ll have a rule which tells the router to forward the request to a specific private IP address.

Wikipedia has much more information on this subject. With this brief introduction, we’re ready to start the tutorial:


I got the Raspberry Pi Model B+ (back in 2014), but as of January 2017, there’s already the Raspberry 3 Model B with much better specs. For the record, my pi has the following specs:

* CPU: 700MHz Broadcom BCM2835, ARM architecture
* RAM: 512 MB SDRAM @ 400MHz
* 10/100 Ethernet RJ45 on-board network

As peripherals:

* Wi-fi USB card: Edimax 150Mbs
* SD card for storage: Samsung EVO 32GB of space and 48MB/s transfer.
* Zebra Case (see photo below)
* A micro USB power adapter (5V/2000mA)

All these totalled around ~$100.


OS. I decided to try the Raspbian OS, which is a fork of Debian (wheezy) adapter for Raspberry. We first download it and write the image to the SD card.

We can then insert the card into the Raspberry Pi and connect a monitor/keyboard/mouse. The boot process will ask us to fill in some information and should be straightforward. The wi-fi adapter worked out of the box.

SSH. After installed, it felts very sluggish to run GUI, so I decided to do everything through SSH. The instructables has a very detailed guide for enabling it through the UI. After changing the password as instructed in the guide, I go the internal IP of the Pi using

hostname -I

I can then connect through:

ssh pi@<IP>

We can then install everything through command line.

Text editor and webserver. I installed my favorite editor and the server nginx (we could have used Apache’s HTTP server alternatively).

sudo apt-get update
sudo apt-get install emacs nginx

To test if the server is installed properly, we can run it out of the box:

sudo /etc/init.d/nginx start

If you put the Pi’s IP on the browser address bar in your laptop you should be able to see the default page served.

Server Configuration

We can make some changes in the default nginx config to ease development.

We can edit the configuration file /etc/nginx/sites-available/default (has to be sudo). The first thing I changed is for it to read files from my home folder instead of /usr/share/nginx/www

server {
    root /home/pi/www;

Since I’m using for private purposes, I turned HTTP authentication on. First we need to register a login entry in a .htpasswd file using htpasswd application. This can be obtained in Debian via

sudo apt-get install apache2-utils

Then we run:

sudo htpasswd -c /etc/nginx/.htpasswd <USER>

replacing with the username you’ll provide at login. When running the command above, you’ll be prompted to provide the password. The user name and an encrypted password will be saved to /etc/nginx/.htpasswd. Now we can configure nginx to use credentials from that file to perform the access check:

server {
    auth_basic "Private Site";
    auth_basic_user_file /etc/nginx/.htpasswd;

We can now add a custom HTML file to /home/pi/www (or whatever path you put in the nginx config), such as /home/pi/www/index.html

  <title>Pi's webpage</title>
    Hello world

Restart the server and reload the page, and you should get the new custom page!

sudo /etc/init.d/nginx restart

In a future post we’ll see how to work with a Node.js server, but this is as far as we’ll go in this first tutorial.

Network Configuration

Static Internal IP. To make sure the internal IP of the Pi doesn’t keep changing you might need to configure your router. My router is a MediaLink 300N, which stores a table of MAC addresses (a unique identifier for your hardware) to internal IPs automatically so I don’t have to do anything.

Static External IP. The remaining problem is your external IP. Unless you have asked for static IP, chances are that your ISP (mine is Comcast) will change the external IP from time to time, so you don’t have much control over that.

Dynamic DNS. To solve that, first we need to get a domain (I registered a new one via Google domains). You can configure it to point to a specific IP (your external IP) which will be stored in a DNS. The problem, as we said, is that your external IP might change, so we need to update the mapping from periodically.

We don’t want to do this manually, so we can use a system like ddclient which runs a daemon on your server machine (the Pi in our case) that will periodically check the external IP and update the DNS entry with the new IP in case it has changed.

To install we can simply do

sudo apt-get install ddclient

We then need to configure it so it knows where to go to update the entry. The file lives in /etc/ddclient.conf (need to edit as sudo). The configuration will depend on what is your domain provider. For google domains it will look like:


There’s a good tutorial for how to setup ddclient using Google domains.

To run the daemon, we can do

sudo ddclient -debug

Port Forwarding. When we access an URL like, it’s implicitly assuming port 80 when routing that domain to the actual IP. We’ll need to tell our router to forward that request to a specific internal IP (otherwise how does it know whether it should go to your laptop or the pi?). Most routers offer a way to perform this mapping, which is also known as port forwarding.

For my MediaLink router it’s under Advanced Settings > Virtual Server > Port Range Forwarding.

NOTE: There’s one current issue I haven’t been able to figure out. The port forwarding seem to only work if I access it from outside of my local network, that is, through my phone network or via some VPN. It might be some issue with MediaLink.


In this post we learned some details of the Internet Protocol and learned how to configure a Raspberry Pi to act as a server in a domestic network.


[1] How-To Geek – How to Forward Ports on your Router.
[2] Wikipedia – IP address
[3] Server Fault – How does the HTTP GET method work in relation to DNS protocol?
[4] Wikipedia – Classful network
[5] Page to test if a specific port is open and being forwarded
[6] Setting up a nginx server on a Raspberry Pi

HTTP Basics

In this post we’ll cover basic concepts of the HTTP protocol, and its variants, including the HTTP over TSL, commonly known as HTTPS. We’ll also talk about related features such as cookies and the next generation of HTTP.



HTTP is the acronym for Hypertext Transfer Protocol [1]. It’s a application level protocol, intended to be used between a client and a server. A common use case is a web browser acting as a client and a remote machine providing the contents of a given website acting as the server.

It assumes a reliable underlying transfer level protocol, and this is often TCP (we talked a bit about transport protocols in a previous post).

The original version of HTTP was defined in RFC1945 in 96. The current most popular implementation is of HTTP/1.1 (RFC2068, in 97). The HTTP/2.0 spec was recently finished (RFC7540 in 2015).

The protocol consists of two parts: first, the client sends a request to a server. Then, the server sends a response back.

HTTP Request

When a client want to connect to a server, it create a request message, that has a common format:

GET /index.html HTTP/1.1

There are a specific commands a requester can send, defined in the HTTP/1.1 specification, including:

* GET – should be use to request data. GET requests should not have major side effects, for example writing to databases, though bumping a counter to track the number of visitors, is fine. This is a guideline though, but is not enforced. It’s up to the application to protect against malicious GET requests.

* HEAD – similar to GET, but the response should only contain the head of the response, without the body

* POST – should be able to cause side effects. Often times the results of forms are sent to the server as a POST request. Additional data is sent in the request body message as opposed to GET requests, where data is sent as part of the URL.

* OPTIONS – returns the list of available HTTP commands this server implements.

* PUT – replaces the content of the address specified by the URI.

* DELETE – deletes the content of the address specified by the URI.

* TRACE – displays the request message got by the server

GET and POST are the most common commands in the context of Browser/Server communication. Methods such as PUT and DELETE can be seen in applications like Elasticsearch.

We can test some of these commands using the curl command line:

> curl -X GET
HTML contents

If we try TRACE, we get an error back (405):

> curl -X TRACE
Error 405 (Method Not Allowed)!!1

HTTP Response

After the server processes the request, it will return a response. The first line of the response it the status code and a textual reason phrase (note that this phrase are not standard, so clients should not rely on error messages).

The most common response header is

HTTP/1.1 200 OK


HTTP/1.1 404 File not found

The HTTP specification defines 5 groups of response status code, based on the response nature. The status code always has 3 digits, and the first digit represents the group:

* Informational 1XX
* Successful 2XX
* Redirection 3XX
* Client Error 4XX
* Server Error 5XX

And example response, for example [1]:

Date: Mon, 23 May 2005 22:38:34 GMT
Server: Apache/ (Unix) (Red-Hat/Linux)
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
ETag: "3f80f-1b6-3e1cb03b"
Content-Type: text/html; charset=UTF-8
Content-Length: 138
Accept-Ranges: bytes
Connection: close

  An Example Page

  Hello World, this is a very simple HTML document.

The first lines represent the response header. After a blank line follows the response.


HTTPS was specified over SSL (Secure Sockets Layer), but SSL has security flaws, and have been since evolved to a more robust layer, TSL, which stands for Transport Layer Security.

Motivation. One problem with HTTP is that the request/response are exchanged over an unsecured network. It’s possible for an attacker to intercept an user connection with the server and to have access to both request and responses. This is a common form of attack known at Man-in-the-middle attack.

By adding an encryption layer, the entire HTTP message can be encrypted using TSL. TSL idea is the following [6]:

* The client sends connection request to the server, providing a list of ciphers it can use (e.g. RSA) and a list of ways it can hash the data for integrity check (e.g. MD5 and SHA-1).

* The server sends as response a digital certificate and its public key,

* In one implementation, the client generates a random number associated with the current session, encrypts it using the server’s public key, and send it to the server. Both client and server now have a shared key to encrypt and decrypt data, so they can use a symmetric key encryption such as AES (Advanced Encryption Standard).

The advantage of this above only using the public key of the server to encrypt the message (like a plain RSA implementation), associating this to the session adds forward secrecy. This is to add some security to the scenario in which someone saves all the encrypted messages and some day manages to steal the server’s private key. In that case, it would be able to decrypt all the stored messages, but by generating an unique key every time, the attacker would need to also store the initial message containing the session key and associate the session key to the message.

The digital certificate is issued by a Certificate Authority (CA), a common trusted third party such as Symantec, which will attest the site integrity. The structure of the certificate is defined by the X.509 standard.

Browsers already ship with a pre-installed list of trusted CAs, so when receiving a certificate from a trusted CA, the browser can look at the CA signature (which was encrypted with the CA private key) and decrypt it using the CA public key. The decrypted signature should match the information the browser has.

For untrusted CAs, it’s up to the user to decide whether to trust a particular CA.

In Chrome, to inspect the Certificates, we can go to Settings… > HTTPS/SSL > Manage Certificates… (on Mac it will open the Keychan application). In the Keychan app, look for System Roots and the Certificates Categories, we can see a list of trusted CAs, like the one below, form VeriSign:

Sample Certificate

Sample Certificate

Let’s Encrypt. In an ideal world, all HTTP connections would be secure, so no one could eavesdrop while you’re browsing the web. Several companies like Google encourage its use, by having Chrome force the use of HTTPS whenever possible and also boosting the rank of websites with https support.

One major obstacle for this is having to rely on a non-free and complicated procedure with third-party CAs. To address this problem, recently the Internet Security Research Group (ISRG) proposed a new way to issue certificates for free.

The key is how to simplify the process of proving a given agent owns a given domain, for example Let’s encrypt will ask the agent to perform an action only the domain owner can do, for example putting a file under it (say, [7].

LE will then do a regular HTTPS request to get that file. Note that LE doesn’t have a trusted certificate for that domain, but it doesn’t need to this initial stage.

In addition, LE needs the agent’s public key and needs to validate it. This is simple: LE gets the agent’s public key, generates a random string and encrypts it with the agent’s public key. The agent will be able to decrypt the message using its own private key and then it encrypts it again using LE’s public key. It finally sends it back, and LE can also decrypt it. If the resulting key is the same it sent originally, it will associate the agent public key to the domain.

Now LE can issue certificates to that particular domain. Major browsers already trust LE as a Certificate Authority, so this require no extra work from the agent.

HTTP Cookies

One of the characteristics of HTTP requests is that they’re stateless. That means that in theory an HTTP request is independent from the previous HTTP request. One way to simulate a state is having the browser and server pass (meta)data around carrying state information. This extra data is basically implemented as an HTTP cookie.

The term cookie came from magic cookie (another programming term), which came from fortune cookie

The term cookie came from magic cookie (another programming term), which came from fortune cookie

Cookies are part of the HTTP protocol. A server can send a cookie to the browser in the HTTP response’s header:

HTTP/1.0 200 OK
Set-Cookie: lu=Rg3vHJZnehYLjVg7qi3bZjzg; Expires=Tue, 15-Jan-2013 21:47:38 GMT; Path=/;; HttpOnly
Set-Cookie: made_write_conn=1295214458; Path=/;
Set-Cookie: reg_fb_gate=deleted; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/;; HttpOnly

In the example above, it’s sending three cookies back. The first part of the cookie is an expression =. The others are attributes like expiration date and path + domain. Path and domain are used to let the client know it should use these cookies for requests with the URLs matching that path and domain.

Once the cookie is set on the client, subsequent requests to the server will contain the cookie information. For example:

GET /spec.html HTTP/1.1
Cookie: made_write_conn=1295214458; reg_fb_gate=deleted

Types. Cookies can be named based on their characteristics:

* Session and Persistent cookies: when a cookie doesn’t have the expiration time set, it last only within the current session. That is, if the user reloads the page the cookie is gone. As opposed to a session cookie, a persistent cookie has the expiration time attribute set and will last until that time.

* Secure cookie: have the Secure attribute. It indicates that the browser should only use a cookie if it was sent in a secure connection (usually HTTPS).

* HTTP-Only cookie: have the HttpOnly attribute. It indicates that a cookie should only be transmitted via HTTP. This informs the browser to block non-HTTP APIs (such as JavaScript APIs) from modifying a cookie.

* First-party and Third-party cookies: if the Domain/Path attributes in the cookie is different from the server address, it’s called a third-party cookie. This can be used for user tracking. For example, an ads agency can partner with several websites so they deliver cookies for this agency. It this way, the ads agency can track user activity across different sites.

This is a bit controversial. In Chrome there is an option to disallow third-party cookies.

Settings > Show Advanced Settings… > Content Settings… > Block third-party cookies and site data

A lot of websites use cookies for creating authenticated sessions. It’s even more important to only use HTTPS connections in this scenario, because cookies are sent as plain text in the HTTP request header. There are many attacks that can be performed exploiting cookies:

Man-in-the-middle attacks. that can be used in a LAN network or public Wi-Fi network to hijack a cookie, by intercepting the HTTP requests and obtaining the cookies, as explained in detail here.

DNS Poisoning. Since browsers use the Domain/Path to decide whether to send a cookie in a request, attackers can hack the DNS server to make the domain specified in the cookie point to the attacker server, which would send the cookies to the attacker. If it’s an HTTPS connection, the request wouldn’t go through because the attacker won’t have a valid certificate.

Cross-site Scripting. The server might contain HTML poisoned with malicious JavaScript code which has access to cookies, and could send those as plain text to an attacker server:

<a href="#">Click here!</a>

This would work even if the site we got the HTML from had a secure connection. This attack can be prevented if the cookies containing sensitive information have the httpOnly property.


SPDY is a protocol created by Google aiming to improve the performance of HTTP requests. The overview provided in their draft is very descriptive:

One of the bottlenecks of HTTP implementations is that HTTP relies on multiple connections for concurrency. This causes several problems, including additional round trips for connection setup, slow-start delays, and connection rationing by the client, where it tries to avoid opening too many connections to any single server. HTTP pipelining helps some, but only achieves partial multiplexing.

SPDY adds a framing layer for multiplexing multiple, concurrent streams across a single TCP connection (or any reliable transport stream). The framing layer is optimized for HTTP-like request-response streams, such that applications which run over HTTP today can work over SPDY with little or no change on behalf of the web application writer.

The SPDY session offers four improvements over HTTP:

* Multiplexed requests: There is no limit to the number of requests that can be issued concurrently over a single SPDY connection.

* Prioritized requests: Clients can request certain resources to be delivered first. This avoids the problem of congesting the network channel with non-critical resources when a high-priority request is pending.

* Compressed headers: Clients today send a significant amount of redundant data in the form of HTTP headers. Because a single web page may require 50 or 100 subrequests, this data is significant.

* Server pushed streams: Server Push enables content to be pushed from servers to clients without a request.

HTTP/2 is inspired on SPDY ideas. The majority of the browsers already support the HTTP/2 protocol, though only a bit over 6% of the websites use it as of January 2016.


While reading the material for this post, we’ve learned a lot of things, including

* Details of TLS, plus history of SSL and TLS
* Symmetric key encryption
* Man in the middle attacks
* Several network interception tools
* Forward secrecy
* Third-party HTTP cookies

Regarding encryption algorithms, I was familiar with RSA, and heard about elliptic curve encryption, though I have no idea how they work. I’m interested in learning more about the elliptic curve Diffie-Hellman algorithm.

There also several topics we didn’t cover like HTTP pipeline or general web attacks, such as heartbleed. This Wikipedia list is an interesting follow-up reading.

Overall it was very interesting to read about internet security.


[1] Wikipedia – Hypertext Transfer Protocol
[2] Wikipedia – HTTPS
[3] Wikipedia – Transport Layer Security
[4] Wikipedia – Advanced Encryption Standard
[5] Wikipedia – X.509 Standard
[6] Microsoft – Overview of SSL/TLS Encryption
[7] Let’s Encrypt – Technical Overview
[8] Wikipedia – HTTP cookie
[9] Grey Hats Speak – Sniffing HTTP On A LAN, MITM Attack
[10] Wikipedia – SPDY
[11] Wikipedia – HTTP/2

Haskell Basic Networking

This post is a set of notes from Chapter 27 of Real World Haskell. In this chapter, the authors discuss basic network programming using Haskell. It presents two simple client-server communication examples: one using UDP and the other TCP.

In this post, we’ll start by revising some basic computer network concepts and then will comment on different parts of the examples presented in the book.


The Transport Layer

The communication between two computers is often organized in multiple layers, following the OSI model standard. One of the layers is the transport layer. This layer is responsible for transferring data from a source to a destination, offering different levels of guarantees. The most famous transport layer protocols are UDP and TCP.

UDP stands for User Datagram Protocol and TCP for Transmission Control Protocol.


UDP provides a lightweight abstraction to send data from one host to another, by sending pieces of information, called Datagram, one at a time. According to [2]:

A datagram is an independent, self-contained message sent over the network whose arrival, arrival time, and content are not guaranteed.

Because of this, we have no guarantee the packet will arrive in order or that the packets will arrive at all. UDP uses checksum to verify whether a given packets arrived to the host was corrupted.

TCP offers more guarantees than UDP, but is less performant. It first establishes a connection between the client and the server and then sends TCP segments. Within a connection, TCP in the server is able to sort the segments in the order they were sent by the client. Also, it can retransmit segments if it doesn’t receive confirmation.

Network sockets

A network socket is the endpoint of inter-process communication between computers in a network.

The sockets types include:

* Datagram which uses the User Datagram Protocol (UDP)
* Stream, which uses the Transmission Control Protocol (TCP) or Stream Control Transmission Protocol (SCTP).
* Raw sockets, which bypass the transport layer.

Unix-based systems use the Berkeley sockets API which uses file descriptors (integers) to identify a socket.

Client-server using UDP

Let’s study the code. As the authors mention [1], the functions provided by the Network.Socket module, are corresponding to the low-level functions in C, so we can refer to those for documentation.

The getaddrinfo() function

The getaddrinfo() function takes a node (hostname), a service (port) and a set of hints flags as inputs and returns a list of structures called addrinfo as output. It will try to find all the addresses matching the constraints provided from the inputs.

There are two modes we’re interested in here: listening and publishing. For the listening mode, we can provide a flag AI_PASSIVE to the hints flags and a null value to node. According to the man page:

If the AI_PASSIVE flag is specified in hints.ai_flags, and node is NULL, then the returned socket addresses will be suitable for bind(2)ing a socket that will accept(2) connections

In Haskell we’re doing exactly that for the server:

addrinfos <- getAddrInfo
               (Just (defaultHints {addrFlags = [AI_PASSIVE]}))
               (Just port)

For the publishing mode, the docs say:

If the AI_PASSIVE flag is not set in hints.ai_flags, then the returned socket addresses will be suitable for use with connect(2), sendto(2), or sendmsg(2)

In our client code we then do:

addrinfos <- getAddrInfo 
               (Just hostname) 
               (Just port) 

The socket() function

A socket is like a network file descriptor. The socket() function takes the family domain, type of socket and the protocol. It’s not clear from the docs what this protocol refers to, expect that 0 is the default protocol and it’s dependent of the address family (first parameter). [3] Suggests it’s the application layer protocol (e.g. HTTP, POP3).

Since we’re going to use UDP, the arguments passed to the socket function in our Haskell code are:

sock <- socket 
          (addrFamily serveraddr) 

Server: Listening

With the socket file descriptor, we can bind an address to it using the bind function. It takes a socket file descriptor and the address and returns 0 on success or -1 on error.

To receive the messages, we use the recvfrom() function, which takes the socket, the maximum size of the packet and will return the message and the address of the sender. In the Haskell version, we have recvFrom implemented in Network.Socket. The documentation has the following warning though:

Do not use the send and recv functions defined in this module in new code, as they incorrectly represent binary data as a Unicode string. As a result, these functions are inefficient and may lead to bugs in the program. Instead use the send and recv functions defined in the ByteString module.

We can use the ByteString version by doing

import Network.Socket hiding (send, sendTo, recv, recvFrom)
import Network.Socket.ByteString

We also need to update all the places we use Strings with ByteString.

Client: Sending data

From the client side, we can use the sendto() function, providing the socket file descriptor, the data and the address of the server. The function will return the number of bytes sent.

In our Haskell code, we have

sendTo (slSocket syslogh) omsg (slAddress syslogh)

Where slSocket gets the socket, osmg is the message, and slAddress the host address. This call might not send the entire message at once, so we have to keep calling this function until the message is completely sent.


After trying running the code above for the client and server, I was not able to have the server print out the messages sent from the client in a Mac OS X. My first suspicion was that the server code had some missing configuration or bug.

I’ve tried using netcat, a tool for reading or writing to network connections via UDP or TCP. To listen to port 1514 using UDP we can do it by running:

nc -u -l -k 1514

The u flag indicates we’re using UDP (default is TCP). The l flag indicates we’re listening instead of sending, and k tells netcat not to disconnect after the client disconnects. So we now basically have a simple server on localhost:1514.

I’ve made a binary for the syslogclient.hs code example, by simply adding a main function and compiling it using ghc:

main = do
  message <- getLine
  h <- openlog "localhost" "1514" "syslogclient"
  syslog h USER INFO message
  closelog h

When running:

$ ghc syslogclient.hs
$ ./syslogclient 
hello world

I didn’t see any output from the netcat side. The next test was verifying if the client code had an issue. I took a similar approach with the syslogserver.hs code, adding the main function and generating a binary:

main = do
  putStrLn "Starting server...\n"
  serveLog "1514" plainHandler

Then started the server up:

$ ghc syslogserver.hs
$ ./syslogserver 

This time I used netcat to send the message using UDP. The command I ended up using was

echo "hello world" | nc -4u localhost 1514

As in the listening mode, the u flag here tells netcat to use the UDP protocol and 4 forces it to use IPv4 only. And this finally worked!

At this point there were a couple of questions hanging: what configuration is missing from the client code and why the server only displays the message if I force it to use IPv4 addresses?

Trace. One tool I’ve been missing from Haskell was the ability to print variables values at specific points in code. I’ve found on StackOverflow an interesting discussion which points out Debug.Trace.trace as a simple function to do this.

It’s an impure function and also messes up with lazy evaluation, so it’s recommended only for debugging purposes. It can be used in a neat way. Say we have a function

someFunction x y = x + y 

and we want to print the contents of x and y during runtime. We can just add one line, with minimal modification to existing code:

someFunction | trace ("Value of x: " ++ x ++ " and y: " ++ y) False = undefined
someFunction x y = x + y

Because trace prints its first argument and returns the second, we basically using this syntax

someFunction x y | False = undefined
someFunction x y = x + y

We’ll try to do the pattern matching with the first form, but since it returns False, we’ll end up executing the second form of someFunction(). Another option is to create a standalone print() function to print a given value. For example,

print x = trace ("Value of x: " ++ x) x

With this trick in our toolkit, we can inspect which addresses returned by getAddrInfo() in the client by adding

traceAddrs :: [AddrInfo] -> [AddrInfo]
traceAddrs addrs = trace (intercalate ", " (map (show . addrAddress) addrs)) addrs

When running the client code again, we get the following output:

[::1]:1514, [::1]:1514,,

The first two values, “::1“, represent an IPV6 address (0000:0000:0000:0000:0000:0000:0000:0001). According to Wikipedia,

Consecutive sections of zeroes are replaced with a double colon (::). The double colon may only be used once in an address, as multiple use would render the address indeterminate

Since we pick up the first address returned by getAddrInfo, we’re using IPv6 to connect to the server. We can force it to use IPv4 by passing the AF_INET flag:

addrinfos <- getAddrInfo
    -- set it to use IPv4
    (Just (defaultHints {addrFamily = AF_INET}))
    (Just port)

We can now run the client and send a message, and it will successfully be sent to the server.

Doing a similar investigation on the server code, we get:,, [::1]:1514, [::1]:1514, 

Since we’re picking the head of the list, the server is actually listening on an IPv4 address. We can force it to use IPv6 by passing the AF_INET6 flag.

addrinfos <- getAddrInfo
    (Just (defaultHints {addrFlags = [AI_PASSIVE], addrFamily = AF_INET6}))
    Nothing (Just port)

Now the server can listen to requests of both IPv4 and IPv6 clients. Mystery solved!

Client-server using TCP

Server: Multi-threaded Listening

There are a couple of differences between the TCP and UDP server.

1. The socket type we use is Stream instead of a Datagram.

2. Second, we call the listen function, which marks the socks as accepting connections. The second argument is the maximum size of the connection queue:

listen sock 5

3. Instead of recvFrom(), we then call accept, which picks the first of the pending connections in the queue, and creates a new socket. The server then spawns a new thread to handle that socket, so that the main thread can continue processing more connections.

procRequests :: MVar () -> Socket -> IO ()
procRequests lock mastersock = do
  (connsock, clientaddr) <- accept mastersock
  forkIO $ procMessages lock connsock clientaddr
  procRequests lock mastersock

4. Use a file handle instead of a socket. Because we keep a stick connection, we can use a file handle to abstract the reading from the socket.

Each thread reads the message from the connection

-- Converts a socket (connsock) to a handle
connhdl <- socketToHandle connsock ReadMode
-- Set handle to buffering mode
hSetBuffering connhdl LineBuffering
-- Read contents
messages <- hGetContents connhdl
-- Print messages
mapM_ (handle lock clientaddr) (lines messages)
-- Close connection
hClose connhdl

Here we use an MVar as a lock to guarantee that at most one thread is writing to stdout at a time. Otherwise we would see messages from different threads mixed up. This is the exact same approach we used in our Haskell Concurrent Programming post, when talking about using MVar as a lock.

Client: Sticky connection

Our TCP client also looks similar to the UDP counterpart, with a couple of differences.

1. As we did with the TCP server, we use Stream instead of Datagram.

2. We also mark the socket as keep-alive:

setSocketOption sock KeepAlive 1

which is basically telling the OS to periodically send packages to probe the server we’re connected to. This serves both as a check to see if the server is still alive or to prevent the connection from being dropped due to inactivity [4].

3. We establish a stick connection with the server:

connect sock (addrAddress serveraddr)

4. As in the TCP server, we use a file handle instead of a socket:

h <- socketToHandle sock WriteMode

which provides us using common IO file functions like hPutStrLn().

Every time we type a line, we want to send that string to the server. In the code below, we write a line to your file and flush it so it is sent to the server immediately.

hPutStrLn (slHandle syslogh) sendmsg
hFlush (slHandle syslogh)

5. Keep sending read lines from stdio until EOF

I’ve added a simple main function to the code so we can compile the client code into a binary, and also added a function, readData(), to read lines from stdio until we send an EOF character:

import Control.Monad
readData :: SyslogHandle -> IO ()
readData h = do
  done <- isEOF
  unless done readLine
    readLine = do
                 message <- getLine
                 syslog h USER INFO message
         	 readData h

main = do
  h <- openlog "localhost" "1514" "syslogtcpclient"
  readData h
  closelog h


Given that we have our binaries, I’ve started a server first and then ran two client binaries. I was able to type messages in each of the clients and verified the server was handling them properly.


Writing this post, I’ve learned about network programming and debugging in Haskell. I’ve had classes about network programming back in college, but it didn’t seem fun at the time. When we study things for our own curiosity, it’s much more interesting.

Also, in studying this chapter, I’ve tried using a more “curious mindset”, always questioning why things are this way or another, and this forced me to do more research and learning things beyond those the book provided.


[1] Real World Haskell – Chapter 27. Sockets and Syslog
[2] Oracle Java – What Is a Datagram?
[3] StackOverflow – Socket Protocol Fundamentals
[4] TCP Keepalive HOWTO