The application layer8 min read networking
Hi! This is my second blog entry. As I mentioned in my previous blog entry, the first real blog post would be a summary about each chapter I liked in the “Computer Networking: a top down approach” book I am reading. This one is about chapter 2, or more specifically, about the application layer and how it works. It may not have all the data, but is a summary that I did and works for me. I hope you like it!
In the application development world there are two major architectural paradigms:
- Client-server architecture
- P2P architecture
The client-server application is where a host or end system (a client) requests a resource from another end system called the server. A client will request objects that are stored in the server, for example each image in an HTML document is an object. Each server will have a unique IP so that the client can reach it.
P2P is an architectural paradigm that allows PCs to connect to each other and transfer files without the need of a middle device like a server. Each PC that downloads a file will also start working as a server.
It’s important to understand how processes communicate in the OS world. An end host will run a program that will communicate to another application in another host. In computer science, these are not the programs themselves but processes running inside each host. Processes within the same system use inter-process communication (IPC) which is controlled by the OS. If it’s a communication between two applications the processes will send and receive messages.
When a process sends a message, it goes from layer 7 (the application layer) down through a socket API to get to layer 4 (transport layer), so the socket API is an interface that divides both layers.
To be able to send a message to another host we’ll require an IP address, that is, a unique identifier to that specific host. We also require a port number, which is the way the other host identifies the application that the sending host requires. There are several port numbers that identifies different applications, for example HTTP uses port 80 and HTTPS uses port 443.
The throughput in networking is the rate at which the sending host can deliver bits to the receiving process. Bandwidth-sensitive applications are applications that require that the transport layer has a control of the minimum of throughput that the application can have in order for it to work. Elastic applications are applications that can work at any throughput level, such as email services.
Types of services offered in the internet:
- TCP service: TCP offers a connection oriented service and a reliable data transfer service. This means that TCP will first do a 3-way handshake with the end host to ensure that the data that will be transferred is going through a reliable service. TCP also uses SSL, or Secure Sockets Layer. SSL allows TCP to establish a secure channel, offering encryption, data integrity and end-to-end authentication. SSL is found in the application layer and an SSL socket exists, similar to the previous socket we mentioned. The application will then move packets down to the SSL socket layer as plaintext, SSL will encrypt it and passes the data to TCP. When the data arrives at the end host, TCP passes the encrypted message to the SSL socket, it is decrypted and moved to the application as cleartext. TCP’s reliable data transfer allows TCP to count the number of bytes it sends and needs an acknowledgement from the other end to know that the amount of data transferred has been received correctly without any issues. Another important feature of TCP is congestion control, this allows TCP to send less data when the network is congested, in order to avoid data loss.
- UDP service: UDP provides connection-less service, unreliable data transfer and no guarantee that the messages will arrive at the other end. This is particularly good for applications that don’t need assured services and retransmission, such as VoIP.
Overview of HTTP:
HTTP stands for HyperText Transfer Protocol, which is used in the internet communication. HTTP is used in the client host and the end host, the client sends an HTTP request and the server will send an HTTP response.
Documents in the web are written in a markup language called HTML (HyperText Markup Language) and for our discussion, these are called objects. Within an HTML document we can have different objects, like images, so if we were to have an HTML document with 5 images we would have 6 different objects. Web browsers are the client (firefox or chrome), web servers (Apache or MS IIS) are at the server side.
HTTP is stateless, that is, it doesn’t keep information from the client. It works in the client-server architecture and uses TCP as its transport protocol to ensure the data delivery. The socket API will do the transfer between the application layer (HTTP) and the transport layer (TCP).
Types of connections:
- Persistent connections: a persistent connection means that a TCP connection is established and is kept over a period of time, meaning that requests and responses will be sent over a same TCP channel.
- Non-persistent connection: a non-persistent connection is one that is established each time a request/response is to be sent. The TCP connection is closed each time the server responds to a request and the client acknowledges the receipt of that data.
- Request: an HTTP request is a type of message to get information from a server. It has a request line, header line and body. The request line can be GET, POST, HEAD, PUT and DELETE. In the header is specified from where we want to retrieve the object. Inside the request message body we can also find parameters, which could be from searching something at a website (for example looking for a book with id 12345 and author tan ah teck in google). POST information is also here.
- Response: is what the server responds to our request. There are several HTTP status codes, depending on the request. It is divided in three: status line, header lines and entity body. The status line is the protocol version, the status code and the status message. The body has the response, in this case we can see the HTML of the website as header tag.
As HTTP is stateless and doesn’t keep any information of a person, we need cookies to identify a person exclusively. A cookie allow sites to keep track of users. If we contact a server, that server might give us a cookie as a response. This will make the HTTP client to send that cookie to the server each time it requests a file in the server. The server will get the cookie and will identify the person, so it can provide data based on the interests of that person.
A web cache (also called proxy server) is a device that sits in the middle of a connection and can respond to client requests quicker than the server where the site is hosted. Many browsers go to a web cache first by opening a TCP connection and then making an HTTP request, the web cache checks if it has the site that the client is asking for and if so, returns it. If it doesn’t have it, it will connect to the server with TCP and do the same HTTP request. When the server responds it forwards the response to the client and saves that website for later use. In the memory hierarchy, a cache is quicker than RAM (used by the server when looking for the website in its memory). So a web cache makes the internet more quicker. The web cache can act as a client and server at the same time, a server when maintaining a TCP connection with the client and a client when requesting the site to the server.
Web caches are mostly placed in the ISP to speed up internet lookups. CDNs nowadays geographically distribute web caches to allow this quicker lookup to take place. Google and MSFT have their own web cache.
Programming a UDP socket:
There are two things required for the packet:
- The IP address of the destination host
- The port of the service we want to reach (or were the server is listening for requests).
The following example uses socket programming in UDP, each line is commented with what it does. If we run the program (server and client) the client will run, ask for a lowercase sentence, send the message and wait for a response, print the response and then terminate. The server will always be running and listening on port 12000, as soon as it gets a request it will respond and print out some information.
Programming a TCP socket:
As TCP is connection oriented, we first need to establish a 3 way handshake. When this is done, the client side can start sending data. After all data has been processed the server will close the connection, meaning this is a non-persistent connection.
Code can be found in my github repository: Check it on Github
Source of the information: Computer Networking, a top down approach by James F. Kurose, Keith W. Ross.