Introduction
Since I started working as a software engineer, I came across multiple web APIs (Application Programming Interface), which either were created based on a REST design or Graphql query language. At the time of writing this article, Graphql is one of the hottest terms in the industry, which is mainly because it allows users to write queries in the client-side and adjust them based on their preferences. However, existing API services they have been built using the traditional REST design and in this tutorial I will attempt to explain some of the principles and best practises that come with the REST style. This article will be the beginning of a series that aims to explain the two main architectures of APIs, as well as to implement practical examples with each architecture using node.js and mongo databases. But, before we dive into the world of APIs, let's try to understand some basic definitions.
What is an API?
When I heard for the first time, I wasn't sure about the meaning of the term. Eventually, after a lot of digging and reading some good resources I think I got it. In reality it's just a fancy term to indicate a web service hosted on a server. A server is responsible to handle incoming requests from multiple clients and an API defines those functionalities, routines that will handle those requests, usually expressed in a homogeneous format. But how does an API differ from a web app that has a back-end? The back-end of a web application is mainly focused to handle incoming requests serving HTML templates, which are then rendered on the client. If a media type (MIME) is not explicitly specified, browsers by default assume an HTML template and attempt to render the response based on that. On the other hand, an API often returns data in a JSON(JavaScript Object Notation) format.
An analogy that I like thinking of is the following. Think of a customer (client) who goes in a bar and orders a beer. The barman, who in our case is the API, is responsible to fulfil any request made from the client and then to continue with the rest of the clients waiting in the queue. This is exactly what an API does. It is a service that is specifically focused on a single task and is responsible to fulfill any network requests that are made by the clients — usually browsers —. Bear in mind that the barman does not act as the server in that example, rather than a service hosted in the bar (server).
What is a CRUD cycle?
The pareto principle (80/20 rule) states that for many events, roughly 80% of the effects come from 20% of the causes. The same principle is applied in web APIs, since 80% of the time software engineers expose endpoints in their API that allow CRUD operations on certain resources. CRUD stands for Create, Read, Update and Delete and indicates four types of functionalities that can be applied in a resource. A resource is considered as a “thing” or data model/entity that exists in a database system. For example, let's assume that we are building a website that will provide an X service and we want to keep track of those users existing in our system. Consequently, we need a certain piece of functionality that will create a user (C) and then possibly to read those users (R) that have been stored. Additionally, if a user wants to change his personal details, we want to update the existing profile of the user (U), as well as to delete his personal details (D), if he decides to resign from our website. All these operations are abbreviated with the acronym CRUD and they are applicable on the design of web APIs. Now, let's proceed with the main topic of the article.
What is a RESTful service
Any service that has been developed following the REST design pattern is called RESTful. Essentially, REST defines a set of rules (6 rules) that must be followed so that a web service to be considered RESTful. I will briefly mention those rules, but I will be focusing in the most important one:
Client – Server: This means that the system will be developed based on client-server architecture, such that each component of the system to act and evolve independently. This decouples architectures that used to combine the client and the server side.
Stateless: This constraint means that when a client makes a request to a server, its responsible to provide all the necessary information (state) that the service needs to handle the request. This avoids the use of session-based authentication that stores user information in the server, however, other authentication schemes may be employed such as json web tokens(JWT).
Cacheable: Any response that is returned should be implicitly or explicitly labelled as cacheable or non-cacheable, which improves performance within the service. This enforces a client-cache-server style, meaning that a cache unit acts as a mediator between a client and a server. Note that HTTP provides a header called Cache-Control that handles the caching abilities of a request or response.
Layered System: A client should not be capable to determine if within a client-server cycle is directly connected to a server or to an intermediate server.
Code-on Demand (optional): The client can request code from the server and return an executable script. This script can then be executed from the client.
Uniform Interface
This constraint is combined with four sub-constraints, which are:
Identification of a resource: A resource is identified using a URI. Continuing from our previous example, URIs in the form /users and /users/123 target the users of the system and return the whole user’s collection and a user with the id of 123, respectively.
Manipulation of Resources through Representations: A representation is the returned state of a target resource and in simple words means the JSON data that is returned – assuming that JSON is the default format –. This rule implies that any representation should include information to identify a resource in the system. For example, when a user is created, the response should include a header that specifies the location – Location header if HTTP is used – of the created resource.
Self Descriptive Messages: A request or response should self-describe itself. A common example is the MIME type specified for a response, as well as the caching information.
Hypertext As Engine Of Application State (HATEOAS): This means that the state of the application should be manipulated only through Hypertext. Thus, clients will deliver state using the request body, query-string parameters, headers and the URI of a request. On the other hand, services will deliver state using the response body, headers and status codes.
REST Best Practises
Use HTTP and HTTP verbs
Although REST does not enforce the usage of HTTP (HyperText Transfer Protocol), is the recommended approach because is inline with most of the constraints in a REST design. For example, HTTP is stateless and allows to use headers such as Cache-Control to handle the caching of a resource, which means that the stateless and cacheable constraints are implemented by default. However, let's use HTTP with our existing example:
GET: The GET method will be used to query any resource in the service. For our example, we may query a collection of users or a single user using the identifiers /users and /users/:id, respectively. Note the plural form of the noun users. If the request succeeds, the service responds returning the state of the resource and a status code of 200 OK. Otherwise, a status of 400 Bad Request is returned. If the user 123 does not exist in the database, a 404 Not Found status is returned. We return the state of the resource (users) only if succeeds.
POST: The POST method will imply the creation of a user and therefore to create a user in our service an endpoint with the URI /users will be available. To create a user the client should call that endpoint and populate the request body with the data of the user. Its is preferred to avoid to specify an id of the user in order to inform the service that is responsible for populating that field. If the post succeeds, we return a status of 201 Created, the user, and we add the Location header with a link of the newly created resource. Although some people suggest to leave the response body empty, in industry you will mostly see the user to be returned, as it avoids a following request to the resource. If the request fails, a 400 Bad Request status is returned.
PUT: This method is mostly used to update a resource, although can also be used to create a user – not recommended if you don’t indent to set a specific user id –. We can update a user with an id 123 by calling /users/123 and adding in the body the updated data of the user. If the request succeeds, the service returns the updated user and a status of 200 OK. However, if there are no indentions for using the updated user, a status of 204 No Content with an empty response body its possible. If the request fails, a status of 400 Bad Request is returned. If the user 123 does not exist in the database, a 404 Not Found status is returned.
DELETE: This method is used to delete a user in a service and it will be available calling the /users/123. If the request succeeds, the service will either return the deleted user with a status of 200 OK or an empty response with a status of 204 No Content. If the request fails, a status of 400 Bad Request is returned. If the user 123 does not exist in the database, a 404 Not Found status is returned. A 404 Not Found status should also be returned if the request is repeated, considering that it was successful in the first attempt.
HTTP Method /users /users/123 GET Success: return a status of 200 OK with the collection of users. Failure: return a status of 400 Bad Request with a empty response body. Success: return 200 OK with the queried user. Failure:return a status of 400 Bad Request with a empty response body. Not Found: returns a status of 404 Not Found with an empty response body. POST Success: return a status of 200 OK with the newly created user. Populate the Location header with a link of the created resource. Failure: return a status of 400 Bad Request with a empty response body. - PUT - Success: return a status of 200 OK with the updated user only if is used in a later stage. Otherwise, return a status of 204 No Content and an empty body. Failure: return a status of 400 Bad Request with a empty response body. Not Found: returns a status of 404 Not Found with an empty response body. DELETE - Success: return a status of 200 OK with the deleted user if is used in a later stage. Otherwise, return a status of 204 No Content and an empty body. Failure: return a status of 400 Bad Request with a empty response body. Not Found: return a status of 404 Not Found with an empty response body.
2. Use Hypertext wisely
Although there are various options to handle the application state using hypertext, each option should only be used when is needed. For example, a query-string and a URI identifier may be equally used to retrieve a resource from an API. In our example, using a query-string we can expose a single endpoint to either fetch a collection of users (/users) or a single user (/users?id=123). Thus, a single endpoint does the same job as what our previous design has done with two. Someone would assume that its a better solution, however is not. The sole purpose of identifiers is to indicate a resource within a service and should be preferred for any other options. This is because it improves readability and clarity as the service grows. Let's assume that we want to query the API to retrieve the books of a user. The routes /users?id=123&type=books and /users/123/books achieve the same goal for a query-string and a URI identifier, respectively, however, the latter is more readable and indicates a type of hierarchy. Nonetheless, a query-string is great when there is a need to filter results or add pagination.
3. Use JSON
A RESTful API may serve a representation either in XML or JSON format. Since XML is rarely used and is larger in size, set the default format of your service to JSON. If you are a client and want to query an API returning JSON data, add the Accept header to explicitly say the desired output format.
4. Use wrapped responses
I suggest to use wrapped responses when you return data to the client for two main reasons. Firstly, some frameworks do not return a status of a request response, and since developers determine if a request succeeds based on that, its preferred to return it in the response body. Secondly, there are hundreds of HTTP status codes in which we focus in only a few. Thus, wrapped responses allow to filter those codes based on your preferences. An example of wrapped responses is the following. If a request succeeds return a response in the form:
// failure.js
app.get("/", (req, res, next) => {
// logic to handle an incomed request
res.json({status: 400, success: false, message: errorMsg })
})
// success.js
app.get("/", (req, res, next) => {
// logic to handle an incomed request
res.json({status: 200, success: true, data: data})
})
Summary
When an API is built based on a REST design, its architecture follows some principles that enhance performance, scalability and reliability. It exposes multiple endpoints that allow clients/consumers to access a resource and perform CRUD operations. A resource is accessed through URIs, which if they are combined with HTTP verbs such as GET, POST, PUT, and DELETE, they present a form of structure and hierarchy. JSON is the recommended format used to return a representation to the client. Lastly, not only API services is a great skill for someone to have, but it's also a great tool to be used in a microservice architecture. Since all companies are moving to microservice architectures, a good knowledge in APIs is a necessary skill that eventually everyone will have.The next article of the series will be a RESTful implementation of an API using node.js and Mongo database. Stay tuned!!!