An Introduction to Concurrent Programming in Golang

Introduction

Golang is a statically-typed language invented by Google in 2009 and is well-known for its ideas and mechanisms behind concurrent programming. While languages like Java, and Ruby purely support a Shared Memory concurrency model, including threads, mutexes and locks, Google pioneers once again adding to the language a model named CSP(Communicating Sequential Processes). If you are wondering how these models are different and why Google took that decision, then you are at the right place. In this article I will be talking about Shared Memory and CSP models, as well as to make a small introduction to CSP for Golang.

Concurrent vs Parallel Programming

A concurrent system is the type of software that has been developed to deal with multiple tasks at the same time, however, don’t confuse it with parallel programming, which runs multiple tasks at the same time. Well, let’s demonstrate their differences with a couple of examples. Let’s assume that Sam goes for a walk to the park near his house. At some point, his phone rings and he picks it up. However, he stops his walk to finish the call and starts as soon as he closes the phone. In this example, Sam executes tasks concurrently. In the second example, Sam keeps walking down the park, talking on the phone at the same time, executing both of the tasks. This is an example for executing tasks in parallel.

In programming, the same applies running a program concurrently and in parallel. But, you may be wondering why not to run a program only in parallel since it seems to be faster? Concurrency is bound to the program itself, while parallel programming depends on the hardware, such as the number of cores a machine has. Since the main focus of a programmer is to write a software, they care for writing a software that runs concurrently, which may run in parallel with the hardware. The details for managing a core is not within their control, rather than an abstraction hidden behind a series of layers.

Shared Memory Model

Most programming languages support a Shared Memory Model. This model suggests the use of a shared memory where multiple processes/threads can access. For example, let’s assume that we create an API that will handle a series of requests. In that case, we often create a pool of threads, with each thread to handle a single request. However, having multiple threads to access the same memory needs synchronization and here is where most of the difficulties in concurrent programming come from. Therefore, primitives such as mutexes and locks are used to mitigate this issue.

Although this model is available in Golang using the sync package, it's not one of the reasons that has made it so famous. The sync package exposes primitives like a Mutex, a Locker, and a Pool, but I will be covering them in another article.

Communicating Sequential Processes (CSP)

The history of CSP starts back in 1978, when Charles Antony Richard Hoare published a paper with the name ”Communicating Sequential Processes”. In this paper, Charles suggests the use of primitives in a language with a message-passing style. These primitives would be consumed by processes and therefore exchange information between them. Based on his paper, a process is anything that accepts an input and returns an output, which is equal to what we call a function. With that in mind, Golang exposes primitives that can be used to write a concurrent system.

CSP in Golang

Rather than using threads, Golang introduces the concept of a goroutine, which essentially is a function that wraps some piece of logic and runs concurrently. It’s very lightweight and may grow or shrink based on the logic within it. It's common to spin multiple goroutines, even a thousand, without worrying about memory limitations. However, a goroutine does not replace threads, rather than introduces a new abstraction on top of OS threads. Internally, the go runtime multiplexes goroutines to OS threads and the runtime is responsible for scheduling their execution. Although this introduces a slight performance penalty, there are a couple of benefits that we get. Firstly, any improvements applied to the runtime are available to us as well. Secondly, we don’t need to manage how we access memory, which means less error-prone, more robust, scalable and easy to maintain code.

Moving on, goroutines may communicate using a channel. A channel acts as a pipe between goroutines and allows to transfer data between them. This is why we mention that CSP introduces a message-passing style. A goroutine can send data through a channel, and this data to be received by another goroutine. Lastly, using a select statement, we can await events coming from a channel.

In the example below, we create a goroutine that sends random numbers through a channel, and terminates after three seconds. At the end, we consume the channel filled with random numbers and we print them to stdout. We stop iterating the channel when it gets a signal from the first goroutine that it has been closed.

package main

import (
	"context"
	"math/rand"
	"fmt"
	"time"
)

func getStream(context context.Context) <-chan int{
	var stream = make(chan int)

	go func(){
		defer close(stream)

		for {
			select {
			case <-context.Done():
				return
			case stream <- rand.Intn(100):
			}
		}
	}()

	return stream	
}

func main(){
	var ctx, _ = context.WithTimeout(context.Background(), time.Second * 3)
	var stream = getStream(ctx)

	defer fmt.Printf("Terminating...\n")
	for v := range stream {
		fmt.Printf("Value: %v\n", v)
	}
}

CSP vs Shared Memory Model

CSP and Share Memory models are both used to solve the same problem, to write concurrent code. However, since Golang supports both of them, how do I know which one to use?

As a rule of thumb, use CSP when you want to transfer the ownership of data. Data has an owner, who essentially is the person that can control it. Therefore, if you want to share the data to someone else, prefer to use channels and goroutines. The same applies when you want to coordinate multiple pieces of logic. The idea behind it is that channels and goroutines are primitives that we are more familiar with. We can create a channel and pass it as an argument to a function or even iterate over it. In addition, we run a goroutine by adding the go keyword in front of a function invocation. The mental model developed for a goroutine is very similar to function, which means that the way we think for writing code, won’t change too much.

On the other hand, prefer to synchronise access in memory when you want to guard the internal state of a struct. For example, let’s say that you create a custom data store, which has a state and you want to protect and synchronise access to it. In that scenario, the use of a shared memory model is a perfect candidate. Lastly, if you have a bottleneck in your system and it has been proved that channels are the cause, then it may be good to switch to mutexes and locks. In general, the shared memory model is more performant, but often this benefit does not overtakes maintenance, scalability and readability.

Summary

In this article we talked about concurrency and we went through the Shared Memory and Communicating Sequential Processes models. Both models are used to solve the same task, however, each one has pros and cons. Golang supports both of these models, but it’s recommended to use CSP, since it increases clarity and allows you to focus on the business logic of your application.