What Is Multithreading? Multitasking for Machines

In our increasingly digital world, people expect their software to be reliably performant, responsive, and available 24/7. Squeezing as much performance as possible from the hardware that supports our apps has never been more important for programmers. In this article we’ll discuss what multithreading is and how it can make your apps faster and more responsive.
What is multithreading?
Multithreading is the ability of a Central Processing Unit (CPU) to break a single process into multiple threads of execution and run them concurrently. If that definition sounds like a mouthful, have no fear. In the following sections we will break down the basic concepts needed to understand multithreading and its larger place in the world of programming. If you’re a veteran who was just looking for a quick refresh on multithreading and its uses, skip here.
What is a process?
When you launch an instance of a program on any computing device—be it a computer, tablet, phone, or even wearable tech—that program is called a process. Each process receives a dedicated address space in computer memory for execution and storage.
Each process...
- Is isolated from other processes, and doesn’t normally share information with other processes
- Has to be launched separately (i.e., processes require separate system calls to execute)
- Has its own dedicated stack, heap memory, and data map making it heavyweight
Processes can communicate with each other through something called Inter-Process Communication (IPC), but at the cost of having to make multiple system calls.
What is a thread?
Processes are memory intensive operations. To help them run faster and use memory more efficiently, programmers will break large programs into smaller tasks called execution threads.
An execution thread is the smallest sequence of programmed instructions that can be independently managed by a scheduler, the abstraction responsible for determining when, where, and in what order threads are allowed to execute.
Each thread…
- Shares memory and data with other threads running within the same parent process
- Can communicate with other threads with few to no system calls
- Is lightweight consuming less time and resources for creation, execution, and context switching
Execution threads are abstract data structures—also called execution contexts—that contain all the information needed to perform a specific task. Shared memory and shared resources makes spinning up a new thread easier than spinning up a new process.
Synchronous vs. asynchronous programming
Here is a popular interview question for developers: Can you explain the difference between synchronous and asynchronous programming models?
- Synchronous programming (Sync): In this execution model you write your code as a series of tasks that must be executed step-by-step. Your program only moves from one step to the next once the previous step has completed in its entirety.
- Asynchronous programming (Async): In this execution model you write your code as a single step with a group of tasks that may be executed concurrently—all tasks may be executed at roughly the same time but you have left it up to the operating system or scheduler to decide.
Concurrent vs. parallel programming
You may have noticed we’ve been using the word concurrent to describe asynchronous programming. While in the English language the words concurrent, parallel, and asynchronous are more or less synonymous, these terms take on a more specific meaning in the programming world.
- Concurrency: To run a group of tasks concurrently means you don’t have to wait for one task to complete before starting another. They may execute simultaneously, take turns progressing, or some combination of those two states. To the end user they may as well have occurred at the same time.
- Parallelism: A specific type of concurrency where tasks are truly executed simultaneously. This feat is only possible in multi-core environments.
Now that we know the differences between threads and processes, sync and async, concurrency and parallelism, we are finally ready to talk about multiprocessing vs. multithreading.
Multiprocessing vs. multithreading
Both multiprocessing and multithreading are performance optimization techniques for speeding up and improving the responsiveness of applications through multitasking. The difference lies in the granularity:
- Multiprocessing refers to the use of multiple cores to increase the raw computing power available for running applications. The speed boost comes from using multiple cores to run multiple processes concurrently.
- Multithreading refers to the performance boost of a single process by splitting up its tasks across multiple execution threads that can run concurrently.
These techniques are not mutually exclusive and you can use both to improve the performance of your apps. In fact, a multiprocessing system can run threads of a single process across multiple cores.
How does multithreading work?
Multiple threads can be implemented concurrently (within a single core) or parallelly (across multiple cores) depending on the needs of the developer. To illustrate how multithreading works we will accompany each section with a cake-baking analogy.
Concurrently within a single core
In this model of program execution, a process is split into multiple threads with the intention that different parts of the computation will execute concurrently. That means the program will make progress on more than one task at the same time, but not necessarily simultaneously. The threads are queued into a thread pool that maintains multiple threads waiting for tasks to be allocated for concurrent execution by the supervising program. The single core constantly switches between tasks creating the illusion of parallelism. This is faster than waiting sequentially for one task to finish before starting the next one.
Let’s illustrate concurrency on a single core system with the following analogy:
You are a baker (processor) who has to fill an order (process) for 3 cakes. The process of baking 3 cakes can be broken up into 3 jobs (threads).
Executing a process without multithreading using one core
You bake each cake sequentially one after the other.
SEQUENTIAL
The total time to fulfill the order to bake three cakes sequentially is six hours.
Executing a process with multithreading on a single core
You take advantage of the fact that you can start working on the next cake while one is still baking in the oven.
The total time to bake three cakes was reduced to four hours by taking advantage of the idle oven time.
CONCURRENT NON PARALLEL
Parallel execution across multiple cores
Alternatively, we can choose to use multiple cores to run individual tasks simultaneously with true parallel processing. Assigning each thread to a different core allows those tasks to be performed with true parallelism. The tasks will be performed at precisely the same time, no alternating context switching required. While this will make your program execute faster, you also have to deal with the added complexity of managing synchronization between cores. Parallel multithreading is significantly more challenging to implement.
Multithreading with three cores
You and two friends each with their own kitchen each bake a cake at the same time.
The total time to bake three cakes was reduced to two hours.
CONCURRENT, PARALLEL
This analogy is grossly simplified. In a typical multithreaded app running across three cores, the tasks being performed by execution threads don’t have to be identical. Some threads may run simultaneously while others might alternate in an interleaved pattern. It all depends on the dependencies of the program and what you are trying to do.
CONCURRENT, INTERLEAVED & PARALLEL
Why use multithreading?
Whether it's concurrency or parallelism, the purpose of using multithreading is to increase the throughput and performance of an application. When processors were first introduced, a single process would have to execute all its tasks sequentially. Multithreading was introduced to allow some tasks to progress while another is still completing. Context switching within a single-core environment allowed programmers to provide the illusion of multitasking to the end user because your graphical user interface (GUI) was able to function while another program was running in the background.
Advantages of multithreading
Multithreading comes with a number of advantages, including:
- Better CPU efficiency
- Improved system reliability
- Faster processing speeds
- Shorter response times
These advantages make multithreading great for I/O operations and optimizing machine learning algorithms.
Disadvantages of multithreading
The disadvantages of multithreading are directly related to the challenges of implementing this model of programming execution.
Expect greater difficulty:
- Writing and testing code—a multithreaded application takes more time to account for multiple threads of execution
- Managing memory and concurrency between threads—you’ll need excellent synchronization between threads to avoid deadlock, a condition where two threads are blocked for accessing the same set of resources.
- Ensuring code portability—the more optimized your multithreaded app is to the underlying hardware the harder it is to port to new devices
Which programming languages support multithreading?
Much of the confusion around which languages support multithreading has to do with whether the program requires truly parallel multithreaded programming or simpler single-core concurrency.
Many languages not only support conventional concurrent multithreading within a single core, but also possess built-in support or specialized libraries for true simultaneous multithreading in multi-core systems. These include:
- C/C++
- Java
- Haskell
- Clojure
- Go
- Rust
- C#
Conclusion
To summarize, multithreading is a CPU feature that allows programmers to split processes into smaller subtasks called threads that can be executed concurrently. These threads may be run asynchronously, concurrently, or parallelly across one or more processors to improve the performance of the application. The ability to run tasks concurrently also makes multithreaded applications and APIs more responsive to the end user.