Pipe: an Inter-Process Communication Method
- Mia Combeau
- C
- October 31, 2022
Table of Contents
By default, it is difficult to get two processes to communicate with each other. As we’ve seen in a previous article, even parent and child processes don’t share the same memory space. So we need to find ways to establish inter-process communication. One of these communication mechanisms is the pipe.
What is a Pipe?
A pipe is a section of shared memory meant to facilitate the communication between processes. It is a unidirectional channel: a pipe has a read end and a write end. So a process can write to the write end of the pipe. The data is then stored in a memory buffer until it is read by another process from the pipe’s read end.
A pipe is a sort of file, stored outside of the file system, that has no name or any other particular attribute. But we can handle it like a file thanks to its two file descriptors. We’ve had the opportunity to discover the concept in a previous article about file descriptors. In a nutshell, a file descriptor (fd for short) is a positive integer, a file index in a data structure containing information about all of the open files in the system. So when we create a pipe, we get two file descriptors pointing to it, one opened in read only mode and the other in write only mode.
Let’s keep in mind that there is a limit to a pipe’s size, which varies depending on the operating system. When this limit is reached, a process will no longer be able to write to it until another reads enough data from it.
Creating a Pipe
We can create a pipe with the aptly-named pipe
system call. Here is its prototype in the <unistd.h>
library :
int pipe(int pipefd[2]);
As its only parameter, pipe
takes an array of two integers where the two file descriptors should be stored. Of course, these file descriptors represent the pipe’s two ends:
pipefd[0]
: the read endpipefd[1]
: the write end
The pipe system call will open the pipe’s file descriptors and then fill them into the provided table.
On success, pipe
returns 0. However, on failure, it returns -1 and describes the encountered error in errno, without filling the provided table.
In order to establish inter-process communication between a parent and child process, we will first have to create a pipe. Then, when we create the child, it will have a duplicate of the pipe’s descriptors, since a child process is a clone of its parent. This way, the child will be able to read from pipefd[0]
information written by the parent in pipefd[1]
and vice versa. Of course, we could also allow two child processes to communicate with each other in this way.
Reading and Writing in a Pipe
A pipe’s file descriptors aren’t very different from other regular file descriptors. To input or retrieve data from one, we can use the read
and write
system calls from the <unistd.h>
library.
However, there are two points to keep in mind:
- If a process attempts to read from an empty pipe,
read
will remain blocked until data is written to it. - Inversely, if a process tries to write to a full pipe (one that has reached its size limit),
write
will remain blocked until enough data has been read to allow the write operation to complete.
Closing a Pipe
The read and write ends of a pipe can be closed with the <unistd.h>
library’s close
system call, just like any other file descriptor. However, there are a few aspects to beware of when closing its descriptors.
When all of the file descriptors referring to the write end of a pipe are closed, a process attempting to read from the read end will receive EOF
(end of file) and the read
function will return 0.
Inversely, if all of the file descriptors referring to the read end of a pipe are closed and a process attempts to write from it, the write
function will send the SIGPIPE
signal, or, if the signal is ignored, fail and set errno to EPIPE
.
To ensure that the processes correctly receive the termination indicators ( EOF
, SIGPIPE
/ EPIPE
), it is essential to close all unused duplicate file descriptors. Otherwise, we risk processes getting stuck in a suspended state.
Pipe Example
So let’s try to communicate a secret from the parent to the child process. In this C program, we will create a pipe and then fork a child process. The child will therefore inherit a duplicate pair of file descriptors referring to the same pipe. The parent and child processes will then close the file descriptors they won’t be using. Then, the parent will write a secret into the pipe while the child attempts to read it one byte at a time, copying it to the standard output.
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
// Utility function for write
void writestr(int fd, const char *str)
{
write(fd, str, strlen(str));
}
// Main
int main(void)
{
int pipefd[2]; // Stores the pipe's fds:
// - pipefd[0]: read only
// - pipefd[1]: write only
pid_t pid; // Stores fork's return value
char buf; // Stores characters read by read
// Create a pipe. Stop eveything on failure.
if (pipe(pipefd) == -1)
{
perror("pipe");
exit(EXIT_FAILURE);
}
// Create a child process
pid = fork();
if (pid == -1) // Failute, stop everything
{
perror("fork");
exit(EXIT_FAILURE);
}
else if (pid == 0) // Child process
{
// Close the unused write end
close(pipefd[1]);
writestr(STDOUT_FILENO, "Child: What is the secret in this pipe?\n");
writestr(STDOUT_FILENO, "Child: \"");
// Read characters from the pipe one by one
while (read(pipefd[0], &buf, 1) > 0)
{
// Write the read character to standard output
write(STDOUT_FILENO, &buf, 1);
}
writestr(STDOUT_FILENO, "\"\n");
writestr(STDOUT_FILENO, "Child: Wow! I must go see my father.\n");
// Close the read end of the pipe
close(pipefd[0]);
exit(EXIT_SUCCESS);
}
else // Parent process
{
// Close unused read end
close(pipefd[0]);
writestr(STDOUT_FILENO, "Parent: I'm writing a secret in this pipe...\n");
// Write into the pipe
writestr(pipefd[1], "\e[33mI am your father mwahahaha!\e[0m");
// Close write end of the pipe (reader will see EOF)
close(pipefd[1]);
// Wait for child
wait(NULL);
writestr(STDOUT_FILENO, "Parent: Hello child!\n");
exit(EXIT_SUCCESS);
}
}
There! We’ve managed to establish inter-process communication!
However, if we forget to close the unused ends of the pipe in each of the processes, meaning if we delete the highlighted lines in the code above, we’ll get the following output:
The child process stays blocked because read did not receive EOF
(end of file), even though the parent did close its write end file descriptor. But the child did not close its write end before attempting to read from the pipe. Hence the importance of making sure we’ve closed all of our unused file descriptors in each one of the processes.
Reproducing the Shell’s Pipe “|” Operator
Shells like Bash also use pipes to handle commands with " |
" operators.
For example, let’s say we have a test.txt
file, and we want to know how many lines it contains. The cat test.txt
command will display the contents of the file. If we add the wc -l
command (which counts the number of lines) with the " |
" operator, we will of course display the number of lines in our file:
The first thing we might notice is that when we do cat test.txt | wc -l
, the contents of the file don’t appear at all. So what is this " |
" operator doing here exactly?
The shell creates a pipe and two child processes, one for the cat
command and one for wc
. Then, it redirects cat
’s standard output towards wc
’s standard input. Therefore, the cat
command does not write its output in the standard output (our terminal), but rather in the pipe. Then, the wc
command will go looking for the data in that pipe rather than the standard input. Here’s a little diagram to visualize the idea:
To reproduce this behavior, we could duplicate the write end of the pipe over the standard output in the first child, and the read end over the standard input of the second child. We’ve previously learnt about the dup2
function that would allow us to do this in the article about file descriptors.
Creating a Pipeline Like a Shell
Of course, a shell can string multiple commands together with the " |
" operator. For example, we can do commands like man bash | head -n 50 | grep shell | grep bash | wc -l
. This is called a pipeline.
If we tried to use a single pipe for all of the child processes’ inputs and outputs in our endeavor to replicate this kind of pipeline, we’d quickly encounter big issues. Since the child processes are executed simultaneously, they will start fighting to read from and write to a single pipe. And one will inevitably end up waiting for input that will never arrive.
To build a pipeline, then, we need to create a pipe (a pair of file descriptors) for each child process, minus 1. That way, the first child can write on its own pipe, the second can read from the first one’s and write to its own, and so on.
And above all, we can’t forget to close all of the pipes’ unused file descriptors in each child process!
A little tip to share, a nagging question to ask, or a strange discovery to discuss about pipes or pipelines? I’d love to read and respond to it all in the comments. Happy coding !
Sources and Further Reading
- Linux Programmer’s Manual:
- Shahriar Shovon, Pipe System Call in C [LinuxHint]
- CodeVault, Simulating the pipe “|” operator in C [YouTube]