Pipelines in Computing and Software Engineering - Conceptual Article
This article describes pipelines in computing and software engineering - starting from the fundamentals of pipelines concept in computing, moving to basic instructions pipeline, to pipelines in Unix and Streams API in Java 8.
Definition of Pipelining in Computing
Pipelining is a segmentation of a computational process into several sub-processes which are executed by dedicated autonomous units. In other words, pipelining can be defined as the technique of decomposing a repeated sequential process into sub-processes, each of which can be executed efficiently on a special dedicated autonomous module that operates concurrently with the others.
What the definition means
Put simply, pipelines are a way to achieve efficient execution of a series of computing instructions by dividing them to into subsets of instructions. These subsets of instructions achieve efficiency by making effective use of the computing resources at-hand to execute these tasks in parallel, thus achieving greater efficiency overall from the computing system.
Have a look at the following diagram which explains pipelining -
Explanation of the diagram
Instruction Pipeline
Expanding on the concept of pipelines in computing, it can be directly applied to optimizing execution of computing instructions in a processor. A set of instructions which are to executed in a sequential order can be divided into subsets of parallel instruction(s) and executed in parallel. Even overlapping of instructions near their start and ends achieves more efficient results than executing these instructions linearly.
Pipelines in Unix
Pipelines in Unix are well known. Multiple programs are chained together using the vertical bar as shown below -
Programs(program1, program2...), in the unix commands snippet above, achieve some individual objectives and are chained together using the concept of pipelines in such a way that the output of program1 is the input of programs2 provided they all use the standard streams for I/O.
Unix has in-built mechanisms for exploiting parallelism while executing its pipes. GNU Parallel is the latest development in Unix area around the concept of pipes. It is a command-driven utility which executes pipes in parallel. Pipelines in Java 8 using Streams API The latest edition of Java - Java 8, also has introduced Pipelining and parallel processing through the concept of Streams. Streams API has in-built operations which can be chained together and can be invoked in parallel using
- A complex computation C is broken down into sub-computations(C1, C2 & C3). These computations then work in parallel and are assimilated at the end to get the final result.
- C1, C2 & C3, the sub-computations, are the pipelines
- The division and then assimilation is efficient because the sub-computations are done in parallel making the best use of available resources.
program1 | program2 | program3
Unix has in-built mechanisms for exploiting parallelism while executing its pipes. GNU Parallel is the latest development in Unix area around the concept of pipes. It is a command-driven utility which executes pipes in parallel. Pipelines in Java 8 using Streams API The latest edition of Java - Java 8, also has introduced Pipelining and parallel processing through the concept of Streams. Streams API has in-built operations which can be chained together and can be invoked in parallel using
Stream.parallel()
method. You can find a detailed tutorial on the basics of Streams API here
Click to Read Java 8 Streams API tutorial.
Summary
We looked at the basic definition of pipelining in computing, then went on to understand what it is, learnt about instruction pipelines, followed up with a brief of how pipelines are used in Unix and finally learning that Streams API adds pipelining facility to Java 8.