Stream module

Intro

This is the sixth in a series of tutorials on Node.js.

We created a project folder called nodejs-tutorial to hold example code from each tutorial.
The code for the whole series is available on GitHub at github.com/LearnByCheating/nodejs-tutorial

This tutorial covers the Stream module.

Create a new directory in the nodejs-tutorial project named 6-stream.
Then open your Terminal application and cd into the new folder.

Read it, watch it, do it, review it:

There are both a written and accompanying video version of this tutorial. There are timestamps under the headings that align with the video version of the topic.

Read it: For each topic heading, first read the topic.
Watch it: Then watch the associated section of the video.
Do it: Then follow the instructions to replicate the steps on your computer.

The Node.js CheatSheet has a major category that aligns with this tutorial.

Review it: When you are done with the whole tutorial, open the CheatSheet and review the Stream category from this tutorial. Make sure you understand everything, and can refer back to the CheatSheet when you are working on your own projects in the future.

Read File without streaming

Let's start with the standard way a file is read using the example files below.

Assume you have a text file called smallfile.txt that contains a few lines of text.
If you use the fs.readFile method to read its contents, then do a console log to print it to the terminal, It will load all the file data into RAM memory before printing it.

nodejs-tutorial/6-stream/smallfile.txt

Put some lines of text in here.

nodejs-tutorial/6-stream/readfile.js

const fs = require('fs');

fs.readFile('smallfile.txt', 'utf8', function(err, data) {
  if (err) return console.error(err.message);
  console.log(data);
});

Execute the file with:

node readfile

That is fine for a small file, but what if you have a huge text file, large image, or a video. Accumulating that much data in memory all at once will hurt performance.

The Stream module

[1:31 Video timestamp]

For large files, the solution is to use the stream module to break the file content up into a stream of data chunks.
A data stream is the transfer of data between two points.
The Stream module provides an API for working with data streams.
Input/Output intensive modules like fs, http, and process.stdin/stdout inherit from the stream module giving them access to stream methods. In this tutorial we will be using streams with the fs module.

The Stream module itself inherits from the Buffer and Events modules.

The Streams can be readable, writeable, or both (duplex).

Read Stream

Let's start with a read stream.

In the nodejs-tutorial project on GitHub there is a 2.7 megabyte file called hugefile.txt filled with pages and pages of Lorem Ipsum filler text.

To read it, create a file called readstream.js and populate it with the below:

nodejs-tutorial/6-stream/readstream.js

const fs = require('fs');

const readStream = fs.createReadStream('hugefile.txt', 'utf8');

let count = 0;
readStream.once('data', () => {
  console.log('Start the data stream');
});
readStream.on('data', (chunk) => {
  ++count
  console.log(chunk);
});
readStream.on('end', () => {
  console.log('Data stream complete. Chunk count: ', count);
});
readStream.on('error', (err) => {
  console.error(err.message);
});

Then execute the file. Note, VS Code's Terminal panel won't print out such a large file, so use your Terminal app.
node readstream

Our code will read the file, then log its contents to the Terminal. But instead of loading the file all at once before printing it, it will be streamed so it can be loaded and printed in chunks.

Let's walk through the code:

const fs = require('fs');

We are reading from a file, so import the File System module:

const readStream = fs.createReadStream('./hugeFile.txt', 'utf8');

The format to create a read stream object is: fs.createReadStream(path[, options])

Set the encoding option to 'utf8' for text, otherwise it returns a buffer object.
When this method is called, Node starts reading the file.

Data Stream Explanation

The Stream module inherits from the Buffer and Events modules.
The file data is loaded as a stream of bytes which are temporarily accumulated in a buffer object in RAM memory.

Node's core Buffer module manages this part behind the scenes.
The buffer stores the bytes in an array. When the buffer's array reaches its maximum size it passes the data as a chunk to be processed for output (e.g., played as a video).

The default buffer size is 64 KB. One byte holds one text character so the buffer is every 64k characters.

The stream module inherits from the EventEmitter class.

When a chunk of data is sent from the buffer, a data event is emitted behind the scenes.
A listener function needs to be in place to handle the data chunk on the data event.

For large files, streaming has two advantages.

It uses less memory than if you loaded the whole file at once.
If it is a video file, the user can start watching the video before it is fully loaded.

ReadStream Events

[2:19, 2:36, and 2:47 Video timestamps]

Going back to the code in our readstream.js file.

let count = 0;

Start the count variable at 0. We will be adding 1 to count every time a new chunk is created.

readStream.once('data', () => {
  console.log('Start the data stream');
});

When a data chunk is sent it emits a data event. Use the once method to listen for the data event the first time it occurs, then handle it with a callback function.
The format is:

readStream.once('data', (data) => { /* Handle event once */ });

readStream.on('data', (chunk) => {
  ++count
  console.log(chunk);
});

To listen for the data event every time, use the on method. The format is:

readStream.on('data', (chunk) => { /* Handle chunk data. */ });

Each time the event occurs we pass in the data chuck as the argument to the callback.
Add one to our count variable.
Log the chunk of data to the console.

readStream.on('end', () => {
  console.log('Data stream complete. Chunk count: ', count);
});

The end event occurs when the data stream has completed. The format is:

readStream.on('end', () => { /* Handle end of stream */ });

We log a message including the chunk count which is 41.

readStream.on('error', (err) => {
  console.error(err.message);
});

Make sure to also listen for the error event to handle errors. The format is:

readStream.on('error', (err) => { /* Handle error */ });

Create a Write Stream

[3:00 Video timestamp]

A write stream lets you write a large amount of data into a file.

To demonstrate, create a file called readWriteStream.js and populate it with the below:

nodejs-tutorial/6-stream/readWriteStream.js

const fs = require('fs');

const readStream = fs.createReadStream('hugeFile.txt', 'utf8');
const writeStream = fs.createWriteStream('output.txt');

let count = 0;
readStream.on('data', (chunk) => {
  console.log(`Chunk ${++count} received: ${chunk.length} characters`);
  writeStream.write(chunk);
});

We are actually combining a read stream with a write stream.

Start by creating a readStream object, passing in hugefile.txt, encoded as utf8 text. As a reminder, the format to create a readStream object is:

fs.createReadStream(path[, options])

And create a writeStream object with the output file named output.txt as the argument. The format for a writeStream object is:

fs.createWriteStream(path[, options])
The default encoding option is utf8, so you can omit it.

In the readStream:

The on method is listening for the data event. The data event occurs every time a chunk of data is created.
In the callback function the first argument is the chuck of data received.
We are incrementing the chunk count variable by 1 then logging it. Along with the chunk length property which is the number of bytes. In a text file a byte holds one character.
Then we call the write method on our writeStream object, passing in the chuck of data.

Execute the file:
node readWriteStream
It will stream all the data from hugefile.txt into output.txt and log the chunk count and length on every data event.

The Pipe method

[3:53 Video timestamp]

Node has a shortcut for reading then writing the same data, called the pipe method.
It works like the Unix pipe method, using the output of the read method as the input for the write method.

Create a file named pipeStream.js and populate it with the below:

nodejs-tutorial/6-stream/pipeStream.js

const fs = require('fs');

const readStream = fs.createReadStream('hugeFile.txt', 'utf8');
const writeStream = fs.createWriteStream('output2.txt');

readStream.pipe(writeStream);

Like before we set variables for the readstream and writestream objects.
The shortcut is, instead of chaining the on method to the readstream, listening for the data event, then handling each chunk by writing it to the writestream;

we simply chain the pipe method to the readStream object and pass in the writeStream as the argument.

This has the same result as the last file we executed except we aren't logging the chunk count this time. The output file this time is output2.txt.

Execute the file:
node pipeStream

You should see the output2.txt file is created.

Create a video streaming web application

[4:45 Video timestamp]

Lastly we will create a very simple video streaming web application using the Express web framework.
By streaming the video, the user can start watching the video before it's fully loaded.

The nodejs-tutorial project on GitHub has a video file at 6-stream/stream-video-webapp/stream-tutorial.mp4. This is an actual video on how to make a video streaming web app. We will use it in our own video streaming web app.
The project folder also contains a README file that is a copy of the below instructions.
If you haven't used Express before, it is the most popular Node.js framework for building websites. It is a large topic in itself so in this tutorial we won't be explaining in detail how Express works.

Create a new directory called stream-video-webapp.
The Unix command is:
mkdir stream-video-webapp
And cd into it:
cd stream-video-webapp

Initiate a package.json file

From the stream-video-webapp folder initiate a Node project.
The -y (yes) option accepts all defaults:
npm init -y

In the scripts property, delete the test script and add the following two scripts:

package.json

{
  ...
  "scripts": {
    "start": "node index.js",
    "dev": "nodemon index.js"
  },
  ...
}

Then when we are done we can start the app with the start script:
npm start
Or if you have Nodemon installed globally, you can use the run dev script that has hot reloading:
npm run dev

Install Express

Install the express npm package:
npm install express

index.html file

Create an HTML document named index.html

Note: if you are in VS Code, you can generate a skeleton HTML document from inside the empty index.html file by entering an exclamation mark then a tab: !+Tab

Populate the file with the below:

index.html

<html>
  <head>
    <title>Node stream video example</title>
  </head>
  <body>
    <h1>How To Build a Video App</h1>
    <!-- controls attribute shows the video controls at the bottom of the video screen -->
    <video width="1080" controls> 
      <source src="/video" type="video/mp4">
    </video>
  </body>
</html>

Our web page just has a heading element.
And a video element.

The width attribute sets the width of the video screen to 1080 pixels, otherwise it would take up the whole screen.
The controls attribute adds a control panel to the bottom of the video screen.
Embedded in the video element is the source element which has a src attribute set to the path where the video can be accessed. In this case we are serving the video on the /video path.

index.js

Create a file in the project root named index.js.

In that file populate it with the below:

index.js

const express = require('express');
const fs = require('fs');
const path = require('path');
const app = express();

// 1. Respond to request to / root path with the index.html document 
app.get('/', (req, res) => {
  res.sendFile(path.join(__dirname, '/index.html'));
});

// 2. Respond to request to /video path.
app.get('/video', (req, res) => {
  const videoPath = './stream-tutorial.mp4';
  // 3. Get video size (in bytes)
  const videoSize = fs.statSync(videoPath).size;

  // 4. Set chunk size to 1MB (default is 64KB)
  const chunkSize = 1000000;
  // 5. Get the video range string from the request headers
  const range = req.headers.range;
  // 6. Starting point: remove non-digits from the range string and convert it to a number.
  const start = Number(range.replace(/\D/g, ''));
  const end = Math.min(start + chunkSize, videoSize - 1);
  // 7. Get the chunk size being sent. 
  const contentLength = end - start + 1;

  // 8. Create the response headers.
  const headers = {
    'Content-Range': `bytes ${start}-${end}/${videoSize}`,
    'Accept-Ranges': 'bytes',
    'Content-Length': contentLength,
    'Content-Type': 'video/mp4'
  }
  // 9. Add headers to response object. HTTP Status 206 is for partial content
  res.writeHead(206, headers);

  // 10. Create read stream for the current chunk of video.
  const readStream = fs.createReadStream(videoPath, { start, end });
  // 11. Pipe the video chunk to the response object to send to the client.
  readStream.pipe(res);
});

// 12. Listen for client requests on port 3000
app.listen(3000, () => {
  console.log('Listening on port 3000');
});

Start the server and view the app in the browser

Start the server:
npm start
Or if you have nodemon installed globally (recommended) run:
npm run dev

View the HTML page at http://localhost:3000
Refresh the browser.
It should show the index.html page with the video player.
If you click play, the video should start playing.
You can move to a different place in the video and it should start playing from that point.

Index.js detailed code explanation:

The comments explain what each statement does. If you want to know exactly what each line of code above is doing, reference the number in the comment and read the corresponding explanation below.

1. When an HTTP request is made to the root route of our site /,
we send the index.html document as the response.

2. In the index.html page, the source attribute in the video points to the /video path, which holds the controls for the video.

3. The fs.statSync method gets the size of the video file.

4. We set the size of a stream chunk to 1MB instead of the 64kb default.

5. Get the range from the request headers sent from the client's browser.

The range includes the byte number of the video to start with.
The end number is omitted from the headers so the request is for the whole rest of the video. We will only send 1MB at a time though.
When starting from the beginning the start of the range is 0.

6. The range comes in as a string, so we need to strip the non-digit characters from it leaving just the starting number.

For the end number we use the smaller of:

The start number plus the 1MB chunk size.
Or if we are at the end of the video, we'll use the remaining videoSize minus 1.

7. Calculate the actual chunk size (in bytes). Will be 1MB except for the last chunk. This will be sent with the response header to let the client's browser know.

8. Create the response headers which will give the browser the information it needs to play the video, including:

Content-Range: That lets the browser know how far along the video is.
Content-Length: The size of the chunk.
Content-Type: The type of data sent, which is video/mp4.

9. Send the headers with the HTTP response back to the client.
Status code 206 is for successfully sending partial content.

10. Create the readStream from the video file, using the start and end options to stream that specific part of the video.

11. Use the pipe method to pipe each readStream chunk to the HTTP response object.

12. At the bottom of the file is where we listen for the next HTTP request to come in. In development we are listening on port 3000.

Stream module (Video Tutorial)

Stream module

Intro

Read File without streaming

nodejs-tutorial/6-stream/smallfile.txt

nodejs-tutorial/6-stream/readfile.js

The Stream module

Read Stream

nodejs-tutorial/6-stream/readstream.js

Data Stream Explanation

ReadStream Events

Create a Write Stream

nodejs-tutorial/6-stream/readWriteStream.js

The Pipe method

nodejs-tutorial/6-stream/pipeStream.js

Create a video streaming web application

Initiate a package.json file

package.json

Install Express

index.html file

index.html

index.js

index.js

Start the server and view the app in the browser

Index.js detailed code explanation: