Node.js readable maximize throughput/performance for compute intense readable - Writable doesn't pull data fast enough

Question

General setup

I developed an application using AWS Lambda node.js 14. I use a custom Readable implementation FrameCreationStream that uses node-canvas to draw images, svgs and more on a canvas. This result is then extracted as a raw image buffer in BGRA. A single image buffer contains 1920 * 1080 * 4 Bytes = 8294400 Bytes ~8 MB. This is then piped to stdin of a child_process running ffmpeg. The highWaterMark of my Readable in objectMode:true is set to 25 so that the internal buffer can use up to 8 MB * 25 = 200 MB.

All this works fine and also doesn't contain too much RAM. But I noticed after some time, that the performance is not ideally.

Performance not optimal

I have an example input that generates a video of 315 frames. If I set highWaterMark to a value above 25 the performance increases to the point, when I set to a value of 315 or above.

For some reason ffmpeg doesn't start to pull any data until highWaterMark is reached. Obviously thats not what I want. ffmpeg should always consume data if minimum 1 frame is cached in the Readable and if it has finished processing the frame before. And the Readable should produce more frames as long highWaterMark isn't reached or the last frame has been reached. So ideally the Readable and the Writeable are busy all the time.

I found another way to improve the speed. If I add a timeout in the _read() method of the Readable after let's say every tenth frame for 100 ms. Then the ffmpeg-Writable will use this timeout to write some frames to ffmpeg.

It seems like frames aren't passed to ffmpeg during frame creation because some node.js main thread is busy?

The fastest result I have if I increase highWaterMark above the amount of frames - which doesn't work for longer videos as this would make the AWS Lambda RAM explode. And this makes the whole streaming idea useless. Using timeouts always gives me stomach pain. Also depending on the execution on different environments a good fitting timeout might differ. Any ideas?

FrameCreationStream

import canvas from 'canvas';
import {Readable} from 'stream';
import {IMAGE_STREAM_BUFFER_SIZE, PerformanceUtil, RenderingLibraryError, VideoRendererInput} from 'vm-rendering-backend-commons';
import {AnimationAssets, BufferType, DrawingService, FullAnimationData} from 'vm-rendering-library';

/**
 * This is a proper back pressure compatible implementation of readable for a having a stream to read single frames from.
 * Whenever read() is called a new frame is created and added to the stream.
 * read() will be called internally until options.highWaterMark has been reached.
 * then calling read will be paused until one frame is read from the stream.
 */
export class FrameCreationStream extends Readable {

    drawingService: DrawingService;
    endFrameIndex: number;
    currentFrameIndex: number = 0;
    startFrameIndex: number;
    frameTimer: [number, number];
    readTimer: [number, number];
    fullAnimationData: FullAnimationData;

    constructor(animationAssets: AnimationAssets, fullAnimationData: FullAnimationData, videoRenderingInput: VideoRendererInput, frameTimer: [number, number]) {
        super({highWaterMark: IMAGE_STREAM_BUFFER_SIZE, objectMode: true});

        this.frameTimer = frameTimer;
        this.readTimer = PerformanceUtil.startTimer();

        this.fullAnimationData = fullAnimationData;

        this.startFrameIndex = Math.floor(videoRenderingInput.startFrameId);
        this.currentFrameIndex = this.startFrameIndex;
        this.endFrameIndex = Math.floor(videoRenderingInput.endFrameId);

        this.drawingService = new DrawingService(animationAssets, fullAnimationData, videoRenderingInput, canvas);
        console.time("read");
    }

    /**
     * this method is only overwritten for debugging
     * @param size
     */
    read(size?: number): string | Buffer {

        console.log("read("+size+")");
        const buffer = super.read(size);
        console.log(buffer);
        console.log(buffer?.length);
        if(buffer) {
            console.timeLog("read");
        }
        return buffer;
    }

    // _read() will be called when the stream wants to pull more data in.
    // _read() will be called again after each call to this.push(dataChunk) once the stream is ready to accept more data. https://nodejs.org/api/stream.html#readable_readsize
    // this way it is ensured, that even though this.createImageBuffer() is async, only one frame is created at a time and the order is kept
    _read(): void {
        // as frame numbers are consecutive and unique, we have to draw each frame number (also the first and the last one)
        if (this.currentFrameIndex <= this.endFrameIndex) {
            PerformanceUtil.logTimer(this.readTimer, 'WAIT   -> READ\t');
            this.createImageBuffer()
                 .then(buffer => this.optionalTimeout(buffer))
                // push means adding a buffered raw frame to the stream
                .then((buffer: Buffer) => {
                    this.readTimer = PerformanceUtil.startTimer();
                    // the following two frame numbers start with 1 as first value
                    const processedFrameNumberOfScene = 1 + this.currentFrameIndex - this.startFrameIndex;
                    const totalFrameNumberOfScene = 1 + this.endFrameIndex - this.startFrameIndex;
                    // the overall frameId or frameIndex starts with frameId 0
                    const processedFrameIndex = this.currentFrameIndex;
                    this.currentFrameIndex++;
                    this.push(buffer); // nothing besides logging should happen after calling this.push(buffer)
                    console.log(processedFrameNumberOfScene + ' of ' + totalFrameNumberOfScene + ' processed - full video frameId: ' + processedFrameIndex + ' - buffered frames: ' + this.readableLength);
                })
                .catch(err => {
                    // errors will be finally handled, when subscribing to frameCreation stream in ffmpeg service
                    // this log is just generated for tracing errors and if for some reason the handling in ffmpeg service doesn't work
                    console.log("createImageBuffer: ", err);
                    this.emit("error", err);
                });
        } else {
            // push(null) makes clear that this stream has ended
            this.push(null);
            PerformanceUtil.logTimer(this.frameTimer, 'FRAME_STREAM');
        }
    }

    private optionalTimeout(buffer: Buffer): Promise<Buffer> {
        if(this.currentFrameIndex % 10 === 0) {
            return new Promise(resolve => setTimeout(() => resolve(buffer), 140));
        }
        return Promise.resolve(buffer);
    }

    // prevent memory leaks - without this lambda memory will increase with every call
    _destroy(): void {
        this.drawingService.destroyStage();
    }

    /**
     * This creates a raw pixel buffer that contains a single frame of the video drawn by the rendering library
     *
     */
    public async createImageBuffer(): Promise<Buffer> {

        const drawTimer = PerformanceUtil.startTimer();
        try {
            await this.drawingService.drawForFrame(this.currentFrameIndex);
        } catch (err: any) {
            throw new RenderingLibraryError(err);
        }

        PerformanceUtil.logTimer(drawTimer, 'DRAW   -> FRAME\t');

        const bufferTimer = PerformanceUtil.startTimer();
        // Creates a raw pixel buffer, containing simple binary data
        // the exact same information (BGRA/screen ratio) has to be provided to ffmpeg, because ffmpeg cannot detect format for raw input
        const buffer = await this.drawingService.toBuffer(BufferType.RAW);
        PerformanceUtil.logTimer(bufferTimer, 'CANVAS -> BUFFER');

        return buffer;
    }
}

FfmpegService

import {ChildProcess, execFile} from 'child_process';
import {Readable} from 'stream';
import {FPS, StageSize} from 'vm-rendering-library';
import {
    FfmpegError,
    LOCAL_MERGE_VIDEOS_TEXT_FILE, LOCAL_SOUND_FILE_PATH,
    LOCAL_VIDEO_FILE_PATH,
    LOCAL_VIDEO_SOUNDLESS_MERGE_FILE_PATH
} from "vm-rendering-backend-commons";

/**
 * This class bundles all ffmpeg usages for rendering one scene.
 * FFmpeg is a console program which can transcode nearly all types of sounds, images and videos from one to another.
 */
export class FfmpegService {

    ffmpegPath: string = null;


    constructor(ffmpegPath: string) {
        this.ffmpegPath = ffmpegPath;
    }

    /**
     * Convert a stream of raw images into an .mp4 video using the command line program ffmpeg.
     *
     * @param inputStream an input stream containing images in raw format BGRA
     * @param stageSize the size of a single frame in pixels (minimum is 2*2)
     * @param outputPath the filepath to write the resulting video to
     */
    public imageToVideo(inputStream: Readable, stageSize: StageSize, outputPath: string): Promise<void> {
        const args: string[] = [
            '-f',
            'rawvideo',
            '-r',
            `${FPS}`,
            '-pix_fmt',
            'bgra',
            '-s',
            `${stageSize.width}x${stageSize.height}`,
            '-i',
            // input "-" means input will be passed via pipe (streamed)
            '-',
            // codec that also QuickTime player can understand
            '-vcodec',
            'libx264',
            '-pix_fmt',
            'yuv420p',
            /*
                * "-movflags faststart":
                * metadata at beginning of file
                * needs more RAM
                * file will be broken, if not finished properly
                * higher application compatibility
                * better for browser streaming
            */
            '-movflags',
            'faststart',
            // "-preset ultrafast", //use this to speed up compression, but quality/compression ratio gets worse
            // don't overwrite an existing file here,
            // but delete file in the beginning of execution index.ts
            // (this is better for local testing believe me)
            outputPath
        ];

        return this.execFfmpegPromise(args, inputStream);
    }

    private execFfmpegPromise(args: string[], inputStream?: Readable): Promise<void> {
        const ffmpegServiceSelf = this;
        return new Promise(function (resolve, reject) {
            const executionProcess: ChildProcess = execFile(ffmpegServiceSelf.ffmpegPath, args, (err) => {
                if (err) {
                    reject(new FfmpegError(err));
                } else {
                    console.log('ffmpeg finished');
                    resolve();
                }
            });
            if (inputStream) {
                // it's important to listen on errors of input stream before piping it into the write stream
                // if we don't do this here, we get an unhandled promise exception for every issue in the input stream
                inputStream.on("error", err => {
                    reject(err);
                });
                // don't reject promise here as the error will also be thrown inside execFile and will contain more debugging info
                // this log is just generated for tracing errors and if for some reason the handling in execFile doesn't work
                inputStream.pipe(executionProcess.stdin).on("error", err => console.log("pipe stream: " , err));
            }
        });
    }
}

Node.js readable maximize throughput/performance for compute intense readable - Writable doesn't pull data fast enough

General setup

Performance not optimal

FrameCreationStream

FfmpegService

0 Answers0