Memory heap size shoots up beyond 1.5 GB for 50MB pdf file #324

ladvishal1985 · 2023-08-31T10:25:53Z

Thanks for such a great library. We are able to reliably able to write watermark on the PDF. But we are facing trouble with the memory consumption. This is not allowing to use this library for our node servers since this issue is resulting in terminating the pods.
For example:

// We are getting file via signed url pass the response as below: 
const recipe = new Recipe(fileBuffer); 
const used = process.memoryUsage().heapUsed / 1024 / 1024;
this.logger.log(`The script uses approximately after recipe read ${Math.round(used * 100) / 100} MB`);
//Creating the new recipe shoots up the memory heap size above 1.5 GB and 
//After creating the stream 
const readerStream = new Muhammara.PDFRStreamForBuffer(fileBuffer);
// The memory size shoots up to almost 3 GB. 
// Then we use below 
const reader = Muhammara.createReader(readerStream); // We need this to get the total page count.
const pageCount =  reader.getPagesCount();

Is there any solution to this problem ?
Currently we are targeting a file size of up to 50 mb and may go upto 100mb.

julianhille · 2023-08-31T16:46:53Z

There are some Infos missing.

Where does fileBuffer come from and what is it?
- Please add a sample how to initialise it. is it uploaded to the server? not sure what the "signed file url" part means.
Why do you initialize recipe and then dont use it? for showing purpose?
What is your goal? watermarking or getting the page count?

About the 3GB: 3GBis what i would expect if (!!) recipe shots to 1,5 GB it, as recipe is just using muhammara under the hood and node does not free the memory between recipe = ... and readerStream = ... and both create their own objects from the buffer. It couldn't free any memory as recipe is still in used and not dereferenced. So there is that. :>

are you able to provide a sample file?

ladvishal1985 · 2023-09-01T09:13:45Z

Check the below snippet

async downloadAndAddwatermark(signedUrl: string, waterMark: string) {
    try {
      const file$ = this.downloadFileUsingSignedUrl(signedUrl);
      const fileBuffer = await firstValueFrom(file$.pipe(take(1))); //<-- Download the file from here as array buffer
      const modifiedBuffer = await this.addWatermark(fileBuffer, waterMark);
      return modifiedBuffer;
    } catch (error) {
      // catch error here
    }
  }

  private addWatermark(fileBuffer, waterMark: string) {
    try {
      const reciepe = new Recipe(fileBuffer); // <-- Memory consumption increases after this.
      const readerStream = new Muhammara.PDFRStreamForBuffer(fileBuffer);
      const reader = Muhammara.createReader(readerStream);
      const pageCount = reader.getPagesCount();
      
      const modifiedReciepe = this.addWatermarkPage(reciepe, {
        currentPage: 1,
        watermark: waterMark,
        pageCount
      });

      return modifiedReciepe.endPDF((outputBuffer) => outputBuffer);
    } catch (error) {
      //catch error here
    }
  }
  private addWatermarkPage(recipe: Recipe, { currentPage, watermark, pageCount }) {
    if (currentPage > pageCount) {
      return recipe;
    }
    const pgWidth = recipe.pageInfo(currentPage).width;
    const pgHeight = recipe.pageInfo(currentPage).height;
    const initialConfig: FileBufferEditConfig = {
      size: 20,
      text: watermark,
      width: pgWidth,
      x: 0
    };
    const textDetails = this.getTextDetails(initialConfig); // Gets inital config object for text
    const newRecipe = recipe
      .editPage(currentPage)
      .text(watermark, textDetails.x, pgHeight - 30, textDetails.textOptions)
      .text(watermark, textDetails.x, 30, textDetails.textOptions)
      .endPage();
    
      return this.addWatermarkPage(newRecipe, {
      currentPage: currentPage + 1, 
      watermark: watermark,
      pageCount
    });
  }
private getTextDetails(options: FileBufferEditConfig) {
    const writer = Muhammara.createWriter(new Muhammara.PDFWStreamForBuffer());
    const fontFile = path.join(this.fontPath, 'Helvetica.ttf');
    const fontObject = writer.getFontForFile(fontFile);
    let textWidth = fontObject.calculateTextDimensions(options.text, options.size).width;
    while (textWidth >= options.width - 20) {
      options.size = options.size - 1;
      textWidth = fontObject.calculateTextDimensions(options.text, options.size).width;
    }
    options.x = options.width / 2 - textWidth / 2;
    const textOptions = {
      font: 'Helvetica',
      size: options.size,
      colorspace: "rgb",
      color: '#F21A1A',
      opacity: 0.5,
    };
    return {
      textOptions: textOptions,
      x: options.x
    };
  }

ladvishal1985 · 2023-09-04T09:33:53Z

@julianhille: Provided the sample here.

julianhille · 2023-09-07T20:56:22Z

if files are that huge, most of the time the file is, even if temporary, stored on disk.
please check if possible to use new muhammara.PDFRStreamForFile('./huge.pdf'); this could possibly reduce the memory usage greatly

julianhille · 2023-09-12T08:37:19Z

You may also have a look at CopyingContext that also might help reduce

julianhille · 2023-09-28T19:02:56Z

Did you solve it? Do you had a chance to look at copying context?

ladvishal1985 · 2023-10-06T04:05:25Z

No We did not got a chance to use copying context. Any example might help us. Currently we solved the issue by writing file to disc and modifying it. This has helped us to work our solution reasonably well. This is how we do it.

  const pageCount = reader.getPagesCount();
  const fontObject = writer.getFontForFile(this.fontFile);
  const xobjectForm  = writer.createFormXObjectsFromPDF(source, Muhammara.ePDFPageBoxMediaBox);

.....

 pageContent
          .doXObject(page.getResourcesDictionary().addFormXObjectMapping(xobjectForm[i] as any))
          .writeText(watermark, config.x, yTop, textOptions)
          .writeText(watermark, config.x, yBottom, textOptions)
          .Q();
        writer.writePage(page);

ladvishal1985 · 2023-10-06T04:05:45Z

You close this issue..

julianhille closed this as completed Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory heap size shoots up beyond 1.5 GB for 50MB pdf file #324

Memory heap size shoots up beyond 1.5 GB for 50MB pdf file #324

ladvishal1985 commented Aug 31, 2023

julianhille commented Aug 31, 2023 •

edited

Loading

ladvishal1985 commented Sep 1, 2023

ladvishal1985 commented Sep 4, 2023

julianhille commented Sep 7, 2023

julianhille commented Sep 12, 2023

julianhille commented Sep 28, 2023

ladvishal1985 commented Oct 6, 2023

ladvishal1985 commented Oct 6, 2023

Memory heap size shoots up beyond 1.5 GB for 50MB pdf file #324

Memory heap size shoots up beyond 1.5 GB for 50MB pdf file #324

Comments

ladvishal1985 commented Aug 31, 2023

julianhille commented Aug 31, 2023 • edited Loading

ladvishal1985 commented Sep 1, 2023

ladvishal1985 commented Sep 4, 2023

julianhille commented Sep 7, 2023

julianhille commented Sep 12, 2023

julianhille commented Sep 28, 2023

ladvishal1985 commented Oct 6, 2023

ladvishal1985 commented Oct 6, 2023

julianhille commented Aug 31, 2023 •

edited

Loading