Nodejs out of memory on pipe for pdfMake - javascript

I have a Nodejs server that is being used to create about 1200 pdf forms that can be downloaded by a client later. They are being created using pdfmake and then output to a server folder. When I execute the code as written at about 350 documents, Nodejs runs out of memory. I know there must be a better way to save, but I cannot seem to figure it out.
The below method is being called by a map of an array of data from a Mongoose query. The relevant code for creating and saving the form is as follows:
const whichForm = certList => {
certList.map(cert => {
if (cert.Cert_Details !== null) {
switch (cert.GWMA) {
case 'OA':
case 'PC':
// Don't provide reports for Feedlots
if (cert.Cert_Details.cert_type !== null) {
if (cert.Cert_Details.cert_type === 'Irrigation') {
createOAReport(cert);
}
}
break;
case 'FA':
// Don't provide reports for Feedlots
if (cert.Cert_Details.cert_type === 'Irrigation') {
createFAReport(cert);
}
break;
}
}
}
}
Different File:
const PdfPrinter = require('pdfmake/src/printer');
const fs = require('fs');
const createOAReport = data => {
console.log('PC or OA Cert ', data.Cert_ID);
// console.log(data);
let all_meters_maint = [];
data.Flowmeters.map(flowmeter => {
// Each Flow meter
// console.log(`Inside Flowmeter ${flowmeter}`);
if (flowmeter.Active === true) {
let fm_maint = [];
fm_maint.push({
text: `Meter Serial Number: ${flowmeter.Meter_Details.Serial_num}`
});
fm_maint.push({
text: `Type of Meter: ${flowmeter.Meter_Details.Manufacturer}`
});
fm_maint.push({ text: `Units: ${flowmeter.Meter_Details.units}`});
fm_maint.push({ text: `Factor: ${flowmeter.Meter_Details.factor}`});
all_meters_maint.push(fm_maint);
}
docDefinition.content.push({
style: 'tableExample',
table: {
widths: [200, 200, '*', '*'],
body: all_meters_maint
},
layout: 'noBorders'
});
const fonts = {
Roboto: {
normal: path.join(__dirname, '../', '/fonts/Roboto-
Regular.ttf'),
bold: path.join(__dirname, '../', '/fonts/Roboto-Medium.ttf'),
italics: path.join(__dirname, '../', '/fonts/Roboto-Italic.ttf'),
bolditalics: path.join(__dirname, '../', '/fonts/Roboto-
MediumItalic.ttf')
}
};
const printer = new PdfPrinter(fonts);
const pdfDoc = printer.createPdfKitDocument(docDefinition);
// Build file path
const fullfilePath = path.join(
__dirname,
'../',
'/public/pdffiles/',
`${data.Cert_ID}.pdf`
);
pdfDoc.pipe(fs.createWriteStream(fullfilePath));
pdfDoc.end();
};
Is there a different way to save the files that don't force them to be in a stream and will not be kept in memory?

Before we get to the answer, I'm making one huge assumption based on the information in the question. The question states create about 1200 pdf forms. Which means I'm assuming in the function whichForm the parameter certList is an array of 1200 items. Or should I say 1200 items that will call the createOAReport method. You get the idea. I'm assuming the problem is that we are calling that method to create the PDFs 1200 times within that Array.map method. Which makes sense I believe given the question and context of the code.
On to the answer. The major problem is you aren't just trying to create 1200 pdfs. You are trying to create 1200 pdfs asynchronously, which of course puts a strain on the system trying to do all of that work all at once. Maybe even more so on a single thread system like Node.js.
The easy hacky solution is to just increase the memory of Node.js. By using the --max-old-space-size flag and setting the memory size in MB when running your node command. You can find more information about this at this tutorial. But the short version is a command like node --max-old-space-size=8192 main.js. That would increase the memory size of Node.js to 8192 MB or 8 GB.
Few problems with that method. Mainly it's not super scalable. What if someday you have 5000 pdfs you want to create? You'd have to increase that memory size again. And maybe increase the specs on the machine it's being run on.
The second solution, which you could actually probably do with the first solution, is to make this process not asynchronous. Depending on many factors and how optimized the current system is, chances are this will increase the amount of time it takes to create all of these PDFs.
This process is kinda a two step process to code it in. First is to setup your createOAReport function to return a promise to indicate when it's done. The second step is to change your whichForm function to limit how many items can be running asynchronously at any single point in time.
You will have to of course play around with the system to determine how many items you want to run at one time without overloading the system. Fine-tuning that number is not something I focused on, and of course you could probably increase that number by increasing the memory you give Node.js as well.
And of course, there are TONS of different ways to do this. I have a few ideas of methods that are better than the one I'm going to show here, but are a lot more complicated. The foundational idea of limiting how many items are running at once remains the same tho. You can optimize it to fit your needs.
I've developed systems like this before, but I don't think the way I've done it is the best or cleanest way to do it. But at the end of this question I've attached some sample code for your example trying to illustrate my point.
const _ = require('lodash');
const MAX_RUNNING_PROMISES = 10; // You will have to play with this number to get it right for your needs
const whichForm = async certList => {
// If certList is ["a", "b", "c", "d"]
// And we run the following function with MAX_RUNNING_PROMISES = 2
// array would equal [["a", "b"], ["c", "d"]]
certList = _.chunk(certList, MAX_RUNNING_PROMISES);
// Of course you can use something other than Lodash here, but I chose it because it's the first thing that came to mind
for (let i = 0; i < certList.length; i++) {
const certArray = certList[i];
// The following line will wait until all the promises have been resolved or completed before moving on
await Promise.all(certArray.map(cert => {
if (cert.Cert_Details !== null) {
switch (cert.GWMA) {
case 'OA':
case 'PC':
// Don't provide reports for Feedlots
if (cert.Cert_Details.cert_type !== null) {
if (cert.Cert_Details.cert_type === 'Irrigation') {
return createOAReport(cert);
}
}
break;
case 'FA':
// Don't provide reports for Feedlots
if (cert.Cert_Details.cert_type === 'Irrigation') {
return createFAReport(cert);
}
break;
}
}
}));
}
}
Then for your other file. We just have to convert it to return a promise.
const PdfPrinter = require('pdfmake/src/printer');
const fs = require('fs');
const createOAReport = data => {
return new Promise((resolve, reject) => {
console.log('PC or OA Cert ', data.Cert_ID);
// console.log(data);
let all_meters_maint = [];
const flowmeter = data.Flowmeters[0];
if (flowmeter.Active === true) {
let fm_maint = [];
fm_maint.push({
text: `Meter Serial Number: ${flowmeter.Meter_Details.Serial_num}`
});
fm_maint.push({
text: `Type of Meter: ${flowmeter.Meter_Details.Manufacturer}`
});
fm_maint.push({
text: `Units: ${flowmeter.Meter_Details.units}`
});
fm_maint.push({
text: `Factor: ${flowmeter.Meter_Details.factor}`
});
all_meters_maint.push(fm_maint);
}
docDefinition.content.push({
style: 'tableExample',
table: {
widths: [200, 200, '*', '*'],
body: all_meters_maint
},
layout: 'noBorders'
});
const fonts = {
Roboto: {
normal: path.join(__dirname, '../', '/fonts/Roboto-Regular.ttf'),
bold: path.join(__dirname, '../', '/fonts/Roboto-Medium.ttf'),
italics: path.join(__dirname, '../', '/fonts/Roboto-Italic.ttf'),
bolditalics: path.join(__dirname, '../', '/fonts/Roboto-MediumItalic.ttf')
}
};
const printer = new PdfPrinter(fonts);
const pdfDoc = printer.createPdfKitDocument(docDefinition);
// Build file path
const fullfilePath = path.join(
__dirname,
'../',
'/public/pdffiles/',
`${data.Cert_ID}.pdf`
);
pdfDoc.pipe(fs.createWriteStream(fullfilePath));
pdfDoc.on('finish', resolve); // This is where we tell it to resolve the promise when it's finished
pdfDoc.end();
});
};
I just realized after getting really far into this answer that my original assumption is incorrect. Since some of those pdfs might be created within the second function and the data.Flowmeters.map system. So although I'm not going to demonstrate it, you will have to apply the same ideas I have given throughout this answer to that system as well. For now, I have removed that section and am just using the first item in that array, since it's just an example.
You might want to restructure your code once you have an idea of this and just have one function that handles creating the PDF, and not have as many .map method calls all over the place. Abstract the .map methods out and keep it separate from the PDF creation process. That way it'd be easier to limit how many PDFs are being created at a single time.
It'd also be a good idea to add in some error handling around all of these processes.
NOTE I didn't actually test this code at all, so there might be some bugs with it. But the overall ideas and principals still apply.

Related

fs.createWriteStream doesn't use back-pressure when writing data to a file, causing high memory usage

Problem
I'm trying to scan a drive directory (recursively walk all the paths) and write all the paths to a file (as it's finding them) using fs.createWriteStream in order to keep the memory usage low, but it doesn't work, the memory usage reaches 2GB during the scan.
Expected
I was expecting fs.createWriteStream to automatically handle memory/disk usage at all times, keeping memory usage at a minimum with back-pressure.
Code
const fs = require('fs')
const walkdir = require('walkdir')
let dir = 'C:/'
let options = {
"max_depth": 0,
"track_inodes": true,
"return_object": false,
"no_return": true,
}
const wstream = fs.createWriteStream("C:/Users/USERNAME/Desktop/paths.txt")
let walker = walkdir(dir, options)
walker.on('path', (path) => {
wstream.write(path + '\n')
})
walker.on('end', (path) => {
wstream.end()
})
Is it because I'm not using .pipe()? I tried creating a new Stream.Readable({read{}}) and then inside the .on('path' emitter pushing paths into it with readable.push(path) but that didn't really work.
UPDATE:
Method 2:
I tried the proposed in the answers drain method but it doesn't help much, it does reduce memory usage to 500mb (which is still too much for a stream) but it slows down the code significantly (from seconds to minutes)
Method 3:
I also tried using readdirp, it uses even less memory (~400mb) and is faster but I don't know how to pause it and use the drain method there to reduce the memory usage further:
const readdirp = require('readdirp')
let dir = 'C:/'
const wstream = fs.createWriteStream("C:/Users/USERNAME/Desktop/paths.txt")
readdirp(dir, {alwaysStat: false, type: 'files_directories'})
.on('data', (entry) => {
wstream.write(`${entry.fullPath}\n`)
})
Method 4:
I also tried doing this operation with a custom recursive walker, and even though it uses only 30mb of memory, which is what I wanted, but it is like 10 times slower than the readdirp method and it is synchronous which is undesirable:
const fs = require('fs')
const path = require('path')
let dir = 'C:/'
function customRecursiveWalker(dir) {
fs.readdirSync(dir).forEach(file => {
let fullPath = path.join(dir, file)
// Folders
if (fs.lstatSync(fullPath).isDirectory()) {
fs.appendFileSync("C:/Users/USERNAME/Desktop/paths.txt", `${fullPath}\n`)
customRecursiveWalker(fullPath)
}
// Files
else {
fs.appendFileSync("C:/Users/USERNAME/Desktop/paths.txt", `${fullPath}\n`)
}
})
}
customRecursiveWalker(dir)
Preliminary observation: you've attempted to get the results you want using multiple approaches. One complication when comparing the approaches you used is that they do not all do the same work. If you run tests on file tree that contains only regular files, that tree does not contain mount points, you can probably compare the approaches fairly, but when you start adding mount points, symbolic links, etc, you may get different memory and time statistics merely due to the fact that one approach excludes files that another approach includes.
I've initially attempted a solution using readdirp, but unfortunately, but that library appears buggy to me. Running it on my system here, I got inconsistent results. One run would output 10Mb of data, another run with the same input parameters would output 22Mb, then I'd get another number, etc. I looked at the code and found that it does not respect the return value of push:
_push(entry) {
if (this.readable) {
this.push(entry);
}
}
As per the documentation the push method may return a false value, in which case the Readable stream should stop producing data and wait until _read is called again. readdirp entirely ignores that part of the specification. It is crucial to pay attention to the return value of push to get proper handling of back-pressure. There are also other things that seemed questionable in that code.
So I abandoned that and worked on a proof of concept showing how it could be done. The crucial parts are:
When the push method returns false it is imperative to stop adding data to the stream. Instead, we record where we were, and stop.
We start again only when _read is called.
If you uncomment the console.log statements that print START and STOP. You'll see them printed out in succession on the console. We start, produce data until Node tells us to stop, and then we stop, until Node tells us to start again, and so on.
const stream = require("stream");
const fs = require("fs");
const { readdir, lstat } = fs.promises;
const path = require("path");
class Walk extends stream.Readable {
constructor(root, maxDepth = Infinity) {
super();
this._maxDepth = maxDepth;
// These fields allow us to remember where we were when we have to pause our
// work.
// The path of the directory to process when we resume processing, and the
// depth of this directory.
this._curdir = [root, 1];
// The directories still to process.
this._dirs = [this._curdir];
// The list of files to process when we resume processing.
this._files = [];
// The location in `this._files` were to continue processing when we resume.
this._ix = 0;
// A flag recording whether or not the fetching of files is currently going
// on.
this._started = false;
}
async _fetch() {
// Recall where we were by loading the state in local variables.
let files = this._files;
let dirs = this._dirs;
let [dir, depth] = this._curdir;
let ix = this._ix;
while (true) {
// If we've gone past the end of the files we were processing, then
// just forget about them. This simplifies the code that follows a bit.
if (ix >= files.length) {
ix = 0;
files = [];
}
// Read directories until we have files to process.
while (!files.length) {
// We've read everything, end the stream.
if (dirs.length === 0) {
// This is how the stream API requires us to indicate the stream has
// ended.
this.push(null);
// We're no longer running.
this._started = false;
return;
}
// Here, we get the next directory to process and get the list of
// files in it.
[dir, depth] = dirs.pop();
try {
files = await readdir(dir, { withFileTypes: true });
}
catch (ex) {
// This is a proof-of-concept. In a real application, you should
// determine what exceptions you want to ignore (e.g. EPERM).
}
}
// Process each file.
for (; ix < files.length; ++ix) {
const dirent = files[ix];
// Don't include in the results those files that are not directories,
// files or symbolic links.
if (!(dirent.isFile() || dirent.isDirectory() || dirent.isSymbolicLink())) {
continue;
}
const fullPath = path.join(dir, dirent.name);
if (dirent.isDirectory() & depth < this._maxDepth) {
// Keep track that we need to walk this directory.
dirs.push([fullPath, depth + 1]);
}
// Finally, we can put the data into the stream!
if (!this.push(`${fullPath}\n`)) {
// If the push returned false, we have to stop pushing results to the
// stream until _read is called again, so we have to stop.
// Uncomment this if you want to see when the stream stops.
// console.log("STOP");
// Record where we were in our processing.
this._files = files;
// The element at ix *has* been processed, so ix + 1.
this._ix = ix + 1;
this._curdir = [dir, depth];
// We're stopping, so indicate that!
this._started = false;
return;
}
}
}
}
async _read() {
// Do not start the process that puts data on the stream over and over
// again.
if (this._started) {
return;
}
this._started = true; // Yep, we've started.
// Uncomment this if you want to see when the stream starts.
// console.log("START");
await this._fetch();
}
}
// Change the paths to something that makes sense for you.
stream.pipeline(new Walk("/home/", 5),
fs.createWriteStream("/tmp/paths3.txt"),
(err) => console.log("ended with", err));
When I run the first attempt you made with walkdir here, I get the following statistics:
Elapsed time (wall clock): 59 sec
Maximum resident set size: 2.90 GB
When I use the code I've shown above:
Elapsed time (wall clock): 35 sec
Maximum resident set size: 0.1 GB
The file tree I use for the tests produces a file listing of 792 MB
You could exploit the returned value from WritableStream.write(): it essentially states if you should continue to read or not. a WritableStream has an internal property that stores the threshold after which the buffer should be processed by the OS. The drain event will be emitted when the buffer has been flushed, i.e. you can call safely call WritableStream.write() without risking to excessively fill the buffer (which means the RAM). Luckily for you, walkdir let you control the process: you can emit pause(pause the walk. no more events will be emitted until resume) and resume(resume the walk) event from the walkdir object, pausing and resuming the writing process on you stream accordingly. Try with this:
let is_emitter_paused = false;
wstream.on('drain', (evt) => {
if (is_emitter_paused) {
walkdir.resume();
}
});
walkdir.on('path', function(path, stat) {
is_emitter_paused = !wstream.write(path + '\n');
if (is_emitter_paused) {
walkdir.pause();
}
});
Here's an implementation inspired by #Louis's answer. I think it's a bit easier to follow and in my minimal testing it performs about the same.
const fs = require('fs');
const path = require('path');
const stream = require('stream');
class Walker extends stream.Readable {
constructor(root = process.cwd(), maxDepth = Infinity) {
super();
// Dirs to process
this._dirs = [{ path: root, depth: 0 }];
// Max traversal depth
this._maxDepth = maxDepth;
// Files to flush
this._files = [];
}
_drain() {
while (this._files.length > 0) {
const file = this._files.pop();
if (file.isFile() || file.isDirectory() || file.isSymbolicLink()) {
const filePath = path.join(this._dir.path, file.name);
if (file.isDirectory() && this._maxDepth > this._dir.depth) {
// Add directory to be walked at a later time
this._dirs.push({ path: filePath, depth: this._dir.depth + 1 });
}
if (!this.push(`${filePath}\n`)) {
// Hault walking
return false;
}
}
}
if (this._dirs.length === 0) {
// Walking complete
this.push(null);
return false;
}
// Continue walking
return true;
}
async _step() {
try {
this._dir = this._dirs.pop();
this._files = await fs.promises.readdir(this._dir.path, { withFileTypes: true });
} catch (e) {
this.emit('error', e); // Uh oh...
}
}
async _walk() {
this.walking = true;
while (this._drain()) {
await this._step();
}
this.walking = false;
}
_read() {
if (!this.walking) {
this._walk();
}
}
}
stream.pipeline(new Walker('some/dir/path', 5),
fs.createWriteStream('output.txt'),
(err) => console.log('ended with', err));

raw h264 to GIF node js

I am trying to use the "pi-camera" library which is working and allowing me to record video in a raw h264 format on my r-pi. However, the node js library "gifify" continuously gives me the error "RangeError: Maximum call stack size exceeded" looking this error up it seems to be related calling many functions within functions multiple times or something related to this. However, my code only uses one function which contains a simple command to take the video and then convert it.
const PiCamera = require('pi-camera');
var fs = require('fs');
var gifify = require('gifify');
var path = require('path');
var sleep = require('system-sleep');
const myCamera = new PiCamera({
mode: 'video',
output: `/home/pi/Videos/video.h264`,
width: 640,
height: 480,
time: 5000,
nopreview: true,
vflip: true,
});
var input = path.join('/home/pi/Videos', 'video.h264');
var output = path.join('/home/pi/Videos', 'daily.gif');
var gif = fs.createWriteStream(output);
var options = {
speed: 5,
text: 'Daily Plant GIF'
};
sleep(5000);
setInterval(vid, 10000);
function vid(){
myCamera.record()
.then((result) => {
console.log('success');
gifify(input, options).pipe(gif);
})
.catch((error) => {
console.log(error);
});
}
any information on what this error truly means in this scenario/ how to fix it would be much appreciated. Thank you!
an error can be related not to your code only but also to libraries you are using.
I see at least few issues been reported to gifyfy about "maximum stack exceeded"
open one:
https://github.com/vvo/gifify/issues/94
I'm not sure if there is any workaround in your case. maybe you need trying different parameters or look for different library

Listen for hot update events on the client side with webpack-dev-derver?

This is a bit of an edge case but it would be helpful to know.
When developing an extension using webpack-dev-server to keep the extension code up to date, it would be useful to listen to "webpackHotUpdate"
Chrome extensions with content scripts often have two sides to the equation:
Background
Injected Content Script
When using webpack-dev-server with HMR the background page stays in sync just fine. However content scripts require a reload of the extension in order to reflect the changes. I can remedy this by listening to the "webpackHotUpdate" event from the hotEmmiter and then requesting a reload. At present I have this working in a terrible and very unreliably hacky way.
var hotEmitter = __webpack_require__(XX)
hotEmitter.on('webpackHotUpdate', function() {
console.log('Reloading Extension')
chrome.runtime.reload()
})
XX simply represents the number that is currently assigned to the emitter. As you can imagine this changed whenever the build changes so it's a very temporary proof of concept sort of thing.
I suppose I could set up my own socket but that seems like overkill, given the events are already being transferred and I simply want to listen.
I am just recently getting more familiar with the webpack ecosystem so any guidance is much appreciated.
Okay!
I worked this out by looking around here:
https://github.com/facebookincubator/create-react-app/blob/master/packages/react-dev-utils/webpackHotDevClient.js
Many thanks to the create-react-app team for their judicious use of comments.
I created a slimmed down version of this specifically for handling the reload condition for extension development.
var SockJS = require('sockjs-client')
var url = require('url')
// Connect to WebpackDevServer via a socket.
var connection = new SockJS(
url.format({
// Default values - Updated to your own
protocol: 'http',
hostname: 'localhost',
port: '3000',
// Hardcoded in WebpackDevServer
pathname: '/sockjs-node',
})
)
var isFirstCompilation = true
var mostRecentCompilationHash = null
connection.onmessage = function(e) {
var message = JSON.parse(e.data)
switch (message.type) {
case 'hash':
handleAvailableHash(message.data)
break
case 'still-ok':
case 'ok':
case 'content-changed':
handleSuccess()
break
default:
// Do nothing.
}
}
// Is there a newer version of this code available?
function isUpdateAvailable() {
/* globals __webpack_hash__ */
// __webpack_hash__ is the hash of the current compilation.
// It's a global variable injected by Webpack.
return mostRecentCompilationHash !== __webpack_hash__
}
function handleAvailableHash(data){
mostRecentCompilationHash = data
}
function handleSuccess() {
var isHotUpdate = !isFirstCompilation
isFirstCompilation = false
if (isHotUpdate) { handleUpdates() }
}
function handleUpdates() {
if (!isUpdateAvailable()) return
console.log('%c Reloading Extension', 'color: #FF00FF')
chrome.runtime.reload()
}
When you are ready to use it (during development only) you can simply add it to your background.js entry point
module.exports = {
entry: {
background: [
path.resolve(__dirname, 'reloader.js'),
path.resolve(__dirname, 'background.js')
]
}
}
For actually hooking into the event emitter as was originally asked you can just require it from webpack/hot/emitter since that file exports an instance of the EventEmitter that's used.
if(module.hot) {
var lastHash
var upToDate = function upToDate() {
return lastHash.indexOf(__webpack_hash__) >= 0
}
var clientEmitter = require('webpack/hot/emitter')
clientEmitter.on('webpackHotUpdate', function(currentHash) {
lastHash = currentHash
if(upToDate()) return
console.log('%c Reloading Extension', 'color: #FF00FF')
chrome.runtime.reload()
})
}
This is just a stripped down version straight from the source:
https://github.com/webpack/webpack/blob/master/hot/dev-server.js
I've fine-tuned the core logic of the crx-hotreload package and come up with a build-tool agnostic solution (meaning it will work with Webpack but also with anything else).
It asks the extension for its directory (via chrome.runtime.getPackageDirectoryEntry) and then watches that directory for file changes. Once a file is added/removed/changed inside that directory, it calls chrome.runtime.reload().
If you'd need to also reload the active tab (when developing a content script), then you should run a tabs.query, get the first (active) tab from the results and call reload on it as well.
The whole logic is ~35 lines of code:
/* global chrome */
const filesInDirectory = dir => new Promise(resolve =>
dir.createReader().readEntries(entries =>
Promise.all(entries.filter(e => e.name[0] !== '.').map(e =>
e.isDirectory
? filesInDirectory(e)
: new Promise(resolve => e.file(resolve))
))
.then(files => [].concat(...files))
.then(resolve)
)
)
const timestampForFilesInDirectory = dir => filesInDirectory(dir)
.then(files => files.map(f => f.name + f.lastModifiedDate).join())
const watchChanges = (dir, lastTimestamp) => {
timestampForFilesInDirectory(dir).then(timestamp => {
if (!lastTimestamp || (lastTimestamp === timestamp)) {
setTimeout(() => watchChanges(dir, timestamp), 1000)
} else {
console.log('%c ๐Ÿš€ Reloading Extension', 'color: #FF00FF')
chrome.runtime.reload()
}
})
}
// Init if in dev environment
chrome.management.getSelf(self => {
if (self.installType === 'development' &&
'getPackageDirectoryEntry' in chrome.runtime
) {
console.log('%c ๐Ÿ“ฆ Watching for file changes', 'color: #FF00FF')
chrome.runtime.getPackageDirectoryEntry(dir => watchChanges(dir))
}
})
You should add this script to your manifest.json file's background scripts entry:
"background": ["reloader.js", "background.js"]
And a Gist with a light explanation in the Readme: https://gist.github.com/andreasvirkus/c9f91ddb201fc78042bf7d814af47121

How to handle Shopify's API call limit using microapps Node.js module

I have been banging my head finding an answer for this I just cant figure out. I am using a Node.js module for the Shopify API by microapps. I have a JSON object containing a list of product id's and skus that I need to update so I looping through the file and calling a function that calls the api. Shopify's API limits calls to it and sends a response header with the value remaining. This node modules provides an object containing the limits and usage. My question is based on the code below how to can at a setTimeout or similar when I am reaching the limit. Once you make your first call it will return the limits object like this:
{
remaining: 30,
current: 10,
max: 40
}
Here is what I have without respecting the limits as everything I tried fails:
const products = JSON.parse(fs.readFileSync('./skus.json','utf8'));
for(var i = 0;i < products.length; i++) {
updateProduct(products[i]);
}
function updateProduct(product){
shopify.productVariant.update(variant.id, { sku: variant.sku })
.then(result => cb(shopify.callLimits.remaining))
.catch(err => console.error(err.statusMessage));
}
I know I need to implement some sort of callback to check if the remaining usage is low and then wait a few seconds before calling again. Any help would be greatly appreciated.
I would use something to limit the execution rate of the function used by shopify-api-node (Shopify.prototype.request) to create the request, for example https://github.com/lpinca/valvelet.
The code below is not tested but should work. It should respect the limit of 2 calls per second.
var Shopify = require('shopify-api-node');
var valvelet = require('valvelet');
var products = require('./skus');
var shopify = new Shopify({
shopName: 'your-shop-name',
apiKey: 'your-api-key',
password: 'your-app-password'
});
// Prevent the private shopify.request method from being called more than twice per second.
shopify.request = valvelet(shopify.request, 2, 1000);
var promises = products.map(function (product) {
return shopify.productVariant.update(product.id, { sku: product.sku });
});
Promise.all(promises).then(function (values) {
// Do something with the responses.
}).catch(function (err) {
console.error(err.stack);
});
Try making use of the autoLimit option, for example:
import Shopify from 'shopify-api-node';
const getAutoLimit = (plan: string) => {
if (plan === 'plus') {
return { calls: 4, interval: 1000, bucketSize: 80 };
} else {
return { calls: 2, interval: 1000, bucketSize: 40 };
}
};
const shopify = new Shopify({
shopName: process.env.SHOPIFY_SHOP_NAME!,
apiKey: process.env.SHOPIFY_SHOP_API_KEY!,
password: process.env.SHOPIFY_SHOP_PASSWORD!,
apiVersion: '2020-07',
autoLimit: getAutoLimit(process.env.SHOPIFY_SHOP_PLAN),
});
export default shopify;
According to the library's documentation:
- `autoLimit` - Optional - This option allows you to regulate the request rate
in order to avoid hitting the [rate limit][api-call-limit]. Requests are
limited using the token bucket algorithm. Accepted values are a boolean or a
plain JavaScript object. When using an object, the `calls` property and the
`interval` property specify the refill rate and the `bucketSize` property the
bucket size. For example `{ calls: 2, interval: 1000, bucketSize: 35 }`
specifies a limit of 2 requests per second with a burst of 35 requests. When
set to `true` requests are limited as specified in the above example. Defaults
to `false`.
And this is the version I tried: "shopify-api-node": "^3.3.2"
Regarding the rate limits, refer to Shopify's documentation.
try this...
const Shopify = require("shopify-api-node");
const waitonlimit = 2;
let calllimitremain = 40;
const shopify = new Shopify({
shopName: process.env.SHOPIFY_URL,
apiKey: process.env.SHOPIFY_KEY,
password: process.env.SHOPIFY_PWD,
autoLimit: true,
});
shopify.on("callLimits", (limits) => {
calllimitremain = limits.remaining;
if (limits.remaining < 10) {
console.log(limits);
}
});
exports.update = async () => {
//Run this before update
while (calllimitremain <= waitonlimit) {
shopify.product.list({ limit: 1, fields: "id, title" });
console.log(`Waiting for bucket to fill: ${calllimitremain}`);
}
//update
await shopify.productVariant.update(
onlineVariantId,
{ compare_at_price: price, price: promo }
);
};
If you look at Shopify code, their github repository has a CLI. That CLI is dealing with the limits. You can quickly learn how Shopify deals with these limits, looking at their code.
Since their code is in Ruby, it is pretty easy to digest. It should not take a skilled JS programmer more than a few minutes to see how to deal with limits based on this code, even abstracting from Ruby.
So my suggestion is to read that Shopify code and try and then morph your JS code to match the same pattern.

Update HTML object with node.js and javascript

I'm new to nodejs and jquery, and I'm trying to update one single html object using a script.
I am using a Raspberry pi 2 and a ultrasonic sensor, to measure distance. I want to measure continuous, and update the html document at the same time with the real time values.
When I try to run my code it behaves like a server and not a client. Everything that i console.log() prints in the cmd and not in the browesers' console. When I run my code now i do it with "sudo node surveyor.js", but nothing happens in the html-document. I have linked it properly to the script. I have also tried document.getElementsByTagName("h6").innerHTML = distance.toFixed(2), but the error is "document is not defiend".
Is there any easy way to fix this?
My code this far is:
var statistics = require('math-statistics');
var usonic = require('r-pi-usonic');
var fs = require("fs");
var path = require("path");
var jsdom = require("jsdom");
var htmlSource = fs.readFileSync("../index.html", "utf8");
var init = function(config) {
usonic.init(function (error) {
if (error) {
console.log('error');
} else {
var sensor = usonic.createSensor(config.echoPin, config.triggerPin, config.timeout);
//console.log(config);
var distances;
(function measure() {
if (!distances || distances.length === config.rate) {
if (distances) {
print(distances);
}
distances = [];
}
setTimeout(function() {
distances.push(sensor());
measure();
}, config.delay);
}());
}
});
};
var print = function(distances) {
var distance = statistics.median(distances);
process.stdout.clearLine();
process.stdout.cursorTo(0);
if (distance < 0) {
process.stdout.write('Error: Measurement timeout.\n');
} else {
process.stdout.write('Distance: ' + distance.toFixed(2) + ' cm');
call_jsdom(htmlSource, function (window) {
var $ = window.$;
$("h6").replaceWith(distance.toFixed(2));
console.log(documentToSource(window.document));
});
}
};
function documentToSource(doc) {
// The non-standard window.document.outerHTML also exists,
// but currently does not preserve source code structure as well
// The following two operations are non-standard
return doc.doctype.toString()+doc.innerHTML;
}
function call_jsdom(source, callback) {
jsdom.env(
source,
[ 'jquery-1.7.1.min.js' ],
function(errors, window) {
process.nextTick(
function () {
if (errors) {
throw new Error("There were errors: "+errors);
}
callback(window);
}
);
}
);
}
init({
echoPin: 15, //Echo pin
triggerPin: 14, //Trigger pin
timeout: 1000, //Measurement timeout in ยตs
delay: 60, //Measurement delay in ms
rate: 5 //Measurements per sample
});
Node.js is a server-side implementation of JavaScript. It's ok to do all the sensors operations and calculations on server-side, but you need some mechanism to provide the results to your clients. If they are going to use your application by using a web browser, you must run a HTTP server, like Express.js, and create a route (something like http://localhost/surveyor or just http://localhost/) that calls a method you have implemented on server-side and do something with the result. One possible way to return this resulting data to the clients is by rendering an HTML page that shows them. For that you should use a Template Engine.
Any DOM manipulation should be done on client-side (you could, for example, include a <script> tag inside your template HTML just to try and understand how it works, but it is not recommended to do this in production environments).
Try searching google for Node.js examples and tutorials and you will get it :)

Categories

Resources