Performance issue with gulp.watch() on directory with ~400 files

Performance issue with gulp.watch() on directory with ~400 files - javascript

I’ve noticed that if I gulp.watch() a directory with ~400 Markdown files, my Gulp task takes a long time to initialize (~19s on my machine). If I remove this specific .watch() call, my task initialization time shrinks to less than 100ms.
gulp.task('my-task', function () {
// these calls are very quick (< 100ms)
gulp.watch('source/styl/*.styl', ['build-css']);
gulp.watch('source/js/index/*.js', ['build-js']);
gulp.watch(['app.js', 'modules/*.js', 'routes/*.js', 'views/*.jade'], [server.run]);
gulp.watch(['source/js/index/*.js', 'app.js', 'modules/*.js', 'routes/*.js', 'gulpfile.js'], ['lint-js']);
gulp.watch('source/img/**/*', ['compress-images']);
// this call takes ~19s to complete
gulp.watch('source/md/releases/*.md', ['build-releases']);
});
Is there anything I can to to resolve this performance issue, or is watching directories with hundreds of files not feasible in Gulp?
Update: I have switched to a callback function:
gulp.watch('source/md/releases/*.md', function (e) {
// console.log(e.path);
});
and I’m still experiencing the same performance issue.

Related

How do I make a gulp function run in series? (gulp-download)

I've been trying to set up functions to download and then install frameworks into my development workflow with gulp-download. Every time I run gulp-download in series, it always runs last, so the function tries to move the files, then they download.
I tried to use merge and run this in a single function, then split it and used a task in series to run it. Neither way is successful.
// DOWNLOAD BOOTSTRAP
function downloadBootstrap(cb) {
download('https://github.com/twbs/bootstrap/releases/download/v4.0.0/bootstrap-4.0.0-dist.zip')
.pipe(unzip())
.pipe(gulp.dest('downloads/bootstrap'));
cb();
}
// INSTALL BOOTSTRAT
function installBootstrap(cb) {
var css = gulp.src('downloads/bootstrap/css/*.{min.css,min.css.map}')
.pipe(dest('_developer/2-css/0-frameworks/bootstrap'))
var js = gulp.src('downloads/bootstrap/js/*.{min.js,min.js.map}')
.pipe(dest('_developer/3-js/0-frameworks/bootstrap'))
var clear = gulp.src('downloads/bootstrap', {read: false, allowEmpty: true})
.pipe(clean())
return merge(css, js, clear); // Combined src
cb();
}
gulp.task('bootstrap', gulp.series('downloadBootstrap', 'installBootstrap'));

You need to make sure you only call the callback function when your task is complete. Inside download, the function call returning doesn't mean the download has finished. However, since you're using a pipeline here, you can eliminate the callback parameter altogether. (On an unrelated note, I would avoid having two functions named download.)
// DOWNLOAD BOOTSTRAP
function download() {
return download('https://github.com/twbs/bootstrap/releases/download/v4.0.0/bootstrap-4.0.0-dist.zip')
.pipe(unzip())
.pipe(gulp.dest('downloads/bootstrap'));
}
Returning the stream will ensure that the stream has been fully processed before the next task continues. Also, you can just remove the callback from your install function since it will never actually be used. (Nothing runs after your function returns.)

I worked it out, I just had to add return to the download lol. Now I have the download finishing before it moves onto the install, but now only the css is migrated, the js remains and the clean isn't running. I had these working before.
gulp.task("bootstrap", gulp.series(downloadBootstrap, installBootstrap));
function downloadBootstrap() {
return download('https://github.com/twbs/bootstrap/releases/download/v4.0.0/bootstrap-4.0.0-dist.zip')
.pipe(unzip())
.pipe(gulp.dest('downloads/bootstrap'));
}
function installBootstrap() {
return gulp.src('downloads/bootstrap/css/*.{min.css,min.css.map}') // Gather [developer] css files
.pipe(gulp.dest('_developer/2-css/0-frameworks/bootstrap')) // Deliver css files to [build]
return gulp.src('downloads/bootstrap/js/*.{min.js,min.js.map}') // Gather [developer] css files
.pipe(gulp.dest('_developer/3-js/0-frameworks/bootstrap')) // Deliver css files to [build]
return gulp.src('downloads/bootstrap', { allowEmpty: true })
.pipe(clean());
}

Gulp Watch Not calling associated gulp task

So I'm coming back to a personal project after a (prolonged) break, and I'm getting back into the flow of things. One thing that smacks me in the face is that my watchers aren't working. If, for example, I call gulp watch-for-jshint, I expect JSHint to be changed anytime I modify a JS file -- either a root one (gulp.js) or one of my client-app ones (./client-app/js/controllers/register.js). That's not happening. I can see watch is starting, but it never calls JSHint when I change my files!
Anyone have any ideas what is going wrong or how to diagnose? As far as I can tell, I have the syntax right... And because (for testing purposes) I'm directly calling watch-for-jshint, I should be avoiding the usual pitfall of never actually calling my watcher.
Simplified gulp file:
var gulp = require('gulp'),
clean = require('gulp-clean'),
jshint = require('gulp-jshint'),
notify = require('gulp-notify'),
watch = require('gulp-watch'),
gulp.task('jshint', function(){
console.log("Running JSHint");
return gulp.src(['*.js', 'client-app/**/*.js'], ['jshint'])
.pipe(jshint({"moz":true}))
.pipe(jshint.reporter('default'))
.pipe(jshint.reporter('fail'))
.on('error', notify.onError({ sound: "Funk" }));
});
gulp.task('watch-for-jshint', function(){
return watch(['*.js', './client-app/**/*.js'], ['jshint']);
});
I've truncated a lot of extraneous tasks, but what's above should be everything relevant.

And naturally, 5 minutes after posting I notice one difference between my syntax and what a lot of questions are using. Most people are using gulp.watch, I was just following the syntax found at the gulp-watch NPM module page of using just watch.
Not... entirely sure why one works and the other doesn't; odd. Seems like the new version shouldn't work, since it's not referencing the gulp-watch module.

Remove the following line in gulp task, if it is not necessary.
.on('error', notify.onError({ sound: "Funk" }))
This makes the gulp task stop and show error. Hence watch is also stopped.
Replace
return watch(['*.js', './client-app/**/*.js'], ['jshint']);
with
gulp.watch(['*.js', './client-app/**/*.js'], ['jshint']);

RequireJS main require call back never invoked

I've tried this post's advice to no avail. No matter what I do, RequireJS's primary callback is never fired. Here's my dependency graph:
/module/main.js
-- /module/mainController.js
--/vendor/preloadjs
--/module/itemController.js
--/module/router.js
--/vendor/crossroads.js
--/vendor/signals.js
--/vendor/hasher.js
--/vendor/signals.js
require.config({
baseUrl: "script/module",
paths: {
signals: "vendor/signals"
}
});
require(["main", function(){
console.log("main function!");
}]);
The "main" module makes use of js-signals, and actually gets invoked. In fact the entire dependency tree is loaded (confirmed via the web inspector). I have a single entry point for the application. All modules start up and actually run fine. You'd think that if the main application callback doesn't run that one or all of its dependencies would have failed.
I'm sure there is some stupid reason I'm not kicking off the primary require's callback. For the record I've tried using the requirejs() method and get the same results.
No files have code in them except for dependencies and console.logs.
Does anyone have any ideas for what I'm doing wrong?

I'll answer my own silliness. The problem is right there in the source. I'm using Angular module injection syntax, and passing the callback as part of the dependency array. Oops. It needs to be the second argument!
Long story short, the code is right, but this is the change:
require.config({
baseUrl: "script/module",
paths: {
signals: "vendor/signals"
}
});
require(["main"], function(){
console.log("main function!");
});

Make Gulp task synchronous without creating a bunch of other tasks

So I am writing up my gulpfile.js and I have come to a point where I need to avoid JavaScript's asynchronous behavior. In short, this is for file-system read/write.
The problem is that all the solutions I have found online thus far create several sub-tasks; which is something I want to avoid so that I don't have to write any confusing documentation about what tasks should and shouldn't be used in the command line, and what order they need to be run in, etc.
My Question: How can I make the below script run each part synchronously, without creating sub-tasks?
gulp.task('rebuild', function(){
// Remove old build
gulp.src('build/', {read: false}).
pipe(rimraf());
// Copy all of the non generated files
gulp.src('src/**/*').
pipe(gulp.dest('build/'));
// Parse SASS/LESS and minify JS
build_styles();
build_scripts();
});

Well if all else fail, you can always fall back to the callback hell:
gulp.task('rebuild', function(){
// Remove old build
gulp.src('build/', {read: false}).
pipe(rimraf())
.on('end', function () {
// Copy all of the non generated files
gulp.src('src/**/*').
pipe(gulp.dest('build/'))
.on('end', function () {
// Parse SASS/LESS and minify JS
build_styles();
build_scripts();
});
});
});

Best way to read many files with nodejs?

I have a large glob of file paths. I'm getting this path list from a streaming glob module https://github.com/wearefractal/glob-stream
I was piping this stream to another stream that was creating fileReadStreams for each path and quickly hitting some limits. I was getting the:
warning: possible EventEmitter memory leak detected. 11 listeners added. Use emitter.setMaxListeners() to increase limit
and also Error: EMFILE, open
I've tried bumping the maxListeners but I have ~9000 files that would be creating streams and I'm concerned that will eat memory that number is not constant and will grow. Am I safe to remove the limit here?
Should I be doing this synchronously? or should I be iterating over the paths and reading the files sequentially? Won't that still execute all the reads at once using a for loop?

The max listeners thing is purely a warning. setMaxListeners only controls when that message is printed to the console, nothing else. You can disable it or just ignore it.
The EMFILE is your OS enforcing a limit on the number of open files (file descriptors) your process can have at a single time. You could avoid this by increasing the limit with ulimit.
Because saturating the disk by running many thousands of concurrent filesystem operations won't get you any added performance—in fact, it will hurt, especially on traditional non-SSD drives—it is a good idea to only run a controlled number of operations at once.
I'd probably use an async queue, which allows you to push the name of every file to the queue in one loop, and then only runs n operations at once. When an operation finishes, the next one in the queue starts.
For example:
var q = async.queue(function (file, cb) {
var stream = fs.createReadStream(file.path);
// ...
stream.on('end', function() {
// finish up, then
cb();
});
}, 2);
globStream.on('data', function(file) {
q.push(file);
});
globStream.on('end', function() {
// We don't want to add the `drain` handler until *after* the globstream
// finishes. Otherwise, we could end up in a situation where the globber
// is still running but all pending file read operations have finished.
q.drain = function() {
// All done with everything.
};
// ...and if the queue is empty when the globber finishes, make sure the done
// callback gets called.
if (q.idle()) q.drain();
});
You may have to experiment a little to find the right concurrency number for your application.

Develop Reference

JavaScript is the programming language of the Web.

Performance issue with gulp.watch() on directory with ~400 files - javascript

Related

How do I make a gulp function run in series? (gulp-download)

Gulp Watch Not calling associated gulp task

RequireJS main require call back never invoked

Make Gulp task synchronous without creating a bunch of other tasks

Best way to read many files with nodejs?

Categories

Resources