Webpack plugin API - update content of an asset and rebuild hash - javascript

I am writing a plugin that needs to swap contents of a certain JSON file after all of the modules were bundled. I implemented it in 2 steps: loader replaces content with a placeholder, and the plugin replaces the placeholder.
the loader looks like this:
const loader = function(source) {
this.clearDependencies();
return JSON.stringify('REGENERATED_JSON');
};
the plugin looks roughly like this:
compilation.hooks.optimizeChunkAssets.tapAsync(PLUGIN_NAME, (chunks, callback) => {
chunks.forEach((chunk) => {
chunk.files.forEach((filePath) => {
const asset = compilation.assets[filePath];
const source = asset.source();
replacements.forEach((id) => {
const pattern = 'JSON.parse("\\"REGENERATED_JSON\\"")';
const index = source.indexOf(pattern);
if (index < 0) return;
const content = JSON.stringify(json_content, null, 2);
const updatedSource = new ReplaceSource(asset);
updatedSource.replace(index, index + pattern.length, content);
compilation.assets[filePath] = updatedSource;
});
});
});
callback();
});
This code has several issues:
Fragile because it's tied to JSON.parse call. I wasn't able to trick webpack into treating file as javascript after it was imported as JSON.
The content hash isn't being rebuilt, neither is the file size assessment, the JSON might be very large but Webpack wouldn't know.
Is there a way to solve these problems within webpack?

This comment helped me solve the issue: https://github.com/webpack/webpack/issues/8830#issuecomment-580095801
In short:
use compilation.hooks.finishModules.tap to get to the modules after every module was parsed
inject a new loader there and then, or add information to the module(s) in question (right to the module object)
trigger a rebuild of those module(s) with compilation.rebuildModule(module, callback), use promisify from built in utils package to convert to a promise to handle multiple parallel rebuilds

Related

How to insert a javascript file in another javascript at build time using comments

Is there something on npm or in VS Code or anywhere that can bundle or concatenate Javascript files based on comments, like:
function myBigLibrary(){
//#include util.js
//#include init.js
function privateFunc(){...}
function publicFunc(){
//#include somethingElse.js
...
}
return {
init, publicFunc, etc
}
}
Or something like that? I would think that such a thing is common when your javascript files get very large. All I can find are complicated things like webpack.
I'm looking for any equivalent solution that allows you to include arbitrary code in arbitrary positions in other code. I suppose that would cause trouble for intellisense, but an extension could handle that.
I'm not sure what you really means but if you means linking variables from other javascript files. probably this will help you
const { variable name } = require(the javascript file path)
example:
in index.js
const { blue } = require('./../js/blue.js)
console.log(blue)
Meanwhile in blue.js
const blue = "dumbass"
if this doesnt help you just ignore this
So here is a bare bones way to to what I wanted. I have been learning more about how you can do things with esbuild or other bundlers, but I didn't quite figure out something that fit my needs. And this is simpler and more flexible. It can work for any file type. You can do automatic updates when files change using nodemon instead of node to run this code.
const fs = require('fs')
/////////////////////////
const input = 'example.js'
const output = 'output.js'
const ext = '.js'
// looks for file with optional directory as: //== dir/file
const regex = /\/\/== *([\w-\/]+)/g
const fileContent = fs.readFileSync(input).toString()
// replace the comment references with the corresponding file content
const replacement = fileContent.replace(regex, (match, group)=>{
const comment = '//////// '+group+ext+' ////////\n\n'
const replace = fs.readFileSync(group+ext).toString()
return comment + replace
})
// write replacement to a file
fs.writeFileSync(output, replacement)

How to loop through a Cherrio inside an async function and populate an outside variable?

I need to create an API that web scraps GitHub's repos getting the following data:
File name;
File extension;
File size (bytes, kbytes, mbytes, etc);
File number of lines;
I'm using Node with TypeScript so, to get the most out of it, I decided to create an interface called FileInterface, that has the four attributes mentioned above.
And of course, the variable is an array of that interface:
let files: FileInterface[] = [];
Let's take my own repo to use as an example: https://github.com/raphaelalvarenga/git-hub-web-scraping
So far so good.
I'm already pointing to the HTML's files section with request-promise dependency and storing them in a Cheerio variable so I can traverse through the "tr" tags to create a loop. As you might think, those "tr" tags represent each files/folders inside of a "table" tag (if you inspect the page, it can easily be found). The loop will fill a temp variable called:
let tempFile: FileInterface;
And at the end of every cycle of the loop, the array will be populated:
files.push(tempFile);
In GitHub repo's initial page, we can find the file names and their extension. But the size and total of lines, we can't. They are found when clicking on them to redirect to the file page. Let's say we clicked in README.md:
Ok, now we can see README.md has 2 lines and 91 Bytes.
My problem is, since this will take a long time, it needs to be an async function. But I can't handle the loop in Cheerio content inside the async function.
Things that I've tried:
Using map and each methods to loop through it and push in the array files;
Using await before the loop. I knew this one wouldn't actually work since it's just a loop that doesn't return anything;
The last thing I tried and believed that would work is Promise. But TypeScript accuses Promises return the "Promise unknown" type and I'm not allowed to populate the result in files arrays, since the types "unknown" and "FilesInterface[]" are not equal.
Below I'll put the code I created so far. I'll upload the repo in case you want to download and test (the link is at the beginning of this post), but I need to warn that this code is in the branch "repo-request-bad-loop". It's not in the master. Don't forget because the master branch doesn't have any of this that I mentioned =)
I'm making a request in Insomnia to the route "/" and passing this object:
{
"action": "getRepoData",
"url": "https://github.com/raphaelalvarenga/git-hub-web-scraping"
}
index-controller.ts file:
As you can see, it calls the getRowData file, the problematic one. And here it is.
getRowData.ts file:
I will try to help you, although I do not know typescript. I redid the getRowData function a bit and now it works for me:
import cheerio from "cheerio";
import FileInterface from "../interfaces/file-interface";
import getFileRemainingData from "../routines/getFileRemaningData";
const getRowData = async (html: string): Promise<FileInterface[]> => {
const $ = cheerio.load(html);
const promises: any[] = $('.files .js-navigation-item').map(async (i: number, item: CheerioElement) => {
const tempFile: FileInterface = {name: "", extension: "", size: "", totalLines: ""};
const svgClasses = $(item).find(".icon > svg").attr("class");
const isFile = svgClasses?.split(" ")[1] === "octicon-file";
if (isFile) {
// Get the file name
const content: Cheerio = $(item).find("td.content a");
tempFile.name = content.text();
// Get the extension. In case the name is such as ".gitignore", the whole name will be considered
const [filename, extension] = tempFile.name.split(".");
tempFile.extension = filename === "" ? tempFile.name : extension;
// Get the total lines and the size. A new request to the file screen will be needed
const relativeLink = content.attr("href")
const FILEURL = `https://github.com${relativeLink}`;
const fileRemainingData: {totalLines: string, size: string} = await getFileRemainingData(FILEURL, tempFile);
tempFile.totalLines = fileRemainingData.totalLines;
tempFile.size = fileRemainingData.size;
} else {
// is not file
}
return tempFile;
}).get();
const files: FileInterface[] = await Promise.all(promises);
return files;
}
export default getRowData;

Webpack : How to convert variables on build

About using Vue (vue-loader) + Webpack and Chromatism
Example: (on dev / source)
let textColor = chromatism.contrastRatio('#ffea00').cssrgb // => rgb(0,0,0)
Does it possible to tell Webpack to convert to rgb(0,0,0) on build version?
So on build version should be converted something like: (for performance)
let textColor = 'rgb(0,0,0)'
As previous answers and comments have already mentioned that there are no readily available AOT compilers to handle this kind of a situation (I mean this is a very specific case and no general purpose tool could be able to handle it), there is nothing stopping you from rolling out your own loader/plugin to handle this task!
You can use a custom Webpack Loader and Node's VM Module to execute the code at buildtime, get its output and replace it with the function call in your source file.
A sample implementation of this idea can look like the following snippet:
// file: chromatismOptimizer.js
// node's vm module https://nodejs.org/api/vm.html
const vm = require('vm')
const chromatism = require('chromatism')
// a basic and largley incomplete regex to extract lines where chromatism is called
let regex = /^(.*)(chromatism\S*)(.*)$/
// create a Sandbox
//https://nodejs.org/api/vm.html#vm_what_does_it_mean_to_contextify_an_object
// this is roughly equivalent to the global the context the script will execute in
let sandbox = {
chromatism: chromatism
}
// now create an execution context for the script
let context = new vm.createContext(sandbox)
// export a webpack sync loader function
module.exports = function chromatismOptimizer(source){
let compiled = source.split('\n').reduce((agg, line) => {
let parsed = line.replace(regex, (ig, x, source, z) => {
// parse and execute the script inside the context
// return value is the result of execution
// https://nodejs.org/api/vm.html#vm_script_runincontext_contextifiedsandbox_options
let res = (new (vm.Script)(source)).runInContext(context)
return `${x}'${res}'${z? z : ''}`
})
agg.push(parsed)
return agg;
}, []).join('\n');
return compiled;
}
Then in your production.webpack.js (or somefile like that) taken from this issue:
// Webpack config
resolveLoader: {
alias: {
'chromatism-optimizer': path.join(__dirname, './scripts/chromatism-optimizer'),
},
}
// In module.rules
{
test: /\.js$/,
use: ['chromatism-optimizer'],
}
NOTE: This is just a reference implementation and is largely incomplete. The regex used here is quite basic and may not cover many other use cases, so make sure you update the regex. Also this whole stuff can be implemented using webpack plugins (it's just that I don't have sufficient knowledge of how to create one). For a quick starter refer to this wiki to learn how to create a custom plugin.
The basic idea is simple.
First create a sandboxed environment,
let sandbox = { chromatism:chromatism, ...}
Then create an execution context,
let context = new vm.createContext(sandbox)
Then for each valid source, execute the source statement in the context and get the result.
let result = (new (vm.Source)(source)).runInContext(context)
Then maybe replace the original source statement with the result.
Try using call-back loader to process all your JavaScript. Define a callback function for this specific case or even something more general like:
evalDuringCompile: function(code) { return JSON.stringify(eval(code)); }
State of Art optimizer cannot handle arbitrary expression. In this case, the most reliable solution is hard code constant in your code like this.
let textColor = process.env.NODE_ENV === 'production' ? 'rgb(0,0,0)' : chromatism.contrastRatio('#ffea00').cssrgb;
Some futuristic optimizer does handle situation like this. prepack will evaluate code at compile time and output computed values. However, it is not generally considered production ready.

Trading RAM by CPU (performance issue)

I'm working with a program that deals with files, I can do several things like rename them, read the contents of them, etc.
Today I'm initializing it as follows:
return new Promise((resolve, reject) => {
glob("path/for/files/**/*", {
nodir: true
}, (error, files) => {
files = files.map((file) => {
// properties like full name, basename, extension, etc.
});
resolve(files);
});
});
So, i read the content of a specific directory, return all files within an array, and then use the Array.map to iterate over the array and change the paths for a object with properties.
Sometimes i work with 200.000 text files, so, this is becoming a problem because it is consuming too much RAM.
So, i want replace by a construction function with lazy loading.. but i never did that before... so i'm looking for a help hand.
That's my code:
class File {
constructor(path) {
this.path = path;
}
extension() {
return path.extname(this.path);
}
// etc
}
So, my main question is: should i only return the evaluation of the property, or should i replace it? Like this:
extension() {
this.extension = path.extname(this.path);
}
I understand this is a trade off.. i'm going to trade the memory by cpu usage.
Thank you.
If you want to reduce RAM usage, I suggest you store an extra meta-data file for each path, as follows:
Keep the paths in memory, or some of them, as necessary.
Save files properties to hard drive
files.forEach( (file) => {
// collect the properties you want for the file
// ...
var json = { path: file, extension: extension, .. }
// mark the metadata file so you can access it later, for example: put it in the same path with a suffix
var metaFile = path + '_meta.json';
fs.writeFile(metaFile, JSON.stringify(json), (err) => {
if (err) throw err;
});
});
Now all the meta data is on hard drive. This way, I believe, you trade memory for disk space and CPU calls.
If you wish to get properties for a file, just read and JSON.parse its corresponding meta data file.
There's no reason to trade CPU for space. Just walk the tree and process files as they're found. The space needed for walking the tree is proportional to the tree depth if it's done depth first. This is almost certainly has same overhead as just creating the list of paths in your existing code.
For directory walking, the node.js FAQ recommends node-findit. The documentation there is pretty clear. Your code will look something like:
var finder = require('findit')(root_directory);
var path = require('path');
var basenames = [];
finder.on('file', function (file, stat) {
basenames.push(path.basename(file));
// etc
}
Or you can wrap the captured values in an object if you like.
If you store only path property NodeJS class instance take for your example 200k * (path.length * 2 + 6) bytes memory.
If you want to use lazy loading for basenames, extenstions etc use lazy getters
class File {
constructor(path) {
this.path = path;
this._basename = null;
this._extname = null;
}
get extname() {
return this._extname || (this._extname = path.extname(this.path));
}
get basename() {
return this._basename || (this._basename = path.basename(this.path));
}
}

NodeJS & Gulp Streams & Vinyl File Objects- Gulp Wrapper for NPM package producing incorrect output

Goal
I am currently trying to write a Gulp wrapper for NPM Flat that can be easily used in Gulp tasks. I feel this would be useful to the Node community and also accomplish my goal. The repository is here for everyone to view , contribute to, play with and pull request. I am attempting to make flattened (using dot notation) copies of multiple JSON files. I then want to copy them to the same folder and just modify the file extension to go from *.json to *.flat.json.
My problem
The results I am getting back in my JSON files look like vinyl-files or byte code. For example, I expect output like
"views.login.usernamepassword.login.text": "Login", but I am getting something like {"0":123,"1":13,"2":10,"3":9,"4":34,"5":100,"6":105 ...etc
My approach
I am brand new to developing Gulp tasks and node modules, so definitely keep your eyes out for fundamentally wrong things.
The repository will be the most up to date code, but I'll also try to keep the question up to date with it too.
Gulp-Task File
var gulp = require('gulp'),
plugins = require('gulp-load-plugins')({camelize: true});
var gulpFlat = require('gulp-flat');
var gulpRename = require('gulp-rename');
var flatten = require('flat');
gulp.task('language:file:flatten', function () {
return gulp.src(gulp.files.lang_file_src)
.pipe(gulpFlat())
.pipe(gulpRename( function (path){
path.extname = '.flat.json'
}))
.pipe(gulp.dest("App/Languages"));
});
Node module's index.js (A.k.a what I hope becomes gulp-flat)
var through = require('through2');
var gutil = require('gulp-util');
var flatten = require('flat');
var PluginError = gutil.PluginError;
// consts
const PLUGIN_NAME = 'gulp-flat';
// plugin level function (dealing with files)
function flattenGulp() {
// creating a stream through which each file will pass
var stream = through.obj(function(file, enc, cb) {
if (file.isBuffer()) {
//FIXME: I believe this is the problem line!!
var flatJSON = new Buffer(JSON.stringify(
flatten(file.contents)));
file.contents = flatJSON;
}
if (file.isStream()) {
this.emit('error', new PluginError(PLUGIN_NAME, 'Streams not supported! NYI'));
return cb();
}
// make sure the file goes through the next gulp plugin
this.push(file);
// tell the stream engine that we are done with this file
cb();
});
// returning the file stream
return stream;
}
// exporting the plugin main function
module.exports = flattenGulp;
Resources
https://github.com/gulpjs/gulp/blob/master/docs/writing-a-plugin/README.md
https://github.com/gulpjs/gulp/blob/master/docs/writing-a-plugin/using-buffers.md
https://github.com/substack/stream-handbook
You are right about where the error is. The fix is simple. You just need to parse file.contents, since the flatten function operates on an object, not on a Buffer.
...
var flatJSON = new Buffer(JSON.stringify(
flatten(JSON.parse(file.contents))));
file.contents = flatJSON;
...
That should fix your problem.
And since you are new to the Gulp plugin thing, I hope you don't mind if I make a suggestion. You might want to consider giving your users the option to prettify the JSON output. To do so, just have your main function accept an options object, and then you can do something like this:
...
var flatJson = flatten(JSON.parse(file.contents));
var jsonString = JSON.stringify(flatJson, null, options.pretty ? 2 : null);
file.contents = new Buffer(jsonString);
...
You might find that the options object comes in useful for other things, if you plan to expand on your plugin in future.
Feel free to have a look at the repository for a plugin I wrote called gulp-transform. I am happy to answer any questions about it. (For example, I could give you some guidance on implementing the streaming-mode version of your plugin if you would like).
Update
I decided to take you up on your invitation for contributions. You can view my fork here and the issue I opened up here. You're welcome to use as much or as little as you like, and in case you really like it, I can always submit a pull request. Hopefully it gives you some ideas at least.
Thank you for getting this project going.

Categories

Resources