Webpack: Custom loader - reuse file extension - javascript

I'm working on writing a loader for filename.xyz.json files.
Now since Webpack version 2, Webpack support loading JSON files out of the box.
So I've managed to get my loader to work when working when using a completely custom file extension like .xyz.jayson.
But because I'm using .json the other, already existing loader gets triggered after my loader did his magic, which will cause an error because at this point it's not JSON anymore. How can I prevent that?
If I understand the Webpack docs correctly, the !! prefix with inline usage would do just that. But I would like to disable post/pre loaders in the config. Is this possible?
Also, I was thinking of actually using that given JSON loader instead of dodging it, because why parse the JSON myself, when there is already a loader for it? But I don't quite sure if that is possible since the returned source from the JSON is already returned as module.export. Would I need to strip the module.export and then run JSON.parse to work with it as an actual js object instead of a string?
So as a quick summary:
I'd like to either to not trigger the JSON loader at all, and parse the JSON myself to manipulate it or use the built-in JSON loader first and then manipulate the JSON data file myself.

I found the solution:
Setting the type of my rule to javascript/auto gave me the expected result.
More information here

Related

Node not reading full contents of file

I'm trying to get it to read & parse a json file to update it, but it's not reading the full file, it's stopping after a lot of the file and just not reading any more of it. It's a massive json file because I can't really store it as anything else, besides multiple json files.
The code of CacheManager is here
The size of what it read is 143,360, and the actual size of the file is 153,840. I've never really ran into the issue, so I have no clue how to remedy it. I'm using fs-extra in the code, but I've verified that the same issue happens with the built-in fs module. I've printed out the content of what it got as well, so I can see that it is reading the file, and it is reading the right content, it's just not getting all of it. I'll link the right content and what it's getting. It's cut off at the end, you can see the part of the json for the md5. The code writing it to the file is just writing the raw content of the read file here (look at the part below the first screenshot to see the regular code)
If the issue is caused by the size of the file, you may look in some streaming parsing alternatives to standard JSON parse (like https://www.npmjs.com/package/stream-json).
Note: I'll check and let you know
Edit for the reader: so far it seems some kind of race and SO caching, discussion in the comments.

Telling if a requested file is Javascript

I have a program that logs every GET/POST request made by a website during the page load process. I want to go through these requests one by one, execute them, and then determine if the file that was returned is a Javascript. Given that it won't have a .js ending (because of scripts like this, yanked from google.com a minute ago), how can I parse the file gotten from the request and identify if it is a Javascript file?
Thanks!
EDIT:
It is better to get a false positive than a false negative. That is, I would rather have some non-JS included in the JS-list than cut some real JS from the list.
The javascript link that you referred does not have a content type, nor does it have the js extension.
Any text file can be considered javascript if it can get executed which can make detection from scratch very difficult. There are two methods that come to mind.
Run a linter on the file contents. If the error is a syntax error or a Parsing error, it is not javascript. If there are no syntax error or parsing error, it should be considered javascript
Parse the AST (Abstract syntax tree) for the file contents. A javascript file would parse without errors. There should be a number of AST libraries available. I haven't worked with JS AST, so can't recommend any one of them but a quick search should give you some options.
I am not sure but probably a linter would also run AST before doing syntax checks. In this case, running AST seems like a lighter option.
The easiest way would be to check if there was anything identifying javascript files by their URI, because the alternatives are a lot heavier. But since you said this isn't an option, you can always check the syntax of the contents of each file using some heuristic tool. You can also check the response headers for its content-type.

Execute NormalModule at build time, after it is built by loaders, then save to json file

I am coding a plugin that, for specific modules, will try to execute the module generated at build time in order to save the result to a json file.
For that, I am tapping into compilation.hooks.succeedModule, which receives a NormalModule object already built. Then I am trying to eval the source replacing webpack variables like __webpack_public_path__.
While it kind of works, this approach feels terribly wrong. Like I am missing something.
Is there a nice way to execute modules at build time from a NormalModule object having basic access to vars like __webpack_public_path__? Maybe Webpack offers a better way to do these kind of things?
Ok, yeah, sounds like you can solve this another way, I've done similar stuff where I needed to change what a module output, write stuff to disk, trigger side effects, etc. It sounds like you want loaders rather than a plugin. The run-loader (https://www.npmjs.com/package/webpack-run-loader) executes the module it loads and exports or returns the result.
You can write a custom loader that you chain to run after responsive-loader, and run-loader, and which receives the JSON from run-loader and writes it to disk where you want it (as a side effect), and then returns an empty string so that nothing is added to the build. The end result would be that requiring this module in your app gets your image files created (by responsive-loader), and the JSON written out to disk where you need it (by your custom loader). Alternately you could skip run-loader and in your custom loader use regex to just grab the JSON from the output of responsive-loader. Using regex on code generated by a project dependency seems fragile, but as long as you have your dependency versions locked down it can work just fine in practice, and it's a bit simpler conceptually than adding run-loader to the pipeline.
If you're writing webpack plugins I imagine you're comfortable writing loaders as well, but if not they're pretty straightforward -- just a function that accepts source code from the loader that came before it and returns code, and does whatever you want in between. The docs aren't bad for the API, but looking at the source of a few published loaders is helpful, too. It might look roughly (just spitballing from memory) like:
// img-info-logging-loader.js
// regex version, expects source arg to be output of responsive-loader
import * as fs from 'fs';
export const imgInfoLoggingLoader = (source) => {
const jsonFinderRegex = /someregexto(match)onsource/;
const desiredJSON = source;
const matchArr = jsonFinderRegex.exec(desiredJSON);
if (!matchArr[1]) {
throw new ReferenceError('json output not found in loader source.');
} else {
const imgConfigJsonString = matchArr[1];
// you would write a fn to generate a filename based on the
// source, or based on the module's filename, which is available
// via the webpack loader api
const fileNameToWrite = getFileNameTowrite();
try {
// async might be preferable depending on your webpack
// performance needs
fs.writeFileSync(fileNameToWrite, imgConfigJsonString);
} catch (err) {
throw new Error(`error writing ${fileNameToWrite}`);
}
}
// what the loader inserts into your JS asset: empty string
return '';
}
EDIT:
Since per your comment you are looking to output a single JSON object with all of the image info in it, you would want a slightly different approach that does use a plugin (this is the most elegant way I know to do it, there may be others). As far as I know a plugin is the only way to 'do something' when webpack is done loading modules.
You still want a custom loader that is extracting the JSON from the output of the responsive-loader, as described above. It won't write each to disk, though. Instead your loader will call a method on the following module:
You also write a json-collector.js that is just a little node module that you will use to hold on to the JSON object you're building. This bit is awkward because it's separate from the loader but the loader needs it. This collector module is simple, though, and if you wanted to be cleaner you could turn it into a more generic module and treat it as a proper, separate node dependency. All it is is an object with a method for adding JSON data, which appends it to an internal JSON object, and one for reading out the collected data, which returns the JSON.
And then you have a plugin that hooks into the end of the build (I think there's one for 'build sealed' that I've used). When that hook is reached, you know webpack has no more modules to load, so the plugin now calls the 'read' method on the json-collector, gets the JSON object from it and writes that to disc.
This solution doesn't fit the standalone plugin/standalone loader convention in webpack but if that doesn't bother you it's actually pretty straightforward, each of the three pieces has a simple job to do. I've used this pattern multiple times and it's worked for me.

Understanding the Communication between Modules in jQuery Source Code Structure [duplicate]

Uncompressed jQuery file: http://code.jquery.com/jquery-2.0.3.js
jQuery Source code: https://github.com/jquery/jquery/blob/master/src/core.js
What are they doing to make it seem like the final output is not using Require.js under the hood? Require.js examples tells you to insert the entire library into your code to make it work standalone as a single file.
Almond.js, a smaller version of Require.js also tell you to insert itself into your code to have a standalone javascript file.
When minified, I don't care for extra bloat, it's only a few extra killobytes (for almond.js), but unminified is barely readable. I have to scroll all the way down, past almond.js code to see my application logic.
Question
How can I make my code to be similar to jQuery, in which the final output does not look like a Frankenweenie?
Short answer:
You have to create your own custom build procedure.
Long answer
jQuery's build procedure works only because jQuery defines its modules according to a pattern that allows a convert function to transform the source into a distributed file that does not use define. If anyone wants to replicate what jQuery does, there's no shortcut: 1) the modules have to be designed according to a pattern which will allow stripping out the define calls, and 2) you have to have a custom conversion function. That's what jQuery does. The entire logic that combines the jQuery modules into one file is in build/tasks/build.js.
This file defines a custom configuration that it passes to r.js. The important option are:
out which is set to "dist/jquery.js". This is the single
file produced by the optimization.
wrap.startFile which is set to "src/intro.js". This file
will be prepended to dist/jquery.js.
wrap.endFile which is set to "src/outro.js". This file will
be appended to dist/jquery.js.
onBuildWrite which is set to convert. This is a custom function.
The convert function is called every time r.js wants to output a module into the final output file. The output of that function is what r.js writes to the final file. It does the following:
If a module is from the var/ directory, the module will be
transformed as follows. Let's take the case of
src/var/toString.js:
define([
"./class2type"
], function( class2type ) {
return class2type.toString;
});
It will become:
var toString = class2type.toString;
Otherwise, the define(...) call is replace with the contents of the callback passed to define, the final return statement is stripped and any assignments to exports are stripped.
I've omitted details that do not specifically pertain to your question.
You can use a tool called AMDClean by gfranko https://www.npmjs.org/package/amdclean
It's much simpler than what jQuery is doing and you can set it up quickly.
All you need to do is to create a very abstract module (the one that you want to expose to global scope) and include all your sub modules in it.
Another alternative that I've recently been using is browserify. You can export/import your modules the NodeJS way and use them in any browser. You need to compile them before using it. It also has gulp and grunt plugins for setting up a workflow. For better explanations read the documentations on browserify.org.

Converting `grunt.file.readJSON` to gulp

I'm in the process of converting a Grunt file to a Gulp file. My Grunt file contains the following line
var config = grunt.file.readJSON('json/config.json');
What this line is doing is that it is setting some variables which it then injects into the html it generates, specifically related to languages.
I tried converting the file automatically with grunt2gulp.js but it always fails with config being undefined. How would I write grunt.file.readJSON using gulp?
The easiest way to load a JSON file in node/io.js is to use require directly:
var config = require('json/config.json');
This can substitute any readJSON calls you have and also works generally. Node/io.js have the ability to synchronously require json files out of the box.
Since this is a .json file, Benjamin's answer works just fine (just require() it in).
If you have any configs that are valid JSON but not stored in files that end in a .json extension, you can use the jsonfile module to load them in, or use the slightly more verbose
JSON.parse(require('fs').readFileSync("...your file path here..."))
(if you have fs already loaded, this tends to be the path of least resistance)
The one big difference between require (which pretty much uses this code to load in json files) and this code is that require uses Node's caching mechanism, so multiple requires will only ever import the file once, and then return points to the parsed data, effectively making everything share the same data object. Sometimes that's great, sometimes it's absolutely disastrous, so keep that in mind
(If you absolutely need unique data, but you like the convenience of require, you can always do a quick var data = require("..."); copied = JSON.parse(JSON.stringify(data));)

Categories

Resources