I've been working on a commonjs implementation on ExtendScript Toolkit lately and I am stuck with this "dependency cycle" thing. My code passed most of the commonjs compliance tests except on these: cyclic, determinism, exactExports, monkeys.
The wiki states that:
If there is a dependency cycle, the foreign module may not have finished executing at the time it is required by one of its transitive dependencies; in this case, the object returned by "require" must contain at least the exports that the foreign module has prepared before the call to require that led to the current module's execution.
Can somebody please explain to me further how this specification should be implemented? Do I throw an exception if it detects a dependency cycle?
You can check my code at: https://github.com/madevelopers/estk
Only tested on ExtendScript Toolkit CS6
In CommonJS, you attach the things you're exporting onto an export object. The intention of the spec is that if there is a 'require' statement part way through a file and then part way through that required file there is a require of the original file, then the second require gets the state of the exports object as it is at that point. I'll provide an example.
// file A.js
exports.A1 = {};
exports.A2 = {};
// The object returned from this require call has B1, B2 and B3
var B = require('./B.js');
exports.A3 = B.B1;
And in file B.js:
// fie B.js
exports.B1 = {};
exports.B2 = {};
// This require call causes a cycle. The object returned has only A1 and A2.
// A3 will be added to the exports object later, *after* this file is loaded.
var A = require('./A.js');
exports.B3 = A.A1;
This example would have worked correctly, even though there is a cycle in the code. Here's another example that would work even though it is circular:
var OtherModule = require('./OtherModule.js');
// even if OtherModule requires this file and causes a circular dependency
// this will work, since doAThing is only defined and not called by requiring this
// this file. By the time doAThing is called, OtherModule will have finished
// loading.
exports.doAThing = function() {
return OtherModule.doSomething() + 3;
}
Even though OtherModule.doSomething doesn't exist when this files code is executed and doAThing is defined, as long as doAThing doesn't get called until later, then everything is fine.
Related
So I thought about ES modules lately and this is how I think they work:
There's a global object that can't be accessed by a user (let's call it #moduleMap). It maps modules absolute urls to their exports:
#moduleMap = {
"https://something.com/module.mjs": { exportName: "value" }
}
The procedure for evaluating a module is as follows:
module is fetched
module is parsed to ast
ast is modified as follows:
import { x } from "./module1.mjs" => all x references are replaced with
#moduleMap["abs path module1.mjs"].x (and imported module is being fetched)
export const y = "some value" => #moduleMap["abs path to this module"].y = "some value"
(as #Bergi pointed out it's not that simple with exports because exports are not hoisted so dead zone for consts and hoisting for functions are not reflected with just property assignments)
(the above is what's called binding that produce 'live bindings')
the operation is repeated for each module imported
when all modules are fetched, parsed and modified the evaluation starts from the entry module (each module is executed in isolation (~wapped in an IIFE with strict mode on).
As #Bergi pointed out, modules are evaluated eagerly starting from the entry module and evaluating module's imports before the module code itself is executed (with exception for circular dependencies) which practically means the import that was required last will be executed first.
when evaluation reaches any code that accesses #moduleMap["some module"], the browser checks if the module was evaluated
if it wasn't evaluated it is evaluated at this point after which the evaluation returns to this place (now the module (or its exports to be exact) is 'cached' in #moduleMap)
if it was evaluated the import is accessible from #moduleMap["some module"].someImport
for situations where an import is reassigned from other module the browser throws an error
That's basically all is happening AFAIK. Am I correct?
You have a pretty good understanding but there are a few anomalies that should be corrected.
ES Modules
In ECMA-262, all modules will have this general shape:
Abstract Module {
Environment // Holds lexical environment for variables declared in module
Namespace // Exotic object that reaches into Environment to access exported values
Instantiate()
Evaluate()
}
There are a lot of different places that modules can come from, so there are "subclasses" of this Abstract Module. The one we are talking about here is called a Source Text Module.
Source Text Module : Abstract Module {
ECMAScriptCode // Concrete syntax tree of actual source text
RequestedModules // List of imports parsed from source text
LocalExportEntries // List of exports parsed from source text
Evaluate() // interprets ECMAScriptCode
}
When a variable is declared in a module (const a = 5) it is stored in the module's Environment. If you add an export declaration to that, it will also show up in LocalExportEntries.
When you import a module, you are actually grabbing the Namespace object, which has exotic behaviour, meaning that while it appears to be a normal object, things like getting and setting properties might do different things than what you were expecting.
In the case of Module Namespace Objects, getting a property namespace.a, actually looks up that property as a name in the associated Environment.
So if I have two modules, A, and B:
// A
export const a = 5;
// B
import { a } from 'A';
console.log(a);
Module B imports A, and then in module B, a is bound to A.Namespace.a. So whenever a is accessed in module b, it actually looks it up on A.Namespace, which looks it up in A.Environment. (This is how live bindings actually work).
Finally onto the subject of your module map. All modules will be Instantiated before they can be Evaluated. Instantiation is the process of resolving the module graph and preparing the module for Evaluation.
Module Map
The idea of a "module map" is implementation specific, but for the purposes of browsers and node, it looks like this: Module Map <URL, Abstract Module>.
A good way to show how browsers/node use this module map is dynamic import():
async function import(specifier) {
const referrer = GetActiveScriptOrModule().specifier;
const url = new URL(specifier, referrer);
if (moduleMap.has(url)) {
return moduleMap.get(url).Namespace;
}
const module = await FetchModuleSomehow(url);
moduleMap.set(url, module);
return module.Namespace;
}
You can actually see this exact behaviour in Node.js: https://github.com/nodejs/node/blob/e24fc95398e89759b132d07352b55941e7fb8474/lib/internal/modules/esm/loader.js#L98
export const y = "some value" => #moduleMap["abs path to this module"].y = "some value"
(the above is what's called binding that produce 'live bindings')
Yeah, basically - you correctly understood that they all reference the same thing, and when the module reassigns it the importers will notice that.
However, it's a bit more complicated as const y stays a const variable declaration, so it still is subject to the temporal dead zone, and function declarations are still subject to hoisting. This isn't reflected well when you think of exports as properties of an object.
when evaluation reaches any code that accesses #moduleMap["some module"], the browser checks if the module was evaluated
if it wasn't evaluated it is evaluated at this point after which the evaluation returns to this place (now the module (or its exports to be exact) is 'cached' in #moduleMap)
if it was evaluated the import is accessible from #moduleMap["some module"].someImport
No. Module evaluation doesn't happen lazily, when the interpreter comes across the first reference to an imported value. Instead, modules are evaluated strictly in the order of the import statements (starting from the entry module). A module does not start to be evaluated before all of its dependencies have been evaluated (apart from when it has a circular dependency on itself).
I'm trying to convert our requirejs calls to use SystemJS, but I'm not exactly sure what I'm doing wrong.
Our original calls look like this:
return function(callback) {
requirejs(["/app/shared.js"], function(result){
callbackFunction = callback;
callback(dashboard);
main();
})
}
And what I'm trying instead is:
return function(callback) {
console.log(callback.toString())
SystemJS.import('app/shared.js').then(function(result){
callbackFunction = callback;
callback(dashboard);
main();
});
}
I've had to remove some leading / to get things to load properly, which is fine, but I've now ran into an issue where variables that were defined at the top of shared.js aren't visible in my local main.js file. In my browser console I get:
Potentially unhandled rejection [1] ReferenceError: dashboard is not defined
shared.js defines dashboard:
var dashboard = { rows: [], }
// Other definitions...
define(["/app/custom-config.js", /* etc */]);
I guess I have two questions:
is this the correct way to replace requirejs calls?
if so, why aren't my variables from shared.js accessible?
For a fuller picture, main() just sets up the dashboard object, and then calls callbackFunction(dashboard) on it.
Your problem can be reduced to the following case where you have two AMD modules, with one that leaks into the global space, and the 2nd one that tries to use what the first one leaked. Like the two following modules.
src/a.js requires the module that leaks and depends on what that module leaks:
define(["./b"], function () {
console.log("a loaded");
callback();
});
src/b.js leaks into the global space:
// This leaks `callback` into the global space.
var callback = function () {
console.log("callback called");
}
define(["./b"], function () {
console.log("b loaded");
});
With RequireJS, the code above will work. Oh, it is badly designed because b.js should not leak into the global space, but it will work. You'll see callback called on the console.
With SystemJS, the code above won't work. Why? RequireJS loads modules by adding a script element to the header and lets script execute the module's code so callback does end up in the global space in exactly the same way it would if you had written your own script element with an src attribute that points to your script. (You'd get an "Mismatched anonymous define" error, but that's a separate issue that need not detain us here.) SystemJS, by default, uses eval rather than create script elements, and this changes how the code is evaluated. Usually, it does not matter, but sometimes it does. In the case at hand here callback does not end up in the global space, and module a fails.
Ultimately, your AMD modules should be written so that they don't use the global space to pass information from one another.
However, there is another solution which may be useful as a stepping-stone towards a final solution. You can use scriptLoad: true to tell SystemJS to use script elements like RequirejS does. (See the documentation on meta for details and caveats.) Here is a configuration that does that:
System.config({
baseURL: "src",
meta: {
"*": {
scriptLoad: true, // This is what fixes the issue.
}
},
packages: {
// Yes, this empty package does something. It makes `.js` the
// default extension for modules.
"": {}
},
});
// We have to put `define` in the global space to
// so that our modules can find it.
window.define = System.amdDefine;
If I run the example code I've given here without scriptLoad: true, then module a cannot call the callback. With scriptLoad: true, it can call the callback and I get on the console:
b loaded
a loaded
callback called
In a CommonJS implementation of a module through Node, I have this infantModule.js:
filename: infantModule.js
var infant = function(gender) {
this.gender = gender;
//technically, when passed though this line, I'm born!
};
var infantInstance = new infant('female');
module.exports = infantInstance;
My question is:
When is this module's constructor function really executed, considering other module consuming this infantModule, such as:
filename: index.js -- entry point of the application
var infantPerson = require('./infantModule');
// is it "born" at this line? (1st time it is "required")
console.log(infantPerson);
// or is it "born" at this line? (1st time it is referenced)
Since my infantModule exposes a ready-made instantiated object, all other future requires of this module, by any other modules besides the index.js entry point, will reference this same object, that behaves like a shared instance in the application, is it correct to put it that way?
If there's an additional line of code in index.js at the bottom, such as:
infantInstance.gender = 'male';
Any other module in my application besides index.js, that require infantModule at a future point in time, will get the object with the gender property changed, is it the correct assumption?
require returns a normal object. Nothing magical happens when you access that object.
Specifically, the first time you call require(), Node will execute the entire contents of the required file, and will then return the value of its module.exports property.
I have created a JavaScript library which can be used for logging purposes.
I also want to support the logging of requirejs.
Which functions/events of requirejs can I prototype/wrap so I can log when a module is initialized and when it is done initializing and returns the initialized object.
For instance if I call require(["obj1","obj2", "obj3"], function(obj1, obj2, obj3){}
I would like to know when requirejs begins on initializing each of the object, and I would like to know when each object is completely initialized.
I looked into the documentation/code, but could not find any usefull functions I can access from the requirejs object or the require object.
Note: I do not want to change the existing code of requirejs I wish to append functionality from the outside by either prototyping or wrapping.
What I have tried (problem is that this only accesses the begin and end of the entire batch of modules):
var oldrequire = require;
require = function (deps, callback, errback, optional) {
console.log("start");
var callbackWrapper = callback;
callbackWrapper = function () {
console.log("end");
var args = new Array();
for(var i = 0; i < arguments.length; i++) {
args.push(arguments[i]);
}
callback.apply(this, args);
};
oldrequire.call(this, deps, callbackWrapper, errback, optional);
};
This is a "better than nothing answer", not a definitive answer, but it might help you look in another direction. Not sure if that's good or bad, certainly it's brainstorming.
I've looked into this recently for a single particular module I had to wrap. I ended up writing a second module ("module-wrapper") for which I added a path entry with the name of the original module ("module"). I then added a second entry ("module-actual") that references the actual module which I require() as a dependency in the wrapper.
I can then add code before and after initialization, and finally return the actual module. This is transparent to user modules as well as the actual module, and very clean and straightforward from a design standpoint.
However, it is obviously not practical to create a wrapper per module manually in your case, but you might be able to generate them dynamically with some trickery. Or somehow figure out what name was used to import the (unique) wrapper module from within it so that it can in turn dynamically import the associated actual module (with an async require, which wouldn't be transparent to user code).
Of course, it would be best if requirejs provided official hooks. I've never seen such hooks in the docs, but you might want to go through them again if you're not more certain than me.
We have two modules that get loaded (with 'define') by require.js:
ds.test.js
ds.js
As you might guess, the former tests the latter. The preamble to ds.test.js is as follows, with some console/logging I've added:
define(["ds", "test", "assert"], function (ds, test, assert) {
console.log(arguments);
// the rest is a 'pure' module --
// no executable code outside of a returned object/map of methods
The output from the console/logging being what I expect: [Object, Object, Object]
The preamble (with console/logging) of ds.js is as follows:
define(["ds"], function (ds) {
console.log(arguments);
// the rest is a 'pure' module
The output from the console/logging however is: [undefined]
Why would the former (ds.test.js) be able to successfully load ds, but ds.js itself cannot? This causes one of my tests to fail, as one of the methods returned by ds refers to a method within itself, i.e.: 'ds.assoc()'. Interestingly before require.js we used a home-rolled dependency manager, and the test did not fail on the same method -- ds.js was able to refer to itself.
Would this be an issue of a so-called "circular dependency"? In that ds.test.js relies on ds.js, and ds.js relies on itself. If so how might I resolve my issue?
For what it's worth, ds.test.js gets loaded first -- it gets picked up as a global var named "SUITE" by 'test.runner.js', the preamble of which is as follows:
define(["test", SUITE], function (test, suite) {
Then whatever test suite gets loaded (in this cast, ds.test.js) in turn loads the module which it is testing (e.g. "ds")
Some final context is that I've just inherited this code in the last few weeks, and what I'm doing is based on our existing conversion of another application from our home-rolled dependency manager to require.js. So I guess I'm asking that be taken into consideration before any sniping with comments such as "why are you using a global variable"; if you've got suggestions for a concrete alternative, great I look forward to it.
(Comment added as an answer by request.)
If ds refers to a method within itself, couldn't you call your example function of assoc() directly, rather than try to use ds.assoc()? (This also eliminates the perceived need for ds to load itself.)