I would like to add custom methods to the puppeteer.Page object, so I could invoke them like so:
let page = await browser.newPage();
page.myNewCustomMethod();
Here is one out of many custom methods I have created. It finds first available element by the XPath expression, using the array of expressions:
const findAnyByXPath = async function (page: puppeteer.Page, expressions: string[]) {
for (const exp of expressions) {
const elements = await page.$x(exp);
if (elements.length) {
return elements[0];
}
}
return null;
}
I have to invoke it like so...
let element = await findAnyByXPath(page, arrayOfExpressions);
To me, that looks weird in the editor, especially in a region where many custom methods are being invoked. It looks to me, a bit of "out of context". So I would rather invoke it like that:
page.findAnyByXPath(arrayOfExpressions);
I'm aware that there is a page.exposeFunction method, but it is not what I'm looking for.
What is a way to achieve this?
Can you do this? Yes.
You can extend any object in JavaScript by modifying its prototype. In order to add a function to a Page object, you can access the prototype of a Page object by using the __proto__ property.
Here is a simple example adding the function customMethod to all Page objects:
const page = await browser.newPage();
page.__proto__.customMethod = async function () {
// ...
return 123;
}
console.log(await page.customMethod()); // 123
const anotherPage = await browser.newPage();
console.log(await anotherPage.customMethod()); // 123
Note, that you need a Page object first, to access the prototype as the constructor function (or class) is not itself exposed.
Should you do this? No.
You probably already noticed the red warnings on the linked MDN docs above. Read them carefully. In general, it is not recommended to change the prototype of objects you are using and haven't created yourself. Someone has created the prototype and he did not expect anyone to tinker around with it. For further information check out this stackoverflow question:
"Why is extending native objects a bad practice?"
How to do it instead?
Instead, you should just use your own functions. There is nothing wrong with having your own functions and call them with page as argument like this:
// simple function
findAnyByXPath(page);
// your own "namespace" with more functionality
myLibrary.findAnyByXPath(page);
myLibrary.anotherCustomFunction(page);
Normally, you could also extend the class Page, but in this case the library is not exporting the class itself. Therefore, you can only create a wrapper class which executes the same functions inside but offers more functionality on top. But this would be a very sophisticated approach and is really worth the effort in this case.
To expand on #Thomas's answer, if you want to override an original method of Page:
const extendPage = (page: Page) => {
const { goto: originalGoto } = page;
page.goto = function goto(url, options) {
console.log("Goto:", url);
// do your things
return originalGoto.apply(page, arguments);
};
return page;
};
const page = extendPage(await browser.newPage());
await page.goto("https://google.com"); // Goto: https://www.google.com
To attach additional methods every time a new Page is created, you can listen to the targetcreated event from the Browser and extend the page in the callback:
const browser = await puppeteer.launch();
browser.on("targetcreated", async (target: Target) => {
if (target.type() === "page") {
const page = await target.page();
extendPage(page);
}
});
const page = await browser.newPage(); // extended page
If you want to add a new method and update Typescript definition:
import { Page, PageEmittedEvents } from "puppeteer";
async function htmlOnly(this: Page) {
await this.setRequestInterception(true); // enable request interception
this.on(PageEmittedEvents.Request, (req) => {
if (req.resourceType() === 'document') return req.continue();
return req.abort();
});
}
declare module "puppeteer" {
interface Page {
htmlOnly: () => Promise<void>;
}
}
export const extendPage = (page: Page) => {
page.htmlOnly = htmlOnly;
return page;
};
browser.on("targetcreated", async (target: Target) => {
if (target.type() === "page") {
const page = await target.page();
extendPage(page);
}
});
const page = await browser.newPage();
await page.htmlOnly();
Related
The problem is rather simple. We need to imbue a function with a parameter, and then simply extract that parameter from the body of the function. I'll present the outline in typescript...
abstract class Puzzle {
abstract assign(param, fn): any;
abstract getAssignedValue(): any;
async test() {
const wrapped = this.assign(222, async () => {
return 555 + this.getAssignedValue();
});
console.log("Expecting", await wrapped(), "to be", 777);
}
}
Let's set the scene:
Assume strict mode, no arguments or callee. Should work reasonably well on the recent-ish version of v8.
The function passed to assign() must be an anonymous arrow function that doesn't take any parameters.
... and it's alsoasync. The assigned value could just be stored somewhere for the duration of the invocation, but because the function is async and can have awaits, you can't rely on the value keeping through multiple interleaved invocations.
this.getAssignedValue() takes no parameters, returning whatever we assigned with the assign() method.
Would be great to find a more elegant solution that those I've presented below.
Edit
Okay, we seem to have found a good solid solution inspired by zone.js. The same type of problem is solved there, and the solution is to override the meaning of some system-level primitives, such as SetTimeout and Promise. The only headache above was the async statement, which meant that the body of the function could be effectively reordered. Asyncs are ultimately triggered by promises, so you'll have to override your Promise with something that is context aware. It's quite involved, and because my use case is outside of browser or even node, I won't bore you with details. For most people hitting this kind of problem - just use zone.js.
Hacky Solution 2
class HackySolution2 extends Puzzle {
assign(param: any, fn: AnyFunction): AnyFunction {
const sub = Object(this);
sub["getAssignedValue"] = () => param;
return function () { return eval(fn.toString()); }.call(sub);
}
getAssignedValue() {
return undefined;
}
}
In this solution, I'm making an object that overrides the getAssignedValue() method, and re-evaluates the source code of the passed function, effectively changing the meaning of this. Still not quite production grade...
Edit.
Oops, this breaks closures.
I don't know typescript so possibly this isn't useful, but what about something like:
const build_assign_hooks = () => {
let assignment;
const get_value = () => assignment;
const assign = (param, fn) => {
assignment = param;
return fn;
}
return [assign, get_value];
};
class Puzzle {
constructor() {
const [assign, getAssignedValue] = build_assign_hooks();
this.assign = assign;
this.getAssignedValue = getAssignedValue;
}
async test() {
const wrapped = this.assign(222, async () => {
return 555 + this.getAssignedValue();
});
console.log("Expecting", await wrapped(), "to be", 777);
}
}
const puzzle = new Puzzle();
puzzle.test();
Hacky Solution 1
We actually have a working implementation. It's such a painful hack, but proves that this should be possible. Somehow. Maybe there's even a super simple solution that I'm missing just because I've been staring at this for too long.
class HackySolution extends Puzzle {
private readonly repo = {};
assign(param: any, fn) {
// code is a random field for repo. It must also be a valid JS fn name.
const code = 'd' + Math.floor(Math.random() * 1000001);
// Store the parameter with this code.
this.repo[code] = param;
// Create a function that has code as part of the name.
const name = `FN_TOKEN_${code}_END_TOKEN`;
const wrapper = new Function(`return function ${name}(){ return this(); }`)();
// Proceed with normal invocation, sending fn as the this argument.
return () => wrapper.call(fn);
}
getAssignedValue() {
// Comb through the stack trace for our FN_TOKEN / END_TOKEN pair, and extract the code.
const regex = /FN_TOKEN_(.*)_END_TOKEN/gm;
const code = regexGetFirstGroup(regex, new Error().stack);
return this.repo[code];
}
}
So the idea in our solution is to examine the stack trace of the new Error().stack, and wrap something we can extract as a token, which in turn we'll put into a repo. Hacky? Very hacky.
Notes
Testing shows that this is actually quite workable, but requires a more modern execution environment than we have - i.e. ES2017+.
I have 3 classes, all extend the previous one.
Entity -> Body -> Player
Each one has a die() method which do very different things.
Entity.die() will call the db
Body.die() will animate the body
Player.die() will call the UI and play special sound.
I don't want to manually call Entity.die() inside Body.die method, mainly because I have many classes and many common methods and I don't want to forget something.
I wrote this little piece of code which does exactly this, the Error stack is easy to understand and points to the correct lines.
function overLoadMethods (parent, children) {
const methods = {}
for (let [fname, fn] of Object.entries(parent)) {
if (typeof fn === 'function') {
if (children[fname]) {
methods[fname] = function () {
fn()
children[fname]()
}
Object.defineProperty(methods[fname], 'name', { value: fname })
} else {
methods[fname] = fn
}
}
}
return methods
}
function createEntity () {
return {
die: () => {
console.log(new Error().stack)
console.log('entity die')
}
}
}
const bodyMethods = {
die: () => {
console.log(new Error().stack)
console.log('body die')
}
}
function createBody () {
const entity = createEntity()
const overLoadedMethods = overLoadMethods(entity, bodyMethods)
return {
...entity,
...bodyMethods,
...overLoadedMethods
}
}
const playerMethods = {
die: () => {
console.log(new Error().stack)
console.log('player die')
}
}
function createPlayer () {
const body = createBody()
const overLoadedMethods = overLoadMethods(body, playerMethods)
return {
...body,
...playerMethods,
...overLoadedMethods
}
}
const player = createPlayer()
// will call Entity.die() then Body.die() then Player.die()
player.die()
Everything is working fine but I never saw this pattern before and I guess there is a good reason which I'm unaware of.
Could someone point the weakness of this pattern if there is one (pretty sure there is) ?
Common Lisp has something similar. When you define a method in a derived class you can decide whether this method should be executed:
:before (i.e. the base method will be called automatically after specialized one)
:after (i.e. the base method will be called automatically before the specialized one)
:around (i.e. only the specialized method will be called, but inside its body you can call the base method with call-next-method that is a special syntax that allows calling base method with either the parameters specified by the caller or the parameters that you want to pass instead).
For example C++ only has around available for general methods (but without the ability to call the base version with original parameters) and forces instead use of before in constructor and after in destructors.
I understand the desire to not repeat code and create code that makes it hard to make mistakes and forget things. But you still have code the you need to remember to wire up. For example, instead of calling Entity.die() you need to call overLoadMethods(). I'm not sure that's an improvement over regular of classes and calling super.die().
You can get the chained method behavior using ES6 classes (you can also get it using prototypes). This has a lot of advantages:
• The pattern is baked into the language.
• It's very clear to see parent/child relationship
• There's a lot of commentary, theory, and examples of different patterns
class Entity {
die() {
// Entity-specific behavior
console.log('entity die')
}
}
class Body extends Entity {
die() {
super.die()
// Body-specific behavior
console.log('body die')
}
}
class Player extends Body {
die() {
super.die()
// Player-specific behavior
console.log('player die')
}
}
const player = new Player
// will call Entity.die() then Body.die() then Player.die()
player.die()
What I was trying to accomplish. I wanted to share a single canvas (because what I'm doing is very heavy) and so I thought I'd make a limited resource manager. You'd ask it for the resource via promise, in this case a Canvas2DRenderingContext. It would wrap the context in a revokable proxy. When you're finished you are required to call release which both returns the canvas to the limited resource manager so it can give it to someone else AND it revokes the proxy so the user can't accidentally use the resource again.
Except when I make a proxy of a Canvas2DRenderingContext it fails.
const ctx = document.createElement('canvas').getContext('2d');
const proxy = new Proxy(ctx, {});
// try to change the width of the canvas via the proxy
test(() => { proxy.canvas.width = 100; }); // ERROR
// try to translate the origin of via the proxy
test(() => { proxy.translate(1, 2); }); // ERROR
function test(fn) {
try {
fn();
} catch (e) {
console.log("FAILED:", e, fn);
}
}
The code above generates Uncaught TypeError: Illegal invocation in Chrome and TypeError: 'get canvas' called on an object that does not implement interface CanvasRenderingContext2D. in Firefox
Is that an expected limitation of Proxy or is it a bug?
note: of course there are other solutions. I can remove the proxy and just not worry about it. I can also wrap the canvas in some JavaScript object that just exposes the functions I need and proxy that. I'm just more curious if this is supposed to work or not. This Mozilla blog post kind of indirectly suggests it's supposed to be possbile since it actually mentions using a proxy with an HTMLElement if only to point out it would certainly fail if you called someElement.appendChild(proxiedElement) but given the simple code above I'd expect it's actually not possible to meanfully wrap any DOM elements or other native objects.
Below is proof that Proxies work with plain JS objects. They work with class based (as in the functions are on the prototype chain). And they don't work with native objects.
const img = document.createElement('img')
const proxy = new Proxy(img, {});
console.log(proxy.src);
Also fails with the same error. where as they don't with JavaScript objects
function testNoOpProxy(obj, msg) {
log(msg, '------');
const proxy = new Proxy(obj, {});
check("get property:", () => proxy.width);
check("set property:", () => proxy.width = 456);
check("get property:", () => proxy.width);
check("call fn on object:", () => proxy.getContext('2d'));
}
function check(msg, fn) {
let success = true;
let r;
try {
r = fn();
} catch (e) {
success = false;
}
log(' ', success ? "pass" : "FAIL", msg, r, fn);
}
const test = {
width: 123,
getContext: function() {
return "test";
},
};
class Test {
constructor() {
this.width = 123;
}
getContext() {
return `Test width = ${this.width}`;
}
}
const testInst = new Test();
const canvas = document.createElement('canvas');
testNoOpProxy(test, 'plain object');
testNoOpProxy(testInst, 'class object');
testNoOpProxy(canvas, 'native object');
function log(...args) {
const elem = document.createElement('pre');
elem.textContent = [...args].join(' ');
document.body.appendChild(elem);
}
pre { margin: 0; }
Well FWIW the solution I choose was to wrap the canvas in a small class that does the thing I was using it for. Advantage is it's easier to test (since I can pass in a mock) and I can proxy that object no problem. Still, I'd like to know
Why doesn't Proxy work for native object?
Do any of the reasons Proxy doesn't work with native objects apply to situations with JavaScript objects?
Is it possible to get Proxy to work with native objects.
const handlers = {
get: (target, key) => key in target ? target[key] : undefined,
set: (target, key, value) => {
if (key in target) {
target[key] = value;
}
return value;
}
};
const { revoke, proxy } = Proxy.revocable(ctx, handlers);
// elsewhere
try {
proxy.canvas.width = 500;
} catch (e) { console.log("Access has been revoked", e); }
Something like that should do what you're expecting.
A revocable proxy, with handlers for get and set traps, for the context.
Just keep in mind that when an instance of Proxy.revocable() is revoked, any subsequent access of that proxy will throw, and thus everything now needs to use try/catch, in the case that it has, indeed, been revoked.
Just for fun, here's how you can do the exact same thing without fear of throwing (in terms of simply using the accessor; no guarantee for doing something wrong while you still have access):
const RevocableAccess = (item, revoked = false) => ({
access: f => revoked ? undefined : f(item),
revoke: () => { revoked = true; }
});
const { revoke, access: useContext } = RevocableAccess(ctx);
useContext(ctx => ctx.canvas.width = 500);
revoke();
useContext(ctx => ctx.canvas.width = 200); // never fires
Edit
As pointed out in the comments below, I completely neglected to test for the method calls on the host object, which, it turns out, are all protected. This comes down to weirdness in the host objects, which get to play by their own rules.
With a proxy as above, proxy.drawImage.apply(ctx, args) would work just fine.
This, however, is counter-intuitive.
Cases that I'm assuming fail here, are Canvas, Image, Audio, Video, Promise (for instance based methods) and the like. I haven't conferred with the spec on this part of Proxies, and whether this is a property-descriptor thing, or a host-bindings thing, but I'm going to assume that it's the latter, if not both.
That said, you should be able to override it with the following change:
const { proxy, revoke } = Proxy.revocable(ctx, {
get(object, key) {
if (!(key in object)) {
return undefined;
}
const value = object[key];
return typeof value === "function"
? (...args) => value.apply(object, args)
: value;
}
});
Here, I am still "getting" the method off of the original object, to call it.
It just so happens that in the case of the value being a function, I call bind to return a function that maintains the this relationship to the original context. Proxies usually handle this common JS issue.
...this causes its own security concern; someone could cache the value out, now, and have permanent access to, say, drawImage, by saying
const draw = proxy.drawImage;...
Then again, they already had the ability to save the real render context, just by saying
const ctx = proxy.canvas.getContext("2d");
...so I'm assuming some level of good-faith, here.
For a more secure solution, there are other fixes, though with canvas, unless it's in-memory only, the context is ultimately going to be available to anyone who can read the DOM.
I have an API route that is being refactored to use ES6 promises to avoid callback hell.
After successfully converting to a promise chain, I wanted to export my .then() functions to a separate file for cleanliness and clarity.
The route file:
The functions file:
This works fine. However, what I'd like to do is move the functions declared in the Class constructor() function into independent methods, which can reference the values instantiated by the constructor. That way it all reads nicer.
But, when I do, I run into scoping problems - this is not defined, etc. What is the correct way to do this? Is an ES6 appropriate to use here, or should I use some other structure?
RAW CODE:
route...
.post((req, res) => {
let SubmitRouteFunctions = require('./functions/submitFunctions.js');
let fn = new SubmitRouteFunctions(req, res);
// *******************************************
// ***** THIS IS WHERE THE MAGIC HAPPENS *****
// *******************************************
Promise.all([fn.redundancyCheck, fn.getLocationInfo])
.then(fn.resetRedundantID)
.then(fn.constructSurveyResult)
.then(fn.storeResultInDB)
.then(fn.redirectToUniqueURL)
.catch((err) => {
console.log(err);
res.send("ERROR SUBMITTING YOUR RESULT: ", err);
});
})
exported functions...
module.exports = class SubmitRouteFunctions {
constructor (req, res) {
this.res = res;
this.initialData = {
answers : req.body.responses,
coreFit : req.body.coreFit,
secondFit : req.body.secondFit,
modules : req.body.modules,
};
this.newId = shortid.generate();
this.visitor = ua('UA-83723251-1', this.newId, {strictCidFormat: false}).debug();
this.clientIp = requestIp.getClientIp(req);
this.redundancyCheck = mongoose.model('Result').findOne({quizId: this.newId});
this.getLocationInfo = request.get('http://freegeoip.net/json/' + this.clientIp).catch((err) => err);
this.resetRedundantID = ([mongooseResult, clientLocationPromise]) => {
console.log(mongooseResult);
if (mongooseResult != null) {
console.log('REDUNDANT ID FOUND - GENERATING NEW ONE')
this.newId = shortid.generate();
this.visitor = ua('UA-83723251-1', this.newId, {strictCidFormat: false});
console.log('NEW ID: ', this.newId);
};
return clientLocationPromise.data;
}
this.constructSurveyResult = (clientLocation) => {
let additionalData = {quizId: this.newId, location: clientLocation};
return Object.assign({}, this.initialData, additionalData);
}
this.storeResultInDB = (newResult) => mongoose.model('Result').create(newResult).then((result) => result).catch((err) => err);
this.redirectToUniqueURL = (mongooseResult) => {
let parsedId = '?' + queryString.stringify({id: mongooseResult.quizId});
let customUrl = 'http://explore-your-fit.herokuapp.com/results' + parsedId;
this.res.send('/results' + parsedId);
}
}
}
ALTERNATIVE #1:
Rather than using ES6 classes, an alternate way to perform the same behavior that cleans up the code just a little bit is to export an anonymous function as described by Nick Panov here: In Node.js, how do I "include" functions from my other files?
FUNCTIONS FILE:
module.exports = function (req, res) {
this.initialData = {
answers : req.body.responses,
coreFit : req.body.coreFit,
secondFit : req.body.secondFit,
modules : req.body.modules,
};
this.newId = shortid.generate();
this.visitor = ua('UA-83723251-1', this.newId, {strictCidFormat: false}).debug();
this.clientIp = requestIp.getClientIp(req);
this.redundancyCheck = mongoose.model('Result').findOne({quizId: this.newId});
this.getLocationInfo = request.get('http://freegeoip.net/json/' + this.clientIp).catch((err) => err);
this.resetRedundantID = ([mongooseResult, clientLocationPromise]) => {
if (mongooseResult != null) {
console.log('REDUNDANT ID FOUND - GENERATING NEW ONE')
this.newId = shortid.generate();
this.visitor = ua('UA-83723251-1', this.newId, {strictCidFormat: false});
console.log('NEW ID: ', this.newId);
};
return clientLocationPromise.data;
}
this.constructSurveyResult = (clientLocation) => {
let additionalData = {quizId: this.newId, location: clientLocation};
return Object.assign({}, this.initialData, additionalData);
}
this.storeResultInDB = (newResult) => mongoose.model('Result').create(newResult).then((result) => result).catch((err) => err);
this.redirectToUniqueURL = (mongooseResult) => {
let parsedId = '?' + queryString.stringify({id: mongooseResult.quizId});
let customUrl = 'http://explore-your-fit.herokuapp.com/results' + parsedId;
res.send('/results' + parsedId);
}
}
Although this does not avoid having to tag each method with this.someFn()..., as I originally wanted, it does take an extra step in the routing file - doing things this way prevents me from having to assign a specific namespace to the methods.
ROUTES FILE
.post((req, res) => {
require('./functions/submitFunctions_2.js')(req, res);
Promise.all([redundancyCheck, getLocationInfo])
.then(resetRedundantID)
.then(constructSurveyResult)
.then(storeResultInDB)
.then(redirectToUniqueURL)
.catch((err) => {
console.log(err);
res.send("ERROR SUBMITTING YOUR RESULT: ", err);
});
})
The functions are reset to reflect each new req and res objects as POST requests hit the route, and the this keyword is apparently bound to the POST route callback in each of the imported methods.
IMPORTANT NOTE: You cannot export an arrow function using this method. The exported function must be a traditional, anonymous function. Here's why, per Udo G's comment on the same thread:
It should be worth to note that this works because this in a function is the global scope when the function is called directly (not bound in any way).
ALTERNATIVE #2:
Another option, courtesy of Bergi from: How to use arrow functions (public class fields) as class methods?
What I am looking for, really, is an experimental feature....
There is an proposal which might allow you to omit the constructor() and directly put the assignment in the class scope with the same functionality, but I wouldn't recommend to use that as it's highly experimental.
However, there is still a way to separate the methods:
Alternatively, you can always use .bind, which allows you to declare the method on the prototype and then bind it to the instance in the constructor. This approach has greater flexibility as it allows modifying the method from the outside of your class.
Based on Bergi's example:
module.exports = class SomeClass {
constructor() {
this.someMethod= this.someMethod.bind(this);
this.someOtherMethod= this.someOtherMethod.bind(this);
…
}
someMethod(val) {
// Do something with val
}
someOtherMethod(val2) {
// Do something with val2
}
}
Obviously, this is more in-line with what I was originally looking for, as it enhances the overall readability of the exported code. BUT doing so will require that you assign a namespace to the new class in your routes file like I did originally:
let SubmitRouteFunctions = require('./functions/submitFunctions.js');
let fn = new SubmitRouteFunctions(req, res);
Promise.all([fn.redundancyCheck, fn.getLocationInfo])
.then(...)
PROPOSED / EXPERIMENTAL FEATURE:
This is not really my wheelhouse, but per Bergi, there is currently a Stage-2 proposal (https://github.com/tc39/proposal-class-public-fields) that is attempting to get "class instance fields" added to the next ES spec.
"Class instance fields" describe properties intended to exist on
instances of a class (and may optionally include initializer
expressions for said properties)
As I understand it, this would solve the issue described here entirely, by allowing methods attached to class objects to reference each instantiation of itself. Therefore, this issues would disappear and methods could optionally be bound automatically.
My (limited) understanding is that the arrow function would be used to accomplish this, like so:
class SomeClass {
constructor() {...}
someMethod (val) => {
// Do something with val
// Where 'this' is bound to the current instance of SomeClass
}
}
Apparently this can be done now using a Babel compiler, but is obviously experimental and risky. Plus, in this case we're trying to do this in Node / Express which makes that almost a moot point :)
I'm working on an emulator. The task at hand is an incoming request on a certain endpoint. The request may contain 1-4 options in the req.body.options. The basic design idea is that an object contains the options and the corresponding method calls (as some sort of a sub-router).
let dataActions = {
option1: optionMethod(param1, param2),
option2: optionMethod2(param1, param2),
option3: optionMethod3(params),
option4: optionMethod4(params)
}
for (key in req.body.options) {
...
}
The for...in should fire the methods (decoupled in other files) when it finds matching in the request with the dataActions keys. Is there a semantical way, or a detailed design pattern to make this work?
The problem is that you already fire the methods yourself.
let dataActions = {
option1: optionMethod(param1, param2) // <-- this is a function call
}
Doing it this way you assign the result of optionMethod() to option1. The above is effectively shorthand for
let dataActions = {};
dataActions.option1 = optionMethod(param1, param2);
If that helps making it more obvious.
You don't want to call the methods immediately. You want to store them for later use. Either store them directly:
let dataActions = {
option1: optionMethod // <-- this is a function reference
}
...or store a function that calls them in some specific way:
let dataActions = {
option1: function () {
return optionMethod('some', 'parameters');
}
}
now you can use them at a separate time, for example like this
Object.keys(dataActions).filter(a => a in req.body.options).forEach(a => {
var optionMethod = dataActions[a];
optionMethod();
});