How to ensure webpage DOM is fully loaded before using jQuery - javascript

I am trying to load a DOM of the currently open webpage in a Google Chrome extension app that I am developing and then retrieve some elements and store them in variables for manipulation. Currently, I am just trying to grab the sender email under the gD class for the specified span. As of now it will output whatever I have in the doStuffWithDOM function twice to the console. I have tried using several variations of ensuring that the page is fully loaded before calling the function like tab.status==complete or/and changeInfo.status==complete but it still calls the function multiple times. Any suggestions on how to ensure the webpages DOM is fully loaded before calling the doStuffWithDOM function?
background.js
function doStuffWithDOM(domContent) {
senderName = $(domContent).find("span.gD");
console.log(senderName);
}
chrome.tabs.onUpdated.addListener(function(id,changeInfo,tab){
if(changeInfo.status=='complete' && tab.status=='complete'){ //To send message after the webpage has loaded
chrome.tabs.sendMessage(tab.id, { text: "report_back" },function(response){
doStuffWithDOM(response);
});
}
})
console output:
test
jQuery.fn.init [prevObject: jQuery.fn.init(62)]
test
jQuery.fn.init [span.gD, prevObject: jQuery.fn.init(70)]

Related

How to get a javascript (emberjs) rendered HTML source in javascript

Problem: I am working on an extension in javascript which needs to be able to view the source HTML of a page after everything is rendered.
The problem is that no matter what method I use, I can only seem to retrieve the pre-rendered source. The website is using emberjs for generating the content of the page.
Example:
Site: https://www.playstation.com/en-us/explore/games/ps4-games/?console=ps4
When I right click and view source, I get the page before the content is loaded.When I right click and inspect element, I want to get the source after the content has loaded.
What I've tried:
background.js
var acceptedURLPattern = "playstation.com";
tabUpdatedCallback = function(tabID, changeInfo, tab) {
if(tab.url.indexOf(acceptedURLPattern) == -1) return;
var eventJsonScript = {
code: "console.log(\"Script Injected\"); window.addEventListener(\"load\", (event) => { " + browserString + ".runtime.sendMessage({ \"html\": document.documentElement.outerHTML });});"
};
browser.tabs.executeScript(tabID, eventJsonScript);
}
handleHTMLMessage = function(request, sender, sendResponse) {
console.log(request);
}
browser.tabs.onUpdated.addListener(tabUpdatedCallback);
browser.runtime.onMessage.addListener(handleHTMLMessage);
The above script is injecting an eventListener onto the page I want to grab the source of after it fires the "load" event which will then send a message back to background.js containing that source.
I've tried changing the documentElement to innerHTML/outerHTML as well as changing the eventListener to document.addEventListener(\"DOMContentLoaded\"), but none of these changes seemed to have any effect.
I've also tried using these: Get javascript rendered html source using phantomjs and get a browser rendered html+javascript but they are using phantomjs to load and execute the page, then return the html. In my solution, I need to be able to grab the already rendered page.
Thanks for the help in advance!
Edit #1:
I took a look at MutationObserver as mentioned by #wOxxOm and changed the eventJsonScript variable to look like this:
var eventJsonScript = {
code: "console.log(\"Script Injected\"); var mutationObserver = new MutationObserver( (mutations) => { mutations.forEach((mutation) => {if( JSON.stringify(mutation).indexOf(\"Yakuza\") != -1) { console.log(mutation); } });}); mutationObserver.observe(document.documentElement, {attributes: true, characterData: true, childList: true, subtree: true, attributeOldValue: true, characterDataOldValue: true}); mutationObserver.takeRecords()"
};
however despite the site clearly having a section for Yakuza 6, the event doesn't get fired. I did remove the if condition in the injected script to verify that events do get fired normally, it just doesn't seem to contain information that I'm looking for.
So the good news is that someone has already written the code to do this in Ember, you can find it here:
https://github.com/emberjs/ember-test-helpers/blob/031969d016fb0201fd8504ac275526f3a0ab2ecd/addon-test-support/%40ember/test-helpers/settled.js
This is the code Ember tests use to wait until everything is rendered and complete, or "settled".
The bad news is it is a nontrivial task to extract it correctly for your extension.
Basically, you will want to:
Wait till the page is loaded (window.load event)
setTimeout at least 200 ms to ensure the Ember app has booted.
Wait until settled, using code linked above.
Wait until browser is idle (requestIdleCallback in latest Chrome, or get a polyfill).
Hope this helps get you started.

Chrome extension to refresh a page every minute and run a (javascript) script every time it refreshes

What I want:
My propose is to check if new content was added in a page (that I do not own), so I was thinking to make a script that save the last content added in a cookie and refresh the page every minute: If the cookie doesn't match the last content added, that would mean there is new content and I would receive a notification.
Let's try with pseudocode:
main_file:
include: functions.js;
cookie last_content_added= get_first_paragraph();
//Refresh script
do (every_minute){
page_reload();
}
when.page.reload.complete {
run script_check_content
}
functions.js
script_check_content{
var content_check = get_first_paragraph();
if (content_check == cookie[last_content_added])
{
//do nothing
}
else{
//new content was added
play.notification.mp3
cookie[last_content.added] = get_first_paragraph();
}
}
Am I not thinking in an easier solution for what I'm looking for?
I'm new to chrome extensions, if you could separate the code in different files like it was a real extension, I would appreciate very much.
I recommend to use 'chrome.tabs.query', use this to get all tabs that have the specified properties or all tabs if no properties are specified and 'chrome.tabs.executeScript' to inject the javascript code into a page that calls 'window.location.reload(). to refresh the page.
Here's a sample code to get the current tab and reload it using chrome.tab methods:
chrome.tabs.query({active: false, currentWindow: true}, function (arrayOfTabs) {
var code = 'window.location.reload();';
chrome.tabs.executeScript(arrayOfTabs[0].id, {code: code});
});
Also, include 'onCompleted' listener to listen when it is completely loaded and initialized.
chrome.webNavigation.onCompleted.addListener(function callback).
Take a look at MutationObserver, it provides a way to react to changes in a DOM. You can provide a callback to react to DOM changes and don't need to use a timer.

chrome.tabs.create/executeScript > call function that belongs to the page

I'm developing an extension for Google Chrome and the problem I'm having is I need to be able to call a JavaScript function that belongs to the webpage that's opened in the tab.
For details, the website is my website, therefore I know that function does exist. That function does a lot of things based on a string value. I want the user to be able to highlight text on any webpage, click a button from the Chrome extension that automatically loads my webpage and calls that function with the highlighted text as it's value.
Here's what I got so far:
chrome.tabs.create({ url: "https://mywebsite.com" }, function (tab) {
var c = "initPlayer('" + request.text + "');"; ////'request.text' is the highlighted text which works
chrome.tabs.executeScript(tab.id, { code: c });
});
But Chrome console says: "Uncaught ReferenceError: initPlayer is not defined."
I know that function does exist as it is in my code on my own website.
Any help is highly appreciated. Thanks!
This happens because pages and content scripts run inside two separate javascript contexts. This means that content scripts cannot acces functions and variables inside a page directly: you'll need to inject a script into the page itself to make it work.
Here is a simple solution:
chrome.tabs.create({url: "https://mywebsite.com"}, function (tab) {
var c = "var s = document.createElement('script');\
s.textContent = \"initPlayer('" + request.text + "');\";\
document.head.appendChild(s);"
chrome.tabs.executeScript(tab.id, {code: c});
});
NOTE: since January 2021, use Manifest V3 with chrome.scripting.executeScript() instead of chrome.tabs.executeScript().
With the above code you will basically:
Create the tab
Inject the code (variable c) into it as a content script that will:
Create a script with the code you want to execute on the page
Inject the script in the page and, therefore, run its code in the page context

How do I provide an extra function to javascript code through an extension?

I want to write an extension that does the following:
Defines a custom function
Allows Javascript code loaded from the Internet to run such a function
The function should take as a parameter an event listener. Basically, something like:
newApiFunctionDefinedInExtension( function( responseHeaders ){
console.log("Headers arrived!", responseHeaders );
} ;
Then using chrome.webRequest, my extension (which made newApiFunctionDefinedInExtension available in the first place) will call the listener (in the locally loaded page) every time response headers are received from the network.
I am new to Chrome extensions and cannot find a way to make that happen. It would be great to know:
How to make a function defined in a module available to the loaded page's scope
How to make such an EventEmitter -- is there a constructor class I can extend?
My goal is simple: the loaded page should define a function, and that function should be called every time there is a network connection.
Every webRequest event receives information about a request, including the ID of the originating tab.
So, assuming that the tab exists note 1, you can use the following flow:
// background.js
chrome.webRequest.onHeadersReceived.addListener(function(details) {
if (details.tabId == -1)
return; // Not related to any tab
chrome.tabs.sendMessage(details.tabId, {
responseHeaders: details.responseHeaders
});
}, {
urls: ['*://*/*'], // e.g. all http(s) URLs. See match patterns docs
// types: ['image'] // for example, defaults to **all** request types
}, ['responseHeaders']);
Then, in a content script (declared in the manifest file), you take the message and pass it to the web page:
// contentscript.js
chrome.runtime.onMessage.addListener(function(message) {
// Assuming that all messages from the background are meant for the page:
document.dispatchEvent(new CustomEvent('my-extension-event', {
detail: message
}));
});
After doing that, your web page can just receive these events as follows:
document.addEventListener('my-extension-event', function(event) {
var message = event.detail;
if (message.responseHeaders) {
// Do something with response headers
}
});
If you want to put an abstraction on top (e.g. implementing a custom EventEmitter), then you need to inject a script in the main execution environment, and declare your custom API over there.
note 1. For simplicity, I assumed that the tab existed. In reality, that is never true for type "main_frame" (and "sub_frame"), because the page has not yet been rendered. If you want to get response headers for the top-level/frame documents, then you need to temporarily store the response headers in some data structure (e.g. a queue / dictionary) in the background page, and send the data to the content script whenever the script is ready.
This can be implemented by using chrome.runtime.sendMessage in the content script to send a message to the background page. Then, whenever a page has loaded and the content script is ready, the background page can use sendResponse to deliver any queued messages.

Phantomjs page.open multiple url's slows down

I have a set of URLs which I use page.open() to open. After processing the contents, I then call page.release() on the page and then call the function to open another page recursively. The webpage has javascript on it, and I test for a condition showing when the javascript has loaded results. The first page.open() call loads the JS in 1 second but all subsequent calls take about 6seconds. I am using page.release() and the pages loaded aren't blank and phantomjs is not crashing. I am wondering why this is happening. I've also tried using page.close()
doAnalysis = function (i) {
var url = 'http://myurl.com';
page.open(url, function(status) {
//get html and process it
page.release();
doAnalysis(i++);
});
}

Categories

Resources