Get the text from an external HTML document - javascript

My goal is to get the text from a HTML document which does not call any functions from my .jsp file.
I've looked around and I thought I had found the answer to my problem but it doesn't seem to be working, and other answers consist of using jQuery (which I am both unfamiliar with and not allowed to use).
This is my code so far:
function getText(divID) {
var w = window.open("test.html");
var body = w.document.body;
var div = document.getElementById(divID);
var textContent = body.textContent || body.innerText;
console.log(textContent);
//div.appendChild(document.createTextNode(textContent));
}
So as you can see, I'm trying to get the body of one HTML document and have it appear in another. Am I on the right tracks?
EDIT: Ok so I seem to have made my problem quite confusing. I call the function in a HTML document called html.html, but I want to get the text from test.html, then have it appear in html.html. It has to be like this because I can't assume that the HTML document I want to read from will include my .jsp file in its head.
At the moment I am getting the following error.
Uncaught TypeError: Cannot read property 'body' of undefined

The reason document.body in the other window is undefined, is because the other window has not loaded and rendered the document yet.
One solution would be to wait for the onload event.
function getText(divID) {
var w = window.open("test.html");
w.addEventListener("load", function() {
var body = w.document.body;
var div = document.getElementById(divID);
var textContent = body.textContent || body.innerText;
console.log(textContent);
});
}
Make sure you run the getText function on a user event like a click, else window.open will fail.
If all you want to do is get the contents of the other window, using AJAX would probably be a better option.
function getText(divID) {
var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 ) {
var body = xhr.response.body;
var div = document.getElementById(divID);
var textContent = body.textContent || body.innerText;
console.log(textContent);
}
};
xhr.open("GET", "test.html", true);
xhr.responseType = "document";
xhr.send();
}

Related

Returned JavaScript value wont render inside a div element

This a basic question (Posting this again, as it was not re-opened after I updated the question). But I couldn't find any duplicates on SO.
This is a script I intend to use in my project on different pages. The purpose is to override the default ID shown in a span element to the order number from the URL parameter session_order. This doesn't affect anything and only enhances the UX for my project.
scripts.js (loaded in the header):
function get_url_parameter(url, parameter) {
var url = new URL(url);
return url.searchParams.get(parameter);
}
And in my HTML template, I call the function this way,
<div onload="this.innerHTML = get_url_parameter(window.location.href, 'session_order');">24</div>
Also tried this,
<div><script type="text/javascript">document.write(get_url_parameter(window.location.href, 'session_order'));</script></div>
When the page is rendered, nothing changes. No errors or warnings in the console either for the first case.
For the second case, console logged an error Uncaught ReferenceError: get_url_parameter is not defined, although script.js loads before the div element (without any errors).
Normally, I'd do this on the server-side with Flask, but I am trying out JavaScript (I am new to JavaScript) as it's merely a UX enhancement.
What am I missing?
Try this:
// This is commented because it can't be tested inside the stackoverflow editor
//const url = window.location.href;
const url = 'https://example.com?session_order=13';
function get_url_parameter(url, parameter) {
const u = new URL(url);
return u.searchParams.get(parameter);
}
window.onload = function() {
const el = document.getElementById('sessionOrder');
const val = get_url_parameter(url, 'session_order');
if (val) {
el.innerHTML = val;
}
}
<span id="sessionOrder">24</span>
Define the function you need for getting the URL param and then on the window load event, get the URL parameter and update the element.
Here you go. Try to stay away from inline scripts using document.write.
function get_url_parameter(url, parameter) {
var url = new URL(url);
return url.searchParams.get(parameter);
}
window.addEventListener('load', function() {
const url = 'https://yourpagesdomain.name/?session_order=hello'; //window.location.href;
const sessionOrder = get_url_parameter(url, 'session_order');
document.getElementById('sessionOrder').innerText = sessionOrder;
});
<div id="sessionOrder"></div>
The order of your markup and script matters...
<div></div>
<script>
function get_url_parameter(url, parameter) {
var url = new URL(url);
return url.searchParams.get(parameter);
}
</script>
<script>
document.querySelector('div').innerHTML = get_url_parameter('https://example.com?session_order=2', 'session_order');
</script>

Javascript - DOM parser load ajax requests, scripts no run

When a user clicks on a link instead of loading a whole new page I load the new page's HTML data through an ajax request (and also with a query string I get the server to not send the nav bar data each time) the resulting data from the ajax request I then put through DOMParser to allow me to just get the content from the div with the id of "content" and replace the current document's "context" div's innerHTML.
After doing a request through this method though any script tags within the newDOM don't run after being put in the content div. Also, it does appear to run while it is in newDOM either, because if you have a script that instantly edits the document while it loads there is no effect when you log out newDOM
AjaxRequest(href, function(data) {
var parser = new DOMParser();
var newDOM = parser.parseFromString(data.responseText, "text/html");
//Setup new title
var title = '';
if (newDOM.getElementsByTagName('title').length > 0 && newDOM.getElementsByTagName('title')[0] !== null) {
title = newDOM.getElementsByTagName('title')[0].innerHTML;
} else {
title = rawhref;
}
document.title = title;
history.pushState({}, title, rawhref);
if (newDOM.getElementById('content') === null) {
//If there is an error message insert whole body into the content div to get full error message
document.getElementById('content').appendChild(newDOM.getElementsByTagName('body')[0]);
} else {
document.getElementById('content').appendChild(newDOM.getElementById('content'));
}
MapDOM();
if (typeof(onPageLoad) == "function") {
onPageLoad();
}
});
Note: the variable "rawhref" is just the request URL without ?noheader so that it will be easier for users to go back though their history.
NOTE: Also after any new load I also have a function that overwrites any new a tag so that it will work though this method for the next new page.
Also, it would be much preferred if the answer didn't use jQuery.
Some one just answered this and while I was testing it they deleted their solution.... Um, thanks so much who ever you were, and for anyone in the future who has this problem here is the code they showed, but I didn't have time to fully understand why it worked.... but I think can work it out.
function subLoader(dest, text) {
var p = new DOMParser();
var doc = p.parseFromString(text, 'text/html');
var f = document.createDocumentFragment();
while (doc.body.firstChild) {
f.appendChild(doc.body.firstChild);
}
[].map.call(f.querySelectorAll('script'), function(script) {
var scriptParent = script.parentElement || f;
var newScript = document.createElement('script');
if (script.src) {
newScript.src = script.src;
} else {
newScript.textContent = script.textContent;
}
scriptParent.replaceChild(newScript, script);
});
dest.appendChild(f);
}

HTML JavaScript delay downloading img src until node in DOM

Hi I have markup sent to me from a server and I set it as the innerHTML of a div element for the purpose of traversing the tree, finding image nodes, and changing their src values. Is there a way to prevent the original src value from being downloaded?
Here is what I am doing
function replaceImageSrcsInMarkup(markup) {
var div = document.createElement('div');
div.innerHTML = markup;
var images = div.getElementsByTagName('img');
images.forEach(replaceSrc);
return div.innerHTML;
}
The problem is that in browsers as soon as you do:
var img = document.createElement('img'); img.src = 'someurl.com' the browser fires off a request to someurl.com. Is there a way to prevent this without resorting to parsing the markup myself? If there is in no other way does anyone know a good way of parsing the markup with as little code as possible to accomplish my goal?
I know you are already happy with your solution, but I think it would be worth sharing a safe method for future users.
You can now simply use the DOMParser object to generate an external document from your HTML string, instead of using a div created by your current document as container.
DOMParser specifically avoids the pitfalls mentioned in the question and other threats: no img src download, no JavaScript execution, even in elements attributes.
So in your case you can safely do:
function replaceImageSrcsInMarkup(markup) {
var parser = new DOMParser(),
doc = parser.parseFromString(markup, "text/html");
// Manipulate `doc` as a regular document
var images = doc.getElementsByTagName('img');
for (var i = 0; i < images.length; i += 1) {
replaceSrc(images[i]);
}
return doc.body.innerHTML;
}
Demo: http://jsfiddle.net/94b7gyg9/1/
Note: with your current code, browsers will still try downloading the resource initially specified in your img nodes src attribute, even if you change it before the end of JS execution. Trace network transactions in this demo: http://jsfiddle.net/94b7gyg9/
Rather than append the new markup to the DOM before you change the img sources, create an element, set it's inner HTML, change the source of the images and then finally, append the changed markup to the page.
Here's a fully-worked sample.
<!DOCTYPE html>
<html>
<head>
<script>
"use strict";
function byId(id,parent){return (parent == undefined ? document : parent).getElementById(id);}
//function allByClass(className,parent){return (parent == undefined ? document : parent).getElementsByClassName(className);}
function allByTag(tagName,parent){return (parent == undefined ? document : parent).getElementsByTagName(tagName);}
function newEl(tag){return document.createElement(tag);}
//function newTxt(txt){return document.createTextNode(txt);}
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
window.addEventListener('load', onDocLoaded, false);
function onDocLoaded()
{
byId('goBtn').addEventListener('click', onGoBtnClick, false);
}
var dummyString = "<img src='img/girl.png'/><img src='img/gfx07.jpg'/>";
function onGoBtnClick(evt)
{
var div = newEl('div');
div.innerHTML = dummyString;
var mImgs = allByTag('img', div);
for (var i=0, n=mImgs.length; i<n; i++)
{
mImgs[i].src = "img/murderface.jpg";
}
document.body.appendChild(div);
}
</script>
<style>
</style>
</head>
<body>
<button id='goBtn'>GO!</button>
</body>
</html>
You could directly parse the markup string using a regex to replace the img src. Searching for all the img src urls in the string and then replacing them with the new url.
var regex = /<img[^>]+src="?([^"\s]+)"?\s*\/>/g;
var imgUrls = [];
while ( m = regex.exec( markup ) ) {
imgUrls.push( m[1] );
}
imgUrls.forEach(function(url) {
markup = markup.replace(url,'new-url');
});
Another solution might be, if you have access to it, to set the all the img src to an empty string, and put the url in in a data-src attribute. Having your markup string look like something like this
markup = '
';
Then setting this markup to your div.innerHTML won't trigger any download from the browser. And you can still parse it using regular DOM selector.
div.innerHTML = markup;
var images = div.getElementsByTagName('img');
images.forEach(function(img){
var oldSrc = img.getAttribute('data-src');
img.setAttribute('src', 'new-url');
});

Loading an image but onload/onerror not working as expected

I have a div
<div id='cards'>
Which I want to fill with images based on some logic. But only when images are first loaded into memory. Otherwise, through onerror I wanna fill in some text..
function pasteCard(card, to){
if (typeof(card) == 'string')
card = [card];
var image = [];
for (var i = 0; i < card.length; i++) {
image[i] = new Image();
image[i].src = '/sprites/open/' + card[i] + '.png';
image[i].onload = function() {
pasteImage(to, image[i]);
}
image[i].onerror = function() {
pasteText(to, card[i]);
}
// alert(card[i]) #1
}
function pasteImage(to, image) {
to.append(image);
}
function pasteText(to, text) {
// alert(card[i]) #2
to.append(text);
}
}
pasteCard(['ABC123', 'DEF456', 'GHI789'], $('#cards'));
But this isn't working.
Problem/weirdness: If only #2 alert is active it returns nothing. But strangely if #1 alert is also active it does kinda work... (but still doesn't load my images, and mostly fails too when other code is involved)
Question: Why is it not working without #1 alert (at least in that jsfiddle)
suggestions?: what should I do?
Onload and onerror events are fired (executed) outside the scope of your function so your variables will be undefined. In the event method you have access to this which is the image object. You can set a data attribute to each image and access that in your error event.
Here is an example:
http://jsfiddle.net/7CfEu/4/
The callbacks are not in the same scope as your image array is - therefor you need to declare a variable then will "connect the scopes" and use it inside the callbacks
also the i variable probably changes until the callback is fired - so by using it inside the callback you will get undefined behavior
for (var i = 0; i < card.length; i++) {
var current_card = card[i];
var current_image = new Image();
current_image.onload = function() {
pasteImage(to, current_image);
}
current_image.onerror = function() {
pasteText(to, current_card);
}
current_image.src = '/sprites/open/' + current_card + '.png';
image[i] = current_image;
}
Fiddle: http://jsfiddle.net/7CfEu/6/
(Also - closing the div tag is never a bad idea)
Just in case anyone ends up here for same reason I did.
Was going crazy because onload and onerror were not firing in the page I was building. Tried copy pasting
var myimage = new Image();
myimage.onload = function() { alert("Success"); };
myimage.onerror = function() { alert("Fail"); };
myimage.src = "mog.gif" //Doesn't exist.
Which was working within codepen and random otherwise blank pages.
Turns out the problem I was having was that I was doing AJAX requests earlier in the page. This involved authorization which in turn involved a call to
setRequestHeader();
This was resulting in a net::ERR_FILE_NOT_FOUND error instead of the expected GET mog.gif 404 (Not Found)
This seemed to prevent proper triggering of events.
Revert with
xhr.setRequestHeader("Authorization", "");

Chrome Extension doesn't considers optionvalue until reloaded

I'm working on my first Chrome Extension. After learning some interesting notions about jquery i've moved to raw javascript code thanks to "Rob W".
Actually the extension do an XMLHttpRequest to a remote page with some parameters and, after manipulating the result, render an html list into the popup window.
Now everything is up and running so i'm moving to add some option.
The first one was "how many elements you want to load" to set a limit to the element of the list.
I'm using fancy-setting to manage my options and here's the problem.
The extension act like there's a "cache" about the local storage settings.
If i do not set anything and perform a clean installation of the extension, the default number of element is loaded correctly.
If i change the value. I need to reload the extension to see the change.
Only if a remove the setting i see the extension work as intended immediately.
Now, i'm going a little more into specific information.
This is the popup.js script:
chrome.extension.sendRequest({action: 'gpmeGetOptions'}, function(theOptions) {
//Load the limit for topic shown
console.log('NGI-LH -> Received NGI "max_topic_shown" setting ('+theOptions.max_topic_shown+')');
//Initializing the async connection
var xhr = new XMLHttpRequest();
xhr.open('GET', 'http://gaming.ngi.it/subscription.php?do=viewsubscription&pp='+theOptions.max_topic_shown+'&folderid=all&sort=lastpost&order=desc');
xhr.onload = function() {
var html = "<ul>";
var doc = xhr.response;
var TDs = doc.querySelectorAll('td[id*="td_threadtitle_"]');
[].forEach.call(TDs, function(td) {
//Removes useless elements from the source
var tag = td.querySelector('img[src="images/misc/tag.png"]'); (tag != null) ? tag.parentNode.removeChild(tag) : false;
var div_small_font = td.querySelector('div[class="smallfont"]'); (small_font != null ) ? small_font.parentNode.removeChild(small_font) : false;
var span_small_font = td.querySelector('span[class="smallfont"]'); (small_font != null ) ? small_font.parentNode.removeChild(small_font) : false;
var span = td.querySelector('span'); (span != null ) ? span.parentNode.removeChild(span) : false;
//Change the look of some elements
var firstnew = td.querySelector('img[src="images/buttons/firstnew.gif"]'); (firstnew != null ) ? firstnew.src = "/img/icons/comment.gif" : false;
var boldtext = td.querySelector('a[style="font-weight:bold"]'); (boldtext != null ) ? boldtext.style.fontWeight = "normal" : false;
//Modify the lenght of the strings
var lenght_str = td.querySelector('a[id^="thread_title_"]');
if (lenght_str.textContent.length > 40) {
lenght_str.textContent = lenght_str.textContent.substring(0, 40);
lenght_str.innerHTML += "<span style='font-size: 6pt'> [...]</span>";
}
//Removes "Poll:" and Tabulation from the strings
td.querySelector('div').innerHTML = td.querySelector('div').innerHTML.replace(/(Poll)+(:)/g, '');
//Modify the URL from relative to absolute and add the target="_newtab" for the ICON
(td.querySelector('a[id^="thread_title"]') != null) ? td.querySelector('a[id^="thread_title"]').href += "&goto=newpost" : false;
(td.querySelector('a[id^="thread_goto"]') != null) ? td.querySelector('a[id^="thread_goto"]').href += "&goto=newpost": false;
(td.querySelector('a[id^="thread_title"]') != null) ? td.querySelector('a[id^="thread_title"]').target = "_newtab": false;
(td.querySelector('a[id^="thread_goto"]') != null) ? td.querySelector('a[id^="thread_goto"]').target = "_newtab": false;
//Store the td into the main 'html' variable
html += "<li>"+td.innerHTML+"</li>";
// console.log(td);
});
html += "</ul>";
//Send the html variable to the popup window
document.getElementById("content").innerHTML = html.toString();
};
xhr.responseType = 'document'; // Chrome 18+
xhr.send();
});
Following the background.js (the html just load /fancy-settings/source/lib/store.js and this script as Fancy-Setting How-To explains)
//Initialization fancy-settings
var settings = new Store("settings", {
"old_logo": false,
"max_topic_shown": "10"
});
//Load settings
var settings = settings.toObject();
//Listener who send back the settings
chrome.extension.onRequest.addListener(function(request, sender, sendResponse) {
if (request.action == 'gpmeGetOptions') {
sendResponse(settings);
}
});
The console.log show the value as it has been cached, as i said.
If i set the value to "20", It remain default until i reload the extension.
If i change it to 30, it remain at 20 until i reload the extension.
If something more is needed, just ask. I'll edit the question.
The problem appears to be a conceptual misunderstanding. The background.js script in a Chrome Extension is loaded once and continues to run until either the extension or the Chrome Browser is restarted.
This means in your current code the settings variable value is loaded only when the extension first starts. In order to access values that have been updated since the extension is loaded the settings variable value in background.js must be reloaded.
There are a number of ways to accomplish this. The simplest is to move the settings related code into the chrome.extension.onRequest.addListener callback function in background.js. This is also the most inefficient solution, as settings are reloaded every request whether they have actually been updated or not.
A better solution would be to reload the settings value in background.js only when the values are updated in the options page. This uses the persistence, or caching, of the settings variable to your advantage. You'll have to check the documentation for implementation details, but the idea would be to send a message from the options page to the background.js page, telling it to update settings after the new settings have been stored.
As an unrelated aside, the var keyword in the line var settings = settings.toObject(); is not needed. There is no need to redeclare the variable, it is already declared above.

Categories

Resources