Get innerHTML from XMLHttpResponse - javascript

I'm trying to write a chrome extension that will get a value from a current page, then use that information to go to another page and pull a specific element from the html response. I can get the html response fine, but I'm unable to parse the html response to get the specific element.
content.js
function getTicketInfo(){
var ticketURI = document.getElementById("p3_lkid").value;
var ticketNumber = document.getElementById("p3_lkold").value;
var xhr = new XMLHttpRequest();
xhr.open('GET',"remotePage.html",true);
xhr.onreadystatechange = function(){
if(xhr.readyState == 4 && xhr.status == 200){
function handleResponse(xhr)
}
}
xhr.send();
}
function handleResponse(xhr){
var contactElement = xhr.getElementById("CF00N80000005MAX6_ileinner");
alert(contactElement.clildNodes[0].nodeValue);
}
remotePage.html
<html>
<div id="CF00N80000005MAX6_ileinner">
Text I need!
</div>
</html>
How can I get this value from the external page? Is there a better way to request this information?

Your XHR response is a string, and not a DOM.
With jQuery you'll be able to convert it to a DOM, and query it.
function handleResponse(xhr){
$(xhr.response).find('#CF00N80000005MAX6_ileinner')
}

This is as simple as not parsing the HTML response to a DOM object. According to MDN, this is how you parse XML (Or HTML, and with Vanilla JavaScript):
var parser = new DOMParser();
var doc = parser.parseFromString(xhr, "text/xml");
And then using the new DOM Object doc for accessing elements.
var contactElement = doc.getElementById("CF00N80000005MAX6_ileinner");
alert(contactElement.childNodes[0].nodeValue);
I also noticed you spelled childNodes wrong, but that isn't the main problem.

Related

Only parse necessary html data and skip unwanted html data

I am working on a script that gets a url and parses all of its html. But it only uses the "data-style-name" "href" "data-sold-out" and the "select".
this is how I parse the html:
function loadHTMLSource(urlSource) {
xhttp = new XMLHttpRequest();
xhttp.open("GET", urlSource, false);
xhttp.send();
return xhttp.response;
}
var page_html = loadHTMLSource(url);
parser = new DOMParser();
my_document = parser.parseFromString(page_html, "text/html");
and Im only pulling info from these html things
my_document.querySelectorAll("[data-style-name]");
attributes["data-sold-out"].value
my_document.querySelector("meta[name='csrf-token']");
my_document.querySelector('select');
Is it possible to pull only these certain html things. So I don't end up pulling data that I don't need?
Any help is appreciated. Thank You.

When I replace XML DOM with some HTML and inject some js, the js doesn’t run

I’m trying to make a bookmarklet that swaps out the DOM on the current page for a new one, and then injects and runs some javascript.
It works fine when the original page is HTML (so I don’t think this is a CORS problem). However, when the original page is XML, the injected javascript doesn’t run :( Why not? How can I get it working?
Here’s some example bookmarklet code:
(function () {
var jsHref = 'https://rawgit.com/andylolz/b2e894fa5ccdecacd901c05769fa97fe/raw/289719871f859a1b19e06f8b8ce3769f0002ce55/js.js';
var htmlHref = 'https://rawgit.com/andylolz/b2e894fa5ccdecacd901c05769fa97fe/raw/289719871f859a1b19e06f8b8ce3769f0002ce55/html.html';
// fetch some HTML
var xhr = new XMLHttpRequest();
xhr.open('GET', htmlHref, false);
xhr.send();
var htmlString = xhr.response;
var parser = new DOMParser();
var result = parser.parseFromString(htmlString, 'text/html');
// swap the DOM for the fetched HTML
document.replaceChild(document.adoptNode(result.documentElement), document.documentElement);
// inject some javascript
var sc = document.createElement('script');
sc.setAttribute('src', jsHref);
document.documentElement.appendChild(sc);
})();
Here it is working on codepen (on an HTML page):
https://codepen.io/anon/pen/wEWemL?editors=1010
Again – If I run the above on an XML page, it mostly works, but the injected javascript doesn’t execute.
Thanks!

XHR loading resources from separate directory

I have a simple XHR to load 'testdirectory/testpage.html'
var xhr; (XMLHttpRequest) ? xhr= new XMLHttpRequest() : xhr= new ActiveXObject("Microsoft.XMLHTTP");
xhr.onload = function() {
if(xhr.readyState === 4 && xhr.status === 200) {
var preview = xhr.responseText;
document.getElementById("content").innerHTML = preview;
}
}
xhr.open("GET", "testdirectory/testpage.html", true);
xhr.send();
I have it set up to display on button click. Works great.
Let's say testpage.html looks like this:
<h1>Loaded Page Heading!</h1>
<img src="main.png">
It will load but the image that is displaying is not main.png from that directory, it is main.png from the directory of the page that placed the XHR.
Is there a way to get the returned HTML to point to the 'testdirectory/main.png' image and not just use the current directory from the XHR? I'm looking for an alternative to changing the HTML of the page retrieved since that would defeat the purpose of what I'm trying to do.
I've been searching through StackOverflow for about 20 minutes, and I've googled a couple of different things. It seems like a question that must have been asked sometime before but is difficult to phrase/find.
I'm afraid you won't be able to achieve what you want without changing the retrieved HTML.
The xhr.responseText you receive from the XHR request is a string of HTML. When you do:
document.getElementById("content").innerHTML = preview;
you're taking that string of HTML and assigning it to the innerHTML property of that element. The innerHTML property parses that string of HTML and assigns the resulting nodes as children of the element. Any URLs in that parsed HTML will use the current document's base URL.
So basically, all you did was take some HTML string, parse it and append it to the current document. The innerHTML property doesn't care or know from where you obtained that HTML. The resulting nodes will behave exactly as any other HTML in the rest of the document.
It seems to me you're expecting the behavior of an iframe. But that's a totally different beast, and works differently to what you're doing here. An iframe is basically an embedded document, with it's own DocumentObjectModel (DOM) and, therefore, it's own base URL property.
In case you decide it's acceptable to modify the HTML string, you could do something like this:
var preview = xhr.responseText.replace('src="', 'src="testdirectory/');
Have in mind though that you would need to do the same for URLs in other types of attributes, such as href.

How can I copy text from a website and use it in my own HTML file

I am making a website to look for prices of flights. Every time that I load my HTML file I have to copy the prices from another website that is not mine and insert them in my HTML file.
The source code of the other website indicates that the tag that I am looking for is a span tag, like <span class="amount price-amount">250</span>
So the question is: How can I copy or extract that info and use it or insert it in my HTML file?
I would like to solve it using HTML, CSS, JavaScript and/or Bootstrap.
Client-Side Webscraping
You do this using page stripping. At least that's what I call it. A basic example is:
var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function () {
if (xhr.readyState === 4) {
var doc = document.createElement('div');
doc.innerHTML = xhr.responseText;
var elems = doc.getElementsByTagName('*'),
prices = [];
for (var i = 0; i < elems.length; i += 1) {
if ((elems[i].getAttribute('class')||'').indexOf('price-amount') > -1 && (elems[i].getAttribute('class')||'').indexOf('amount') > -1) {
prices.push(elems[i].innerHTML);
}
}
}
};
xhr.open('GET', 'airlinesite.com/path/to/page', true);
xhr.send();
This will get the HTML from airlinesite.com/path/to/page. Then it will get all the elements. Loop through them. If it has a class amount and price-amount, it will store it's value in an Array. The values will be stored in prices.
For this, the target domain must have CORS, which it probably does
Use a web-scraper; I recommend request and cheerio. This assumes you have Node JS and know how to install packages.
Here's a simple sample code:
var request = require('request');
var cheerio = require('cheerio');
request(this.url, function(error, response, body) {
if (!error && response.statusCode == 200) {
// body is the scraped html
$ = cheerio.load(arg); // the jQuery-like selector
var price = $('span.price-amount').text(); // the price you want. Use the selector accordingly.
}
}
use inspect element, do this by right clicking and click inspect element. then there will be a box with and arrow pointing into in the top left corner, this is the search click it. Then select what part you want on the page, then it will load it, that way you can just copy and paste it.

parse rss feed using javascript

I am parsing an RSS feed using PHP and JavaScript. First I created a proxy with PHP to obtain the RSS feed. Then get individual data from this RSS feed using JavaScript. My issue with with the JavaScript. I am able to get the entire JavaScript document if I use console.log(rssData); with no errors. If I try to get individual elements within this document say for example: <title>, <description>, or <pubDate> using rssData.getElementsByName("title"); it gives an error "Uncaught TypeError: Object....has no method 'getElementsByName'". So my question is how to I obtain the elements in the RSS feed?
Javascript (Updated)
function httpGet(theUrl) {
var xmlHttp = null;
xmlHttp = new XMLHttpRequest();
xmlHttp.open("GET", theUrl, false);
xmlHttp.send(null);
return xmlHttp.responseXML;
}
// rss source
var rssData = httpGet('http://website.com/rss.php');
// rss values
var allTitles = rssData.getElementsByTagName("title"); // title
var allDate = rssData.getElementsByTagName("pubDate"); // date
Try changing the last line of the httpGet function to:
return xmlHttp.responseXML;
After all, you are expecting an XML response back. You may also need to add this line to your PHP proxy:
header("Content-type: text/xml");
To force the return content to be sent as XML.

Categories

Resources