Erase/reset DOM and global variables with JavaScript

Erase/reset DOM and global variables with JavaScript - javascript

I'm writing an electron app, but it's a question about JavaScript/HTML5 in general. I want to load local content in a webview and then open iframes from particular remote resource inside it. Unfortunately I can't because of X-FRAME options. So I came with a workaround. The idea is to load the remote content, erase the dom and inject my own local content using custom file protocol to embed local resources.
Basically I want to totally erase everything, no matter what is loaded into the webview. I got the erasing the dom part with document.write(). But how could I unset all variables that could have been set by that page? Or could I prevent the document from being written to in the first place? Or is there any better, less hacky way to do what I want to do? This is my current code which erases dom:
It runs from a preload script, before anything else:
(function() {
var originalProperties = Object.getOwnPropertyNames(window); //global variables, before dom is loaded
var injectDOM = function() {
document.removeEventListener('DOMContentLoaded', injectDOM);
//trying to erase global variables set by remote resource, if any. Is there a better way?
var newProperties = Object.getOwnPropertyNames(window);
var difference = newProperties.filter(x => originalProperties.indexOf(x) == -1);
for (i = 0; i < newVariables.length; i++) {
if (window.hasOwnProperty(newVariables[i])) {
window[newVariables[i]] = null;
delete window[newVariables[i]]
//some variables still stay, delete return false, however they are nulled
//but is there a better way to do that and what about possible attached event listeners?
}
}
var html = '';
html += '<!-- automagically injected-->';
html += '<!DOCTYPE html>';
html += '<html>';
html += '<head>';
//html += '<script>alert("test");</script>';
html += '</head>';
html += '<body>';
html += 'hello world';
html += '</body>';
html += '</html'>;
document.write(html);
}
document.addEventListener('DOMContentLoaded', injectDOM);
I also tried comparing Object.getOwnPropertyNames(window) before and after the dom was loaded, but something tells me its not the best way to do it.
Update: I managed to solve the problem more elgantly with #wOxxOm's help. I posted my solution in the original github issue https://github.com/electron/electron/issues/5036

Related

Manipulate C# / UWP from HTML / JS

I just managed to implement a small webserver on my Raspberry Pi.
The webserver is created as an UWP headless app.
It can use Javascript. Which is pretty helpful.
I only just start with HTML and JS so I'm a big noob in this and need some help.
I already managed to show the same data I show on the webpage in a headed app on the same device.
Now I want to be able to manipulate the data from the webpage.
But I don't know how I'm supposed to do that.
I parse the HTML / JS as a complete string so I can't use variables I defined in code. I would need another way to do this.
My code for the webserver is currently this:
public sealed class StartupTask : IBackgroundTask
{
private static BackgroundTaskDeferral _deferral = null;
public async void Run(IBackgroundTaskInstance taskInstance)
{
_deferral = taskInstance.GetDeferral();
var webServer = new MyWebServer();
await ThreadPool.RunAsync(workItem => { webServer.Start(); });
}
}
class MyWebServer
{
private const uint BufferSize = 8192;
public async void Start()
{
var listener = new StreamSocketListener();
await listener.BindServiceNameAsync("8081");
listener.ConnectionReceived += async (sender, args) =>
{
var request = new StringBuilder();
using (var input = args.Socket.InputStream)
{
var data = new byte[BufferSize];
IBuffer buffer = data.AsBuffer();
var dataRead = BufferSize;
while (dataRead == BufferSize)
{
await input.ReadAsync(buffer, BufferSize, InputStreamOptions.Partial);
request.Append(Encoding.UTF8.GetString(data, 0, data.Length));
dataRead = buffer.Length;
}
}
string query = GetQuery(request);
using (var output = args.Socket.OutputStream)
{
using (var response = output.AsStreamForWrite())
{
string htmlContent = "<html>";
htmlContent += "<head>";
htmlContent += "<script>";
htmlContent += "function myFunction() {document.getElementById('demo').innerHTML = 'Paragraph changed.'}";
htmlContent += "</script>";
htmlContent += "<body>";
htmlContent += "<h2>JavaScript in Head</h2>";
htmlContent += "<p id='demo'>A paragraph.</p>";
htmlContent += "<button type='button' onclick='myFunction()'>Try it!</button>";
htmlContent += "</body>";
htmlContent += "</html>";
var html = Encoding.UTF8.GetBytes(htmlContent);
using (var bodyStream = new MemoryStream(html))
{
var header =
$"HTTP/1.1 200 OK\r\nContent-Length: {bodyStream.Length}\r\nConnection: close\r\n\r\n";
var headerArray = Encoding.UTF8.GetBytes(header);
await response.WriteAsync(headerArray, 0, headerArray.Length);
await bodyStream.CopyToAsync(response);
await response.FlushAsync();
}
}
}
};
}
public static string GetQuery(StringBuilder request)
{
var requestLines = request.ToString().Split(' ');
var url = requestLines.Length > 1
? requestLines[1]
: string.Empty;
var uri = new Uri("http://localhost" + url);
var query = uri.Query;
return query;
}
}

Your question is a bit vague, so I have to guess what you're trying to do. Do you mean that a browser (or another app with a Web view) will connect to your Pi server, grab some data off it, and then manipulate the data to format them / display them in a particular way on the page? If so, then first you need to decide how you get the data. You seem to imply the data will just be a stream of HTML, though it's not clear how you'll be passing that string to the browser. Traditional ways of grabbing the data might be with Ajax and possibly JSON, but it's also possible to use an old-fashioned iframe (maybe a hidden one) -- though if starting from scratch, Ajax would be better.
The basic issue is to know: what page will access the data on the server and in what format? Is it a local page served locally from the client app's filestore, that will then launch a connection to the server, grab the data and display them in a <div> or and <iframe>, or is it a page on your server that comes with the data incorporated in one part of the DOM, and you want to transform them and display them in another element?
Let's now assume your client app has received the data in an element like <div id="myData">data</div>. A script on the client page can grab those data as a string with document.getElementById('myData').innerHTML(see getElementById). You can then transform the data as necessary with JavaScript methods. Then there are various DOM techniques for inserting the transformed data either back in the same element or a different one.
Instead, let's assume you have received the data via XMLHttpRequest. Then you'll need to identify just the data you want from the received object (that might involve turning the object into a string and using a regular expression, or more likely, use DOM selection methods on the object till you have the part of the data you want). When you've extracted the data / node / element, you can insert it into a <div> on your page as above.
Sorry if this is all a bit vague and abstract, but hopefully it can point you in the right direction to look up further things as needed. https://www.w3schools.com/ is a great resource for beginners.

How to get full interpreted html source with iframes in PhantomJS

With PhantomJS, I want to print the html source of a webpage like Firebug does. Interpreted with iframes.
var page = require('webpage').create();
page.open('http://google.com', function () {
console.log(page.content);
phantom.exit();
});
This only seem to shows the interpreted HTML without iframes html. And use evaluate can't help because my iframes are in another domain so I think javascript with not have access to them.

I found that going through frames to get content did not work because page.framesCount in phantomjs counts only the child frames and not the main frame. Here is working code to display the HTML of all frames:
// Apparently framesCount doesn't include the main frame so add 1
var frameCount = page.framesCount + 1
var html = page.frameContent + '\n\n'
for (var i = 1; i < frameCount; ++i) {
page.switchToFrame(i)
html += page.frameContent + '\n\n'
}
One last important thing, if you don't want the source but want to access the iframe DOM even if it's in another domain do it like this:
phantomjs --web-security=no
The code to access the iframe body is:
var i = document.getElementsByTagName('iframe')
var body = i[0].contentWindow.document.body

Append <script> when preparing a html code for iframe: without executing

I am preparing an HTML code on memory for an Iframe, when I use append it executes the code.
html = $(parser.parseFromString($("#EHtml").val(), "text/html"));
js = '<script>' + $("#EJs").val() + '</script>';
html.find('body').append(js);
$("#EHtml").val() contains HTML code
and the append function does its job but also executes the code.
Any thoughts here?

You need to just store a reference to the string of code and do 1 of two things: either do the append interaction only when you want to run the code later, or run eval(jsString) when you want to run it.

Script tags won't execute if their [type] attribute is set to anything wacky.
<script type="wacky/non-executing">
console.log("This will not execute! You will not see this!");
</script>

Its Obvious to run script when you insert it, between script tags,Because your DOM already complete loads.
And it will run twice because you put it inside the body So when you body content canges the script ran again!
So you have to set your Script tag on head of an iframe to it will run Only when you insert it or reload,and not again an again !
I am not suggesting you to use eval() because it is dangerous to use for script evaluation,eval() is basically used for another purpose !
Use <script></script> tags to run your script and place it on head if you don't want to ran it twice .
var script = document.createElement('script');
script.innerHTML = $("#EJs").val();
iframe.contentWindow.document.head.appendChild(script);
May be this will help you..

Try using entities, like
var encodeHtmlEntity = function(str) {
var buf = [];
for (var i=str.length-1;i>=0;i--) {
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
}
return buf.join('');
};
html.find('body').append(encodeHtmlEntity(js));
to append and
var decodeHtmlEntity = function(str) {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
};
decodeHtmlEntity(html.find('body').val());
to read.

Got it in the way I need.
I was trying to add the script tag without using ways around the problem - I mean I just wanted to add the tag as it is. Thanks every body for the inputs - I got the solution from your advises...
in the end I could append to the Iframe but it was executing in the main page context. What was causing the problem was using JQuery to do the appending... so here it is:
frame = document.getElementById("frame");
out = (frame.contentWindow) ? frame.contentWindow : (frame.contentDocument.document) ? frame.contentDocument.document : frame.contentDocument;
out.document.open();
out.document.write(html.find('html')[0].outerHTML);//HTML added here
//JS appended here
js=out.document.createElement('script');
js.innerHTML = 'js code here';
out.document.getElementsByTagName("body")[0].appendChild(js);
out.document.close();
I hope its useful for someone...

How to prevent resource loading of unattached elements in Chrome

I'm working on Chrome extension and I have following problem:
var myDiv = document.createElement('div');
myDiv.innerHTML = '<img src="a.png">';
What happens now is that Chrome tries to load the "a.png" resource, even If I don't attach the "div" element to document. Is there a way to prevent it?
_In the extension I need to get data from a site that doesn't provide any API, so I have to parse the whole HTML to get the necessary data. Writing my own simple HTML parser could be tricky so I would rather use the native HTML parser. However, in Chrome when I put the whole source code to some temporary non-attached element (so it would get parsed and I could filter the necessary data), ale the images (and possibly other resources) start to load as well, causing higher traffic or (in case of relative paths) lots of errors in console. _

To prevent the resources from being loaded, you'll need to create your Node in an entirely new #document. You can use document.implementation.createHTMLDocument for this.
var dom = document.implementation.createHTMLDocument(); // make new #document
// now use this to..
var myDiv = dom.createElement('div'); // ..create a <div>
myDiv.innerHTML = '<img src="a.png">'; // ..parse HTML

You can delay parsing/loading html by storing it in non-standard attribute, then assigning it to innerHtml, "when the time comes":
myDiv.setAttribute('deferredHtml', '<img src="http://upload.wikimedia.org/wikipedia/commons/4/4e/Single_apple.png">');
global.loadDeferredImage = function() {
if(myDiv.hasAttribute('deferredHtml')) {
myDiv.innerHTML = myDiv.getAttribute('deferredHtml');
myDiv.removeAttribute('deferredHtml');
}
};
... onclick="loadDeferredImage()"
I created jsfiddle illustrating this idea:
http://jsfiddle.net/akhikhl/CbCst/3/

getScript or eval in specific location?

I was wondering if eval (or some variant of jQuery's getScript) can be used to position external javascript in places other than the end of the DOM or at the head. I've tried:
var head = document.getElementById("fig");
instead of
var head = document.getElementsById("head")[0];
with
var script = document.createElement("script");
script.text = $(".code").val();
head.appendChild(script);
But I can't seem to get it to work regardless. (The code does work, but Firebug shows the code being replaced both at #fig and at the end of the page, right before the </body> tag.
Basically the JavaScript toolkit I'm using renders things based on where the script tag is located, and I'm trying to dynamically modify the javascript based on user input (hence I can't really refer to a new external JS file - I'd rather run an eval, which isn't ideal).
I guess the worst case scenario would be to save the user input into a "new" file using PHP or something and use getScript pointing to that new PHP file, but it seems exceedingly hacky.
Thank you once again!

Does the "JavaScript toolkit" you refer to use document.write or document.writeln to insert output into the page? If so, you could override that function to append the script output into the correct location:
document.write = function(s) {
$('#fig').append(s);
};
document.writeln = function(s) {
$('#fig').append(s + '\n');
};
and then load and execute the script using $.getScript.
Edit: A more robust way of doing it, depending on how the code is added:
var output = '';
document.write = function(s) {
output += s;
};
document.writeln = function(s) {
output += s + '\n';
};
$.getScript('URL of script here', function() {
$('#fig').append(output);
});

Develop Reference

JavaScript is the programming language of the Web.