Remove all elements from website except X

Remove all elements from website except X - javascript

I'm not really familiar with Javascript, and even less with how Javascript works in Chrome's F12 developer tools. What I'm trying to do is have a favorite which, when clicked on, loads a web page but removes some of the clutter of the page which is loaded (I don't really care if it removes it before the page is loaded, or loads it and then removes it)
For now, I'm trying to figure out how to remove all elements except the one I want to keep (and its' children), namely, one which has the following html:
<div>
<ul class="c-list-news u-relative" data-load-more-content>...</ul>
</div>
I'm trying the following (from what I could find on SO), but I can't find the right selector (or I'm doing something else wrong, not quite sure):
var elem = document.querySelectorAll('body *:not(div ul.c-list-news, div ul.c-list-news *)');
for(var i=0;i<elem.length;i++) {
elem[i].parentElement.removeChild(elem[i]);
}
(PS : I haven't yet looked into how to put it into a favorite/extension, it will come later)

It's probably easier than you realize. :-) You can get the first element matching .c-list-news like this:
const cListNews = document.querySelector(".c-list-news");
If you want to keep its parent, just add .parentNode to that:
const divContainer = document.querySelector(".c-list-news").parentNode;
Then, wipe out body entirely:
document.body.innerHTML = "";
...and put the element back:
document.body.appendChild(cListNews); // Or `divContainer`
I'm not sure I'd expect the page to continue to be readable, though, since of course this completely changes where the element is in the DOM, which may well make the CSS fail.
You can't make a bookmark (favorite) that both loads the page and does this in one go, because javascript: bookmarks work within the context of the current page. You could use something like TamperMonkey which is an extension that lets you run a script automatically when you go to matching URLs.
But you can make a bookmark that you use when you're already on the page: Just use the javascript: pseudo-protocol and follow it with JavaScript code. For instance:
javascript:var divContainer %3D document.querySelector(".c-list-news").parentNode%3Bdocument.body.innerHTML %3D ""%3Bdocument.body.appendChild(divContainer)%3Bconsole.log("done")%3B
I created that by simply removing line breaks from the code (optional), running the code through encodeURIComponent, and putting javascript: on the front. (Some folks would also convert spaces to %20.)

Save the element to keep to a variable. Remove all nodes from the body, or the element that you want, and add the element to keep. Example:
let elementToKeep = document.getElementById('side');
const myNode = document.getElementsByTagName("body")[0];
while (myNode.firstChild) {
myNode.removeChild(myNode.firstChild);
}
myNode.appendChild(elementToKeep);
Using the removeChild method is faster that setting the innerHtml as empty string.
Check here: Remove all child elements of a DOM node in JavaScript

Related

Why doesn't it format the javascript code? [duplicate]

In tutorials I've learnt to use document.write. Now I understand that by many this is frowned upon. I've tried print(), but then it literally sends it to the printer.
So what are alternatives I should use, and why shouldn't I use document.write? Both w3schools and MDN use document.write.

The reason that your HTML is replaced is because of an evil JavaScript function: document.write().
It is most definitely "bad form." It only works with webpages if you use it on the page load; and if you use it during runtime, it will replace your entire document with the input. And if you're applying it as strict XHTML structure it's not even valid code.
the problem:
document.write writes to the document stream. Calling document.write on a closed (or loaded) document automatically calls document.open which will clear the document.
-- quote from the MDN
document.write() has two henchmen, document.open(), and document.close(). When the HTML document is loading, the document is "open". When the document has finished loading, the document has "closed". Using document.write() at this point will erase your entire (closed) HTML document and replace it with a new (open) document. This means your webpage has erased itself and started writing a new page - from scratch.
I believe document.write() causes the browser to have a performance decrease as well (correct me if I am wrong).
an example:
This example writes output to the HTML document after the page has loaded. Watch document.write()'s evil powers clear the entire document when you press the "exterminate" button:
I am an ordinary HTML page. I am innocent, and purely for informational purposes. Please do not <input type="button" onclick="document.write('This HTML page has been succesfully exterminated.')" value="exterminate"/>
me!
the alternatives:
.innerHTML This is a wonderful alternative, but this attribute has to be attached to the element where you want to put the text.
Example: document.getElementById('output1').innerHTML = 'Some text!';
.createTextNode() is the alternative recommended by the W3C.
Example: var para = document.createElement('p');
para.appendChild(document.createTextNode('Hello, '));
NOTE: This is known to have some performance decreases (slower than .innerHTML). I recommend using .innerHTML instead.
the example with the .innerHTML alternative:
I am an ordinary HTML page.
I am innocent, and purely for informational purposes.
Please do not
<input type="button" onclick="document.getElementById('output1').innerHTML = 'There was an error exterminating this page. Please replace <code>.innerHTML</code> with <code>document.write()</code> to complete extermination.';" value="exterminate"/>
me!
<p id="output1"></p>

Here is code that should replace document.write in-place:
document.write=function(s){
var scripts = document.getElementsByTagName('script');
var lastScript = scripts[scripts.length-1];
lastScript.insertAdjacentHTML("beforebegin", s);
}

You can combine insertAdjacentHTML method and document.currentScript property.
The insertAdjacentHTML() method of the Element interface parses the specified text as HTML or XML and inserts the resulting nodes into the DOM tree at a specified position:
'beforebegin': Before the element itself.
'afterbegin': Just inside the element, before its first child.
'beforeend': Just inside the element, after its last child.
'afterend': After the element itself.
The document.currentScript property returns the <script> element whose script is currently being processed. Best position will be beforebegin — new HTML will be inserted before <script> itself. To match document.write's native behavior, one would position the text afterend, but then the nodes from consecutive calls to the function aren't placed in the same order as you called them (like document.write does), but in reverse. The order in which your HTML appears is probably more important than where they're place relative to the <script> tag, hence the use of beforebegin.
document.currentScript.insertAdjacentHTML(
'beforebegin',
'This is a document.write alternative'
)

As a recommended alternative to document.write you could use DOM manipulation to directly query and add node elements to the DOM.

Just dropping a note here to say that, although using document.write is highly frowned upon due to performance concerns (synchronous DOM injection and evaluation), there is also no actual 1:1 alternative if you are using document.write to inject script tags on demand.
There are a lot of great ways to avoid having to do this (e.g. script loaders like RequireJS that manage your dependency chains) but they are more invasive and so are best used throughout the site/application.

I fail to see the problem with document.write. If you are using it before the onload event fires, as you presumably are, to build elements from structured data for instance, it is the appropriate tool to use. There is no performance advantage to using insertAdjacentHTML or explicitly adding nodes to the DOM after it has been built. I just tested it three different ways with an old script I once used to schedule incoming modem calls for a 24/7 service on a bank of 4 modems.
By the time it is finished this script creates over 3000 DOM nodes, mostly table cells. On a 7 year old PC running Firefox on Vista, this little exercise takes less than 2 seconds using document.write from a local 12kb source file and three 1px GIFs which are re-used about 2000 times. The page just pops into existence fully formed, ready to handle events.
Using insertAdjacentHTML is not a direct substitute as the browser closes tags which the script requires remain open, and takes twice as long to ultimately create a mangled page. Writing all the pieces to a string and then passing it to insertAdjacentHTML takes even longer, but at least you get the page as designed. Other options (like manually re-building the DOM one node at a time) are so ridiculous that I'm not even going there.
Sometimes document.write is the thing to use. The fact that it is one of the oldest methods in JavaScript is not a point against it, but a point in its favor - it is highly optimized code which does exactly what it was intended to do and has been doing since its inception.
It's nice to know that there are alternative post-load methods available, but it must be understood that these are intended for a different purpose entirely; namely modifying the DOM after it has been created and memory allocated to it. It is inherently more resource-intensive to use these methods if your script is intended to write the HTML from which the browser creates the DOM in the first place.
Just write it and let the browser and interpreter do the work. That's what they are there for.
PS: I just tested using an onload param in the body tag and even at this point the document is still open and document.write() functions as intended. Also, there is no perceivable performance difference between the various methods in the latest version of Firefox. Of course there is a ton of caching probably going on somewhere in the hardware/software stack, but that's the point really - let the machine do the work. It may make a difference on a cheap smartphone though. Cheers!

The question depends on what you are actually trying to do.
Usually, instead of doing document.write you can use someElement.innerHTML or better, document.createElement with an someElement.appendChild.
You can also consider using a library like jQuery and using the modification functions in there: http://api.jquery.com/category/manipulation/

This is probably the most correct, direct replacement: insertAdjacentHTML.

Try to use getElementById() or getElementsByName() to access a specific element and then to use innerHTML property:
<html>
<body>
<div id="myDiv1"></div>
<div id="myDiv2"></div>
</body>
<script type="text/javascript">
var myDiv1 = document.getElementById("myDiv1");
var myDiv2 = document.getElementById("myDiv2");
myDiv1.innerHTML = "<b>Content of 1st DIV</b>";
myDiv2.innerHTML = "<i>Content of second DIV element</i>";
</script>
</html>

Use
var documentwrite =(value, method="", display="")=>{
switch(display) {
case "block":
var x = document.createElement("p");
break;
case "inline":
var x = document.createElement("span");
break;
default:
var x = document.createElement("p");
}
var t = document.createTextNode(value);
x.appendChild(t);
if(method==""){
document.body.appendChild(x);
}
else{
document.querySelector(method).appendChild(x);
}
}
and call the function based on your requirement as below
documentwrite("My sample text"); //print value inside body
documentwrite("My sample text inside id", "#demoid", "block"); // print value inside id and display block
documentwrite("My sample text inside class", ".democlass","inline"); // print value inside class and and display inline

I'm not sure if this will work exactly, but I thought of
var docwrite = function(doc) {
document.write(doc);
};
This solved the problem with the error messages for me.

Is it possible to reliably insert a HTML element at script's location?

I'm writing a Javascript file which will be a component in a webpage. I'd like it to be simple to use - just reference the script file in your page, and it is there. To that end however there is a complication - where should the HTML go that the Javascript generates? One approach would be to require a placeholder element in the page with a fixed ID or class or something. But that's an extra requirement. It would be better if the HTML was generated at the location that the script is placed (or, at the start of body, if the script is placed in head). Also, for extra customizability, if the fixed ID was found, the HTML would be placed inside that placeholder.
So I'm wondering - how do I detect my script's location in the page? And how do I place HTML there? document.write() comes to mind, but that is documented as being pretty unreliable. Also it doesn't help if the script is in the head. Not to mention what happens if my script is loaded dynamically via some AJAX call, but I suppose that can be left as an unsupported scenario.

I am doing that with this code...
// This is for Firefox only at the moment.
var thisScriptElement = document.currentScript,
// Generic `a` element for exploiting its ability to return `pathname`.
a = document.createElement('a');
if ( ! thisScriptElement) {
// Iterate backwards, to look for our script.
var scriptElements = document.body.getElementsByTagName('script'),
i = scriptElements.length;
while (i--) {
if ( ! scriptElements[i].src) {
continue;
}
a.href = scriptElements[i].src;
if (a.pathname.replace(/^.*\//, '') == 'name-of-your-js-code.js') {
thisScriptElement = scriptElements[i];
break;
}
}
}
Then, to add your element, it's simple as...
currentScript.parentNode.insertBefore(newElement, currentScript);
I simply add a script element anywhere (and multiple times if necessary) in the body element to include it...
<script type="text/javascript" src="somewhere/name-of-your-js-code.js?"></script>
Ensure the code runs as is, not in DOM ready or window's load event.
Basically, we first check for document.currentScript, which is Firefox only but still useful (if it becomes standardised and/or other browsers implement it, it should be most reliable and fastest).
Then I create a generic a element to exploit some of its functionality, such as extracting the path portion of the href.
I then iterate backwards over the script elements (because in parse order the last script element should be the currently executing script), comparing the filename to what we know ours is called. You may be able to skip this, but I am doing this to be safe.

document.write is very reliable if used as you indicate (a default SharePoint 2010 page uses it 6 times). If placed in the head, it will write content to immediately after the body element. The trick is to build a single string of HTML and write it in one go, don't write snippets of half-formed HTML.
An alternative is to use document.getElementsByTagName('script') while the document is loading and assume the the last one is the current script element. Then you can look at the parent and if it's the head, use the load or DOM ready event to add your elements after the body. Otherwise, just add it before or after the script element as appropriate.

Append html to jQuery element without running scripts inside the html

I have written some code that takes a string of html and cleans away any ugly HTML from it using jQuery (see an early prototype in this SO question). It works pretty well, but I stumbled on an issue:
When using .append() to wrap the html in a div, all script elements in the code are evaluated and run (see this SO answer for an explanation why this happens). I don't want this, I really just want them to be removed, but I can handle that later myself as long as they are not run.
I am using this code:
var wrapper = $('<div/>').append($(html));
I tried to do it this way instead:
var wrapper = $('<div>' + html + '</div>');
But that just brings forth the "Access denied" error in IE that the append() function fixes (see the answer I referenced above).
I think I might be able to rewrite my code to not require a wrapper around the html, but I am not sure, and I'd like to know if it is possible to append html without running scripts in it, anyway.
My questions:
How do I wrap a piece of unknown html
without running scripts inside it,
preferably removing them altogether?
Should I throw jQuery out the window
and do this with plain JavaScript and
DOM manipulation instead? Would that help?
What I am not trying to do:
I am not trying to put some kind of security layer on the client side. I am very much aware that it would be pointless.
Update: James' suggestion
James suggested that I should filter out the script elements, but look at these two examples (the original first and the James' suggestion):
jQuery("<p/>").append("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there")
keeps the text nodes but writes gnu!
jQuery("<p/>").append(jQuery("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there").not('script'))`
Doesn't write gnu!, but also loses the text nodes.
Update 2:
James has updated his answer and I have accepted it. See my latest comment to his answer, though.

How about removing the scripts first?
var wrapper = $('<div/>').append($(html).not('script'));
Create the div container
Use plain JS to put html into div
Remove all script elements in the div
Assuming script elements in the html are not nested in other elements:
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).children().remove('script');
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).find('script').remove();
This works for the case where html is just text and where html has text outside any elements.

You should remove the script elements:
var wrapper = $('<div/>').append($(html).remove("script"));
Second attempt:
node-validator can be used in the browser:
https://github.com/chriso/node-validator
var str = sanitize(large_input_str).xss();
Alternatively, PHPJS has a strip_tags function (regex/evil based):
http://phpjs.org/functions/strip_tags:535

The scripts in the html kept executing for me with all the simple methods mentioned here, then I remembered jquery has a tool for this (since 1.8), jQuery.parseHTML. There's still a catch, according to the documentation events inside attributes(i.e. <img onerror>) will still run.
This is what I'm using:
var $dom = $($.parseHTML(d));
$dom will be a jquery object with the elements found

DOCTYPE breaks style.display

I have a (legacy) JS function, that shows or hides child nodes of argument element. It is used in mouseover and mouseout event handlers to show-hide img tags.
The function looks like this:
function displayElem(elem, value, handlerRoot){
try{
var display = 'inline';
if(!value)
display = 'none';
if(handlerRoot)
elem.style.display = display;
var childs = elem.childNodes;
for (i = 0; i < childs.length; i++){
if(childs[i].nodeType == Node.ELEMENT_NODE){
childs[i].style.display = display;
alert("Node "+childs[i].tagName+" style set to " +childs[i].style.display);
}
}
}catch(e){
alert('displayElem: ' + e);
}
}
Here, value and handlerRoot are boolean flags.
This function works perfectly, if target html page has no doctype. Adding any doctype (strict or transitional) breaks this. Alert shows style has been set to the right value, but child elements are not displayed.
Would be good, if this function could work with any DOCTYPE.
Image (a child node of elem) is initialized like this (perhaps something is wrong here?):
var img = new Image();
img.style.cssText =
'background: transparent url("chrome://{appname}/content/dbutton.png") right top no-repeat;' +
'position: relative;' +
'height:18px;'+
'width:18px;'+
'display:none;';

JavaScript doesn't really work over plain HTML but on the DOM tree generated by the browser. Thus the DOCTYPE does not have a direct influence on JavaScript but on the way the browser handles invalid HTML and CSS.
I think the first step is to clean-up the HTML and make sure it's valid, esp. that tags are used in allowed places and properly nested. That will guarantee that the generated node tree is the same no matter the rendering mode.
You can also use your favourite browser tool (such as Firebug) the inspect the real tree and make sure nodes are placed where you think they are.

Update:
I wonder if when dealing with a document in standards mode (the document has a DOCTYPE), Firefox is inserting an implied element that it doesn't insert in backward-compat mode (no DOCTYPE), and so the image isn't an immediate child of elem but instead a child of this implied element that's then a child of elem; so you won't see the image in elem.childNodes. Walking through the code in a debugger is the best way to tell, but failing that, alert the tagName of each of the child nodes you're iterating through in the loop.
For example, with this markup:
<table id='theTable'>
<tr><td>Hi there</td></tr>
</table>
...Firefox will insert a tbody element, so the DOM looks like this:
<table id='theTable'>
<tbody>
<tr><td>Hi there</td></tr>
</tbody>
</table>
...but it won't be that specific example unless the DOCTYPE is a red herring, because I just tested and Firefox does that even in backward-compat mode. But perhaps you were testing two slightly different documents? Or perhaps it does it with some elements only in standards mode.
Original:
Not immediately seeing the problem, but I do see two issues:
i isn't declared in the function, and so you're falling prey to the Horror of Implicit Globals. Since your alert is showing the correct value, I can't see why that would be the problem.
url(..) in CSS doesn't use quotes. Yes they can, optionally.

Thanks to Álvaro G. Vicario. Though he didn't gave an exact answer, the direction was right.
I've checked the page with w3c validator, and found that my Image objects were missing src attribute. Thus, adding img.src = "chrome://{appname}/content/dbutton.png"; helped.
Still, I'm not sure, why the original code author used background style instead of src... Perhaps, that would remain a mystery. :)

Recommended method to locate the current script?

I am writing a script that needs to add DOM elements to the page, at the place where the script is located (widget-like approach).
What is the best way to do this?
Here are the techniques I am considering:
Include an element with an id="Locator" right above the script. Issues:
I don't like the extra markup
If I reuse the widget in the page, several elements will have the same "Locator" id. I was thinking about adding a line in the script to remove the id once used, but still...
Add an id to the script. Issues:
even though it seems to work, the id attribute is not valid for the script element
same issue as above, several elements will have the same id if I reuse the script in the page.
Use getElementsByTagName("script") and pick the last element. This has worked for me so far, it just seems a little heavy and I am not sure if it is reliable (thinking about deferred scripts)
document.write: not elegant, but seems to do the job.
[Edit] Based on the reply from idealmachine, I am thinking about one more option:
Include in the script tag an attribute, for example goal="tabify".
Use getElementsByTagName("script") to get all the scripts.
Loop through the scripts and check the goal="tabify" attribute to find my script.
Remove the goal attribute in case there's another widget in the page.
[Edit] Another idea, also inspired by the replies so far:
Use getElementsByTagName("script") to get all the scripts.
Loop through the scripts and check innerHTML to find my script.
At the end of the script, remove the script tag in case there's another widget in the page.

Out of the box : document.currentScript (not supported by IE)

I've worked for OnlyWire which provides, as their main service, a widget to put on your site.
We use the var scripts = document.getElementsByTagName("script"); var thisScript = scripts[scripts.length - 1]; trick and it seems to work pretty well. Then we use thisScript.parentNode.insertBefore(ga, thisScript); to insert whatever we want before it, in the DOM tree.
I'm not sure I understand why you consider this a "heavy" solution... it doesn't involve iteration, it's a pure cross-browser solution which integrates perfectly.

This works with multiple copies of same code on page as well as with dynamically inserted code:
<script type="text/javascript" class="to-run">
(function(self){
if (self == window) {
var script = document.querySelector('script.to-run');
script.className = '';
Function(script.innerHTML).call(script);
} else {
// Do real stuff here. self refers to current script element.
console.log(1, self);
}
})(this);
</script>

Either document.write or picking the last script element will work for synchronously loaded scripts in the majority of web pages. However, there are some options I can think of that you did not consider to allow for async loading:
Adding a div with class="Locator" before the script. HTML classes has the advantage that duplicates are not invalid. Of course, to handle the multiple widget case, you will want to change the element's class name when done adding the HTML elements so you do not add them twice. (Note that it is also possible for an element to be a member of multiple classes; it is a space-separated list.)
Checking the src of each script element can ensure that tracking code (e.g. Google Analytics legacy tracking code) and other scripts loaded at the very end of the page will not prevent your script from working properly when async loading is used. Again, to handle the multiple widget case, you may need to remove the script elements when done with them (i.e. when the desired code has been added to the page).
One final comment I will make (although you may already be aware of this) is that when coding a widget, you need to declare all your variables using var and enclose all your code within: (JSLint can help check this)
(function(){
...
})();
This has been called a "self-executing function" and will ensure that variables used in your script do not interfere with the rest of the Web page.

Whether you drop a <script> tag in or a <div class="mywidget">, you're adding something to the markup. Personally, I prefer the latter as the script itself is only added once. Too many scripts in the page body can slow down the page load time.
But if you need to add the script tag where the widget is going to be, I don't see what's wrong with using document.write() to place a div.

I just found another method that seems to answer my question:
How to access parent Iframe from javascript
Embedding the script in an iframe allows to locate it anytime, as the script always keeps a reference to its own window.
I vote this the best approach, as it'll always work no matter how many times you add the script to the page (think widget). You're welcome to comment.
What pushed me to consider iframes in the first place was an experiment I did to build a Google gadget.

In many cases this work well (hud.js is the name of the scipt):
var jsscript = document.getElementsByTagName("script");
for (var i = 0; i < jsscript.length; i++) {
var pattern = /hud.js/i;
if ( pattern.test( jsscript[i].getAttribute("src") ) )
{
var parser = document.createElement('a');
parser.href = jsscript[i].getAttribute("src");
host = parser.host;
}
}

Also you can add individual script's name inside them.
either inside some js-script
dataset['my_prefix_name'] = 'someScriptName'
or inside HTML - in the <script> tag
data-my_prefix_name='someScriptName'
and next search appropriate one by looping over document.scripts array:
... function(){
for (var i = 0, n = document.scripts.length; i < n; i++) {
var prefix = document.scripts[i].dataset['my_prefix_name']
if (prefix == 'whatYouNeed')
return prefix
}
}

I haven't had access to internet explorer since forever, but this should work pretty much everywhere:
<script src="script.js"
data-count="30"
data-headline="My headline"
onload="uniqueFunctionName(this)"
defer
></script>
and inside script.js:
window.uniqueFunctionName = function (currentScript) {
var dataset = currentScript.dataset
console.log(dataset['count'])
console.log(dataset['headline'])
}

Develop Reference

JavaScript is the programming language of the Web.