How to remove all javascript and js calls from web page? - javascript

I have a web page which is completely rendered on the server side (nodejs+phantomjs), I want to send this page to a client browser. The problem is the client browser tries to re-execute the javascript. Hence, I have two options:
Disable javascript in the iframe that loads that page in the client
Strip every javascript and js call/event from the page
Although I will not use the original javascript of the page, I will later on need to be able to add javascript events to the iframe.
It seems the first option can be realised by using the iframe 'sandbox' argument.. but that will prevent me from injecting other javascript later on. Hence I need a way to realize the second option, i.e. removing all the original javascript from the page.
Is there an efficient (and reliable) way to do so? I guess using regex could be a solution, but is it reliable?

I found a solution which appear to be working. I am not sure it is the best solution, but it is surely better than manually removing any JS reference from the document.. for my purposes.
Here's the trick: hijack js! I am just prepending in the <head> the following:
<script>
Function.prototype.call = function(){};
Function.prototype.apply = function(){};
Function.prototype.bind = function(){};
</script>"
and JavaScript is disabled.

Related

SEO ajax and links

I have been mulling about SEO, ajax and links. I get confused when looking at code from different web-pages and how they seem to handle this issue.
I have always made sure that a static context exists for the function that makes the ajax-call. I have not been placing javascript inline of my markup but I have rather been using ids to invoke the functions with external js-files. A typical example of my own is the following:
Link
And then hookup the id with a click function.
But what I see on some major pages is that they use things like:
Link
Link
Is there some benefits of using javascript inline like above? I don't get it, major sites seems to be using it?
the second way is bad because a crawler that does not use javascript would not be able to use the second method.
the first method would still work if it didn't use javascript.
As long as your links are properly named and contextually appropriate, AND behave correctly without javascript enabled, you should be 100% fine.
Not that some crawlers do use javascript though, so even though the second variation is a poor one, it might still work sometimes.
tl;dr: If it works without javascript you're good.
On HTML part write this way:
Link
On JavaScript part, write this way:
function ajaxCall() {
// AJAX functionalities will go here
return false;
}
Search engines will index the url, as JavaScript code will not be executed during crawlers fetch the page. But when an user browses this page using browsers, the JavaScript code will be executed (assuming the user did not disable JavaScript), and the ajaxCall function will be called.
Note: As the function returns false, user will not navigate to the URL defined in href section. But if it returns true or void, then user will be navigated to the defined location.

How do I get the path of the currently running script with Javascript?

We have an IE extension implemented as a Browser Helper Object (BHO). We have a utility function written in C++ that we add to the window object of the page so that other scripts in the page can use it to load local script files dynamically. In order to resolve relative paths to these local script files, however, we need to determine the path of the JavaScript file that calls our function:
myfunc() written in C++ and exposed to the page's JavaScript
file:///path/to/some/javascript.js
(additional stack frames)
From the top frame I want to get the information that the script calling myfunc() is located in file:///path/to/some/javascript.js.
I first expected that we could simply use the IActiveScriptDebug interface to get a stacktrace from our utility function. However, it appears to be impossible to get the IActiveScript interface from an IWebBrowser2 interface or associated document (see Full callstack for multiple frames JS on IE8).
The only thing I can think of is to register our own script debugger implementation and have myfunc() break into the debugger. However, I'm skeptical that this will work without prompting the user about whether they want to break into the debugger.
Before doing more thorough tests of this approach, I wanted to check whether anyone has definitive information about whether this is likely to work and/or can suggest an alternative approach that will enable a function written in C++ to get a stack trace from the scripting engine that invoked it.
Each script you load may have an id and each method of the script calling myfunc() may pass this id to myfunc(). This means that first you have to modify myfunct() and finally alter your scripts and calls.
This answer describes how I solved the actual issue I described in the original question. The question description isn't great since I was making assumptions about how to solve the problem that actually turned out to be unfounded. What I was really trying to do is determine the path of the currently running script. I've changed the title of the question to more accurately reflect this.
This is actually fairly easy to achieve since scripts are executed in an HTML document as they are loaded. So if I am currently executing some JavaScript that is loaded by a script tag, that script tag will always be the last script tag in the document (since the rest of the document hasn't loaded yet). To solve this problem, it is therefore enough just to get the URL of the src attribute of the last script tag and resolve any relative paths based on that.
Of course this doesn't work for script embedded directly in the HTML page, but that is bad practice anyway (IMO) so this doesn't seem like a very important limitation.

How to detect whether a browser enables javascript s.t. the master page can make a proper response?

I am developing a site using Asp.net MVC 3 with Razor.
In the _Layout.cshtml (the master page) I want to put a logic based on whether or not the browser enables javascript.
What is the simplest way to make this logic?
For the sake of simplicity, let the master page just output as follows:
#if(....)//need to modify
{ <p>javascript enabled...</p>}
else {<p>javascript disabled...</p>}
If you want to block the access of your application you can use something like this
<noscript>
<meta http-equiv="refresh" content="0;url=../Controller/Error" />
</noscript>
There's no way to find this out on the server, therefore there's no way to find out before the first page is loaded. The best you can do is to put a bit of Javascript into the page that sets a cookie or posts an AJAX response to the server telling it that Javascript is active, so you can do something about it on subsequent page requests. Even apart from the obvious problem of the first page load, it's a bad strategy since the user may switch off Javascript in the meantime while your server still thinks it's active.
Graceful degradation/progressive enhancement are the keywords here. Make your page assume by default that no Javascript is active and act accordingly, i.e. serve plain HTML in either case. Include Javascript that will "upgrade" the site's functionality if Javascript is active. Let the client figure out if Javascript is working or not and give it the means to work in either case.
I'm afraid there's no good solution. Almost all of the solutions out there somehow involve running a script to do the check and it doesn't feel right (at least to me). The best solution I can suggest is use the <noscript /> tag and redirect to a different page that does not depend on javascript.
Here is one trick...
Assume the user has JavaScript blocked (off). We put this code into the index.aspx:
<script>
document.location.href = "index.aspx?js=1";
</script>
If you get the js=1, you know that user has JS enabled.
So you can generate the code in according the user has / hasn't JS.
The other way is to generate contents witho some special class, e.g. <div class="noscript">, and then you run the script (jQuery):
$(".noscript").hide();

Where to put JavaScript configuration functions?

What is the general developer opinion on including javascript code on the file instead of including it on the script tag.
So we all agree that jquery needs to be included with a script file, like below:
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.3/jquery.min.js"
type="text/javascript"></script>
My question is, in order to get functions on a page that is not on all pages of a site. Do we include the functions like below in the same page or in a global include file like above called mysite.js.
$(document).ready(function(){
$(".clickme").click(function(event){
alert("Thanks for visiting!");
});
});
ok. So the question is: if the code above is going to be called in every class="clickme" on a specific pages, and you have the ability to call it either from an include separate file called mysite.js or in the content of the page. Which way will you go?
Arguments are:
If you include it on the page you will only call it from those specific pages that the js functionality is needed.
Or you include it as a file, which the browser cached, but then jquery will have to spend x ms to know that that function is not trigger on a page without "clickme" class in it.
EDIT 1:
Ok. One point that I want to make sure people address is what is the effect of having the document.ready function called things that does not exist in the page, will that trigger any type of delay on the browser? Is that a significant impact?
First of all - $("#clickme") will find the id="clickme" not class="clickme". You'd want $(".clickme") if you were looking for classes.
I (try to) never put any actual JavaScript code inside my XHTML documents, unless I'm working on testing something on a page quickly. I always link to an external JS file to load the functionality I want. Browsers without JS (like web crawlers) will not load these files, and it makes your code look much cleaner to the "view source".
If I need a bit of functionality only on one page - it sometimes gets its own include file. It all depends on how much functionality / slow selectors it uses. Just because you put your JS in an external JS file doesn't mean you need to include it on every page.
The main reason I use this practice - if I need to change some JavaScript code, it will all be in the same place, and change site wide.
As far as the question about performance goes- Some selectors take a lot of time, but most of them (especially those that deal with ID) are very quick. Searching for a selector that doesn't exist is a waste of time, but when you put that up against the wasted time of a second script HTTP request (which blocks the DOM from being ready btw), searching for an empty selector will generally win as being the lesser of the two evils. jQuery 1.3 Performace Notes and SlickSpeed will hopefully help you decide on how many MS you really are losing to searching for a class.
I tend to use an external file so if a change is needed it is done in one place for all pages, rather than x changes on x pages.
Also if you leave the project and someone else has to take over, it can be a massive pain to dig around the project trying to find some inline js.
My personal preference is
completely global functions, plugins and utilities - in a separate JavaScript file and referenced in each page (much like the jQuery file)
specific page functionality - in a separate JavaScript file and only referenced in the page it is needed for
Remember that you can also minify and gzip the files too.
I'm a firm believer of Unobtrusive JavaScript and therefore try to avoid having any JavaScript code in with the markup, even if the JavaScript is in it's own script block.
I agreed to never have code in your HTML page. In ASP.net I programmatically have added a check for each page to see if it has a same name javascript file.
Eg. MyPage.aspx will look for a MyPage.aspx.js
For my MVC master page I have this code to add a javascript link:
// Add Each page's javascript file
if (Page.ViewContext.View is WebFormView)
{
WebFormView view = Page.ViewContext.View as WebFormView;
string shortUrl = view.ViewPath + ".js";
if (File.Exists(Server.MapPath(shortUrl)))
{
_clientScriptIncludes["PageJavascript"] = Page.ResolveUrl(shortUrl);
}
}
This works well because:
It is automagically included in my files
The .js file lives alongside the page itself
Sorry if this doesn't apply to your language/coding style.

Executing JavaScript on page load selectively

Mending a bug in our SAP BW web application, I need to call two javascript functions from the web framework library upon page load. The problem is that each of these functions reloads the page as a side-effect. In addition, I don't have access to modify these functions.
Any great ideas on how to execute a piece of code on "real" page load, then another piece of code on the subsequent load caused by this function, and then execute no code the third reload?
My best idea so far it to set a cookie on each go to determine what to run. I don't greatly love this solution. Anything better would be very welcome. And by the way, I do realize loading a page three times is absolutely ridiculous, but that's how we roll with SAP.
A cookie would work just fine. Or you could modify the query string each time with a "mode=x" or "load=x" parameter.
This would present a problem if the user tries to bookmark the final page, though. If that's an option, the cookie solution is fine. I would guess they need cookies enabled to get that far in the app anyway?
A cookie, or pass a query string parameter indicating which javascript function has been run. We had to do something along these lines to trip out a piece of our software. That's really the best I got.
Use a cookie or set a hidden field value. My vote would be for the field value.
This might be a cute case for using window.name, 'the property that survives page reloads'.

Categories

Resources