Why Do Developers Split Up <script in JavaScript?

Why Do Developers Split Up <script in JavaScript? - javascript

I see so many things like this:
S = "<scr" + "ipt language=\"JavaScript1.2\">\n<!--\n";
Why do they do this, is there an application/browser that messes up if you just use straight "<script>"?

Have a look at this question:
Javascript external script loading strangeness.
Taken from bobince's answer:
To see the problem, look at that top
line in its script element:
<script type="text/javascript">
document.write('<script src="set1.aspx?v=1234"
type="text/javascript"></script>');
</script>
So an HTML parser comes along and sees
the opening <script> tag. Inside
<script>, normal <tag> parsing
is disabled (in SGML terms, the
element has CDATA content). To find
where the script block ends, the HTML
parser looks for the matching
close-tag </script>.
The first one it finds is the one
inside the string literal. An HTML
parser can't know that it's inside a
string literal, because HTML parsers
don't know anything about JavaScript
syntax, they only know about CDATA. So
what you are actually saying is:
<script type="text/javascript">
document.write('<script src="set1.aspx?v=1234"
type="text/javascript">
</script>
That is, an unclosed string literal
and an unfinished function call. These
result in JavaScript errors and the
desired script tag is never written.
A common attempt to solve the problem
is:
document.write('...</scr' + 'ipt>');
This wouldn't explain why it's done in the start tag though.

The more appropriate way to append scripts is to use the DOM.
Create an element of type <script>. See documentation for document.createElement.
Set its attributes (src, type etc.)
Use body.appendChild to add this to the DOM.
This is a much cleaner approach.

Related

Programatically escape / script closing tag in javascript

I have this variable which contains a script html code
<script>
var script = "<script>console.log('script here')</script>"
</script>
how do we programmatically escape the / in the closing tag </script> so it will look like the code below
<script>
var script = "<script>console.log('script here')<\/script>"
</script>

It does not work as you think.
The first fragment of code does not work because the browser finds the </script> piece in the string and thinks that it is the closing tag of the script element. It treats the rest of the script and the real </script> closing tag as regular text and displays it in the page (except for the </script> tag).
This means that only a fragment of your script is parsed, the parser finds a syntax error in it (the string is not closed) and the script does not run.
There is no way to fix this using JavaScript code. It is not a coding problem. It is an HTML problem (kind of) and its only solution is to write the HTML in a way that avoids the issue.
The HTML document contains a closing tag </script> inside the body of a script element. For normal HTML content (a paragraph, for example) the solution is straight forward: use < and > to encode < and >:
<p> This is a paragraph that contains a <p> closing tag</p>
You should do it anyway everywhere you want < and > to represent themselves (to be rendered and not interpreted as tag markers) to produce correct HTML.
This simple solution is not possible in the <script> element because the content of the <script> element is not parsed by the HTML parser. It only finds the first appearance of the </script> closing tag and passes the content to the JavaScript parser. And the JavaScript parser does not understand < and >.
However, there is a simple solution for your problem. Make sure that the script does not contain the string </script> and everything will work without problems.
Usually, this is done either by writing:
var script = "<script>console.log('script here')<\/script>"
or by splitting the string in two sub-strings in the middle of the script word:
var script = "<script>console.log('script here')</scr" + "ipt>"
The first solution looks a little better.
Another, even easier, solution is to not put the JavaScript code into an inline script element but keep it in a .js file and link that file into the HTML document:
<script src="my-fancy-script.js"></script>
The file my-fancy-script.js looks like this:
var script = "<script>console.log('script here')</script>"
This way, the content of the my-fancy-script.js file is passed directly to the JavaScript parser that is not fooled by any appearance of </script> in the code.

An approach could be:
var script = "<script>console.log('script here')</script>";
script = script.replace('</script>', '<\/script>');
The same for the opposite:
var script = "<script>console.log('script here')<\/script>";
script = script.replace('<\/script>', '</script>');

How do browsers parse a script tag exactly?

I've just run into a pathological case with HTML parsing. I've always thought that a <script> tag would run until the first closing </script> tag. But it turns out this is not always the case.
This is valid:
<script><!--
alert('<script></script>');
--></script>
And even this is valid:
<script><!--
alert('<script></script>');
</script>
But this is not:
<script><!--
alert('</script>');
--></script>
And neither is this:
<script>
alert('<script></script>');
</script>
This behavior is consistent in Firefox and Chrome. So, as hard as it is to believe, browsers seem to accept an open+close script tag inside an html comment inside a script tag. So the question is how do browser really parse script tags?
This matters because the HTML parsing library I'm using, Nokogiri, assumed the obvious (but incorrect) until-the-first-closing-tag rule and did not handle this edge case. I imagine most other libraries would not handle it either.

After poring over the links given by Tim and Jukka I came to the following answer:
after the opening <script> tag, the parser goes to data1 state
if <!-- is encountered while in data1 state, switch to data2 state
if --> is encountered while in any state, switch to data1 state
if <script[\s/>] is encountered while in data2 state, switch to data3 state
if </script[\s/>] is encountered while in data3 state, switch to data2 state
if </script[\s/>] is encountered while in any other state, stop parsing

All the examples are invalid as per the HTML 4.01 specification: the content of script is declared as CDATA, and the description of CDATA says:
“Although the STYLE and SCRIPT elements use CDATA for their data model, for these elements, CDATA must be handled differently by user agents. Markup and entities must be treated as raw text and passed to the application as is. The first occurrence of the character sequence "</" (end-tag open delimiter) is treated as terminating the end of the element's content. In valid documents, this would be the end tag for the element.”
As you have observed, browsers might not enforce this rule but instead recognize pairs of start and end tags, in some situations. From the spec perspective, this is handling of invalid documents, i.e. error processing. It is not clear what exactly they are doing here and why. It seems to depend on the presence of <!--, which should not have any effect on HTML 4.01 parsing (it is not a comment opener in CDATA content).
In XHTML, partly different rules apply, because in XHTML, <!-- opens a comment within the content of a script element.
As an aside, all the examples are invalid HTML 4.01 and invalid XHTML due to the lack of the type attribute in script. The attribute is not needed (browsers default to treating the content as JavaScript), but it’s required by those specs.
In HTML5, other rules apply. They are rather complicated, and they are supposed to describe browser behavior. In addition to imposing restrictions on content (forbidding e.g. <!-- without matching -->), HTML5 also specifies parsing rules.

Content of tags is still HTML, unless you mark it as not being HTML. In HTML, <word> is taken to be a tag, < needs to be written as < to avoid this behaviour. Alternately, you want to make the contents of <script> a text node; use this formula:
<script type="text/javascript">
//<![CDATA[
// your code, with < and & and "", woohoo!
//]]>
</script>
<![CDATA[ ... ]]> delineates a part of the document as pure text, without markup. Slashes are there so JavaScript wouldn't get confused; the first set of slashes is outside CDATA, but they're HTML-safe, so there's no problem.
EDIT: Just realised the question is about parsing, not writing HTML. Oops.

Hypothetically, if the tags are parsed first and the comments are parsed later, the HTML parser would give you those results.
(I don't mean this is necessarily the case, just a possible explanation only.)
1st case
<script><!--
alert('<script></script>');
--></script>
There is a set of <script></script> inside another <script></script>. The parser may ignore the name of the tags first and just checks for proper opening and closing of those tags. Then it parse the comments.
<script><!--
--></script>
So this is valid.
2nd case
<script><!--
alert('<script></script>');
</script>
There is a set of <script></script> inside another <script></script>. Then it parse the comments.
<script><!--
The comment extends all the way to the end of the document. This is not strictly valid, but the browser handles it correctly.
3rd case
<script><!--
alert('</script>');
--></script>
There is a single closing tag inside the set of <script></script>. It is invalidated before it parse out the </script> as comments.
4th case
<script>
alert('<script></script>');
</script>
There is a set of <script></script> inside another <script></script>, and there are no comments. The first pass is valid but then it really looks into the tags to see what they are. It may not accept a pair of <script> tags inside another one so it invalidates the case.

Painlessly pass HTML to javascript

I need to pass html to javascript so that I can show the html on demand.
I can do it using textareas by having a textarea tag with the html content on the page, like so: <textarea id="html">{whatever html I want except other textareas}</textarea>
then using jquery I can present it on the page:
$("#target").html($("#html").val());
What I want to know is how to do it properly, without having to use textareas or having the html present in the <body> of the page at all?

You could use jquery templates. It's a bit more complex, but offers lots of other nice features.
https://github.com/codepb/jquery-template

Just save it in a variable:
<script type="text/javascript">
var myHTML = '<div>Foo Bar</div>';
</script>

As far as I know there is no painless way to do this due to the nature of html and javascript.
You can store your html as a string in a javascript variable such as:
var string = '<div class="someClass">your text here</div>';
However you should note that strings are enclosed within ether ' or " and if you use ether in your html you will prematurely end the string and cause errors with invalid javascript.
You can decide to only use one type of quote in your html say " and then ' to hold strings in javascript, but a more concrete way is to escape your quotes in html like so:
<div \"someClass\">your text here</div>
By putting \ before a special character you are telling it that it should ignore this character, however when you go to print it out the character will still print but the \ character won't, giving you functioning html.

Just like remy mentioned, you can use jQuery templates, and it's even cooler if you combine it with mustache! (which supports a lot of platforms)
Plus the mustache jQuery plugin is way more advanced than jQuery templates.
https://github.com/jonnyreeves/jquery-Mustache

How do I get the original innerHTML source without the Javascript generated contents?

Is it possible to get in some way the original HTML source without the changes made by the processed Javascript? For example, if I do:
<div id="test">
<script type="text/javascript">document.write("hello");</script>
</div>
If I do:
alert(document.getElementById('test').innerHTML);
it shows:
<script type="text/javascript">document.write("hello");</script>hello
In simple terms, I would like the alert to show only:
<script type="text/javascript">document.write("hello");</script>
without the final hello (the result of the processed script).

I don't think there's a simple solution to just "grab original source" as it'll have to be something that's supplied by the browser. But, if you are only interested in doing this for a section of the page, then I have a workaround for you.
You can wrap the section of interest inside a "frozen" script:
<script id="frozen" type="text/x-frozen-html">
The type attribute I just made up, but it will force the browser to ignore everything inside it. You then add another script tag (proper javascript this time) immediately after this one - the "thawing" script. This thawing script will get the frozen script by ID, grab the text inside it, and do a document.write to add the actual contents to the page. Whenever you need the original source, it's still captured as text inside the frozen script.
And there you have it. The downside is that I wouldn't use this for the whole page... (SEO, syntax highlighting, performance...) but it's quite acceptable if you have a special requirement on part of a page.
Edit: Here is some sample code. Also, as #FlashXSFX correctly pointed out, any script tags within the frozen script will need to be escaped. So in this simple example, I'll make up a <x-script> tag for this purpose.
<script id="frozen" type="text/x-frozen-html">
<div id="test">
<x-script type="text/javascript">document.write("hello");</x-script>
</div>
</script>
<script type="text/javascript">
// Grab contents of frozen script and replace `x-script` with `script`
function getSource() {
return document.getElementById("frozen")
.innerHTML.replace(/x-script/gi, "script");
}
// Write it to the document so it actually executes
document.write(getSource());
</script>
Now whenever you need the source:
alert(getSource());
See the demo: http://jsbin.com/uyica3/edit

A simple way is to fetch it form the server again. It will be in the cache most probably. Here is my solution using jQuery.get(). It takes the original uri of the page and loads the data with an ajax call:
$.get(document.location.href, function(data,status,jq) {console.log(data);})
This will print the original code without any javascript. It does not do any error handling!
If don't want to use jQuery to fetch the source, consult the answer to this question: How to make an ajax call without jquery?

Could you send an Ajax request to the same page you're currently on and use the result as your original HTML? This is foolproof given the right conditions, since you are literally getting the original HTML document. However, this won't work if the page changes on every request (with dynamic content), or if, for whatever reason, you cannot make a request to that specific page.

Brute force approach
var orig = document.getElementById("test").innerHTML;
alert(orig.replace(/<\/script>[.\n\r]*.*/i,"</script>"));
EDIT:
This could be better
var orig = document.getElementById("test").innerHTML + "<<>>";
alert(orig.replace( /<\/script>[^(<<>>)]+<<>>/i, "<\/script>"));

If you override document.write to add some identifiers at the beginning and end of everything written to the document by the script, you will be able to remove those writes with a regular expression.
Here's what I came up with:
<script type="text/javascript" language="javascript">
var docWrite = document.write;
document.write = myDocWrite;
function myDocWrite(wrt) {
docWrite.apply(document, ['<!--docwrite-->' + wrt + '<!--/docwrite-->']);
}
</script>
Added your example somewhere in the page after the initial script:
<div id="test">
<script type="text/javascript"> document.write("hello");</script>
</div>
Then I used this to alert what was inside:
var regEx = /<!--docwrite-->(.*?)<!--\/docwrite-->/gm;
alert(document.getElementById('test').innerHTML.replace(regEx, ''));

If you want the pristine document, you'll need to fetch it again. There's no way around that. If it weren't for the document.write() (or similar code that would run during the load process) you could load the original document's innerHTML into memory on load/domready, before you modify it.

I can't think of a solution that would work the way you're asking. The only code that Javascript has access to is via the DOM, which only contains the result after the page has been processed.
The closest I can think of to achieve what you want is to use Ajax to download a fresh copy of the raw HTML for your page into a Javascript string, at which point since it's a string you can do whatever you like with it, including displaying it in an alert box.

A tricky way is using <style> tag for template. So that you do not need rename x-script any more.
console.log(document.getElementById('test').innerHTML);
<style id="test" type="text/html+template">
<script type="text/javascript">document.write("hello");</script>
</style>
But I do not like this ugly solution.

I think you want to traverse the DOM nodes:
var childNodes = document.getElementById('test').childNodes, i, output = [];
for (i = 0; i < childNodes.length; i++)
if (childNodes[i].nodeName == "SCRIPT")
output.push(childNodes[i].innerHTML);
return output.join('');

Missing } in XML expression

I have an external javascript file that I want to, upon include, write some HTML to the end of the web page.
Upon doing so though I get the error Missing } in XML expression on the line that uses dropdownhtml.
Here is my code
var dropdownhtml = '<div id="dropdown"></div>';
$(document).ready(function(){
//$(document).append(dropdownhtml);
alert(dropdownhtml);
});
The XHTML webpage that includes this file does so like this:
<script type="text/javascript" src="/web/resources/js/dropdownmenu.js"></script>
Doing either append or alert throws up the same error, what is going wrong?

I got this error because I called an external JavaScript within an existing JavaScript, so ended up with:
<script type="text/javascript">
<script type="text/javascript">
code
</script>
code
</script>

Edit Your update changes the question a bit. :-)
There's nothing wrong with your quoted Javascript or with the script tag that includes it, the problem must lie elsewhere on the page.
The old answer:
If you're including Javascript inside an XML document, you must wrap it up in a CDATA section, or you'll run into trouble like this because the XML parser neither knows nor cares about your Javascript quotes, and instead seems markup (your <div>s in the string).
E.g.:
<foo>
<bar><![CDATA[
var dropdownhtml = '<div id="dropdown"></div>';
$(document).ready(function(){
//$(document).append(dropdownhtml);
alert(dropdownhtml);
});
]]></bar>
</foo>
Naturally you need to ensure that the ]]> sequence never appears in a string (or comment, etc.) your script, but that's quite easy to do (for instance: "Be sure to interrupt the end sequence with a harmless backslash like this: ]]\>; that escape just resolves to > anyway.")

There's definitely a missing ); at the end of your code sample. Don't get where there may be a missing } though.

I have empty script on my page
<script src=""></script>
And this leads to such error

Develop Reference

JavaScript is the programming language of the Web.

Why Do Developers Split Up <script in JavaScript? - javascript

I see so many things like this: S = "<scr" + "ipt language=\"JavaScript1.2\">\n<!--\n"; Why do they do this, is there an application/browser that messes up if you just use straight "<script>"?

The more appropriate way to append scripts is to use the DOM. Create an element of type <script>. See documentation for document.createElement. Set its attributes (src, type etc.) Use body.appendChild to add this to the DOM. This is a much cleaner approach.

Related

Programatically escape / script closing tag in javascript

How do browsers parse a script tag exactly?

Painlessly pass HTML to javascript

How do I get the original innerHTML source without the Javascript generated contents?

Missing } in XML expression

Categories

Resources