html generation via javascript: avoiding htmltidy error

html generation via javascript: avoiding htmltidy error - javascript

I have a web page that generates HTML via JavaScript.
When validating using HTML validator 0.9.5.1 in Firefox 22, I get an error: 'document type does not allow element "span" here'
I am using this JavaScript:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
...<body>...
<script type='text/javascript'>
var someHtml = '<span>Hello world!</span>';
var e;
e = window.document.createElement('div');
e.innerHTML = someHtml;
window.document.body.appendChild(e);
</script>
Obviously the parser assumes that <span> is nested inside <script>
How should i rewrite the JavaScript to pass HTML validation? I would prefer a solution that does not require me to create HTML elements.
Note: The answers in Avoiding HTML-in-string / html() in a jQuery script do not help me since I know that the code works. I want to reformat to avoid validation errors.

the pre element is for preformatted text, not more HTML.
HTML Tidy's objection is that you are putting something that it believes you expect the browser to render as HTML, you need to scape the entities (replacing < and > with < and >) so that it is interpretted as text.
UPDATE IN RESPONSE TO COMMENT:
With an XHTML doctype, the document must be wellformed XML. So, you need CDATA marks inside your script tag:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<title>Hello world</title>
</head>
<body>
<script type='text/javascript'>
//<![CDATA[
var someHtml = "<div>Hello world!</div>";
var e;
e = window.document.createElement('div');
e.innerHTML = someHtml;
window.document.body.appendChild(e);
//]]>
</script>
</body>
</html>

You can't put <div> inside <pre>. <pre> can contain phrasing content only, where <div> is not.
Also you should wrap your script with <![CDATA[ ... ]]> section since doctype is XHTML.

Related

Convert jQuery element to html string

This JS code tries to modify the raw html. It does that by converting the html to jQuery element, does the modifications on the jQuery element then the part which is not working is converting back to raw html string.
Since .html() will not work with xml as indicated in the docs
How can it convert the jQuery back to raw html string? Thanks
let jQ = $($.parseHTML(raw_html));
//modify jQ to heart content
console.log(jQ.html()); //<-- undefined
The raw_html
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
//...
</html>
edit
Output of console.log($.parseHTML(raw_html));

Do
var a = $('<div>').append(raw_html);
//do modifications to variable a
$(a).html() will display the correct html
NOTE this will strip head and other tags as discussed here
heres a plnkr

Javascript XMLSerializer case sensitive

I'm generating a KML document in Javascript and i'm trying to use XMLSerializer to generate the XML file but it's generating all lower case tags even though i create the tags in capital in the DOM.
Is it the DOM that mangles the capitalization or the XMLSerializer? Is there any way to get around it or am I missing something? I've tried this in both Chrome and Firefox.
The KML document is to be imported into Google Earth and it seems it doesn't accept lower case tags.

Based on testing in FF4, the following will work:
Use document.createElementNS ("http://www.opengis.net/kml/2.2", elementName) instead of document.createElement(elementName).
Use elt.appendChild (document.createTextNode (text)) instead of elt.innerHTML = text.

The following works for me (preserving case) in FF 5 beta in an XHTML page:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>test</title>
<script type="text/javascript">
function test() {
var kml = document.getElementsByTagName("kml").item(0);
window.alert (new XMLSerializer().serializeToString(kml));
}
</script>
</head>
<body onload="test()">
<kml id="kml" xmlns="http://www.opengis.net/kml/2.2">
<Document>
<name>KML Samples</name>
<open>1</open>
<description>samples</description>
<Style id="downArrowIcon">
<IconStyle>
<Icon>
<href>http://maps.google.com/mapfiles/kml/pal4/icon28.png</href>
</Icon>
</IconStyle>
</Style>
</Document>
</kml>
</body>
</html>

It doesn't matter if you add elements with capital letters, the DOM manages them always in lower case. Just check it with firebug, you won't see uppercase tags.
In case your doctype is set to XHTML it even breaks standard compliance.
in XHTML attributes and elements must be all lower-case
UPDATE: just checked following:
var test = document.createElement("DIV");
// test.outerHTML returns "<div></div>"
So already when you create the element, it's being parsed and converted to lowercase.

Document Type Definition in html

If i add the Js Script above This Line <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> Then CSS Is not working. Is There is any way to solve this issue
<script type="text/javascript">
<?php $data3 = getmaildata(); ?>
var collection = [<?php echo $data3; ?>];
</script>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org /TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<title>.:: sample ::.</title>
<link rel="stylesheet" href="css/stylesheet.css" type="text/css">

A script element can appear in the head or in the body, it can't appear before the Doctype and no element can appear outside the root element (<html>).
If the Doctype (with a couple of provisos which don't apply in this case) isn't the very first thing in a document then browsers will enter Quirks mode (and emulate bugs seen in older browsers with CSS and DOM handling).
There is no way around this (that is well supported by browsers), so just write valid code and don't try to put a script element somewhere that it isn't allowed.

<script> tags are usually placed in <head> or just before </body>, I don't know if it's related but your code is still invalid.

What happens if you put the SCRIPT element in its proper place, inside the HEAD section or in the BODY?
Also, I don't know what $data3 contains, but if it's a string and not an integer for instance, then it should be encapsulated in quotation marks.

The doctype declaration should be the very first thing in an HTML document, before the tag.
The doctype declaration is not an HTML tag; it is an instruction to the web browser about what version of the markup language the page is written in.
The doctype declaration refers to a Document Type Definition (DTD). The DTD specifies the rules for the markup language, so that the browsers can render the content correctly.
http://www.w3schools.com/tags/tag_DOCTYPE.asp

Removing a particular line from a paragraph using javascript

I pass a string from one page to another page using AJAX method.This string is available in the data base.It retrieves string along with the whole client page script of the page from where i am retrieving the data.I want the string alone from the rest of the data using javascript.The string which i retrieve from the data base alone will be keep on changing
This is how it looks:
**live its live** <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><htmlxmlns="http://www.w3.org/1999/xhtml"> <head><title> Untitled Page </title>
I want to remove the whole paragraph except **live its live**.The problem is that the **live its live** text alone will be changed if it is updated in the data base
Can anybody pls help me sort out this problem....

demo
var str = '**live its live** <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title> Untitled Page </title><link href="App_Themes/Modern/default.css" type="text/css" rel="stylesheet" /></head> <body> <form name="form1" method="post" action="/Web/MessageDisplay.aspx" id="form1"> <div> <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUJNzgzNDMwNTMzZGTa4056JeZHQioLQvNmbYjBQvHt8A==" /> </div> <div> </div> </form> </body> </html>';
alert(str.slice(0,str.indexOf('<!DOCTYPE')))

Your code must have a way to uniquely identify the live its live text to avoid loopholes. If you just slice the string using the content it may fail. Assuming that you used "**" as a wrapper to uniquely identify the text, an alternative to slicing the string or getting its substring is to use javascript's regular expression.
E.g.
var re = /\*\*.*\*\*/gi;
var str = '**live its live** <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><htmlxmlns="http://www.w3.org/1999/xhtml"> <head><title> Untitled Page </title>;';
alert(re.exec(str)[0]);
For more information on RegExp, see https://developer.mozilla.org/en/core_javascript_1.5_guide/regular_expressions

Extremely strange glitch in Chrome - parses contents of string!

Okay - this is the dumbest glitch I have seen in a while:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<script type='text/javascript'>
var data = "</script>";
</script>
</head>
<body>
This should break!
</body>
</html>
This causes syntax errors because the JavaScript parser is actually reading the contents of the string. How stupid!
How can I put </script> in my code. Is there any way?
Is there a valid reason for this behavior?

Within X(HT)ML (when actually treated as such), scripts are required to be escaped as CDATA for precisely this reason. http://www.w3.org/TR/xhtml1/diffs.html#h-4.8
In XHTML, the script and style elements are declared as having #PCDATA content. As a result, < and & will be treated as the start of markup, and entities such as < and & will be recognized as entity references by the XML processor to < and & respectively. Wrapping the content of the script or style element within a CDATA marked section avoids the expansion of these entities.
<script type="text/javascript">
<![CDATA[
... unescaped script content ...
]]>
</script>
If your XHTML document is just served as text/html and treated as tag soup, that doesn't apply and you'll just have to "escape" the string like '</scr' + 'ipt>'.

It's not a glitch - this is normal expected behaviour and quite rightly so if you think about it. HTML specs do not define scripting languages, so all the engine should see is plain text up until </script>, which closes the tag. There are a couple of options, other than the ones already outlined:
// escape the / character, changing the format of the "closing" tag
var data = "<\/script>";
// break up the string
var data = "</"+"script>";
The first method works because HTML doesn't use \ for escaping, it's treated as a literal character, and of course <\/script> isn't a valid closing tag. The second one works for more obvious reasons, but I've been told by someone else here that it shouldn't be used (and I never quite understood why).

Write it this way:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<script type='text/javascript'>
<!--
var data = "</script>";
-->
</script>
</head>
<body>
This should break!
</body>
</html>
The reason is simply that HTML is parsed before executing javascript and the <!-- and --> make the parser ignore all tags that appear in this section.

If you can believe the HTML4 standard, the script content
ends at the first ETAGO ("</") delimiter followed by a name start character ([a-zA-Z])
So, the JavaScript parser is not reading the contents of the string as you describe; the JavaScript parser never gets anything after var data = ", which obviously isn't a valid script.
The simplest way to avoid accidentally ending your JavaScript early is to use Andy E's first suggestion:
var data = "<\/script>";
This way the HTML parser doesn't see </ so the script content doesn't end, and \/ is equivalent to / in a JavaScript string literal, so the results are correct. This is also the method shown for JavaScript in the standard.

Develop Reference

JavaScript is the programming language of the Web.

html generation via javascript: avoiding htmltidy error - javascript

You can't put <div> inside <pre>. <pre> can contain phrasing content only, where <div> is not. Also you should wrap your script with <![CDATA[ ... ]]> section since doctype is XHTML.

Related

Convert jQuery element to html string

Javascript XMLSerializer case sensitive

Document Type Definition in html

Removing a particular line from a paragraph using javascript

Extremely strange glitch in Chrome - parses contents of string!

Categories

Resources