How can I render raw HTML [duplicate] - javascript

How can I show HTML snippets on a webpage without needing to replace each < with < and > with >?
In other words, is there a tag for don't render HTML until you hit the closing tag?

The tried and true method for HTML:
Replace the & character with &
Replace the < character with <
Replace the > character with >
Optionally surround your HTML sample with <pre> and/or <code> tags.

sample 1:
<pre>
This text has
been formatted using
the HTML pre tag. The brower should
display all white space
as it was entered.
</pre>
sample 2:
<pre>
<code>
My pre-formatted code
here.
</code>
</pre>
sample 3:
(If you are actually "quoting" a block of code, then the markup would be)
<blockquote>
<pre>
<code>
My pre-formatted "quoted" code here.
</code>
</pre>
</blockquote>

is there a tag for don't render HTML until you hit the closing tag?
No, there is not. In HTML proper, there’s no way short of escaping some characters:
& as &
< as <
(Incidentally, there is no need to escape > but people often do it for reasons of symmetry.)
And of course you should surround the resulting, escaped HTML code within <pre><code>…</code></pre> to (a) preserve whitespace and line breaks, and (b) mark it up as a code element.
All other solutions, such as wrapping your code into a <textarea> or the (deprecated) <xmp> element, will break.1
XHTML that is declared to the browser as XML (via the HTTP Content-Type header! — merely setting a DOCTYPE is not enough) could alternatively use a CDATA section:
<![CDATA[Your <code> here]]>
But this only works in XML, not in HTML, and even this isn’t a foolproof solution, since the code mustn’t contain the closing delimiter ]]>. So even in XML the simplest, most robust solution is via escaping.
1 Case in point:
textarea {border: none; width: 100%;}
<textarea readonly="readonly">
<p>Computer <textarea>says</textarea> <span>no.</span>
</textarea>
<xmp>
Computer <xmp>says</xmp> <span>no.</span>
</xmp>

Kind of a naive method to display code will be including it in a textarea and add disabled attribute so its not editable.
<textarea disabled> code </textarea>
Hope that help someone looking for an easy way to get stuff done.
But warning, this won't escape the tags for you, as you can see here (the following obviously does not work):
<textarea disabled>
This is the code to create a textarea:
<textarea></textarea>
</textarea>

Deprecated, but works in FF3 and IE8.
<xmp>
<b>bold</b><ul><li>list item</li></ul>
</xmp>
Recommended:
<pre><code>
code here, escape it yourself.
</code></pre>

i used <xmp> just like this :
http://jsfiddle.net/barnameha/hF985/1/

The deprecated <xmp> tag essentially does that but is no longer part of the XHTML spec. It should still work though in all current browsers.
Here's another idea, a hack/parlor trick, you could put the code in a textarea like so:
<textarea disabled="true" style="border: none;background-color:white;">
<p>test</p>
</textarea>
Putting angle brackets and code like this inside a text area is invalid HTML and will cause undefined behavior in different browsers. In Internet Explorer the HTML is interpreted, whereas Mozilla, Chrome and Safari leave it uninterpreted.
If you want it to be non-editable and look different then you could easily style it using CSS. The only issue would be that browsers will add that little drag handle in the bottom-right corner to resize the box. Or alternatively, try using an input tag instead.
The right way to inject code into your textarea is to use server side language like this PHP for example:
<textarea disabled="true" style="border: none;background-color:white;">
<?php echo '<p>test</p>'; ?>
</textarea>
Then it bypasses the html interpreter and puts uninterpreted text into the textarea consistently across all browsers.
Other than that, the only way is really to escape the code yourself if static HTML or using server-side methods such as .NET's HtmlEncode() if using such technology.

If your goal is to show a chunk of code that you're executing elsewhere on the same page, you can use textContent (it's pure-js and well supported: http://caniuse.com/#feat=textcontent)
<div id="myCode">
<p>
hello world
</p>
</div>
<div id="loadHere"></div>
document.getElementById("myCode").textContent = document.getElementById("loadHere").innerHTML;
To get multi-line formatting in the result, you need to set css style "white-space: pre;" on the target div, and write the lines individually using "\r\n" at the end of each.
Here's a demo: https://jsfiddle.net/wphps3od/
This method has an advantage over using textarea: Code wont be reformatted as it would in a textarea. (Things like are removed entirely in a textarea)

In HTML? No.
In XML/XHTML? You could use a CDATA block.

I assume:
you want to write 100% valid HTML5
you want to place the code snippet (almost) literal in the HTML
especially < should not need escaping
All your options are in this tree:
with HTML syntax
there are five kinds of elements
those called "normal elements" (like <p>)
can't have a literal <
it would be considered the start of the next tag or comment
void elements
they have no content
you could put your HTML in a data attribute (but this is true for all elements)
that would need JavaScript to move the data elsewhere
in double-quoted attributes, " and &thing; need escaping: " and &thing; respectively
raw text elements
<script> and <style> only
they are never rendered visible
but embedding your text in Javascript might be feasable
Javascript allows for multi-line strings with backticks
it could then be inserted dynamically
a literal </script is not allowed anywhere in <script>
escapable raw text elements
<textarea> and <title> only
<textarea> is a good candidate to wrap code in
it is totally legal to write </html> in there
not legal is the substring </textarea for obvious reasons
escape this special case with </textarea or similar
&thing; needs escaping: &thing;
foreign elements
elements from MathML and SVG namespaces
at least SVG allows embedding of HTML again...
and CDATA is allowed there, so it seems to have potential
with XML syntax
covered by Konrad's answer
Note: > never needs escaping. Not even in normal elements.

It's vey simple ....
Use this xmp code
<xmp id="container">
<xmp >
<p>a paragraph</p>
</xmp >
</xmp>

<textarea ><?php echo htmlentities($page_html); ?></textarea>
works fine for me..
"keeping in mind Alexander's suggestion, here is why I think this is a good approach"
if we just try plain <textarea> it may not always work since there may be closing textarea tags which may wrongly close the parent tag and display rest of the HTML source on the parent document, which would look awkward.
using htmlentities converts all applicable characters such as < > to HTML entities which eliminates any possibility of leaks.
There maybe benefits or shortcomings to this approach or a better way of achieving the same results, if so please comment as I would love to learn from them :)

This is a simple trick and I have tried it in Safari and Firefox
<code>
<span><</span>meta property="og:title" content="A very fine cuisine" /><br>
<span><</span>meta property="og:image" content="http://www.example.com/image.png" />
</code>
It will show like this:
You can see it live Here

You could try:
Hello! Here is some code:
<xmp>
<div id="hello">
</div>
</xmp>

This is a bit of a hack, but we can use something like:
body script {
display: block;
font-family: monospace;
white-space: pre;
}
<script type="text/html">
<h1>Hello World</h1>
<ul>
<li>Enjoy this dodgy hack,
<li>or don't!
</ul>
</script>
With that CSS, the browser will display scripts inside the body. It won’t attempt to execute this script, as it has an unknown type text/html. It’s not necessary to escape special characters inside a <script>, unless you want to include a closing </script> tag.
I’m using something like this to display executable JavaScript in the body of the page, for a sort of "literate progamming".
There’s some more info in this question When should tags be visible and why can they?.

function escapeHTML(string)
{
var pre = document.createElement('pre');
var text = document.createTextNode(string);
pre.appendChild(text);
return pre.innerHTML;
}//end escapeHTML
it will return the escaped Html

Ultimately the best (though annoying) answer is "escape the text".
There are however a lot of text editors -- or even stand-alone mini utilities -- that can do this automatically. So you never should have to escape it manually if you don't want to (Unless it's a mix of escaped and un-escaped code...)
Quick Google search shows me this one, for example: http://malektips.com/zzee-text-utility-html-escape-regular-expression.html

This is by far the best method for most situations:
<pre><code>
code here, escape it yourself.
</code></pre>
I would have up voted the first person who suggested it but I don't have reputation. I felt compelled to say something though for the sake of people trying to find answers on the Internet.

You could use a server side language like PHP to insert raw text:
<?php
$str = <<<EOD
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="description" content="Minimal HTML5">
<meta name="keywords" content="HTML5,Minimal">
<title>This is the title</title>
<link rel='stylesheet.css' href='style.css'>
</head>
<body>
</body>
</html>
EOD;
?>
then dump out the value of $str htmlencoded:
<div style="white-space: pre">
<?php echo htmlentities($str); ?>
</div>

There are a few ways to escape everything in HTML, none of them nice.
Or you could put in an iframe that loads a plain old text file.

Actually there is a way to do this. It has limitation (one), but is 100% standard, not deprecated (like xmp), and works.
And it's trivial. Here it is:
<div id="mydoc-src" style="display: none;">
LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo
<!--
YOUR CODE HERE.
<script src="WidgetsLib/all.js"></script>
^^ This is a text, no side effects trying to load it.
-->
LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo
</div>
Please let me explain. First of all, ordinary HTML comment does the job, to prevent whole block be interpreted. You can easily add in it any tags, all of them will be ignored. Ignored from interpretation, but still available via innerHTML! So what is left, is to get the contents, and filter the preceding and trailing comment tokens.
Except (remember - the limitation) you can't put there HTML comments inside, since (at least in my Chrome) nesting of them is not supported, and very first '-->' will end the show.
Well, it is a nasty little limitation, but in certain cases it's not a problem at all, if your text is free of HTML comments. And, it's easier to escape one construct, then a whole bunch of them.
Now, what is that weird LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo string? It's a random string, like a hash, unlikely to be used in the block, and used for? Here's the context, why I have used it. In my case, I took the contents of one DIV, then processed it with Showdown markdown, and then the output assigned into another div. The idea was, to write markdown inline in the HTML file, and just open in a browser and it would transform on the load on-the-fly. So, in my case, <!-- became transformed to <p><!--</p>, the comment properly escaped. It worked, but polluted the screen. So, to easily remove it with regex, the random string was used. Here's the code:
var converter = new showdown.Converter();
converter.setOption('simplifiedAutoLink', true);
converter.setOption('tables', true);
converter.setOption('tasklists', true);
var src = document.getElementById("mydoc-src");
var res = document.getElementById("mydoc-res");
res.innerHTML = converter.makeHtml(src.innerHTML)
.replace(/<p>.{0,10}LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo.{0,10}<\/p>/g, "");
src.innerHTML = '';
And it works.
If somebody is interested, this article is written using this technique. Feel free to download, and look inside the HTML file.
It depends what you are using it for. Is it user input? Then use <textarea>, and escape everything. In my case, and probably it's your case too, I simply used comments, and it does the job.
If you don't use markdown, and just want to get it as is from a tag, then it's even simpler:
<div id="mydoc-src" style="display: none;">
<!--
YOUR CODE HERE.
<script src="WidgetsLib/all.js"></script>
^^ This is a text, no side effects trying to load it.
-->
</div>
and JavaScript code to get it:
var src = document.getElementById("mydoc-src");
var YOUR_CODE = src.innerHTML.replace(/(<!--|-->)/g, "");

This is how I did it:
$str = file_get_contents("my-code-file.php");
echo "<textarea disabled='true' style='border: none;background-color:white;'>";
echo $str;
echo "</textarea>";

It may not work in every situation, but placing code snippets inside of a textarea will display them as code.
You can style the textarea with CSS if you don't want it to look like an actual textarea.

If you are looking for a solution that works with frameworks.
const code = `
<div>
this will work in react
<div>
`
<pre>
<code>{code}</code>
</pre>
And you can give it a nice look with css:
pre code {
background-color: #eee;
border: 1px solid #999;
display: block;
padding: 20px;
}

JavaScript string literals can be used to write the HTML across multiple lines. Obviously, JavaScript, ECMA6 in particular, is required for this solution.
.createTextNode paired with CSS white-space: pre-wrap; does the trick.
.innerText alone also works. Run code snippet below.
let codeBlock = `
<!DOCTYPE HTML>
<html lang="en">
<head>
<title>My Page</title>
</head>
<body>
<h1>Welcome to my page</h1>
<p>I like cars and lorries and have a big Jeep!</p>
<h2>Where I live</h2>
<p>I live in a small hut on a mountain!</p>
</body>
</html>
`
const codeElement = document.querySelector("#a");
let textNode = document.createTextNode(codeBlock);
codeElement.appendChild(textNode);
const divElement = document.querySelector("#b");
divElement.innerText = codeBlock;
#a {
white-space: pre-wrap;
}
<div id=a>
</div>
<div id=b>
</div>

//To show xml tags in table columns you will have to encode the tags first
function htmlEncode(value) {
//create a in-memory div, set it's inner text(which jQuery automatically encodes)
//then grab the encoded contents back out. The div never exists on the page.
return $('<div/>').text(value).html();
}
html = htmlEncode(html)

A combination of a couple answers that work together here:
function c(s) {
return s.split("<").join("<").split(">").join(">").split("&").join("&")
}
displayMe.innerHTML = ok.innerHTML;
console.log(
c(ok.innerHTML)
)
<textarea style="display:none" id="ok">
<script>
console.log("hello", 5&9);
</script>
</textarea>
<div id="displayMe">
</div>

I used this a long time ago and it did the trick for me, I hope it helps you too.
var preTag = document.querySelectorAll('pre');
console.log(preTag.innerHTML);
for (var i = 0; i < preTag.length; i++) {
var pattern = preTag[i].innerHTML;
pattern = pattern.replace(/</g, "<").replace(/>/g, ">");
console.log(pattern);
preTag[i].innerHTML = pattern;
}
<pre>
<p>example</p>
<span>more text</span>
</pre>

You can separate the tags by changing them to spans.
Like this:
<span><</span> <!-- opening bracket of h1 here -->
<span>h1></span> <!-- opening tag of h1 completed here -->
<span>hello</span> <!-- text to print -->
<span><</span> <!-- closing h1 tag's bracket here -->
<span>/h1></span> <!-- closing h1 tag ends here -->
And also, you can just only add the <(opening angle bracket) to the spans
<span><</span> <!-- opening bracket of h1 here -->
h1> <!-- opening tag of h1 completed here -->
hello <!-- text to print -->
<span><</span> <!-- closing h1 tag's bracket here -->
/h1><!-- closing h1 tag ends here -->

<code><?php $str='<';echo htmlentities($str);?></code>
I found this to be the easiest, fastest and most compact.

Related

Is it ok to write almost any unescaped text inside textarea and get it from JS with it's value property?

I've noticed that inside HTML textarea element I can set almost any default value without escaping and access it from JavaScript through that element node's value property like below:
HTML:
<textarea id="txt">
<h1>A h1 heading</h2>
<!-- this is comment -->
<textarea>This is a textarea</textarea>
</textarea>
JS:
console.log(txt.value)
JS Output:
<h1>A h1 heading</h2>
<!-- this is comment -->
<textarea>This is a textarea</textarea>
To print the nested textarea tag I only needed to escape that <. If escape all the angle brackets then it also works. But that is too much work.
My question is it the standard behavior? I've found no articles or docs saying about. I looked into spec. But it seemed too complex. Is it safe to build app assuming it will work like this on all browsers? Can you please give reference to part of the spec that explains this?

How to automate selecting certain codes in an html?

Hi I have a question about automating selecting certain content in an HTML. So if we save an webpage as html only, then we'll get HTML codes along with other stylesheets and javascript codes. However, I only want to extract the HTML codes between <div class='post-content' itemprop='articleBody'>and</div> and then create a new HTML file that has the extracted HTML codes. Is there a possible way to do it? Example codes are down below:
<html>
<script src='.....'>
</script>
<style>
...
</style>
<div class='header-outer'>
<div class='header-title'>
<div class='post-content' itemprop='articleBody'>
<p>content we want</p>
</div>
</div></div>
<div class='footer'>
</div>
</html>
While I'm typing, I'm thinking about javascript, which seems to be able to manipulate HTML DOM elements..Is Ruby able to do that? Can I generate a new clean html that only contains content between <div class='post-content' itemprop='articleBody'>and</div> by using javascript or Ruby? However, as for how to write the actual code, I don't have a clue.
So anybody has any idea about it? Thank you so much!
I'm not quite sure what you're asking, but I'll take a crack at it.
Can Ruby modify the DOM on a webpage?
Short answer, no. Browsers don't know how to run Ruby. They do know how to run javascript, so that's what usually used for real-time DOM manipulation.
Can I generate a new clean html
Yes? At the end of the day, HTML is just a specifically formatted string. If you want to download the source from that page and find everything in the <div class='post-content' itemprop='articleBody'> tag, there are a couple of ways to go about that. The best is probably the nokogiri gem, which is a ruby HTML parser. You'll be able to feed it a string (from a file or otherwise) that represents the old page and strip out what you want. Doing that would look something like this:
require 'nokogiri'
page = Nokogiri::HTML(open("https://googleblog.blogspot.com"))
# finds the first child of the <div class="post-content"> element
text = page.css('.post-content')[0].text
I believe that gives you the text you're looking for. More detailed nokogiri instructions can be found here.
You want to use a regular expression. For example:
//The "m" means multi-line
var regEx = /<div class='post-content' itemprop='articleBody'>([\s\S]*?)<\/div>/m;
//The content (you'll put the javascript at the bottom
var bodyCode = document.body.innerHTML;
var match = bodyCode.match( regEx );
//Prints to the console
console.dir( match );
You can see this in action here: https://regex101.com/r/kJ5kW6/1

How to show JavaScript code on a web page?

How to show the following Javascript code, but not run it?
<script>alert("test");</script>
I tried to show this code on my page, but when user loads the page it alerts "test".
How can I show the code but not run it on page load?
Use < for < and > for >
<code>
<script>alert("test");</script>
</code>
<pre><code><script>alert('test')</script>
</code></pre>
OR
Using Jquery you can simply add text on particular selector using .text()
Jquery example:
$(document).ready(function () {
$('.showText').text("<script>alert('text')<"+"/script>");
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js"></script>
<span class="showText"></span>
<!-- Sadly (totally)... this is obsolete :( -->
<xmp>
<script>alert("test");</script>
</xmp>
http://www.w3.org/TR/REC-html32#xmp
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/plaintext
https://code.google.com/p/doctype-mirror/wiki/PlaintextElement
http://www.blooberry.com/indexdot/html/tagpages/x/xmp.htm
http://www.blooberry.com/indexdot/html/tagpages/p/plaintext.htm
;) Great stuff but sadly you're on the browser's mercy since <plaintext> and <xmp> are obsolete,
so
escaping the < and > characters with < and > (or with hex <, >)
is a far better idea.
<script>alert("test");</script>
You can wrap the above for semantic reasons in tags like <pre> or <code> although it's not needed to achieve the desired.
So the point is that we're over and over suggested to use <pre>, yet we still miss a proper and valid tag to literalize and represent <, &, > without escaping.

How to keep line breaks in CKEditor WYSIWYG editor

I have an HTML code as follows
<div class="editable" ...>
<div class="code">
This is an example.
This is a new line
</div>
</div>
In CSS, code has "word-wrap: pre" attribute, such that the text in the inner DIV will show two lines. I use CKEditor with DIV replacement method to edit it. However, it becomes
<div class="code">
This is an example.This is a new line
</div>
The text inside the HTML tag will become one line long, beginning and trailing spaces and new line are stripped. So in CKEditor, although I have specified the config.contentsCss, it still shows one line because CKEditor has merge those two lines into one (I checked this in Chrome "Inspect Element" in CKEditor's iframe editor). Therefore, I see the source code or saved HTML, two lines format is not preserved because they are only one line.
I've googled and tried the CKEditor HTML writer or addRules to restrict the indent format and new line in begin/close tags, however, those seems work on HTML tags, not the document text. Is there any other methods to preserve line breaks of text?
I found it.
// preserve newlines in source
config.protectedSource.push( /\n/g );
http://docs.ckeditor.com/#!/api/CKEDITOR.config-cfg-protectedSource
$(document).on('paste', 'textarea', function (e) {
$(e.target).keyup(function (e) {
var inputText = $(e.target).val();
$(e.target).val(inputText.replace(/\n/g, '<br />'))
.unbind('keyup');
});
});
Use the <pre> HTML tag. Like this:
<div class="editable" ...>
<div class="code"><pre>
This is an example in a "pre".
This is a new line
</pre></div>
</div>
<div class="editable" ...>
<div class="code">
This is an example NOT in a "pre".
Therefore this is NOT a new line
</div>
</div>
Or you can put a <br/> tag in between your lines. Its the ssame as hitting enter.
In my particular case, it was an extra tag, univis, that I needed to give similar semantics (i.e., leave indentation and inebreaks alone), and what we ended up doing was:
CKEDITOR.dtd.$block.univis=1;
CKEDITOR.dtd.$cdata.univis=1;
CKEDITOR.dtd.univis=CKEDITOR.dtd.script;
But that looks like it might or might not be extensible to classes.
I got some Craft sites running and I don't want to paste the config file everywhere. For everyone else still having the problem: Just use redactor. Install and replace the field type. Correct everything once and you're done.

jQuery: Parse/Manipulate HTML without executing scripts

I'm loading some HTML via Ajax with this format:
<div id="div1">
... some content ...
</div>
<div id="div2">
...some content...
</div>
... etc.
I need to iterate over each div in the response and handle it separately. Having a separate string for the HTML content of each div mapped to the id would satisfy my requirements. However, the divs may contain script tags, which I need to preserve but not execute (they'll execute later when I stick the HTML into the document, so executing during parsing would be bad). My first thought was to do something like this:
// data being the result from $.get
var clean = data.replace(/<script.*?</script>/,function() {
// insert some unique token, save the tag, put it back while I'm processing
});
$('<div/>').html(clean).children().each( /* ... process here ... */);
But I worry that some stupid dev is going to come along and put something like this in one of the divs:
<script> var foo = '</script>'; // ... </script>
Which would screw it all up. Not to mention, the whole thing feels like a hack to begin with. Does anyone know a better way?
EDIT: Here's the solution I've come up with:
var divSplitRegex = /(?:^|<\/div>)\s*<div\s+id="prefix-(.+?)">/g,
idReplacement = preDelimeter+'$1'+postDelimeter;
var r = data.replace(<\/div>\s*$/,'').
replace(divSplitRegex,idReplacement).split(preDelimeter);
$.each(r,function() {
var content;
if(this) {
callback.apply(null,this.split(postDelimeter));
}
});
Where preDelimiter and postDelimeter are just unique strings like "###I'd have to be an idiot to embed this string in my content unescaped because it would break everything###', and callback is a function expecting the div id and the div content. This only works because I know that the divs will have only an id atribute, and the id will have a special prefix. I suppose someone could put a div in their content with an id having the same prefix and it would screw things up too.
So, I still don't love this solution. Anyone have a better one?
FYI, Using unescaped in any JavaScript script causes this issue in a browser. Developers have to escape it anyway so there is no excuse. So you can "trust" that would break in any case.
<body>
<div>
<script>
alert('<script> tags </script> are not '+
'valid in regular old HTML without being escaped.');
</script>
</body>
See
http://jsbin.com/itevu
to see it break. :)
In some cases removing script tags results in invalid html:
<html>
<head>
</head>
<body>
<p>This should be
<script type="text/javascript">
document.writeln("<b");
</script>>bolded</b>.
</body>
</html>

Categories

Resources