How can I show HTML snippets on a webpage without needing to replace each < with < and > with >?
In other words, is there a tag for don't render HTML until you hit the closing tag?
The tried and true method for HTML:
Replace the & character with &
Replace the < character with <
Replace the > character with >
Optionally surround your HTML sample with <pre> and/or <code> tags.
sample 1:
<pre>
This text has
been formatted using
the HTML pre tag. The brower should
display all white space
as it was entered.
</pre>
sample 2:
<pre>
<code>
My pre-formatted code
here.
</code>
</pre>
sample 3:
(If you are actually "quoting" a block of code, then the markup would be)
<blockquote>
<pre>
<code>
My pre-formatted "quoted" code here.
</code>
</pre>
</blockquote>
is there a tag for don't render HTML until you hit the closing tag?
No, there is not. In HTML proper, there’s no way short of escaping some characters:
& as &
< as <
(Incidentally, there is no need to escape > but people often do it for reasons of symmetry.)
And of course you should surround the resulting, escaped HTML code within <pre><code>…</code></pre> to (a) preserve whitespace and line breaks, and (b) mark it up as a code element.
All other solutions, such as wrapping your code into a <textarea> or the (deprecated) <xmp> element, will break.1
XHTML that is declared to the browser as XML (via the HTTP Content-Type header! — merely setting a DOCTYPE is not enough) could alternatively use a CDATA section:
<![CDATA[Your <code> here]]>
But this only works in XML, not in HTML, and even this isn’t a foolproof solution, since the code mustn’t contain the closing delimiter ]]>. So even in XML the simplest, most robust solution is via escaping.
1 Case in point:
textarea {border: none; width: 100%;}
<textarea readonly="readonly">
<p>Computer <textarea>says</textarea> <span>no.</span>
</textarea>
<xmp>
Computer <xmp>says</xmp> <span>no.</span>
</xmp>
Kind of a naive method to display code will be including it in a textarea and add disabled attribute so its not editable.
<textarea disabled> code </textarea>
Hope that help someone looking for an easy way to get stuff done.
But warning, this won't escape the tags for you, as you can see here (the following obviously does not work):
<textarea disabled>
This is the code to create a textarea:
<textarea></textarea>
</textarea>
Deprecated, but works in FF3 and IE8.
<xmp>
<b>bold</b><ul><li>list item</li></ul>
</xmp>
Recommended:
<pre><code>
code here, escape it yourself.
</code></pre>
i used <xmp> just like this :
http://jsfiddle.net/barnameha/hF985/1/
The deprecated <xmp> tag essentially does that but is no longer part of the XHTML spec. It should still work though in all current browsers.
Here's another idea, a hack/parlor trick, you could put the code in a textarea like so:
<textarea disabled="true" style="border: none;background-color:white;">
<p>test</p>
</textarea>
Putting angle brackets and code like this inside a text area is invalid HTML and will cause undefined behavior in different browsers. In Internet Explorer the HTML is interpreted, whereas Mozilla, Chrome and Safari leave it uninterpreted.
If you want it to be non-editable and look different then you could easily style it using CSS. The only issue would be that browsers will add that little drag handle in the bottom-right corner to resize the box. Or alternatively, try using an input tag instead.
The right way to inject code into your textarea is to use server side language like this PHP for example:
<textarea disabled="true" style="border: none;background-color:white;">
<?php echo '<p>test</p>'; ?>
</textarea>
Then it bypasses the html interpreter and puts uninterpreted text into the textarea consistently across all browsers.
Other than that, the only way is really to escape the code yourself if static HTML or using server-side methods such as .NET's HtmlEncode() if using such technology.
If your goal is to show a chunk of code that you're executing elsewhere on the same page, you can use textContent (it's pure-js and well supported: http://caniuse.com/#feat=textcontent)
<div id="myCode">
<p>
hello world
</p>
</div>
<div id="loadHere"></div>
document.getElementById("myCode").textContent = document.getElementById("loadHere").innerHTML;
To get multi-line formatting in the result, you need to set css style "white-space: pre;" on the target div, and write the lines individually using "\r\n" at the end of each.
Here's a demo: https://jsfiddle.net/wphps3od/
This method has an advantage over using textarea: Code wont be reformatted as it would in a textarea. (Things like are removed entirely in a textarea)
In HTML? No.
In XML/XHTML? You could use a CDATA block.
I assume:
you want to write 100% valid HTML5
you want to place the code snippet (almost) literal in the HTML
especially < should not need escaping
All your options are in this tree:
with HTML syntax
there are five kinds of elements
those called "normal elements" (like <p>)
can't have a literal <
it would be considered the start of the next tag or comment
void elements
they have no content
you could put your HTML in a data attribute (but this is true for all elements)
that would need JavaScript to move the data elsewhere
in double-quoted attributes, " and &thing; need escaping: " and &thing; respectively
raw text elements
<script> and <style> only
they are never rendered visible
but embedding your text in Javascript might be feasable
Javascript allows for multi-line strings with backticks
it could then be inserted dynamically
a literal </script is not allowed anywhere in <script>
escapable raw text elements
<textarea> and <title> only
<textarea> is a good candidate to wrap code in
it is totally legal to write </html> in there
not legal is the substring </textarea for obvious reasons
escape this special case with </textarea or similar
&thing; needs escaping: &thing;
foreign elements
elements from MathML and SVG namespaces
at least SVG allows embedding of HTML again...
and CDATA is allowed there, so it seems to have potential
with XML syntax
covered by Konrad's answer
Note: > never needs escaping. Not even in normal elements.
It's vey simple ....
Use this xmp code
<xmp id="container">
<xmp >
<p>a paragraph</p>
</xmp >
</xmp>
<textarea ><?php echo htmlentities($page_html); ?></textarea>
works fine for me..
"keeping in mind Alexander's suggestion, here is why I think this is a good approach"
if we just try plain <textarea> it may not always work since there may be closing textarea tags which may wrongly close the parent tag and display rest of the HTML source on the parent document, which would look awkward.
using htmlentities converts all applicable characters such as < > to HTML entities which eliminates any possibility of leaks.
There maybe benefits or shortcomings to this approach or a better way of achieving the same results, if so please comment as I would love to learn from them :)
This is a simple trick and I have tried it in Safari and Firefox
<code>
<span><</span>meta property="og:title" content="A very fine cuisine" /><br>
<span><</span>meta property="og:image" content="http://www.example.com/image.png" />
</code>
It will show like this:
You can see it live Here
You could try:
Hello! Here is some code:
<xmp>
<div id="hello">
</div>
</xmp>
This is a bit of a hack, but we can use something like:
body script {
display: block;
font-family: monospace;
white-space: pre;
}
<script type="text/html">
<h1>Hello World</h1>
<ul>
<li>Enjoy this dodgy hack,
<li>or don't!
</ul>
</script>
With that CSS, the browser will display scripts inside the body. It won’t attempt to execute this script, as it has an unknown type text/html. It’s not necessary to escape special characters inside a <script>, unless you want to include a closing </script> tag.
I’m using something like this to display executable JavaScript in the body of the page, for a sort of "literate progamming".
There’s some more info in this question When should tags be visible and why can they?.
function escapeHTML(string)
{
var pre = document.createElement('pre');
var text = document.createTextNode(string);
pre.appendChild(text);
return pre.innerHTML;
}//end escapeHTML
it will return the escaped Html
Ultimately the best (though annoying) answer is "escape the text".
There are however a lot of text editors -- or even stand-alone mini utilities -- that can do this automatically. So you never should have to escape it manually if you don't want to (Unless it's a mix of escaped and un-escaped code...)
Quick Google search shows me this one, for example: http://malektips.com/zzee-text-utility-html-escape-regular-expression.html
This is by far the best method for most situations:
<pre><code>
code here, escape it yourself.
</code></pre>
I would have up voted the first person who suggested it but I don't have reputation. I felt compelled to say something though for the sake of people trying to find answers on the Internet.
You could use a server side language like PHP to insert raw text:
<?php
$str = <<<EOD
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="description" content="Minimal HTML5">
<meta name="keywords" content="HTML5,Minimal">
<title>This is the title</title>
<link rel='stylesheet.css' href='style.css'>
</head>
<body>
</body>
</html>
EOD;
?>
then dump out the value of $str htmlencoded:
<div style="white-space: pre">
<?php echo htmlentities($str); ?>
</div>
There are a few ways to escape everything in HTML, none of them nice.
Or you could put in an iframe that loads a plain old text file.
Actually there is a way to do this. It has limitation (one), but is 100% standard, not deprecated (like xmp), and works.
And it's trivial. Here it is:
<div id="mydoc-src" style="display: none;">
LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo
<!--
YOUR CODE HERE.
<script src="WidgetsLib/all.js"></script>
^^ This is a text, no side effects trying to load it.
-->
LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo
</div>
Please let me explain. First of all, ordinary HTML comment does the job, to prevent whole block be interpreted. You can easily add in it any tags, all of them will be ignored. Ignored from interpretation, but still available via innerHTML! So what is left, is to get the contents, and filter the preceding and trailing comment tokens.
Except (remember - the limitation) you can't put there HTML comments inside, since (at least in my Chrome) nesting of them is not supported, and very first '-->' will end the show.
Well, it is a nasty little limitation, but in certain cases it's not a problem at all, if your text is free of HTML comments. And, it's easier to escape one construct, then a whole bunch of them.
Now, what is that weird LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo string? It's a random string, like a hash, unlikely to be used in the block, and used for? Here's the context, why I have used it. In my case, I took the contents of one DIV, then processed it with Showdown markdown, and then the output assigned into another div. The idea was, to write markdown inline in the HTML file, and just open in a browser and it would transform on the load on-the-fly. So, in my case, <!-- became transformed to <p><!--</p>, the comment properly escaped. It worked, but polluted the screen. So, to easily remove it with regex, the random string was used. Here's the code:
var converter = new showdown.Converter();
converter.setOption('simplifiedAutoLink', true);
converter.setOption('tables', true);
converter.setOption('tasklists', true);
var src = document.getElementById("mydoc-src");
var res = document.getElementById("mydoc-res");
res.innerHTML = converter.makeHtml(src.innerHTML)
.replace(/<p>.{0,10}LlNnlljn77fggggkk77csJJK8bbJBKJBkjjjjbbbJJLJLLJo.{0,10}<\/p>/g, "");
src.innerHTML = '';
And it works.
If somebody is interested, this article is written using this technique. Feel free to download, and look inside the HTML file.
It depends what you are using it for. Is it user input? Then use <textarea>, and escape everything. In my case, and probably it's your case too, I simply used comments, and it does the job.
If you don't use markdown, and just want to get it as is from a tag, then it's even simpler:
<div id="mydoc-src" style="display: none;">
<!--
YOUR CODE HERE.
<script src="WidgetsLib/all.js"></script>
^^ This is a text, no side effects trying to load it.
-->
</div>
and JavaScript code to get it:
var src = document.getElementById("mydoc-src");
var YOUR_CODE = src.innerHTML.replace(/(<!--|-->)/g, "");
This is how I did it:
$str = file_get_contents("my-code-file.php");
echo "<textarea disabled='true' style='border: none;background-color:white;'>";
echo $str;
echo "</textarea>";
It may not work in every situation, but placing code snippets inside of a textarea will display them as code.
You can style the textarea with CSS if you don't want it to look like an actual textarea.
If you are looking for a solution that works with frameworks.
const code = `
<div>
this will work in react
<div>
`
<pre>
<code>{code}</code>
</pre>
And you can give it a nice look with css:
pre code {
background-color: #eee;
border: 1px solid #999;
display: block;
padding: 20px;
}
JavaScript string literals can be used to write the HTML across multiple lines. Obviously, JavaScript, ECMA6 in particular, is required for this solution.
.createTextNode paired with CSS white-space: pre-wrap; does the trick.
.innerText alone also works. Run code snippet below.
let codeBlock = `
<!DOCTYPE HTML>
<html lang="en">
<head>
<title>My Page</title>
</head>
<body>
<h1>Welcome to my page</h1>
<p>I like cars and lorries and have a big Jeep!</p>
<h2>Where I live</h2>
<p>I live in a small hut on a mountain!</p>
</body>
</html>
`
const codeElement = document.querySelector("#a");
let textNode = document.createTextNode(codeBlock);
codeElement.appendChild(textNode);
const divElement = document.querySelector("#b");
divElement.innerText = codeBlock;
#a {
white-space: pre-wrap;
}
<div id=a>
</div>
<div id=b>
</div>
//To show xml tags in table columns you will have to encode the tags first
function htmlEncode(value) {
//create a in-memory div, set it's inner text(which jQuery automatically encodes)
//then grab the encoded contents back out. The div never exists on the page.
return $('<div/>').text(value).html();
}
html = htmlEncode(html)
A combination of a couple answers that work together here:
function c(s) {
return s.split("<").join("<").split(">").join(">").split("&").join("&")
}
displayMe.innerHTML = ok.innerHTML;
console.log(
c(ok.innerHTML)
)
<textarea style="display:none" id="ok">
<script>
console.log("hello", 5&9);
</script>
</textarea>
<div id="displayMe">
</div>
I used this a long time ago and it did the trick for me, I hope it helps you too.
var preTag = document.querySelectorAll('pre');
console.log(preTag.innerHTML);
for (var i = 0; i < preTag.length; i++) {
var pattern = preTag[i].innerHTML;
pattern = pattern.replace(/</g, "<").replace(/>/g, ">");
console.log(pattern);
preTag[i].innerHTML = pattern;
}
<pre>
<p>example</p>
<span>more text</span>
</pre>
You can separate the tags by changing them to spans.
Like this:
<span><</span> <!-- opening bracket of h1 here -->
<span>h1></span> <!-- opening tag of h1 completed here -->
<span>hello</span> <!-- text to print -->
<span><</span> <!-- closing h1 tag's bracket here -->
<span>/h1></span> <!-- closing h1 tag ends here -->
And also, you can just only add the <(opening angle bracket) to the spans
<span><</span> <!-- opening bracket of h1 here -->
h1> <!-- opening tag of h1 completed here -->
hello <!-- text to print -->
<span><</span> <!-- closing h1 tag's bracket here -->
/h1><!-- closing h1 tag ends here -->
<code><?php $str='<';echo htmlentities($str);?></code>
I found this to be the easiest, fastest and most compact.
I've downloaded a Markdown JS library but I don't know if it supports syntax highlighting, or any of his two supported dialects (gruber/maruku), because its the first time I try to add markdown support to my webpages. So, I would like to know how to integrate a syntax highlighter (like Alex Gorbatchev's JS library) to markdown.
Any other libraries are welcome. Basically, my Markdown snippets are in .md files loaded that way:
<div class="markdown-f">
<?= file_get_contents("file.md"); ?>
</div>
and it contains code snippets together with common Markdown text. I need a JS library to be able of doing something like:
<script>
$('.markdown-f').each(function() {
var contents = $(this).text();
$(this).empty();
contents = markdown.toHTML(contents);
$(this).text(contents);
});
</script>
with a dialect or any other hacktrick supporting syntax highlighting (specifying manually the target language for example).
I've used that markdown parser on my website to display the README files of repos I've created. It wraps code blocks in <pre><code> code goes here </code></pre>
It does not use highlighting, but you could then use the other library you mentioned after calling:
$("code").addClass("brush: js") // assuming you want to highlight javascript
Finally I used highlightjs.
The #A.OzanEkici solution has the (little) downside that I lost the markdown highglighting of my text editor (the emacs's markdown-mode), since the contents inside the <pre> tag must be un-indent to don't see the indention in the rendered page, and the #JaredBeach doesn't work either because Alex Gorbatchev's library only work on <pre> tags, not on <pre><code> tags, which is what is replaced by the markdown syntax.
So, my solution was simply:
<script>
$('.markdown-f').each(function(){
$(this).html(markdown.toHTML($(this).text()));
});
hljs.initHighlightingOnLoad();
</script>
And that has the adventage that the language is automatically detected.
I use Alex Gorbatchev's JS library to do this and it works great.
First you should create a <pre> element like this;
<pre class="brush: __yourFileType__"> + data + </pre>
data refers to your contents and __yourFileType__ can be one of these .
Ex: class="brush: xml" , class="brush: txt"
After that you just simply call it;
SyntaxHighlighter.highlight();
Hi I have a question about automating selecting certain content in an HTML. So if we save an webpage as html only, then we'll get HTML codes along with other stylesheets and javascript codes. However, I only want to extract the HTML codes between <div class='post-content' itemprop='articleBody'>and</div> and then create a new HTML file that has the extracted HTML codes. Is there a possible way to do it? Example codes are down below:
<html>
<script src='.....'>
</script>
<style>
...
</style>
<div class='header-outer'>
<div class='header-title'>
<div class='post-content' itemprop='articleBody'>
<p>content we want</p>
</div>
</div></div>
<div class='footer'>
</div>
</html>
While I'm typing, I'm thinking about javascript, which seems to be able to manipulate HTML DOM elements..Is Ruby able to do that? Can I generate a new clean html that only contains content between <div class='post-content' itemprop='articleBody'>and</div> by using javascript or Ruby? However, as for how to write the actual code, I don't have a clue.
So anybody has any idea about it? Thank you so much!
I'm not quite sure what you're asking, but I'll take a crack at it.
Can Ruby modify the DOM on a webpage?
Short answer, no. Browsers don't know how to run Ruby. They do know how to run javascript, so that's what usually used for real-time DOM manipulation.
Can I generate a new clean html
Yes? At the end of the day, HTML is just a specifically formatted string. If you want to download the source from that page and find everything in the <div class='post-content' itemprop='articleBody'> tag, there are a couple of ways to go about that. The best is probably the nokogiri gem, which is a ruby HTML parser. You'll be able to feed it a string (from a file or otherwise) that represents the old page and strip out what you want. Doing that would look something like this:
require 'nokogiri'
page = Nokogiri::HTML(open("https://googleblog.blogspot.com"))
# finds the first child of the <div class="post-content"> element
text = page.css('.post-content')[0].text
I believe that gives you the text you're looking for. More detailed nokogiri instructions can be found here.
You want to use a regular expression. For example:
//The "m" means multi-line
var regEx = /<div class='post-content' itemprop='articleBody'>([\s\S]*?)<\/div>/m;
//The content (you'll put the javascript at the bottom
var bodyCode = document.body.innerHTML;
var match = bodyCode.match( regEx );
//Prints to the console
console.dir( match );
You can see this in action here: https://regex101.com/r/kJ5kW6/1
I have a string in JavaScript and it includes an a tag with an href. I want to remove all links and the text. I know how to just remove the link and leave the inner text but I want to remove the link completely.
For example:
var s = "check this out <a href='http://www.google.com'>Click me</a>. cool, huh?";
I would like to use a regex so I'm left with:
s = "check this out. cool, huh?";
This will strip out everything between <a and /a>:
mystr = "check this out <a href='http://www.google.com'>Click me</a>. cool, huh?";
alert(mystr.replace(/<a\b[^>]*>(.*?)<\/a>/i,""));
It's not really foolproof, but maybe it'll do the trick for your purpose...
Just to clarify, in order to strip link tags and leave everything between them untouched, it is a two step process - remove the opening tag, then remove the closing tag.
txt.replace(/<a\b[^>]*>/i,"").replace(/<\/a>/i, "");
Working sample:
<script>
function stripLink(txt) {
return txt.replace(/<a\b[^>]*>/i,"").replace(/<\/a>/i, "");
}
</script>
<p id="strip">
<a href="#">
<em>Here's the text!</em>
</a>
</p>
<p>
<input value="Strip" type="button" onclick="alert(stripLink(document.getElementById('strip').innerHTML))">
</p>
Regexes are fundamentally bad at parsing HTML (see Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why). What you need is an HTML parser. See Can you provide an example of parsing HTML with your favorite parser? for examples using a variety of parsers.
If you only want to remove <a> elements, the following should work well:
s.replace(/<a [^>]+>[^<]*<\/a>/, '');
This should work for the example you gave, but it won't work for nested tags, for example it wouldn't work with this HTML:
<em>Google</em>
Just commented about John Resig's HTML parser. Maybe it helps on your problem.
Examples above do not remove all occurrences. Here is my solution:
str.replace(/<a\b[^>]*>/gm, '').replace(/<\/a>/gm, '')