getting Pure test from html text using javascript

getting Pure test from html text using javascript - javascript

I am using struts, and I am getting html text from database and I am storing it in a string and passing it to jsp. Now in jsp I have to extract pure text from that html string and has to display in the TextArea using javascript.
Please suggest some solutions, I am not allowed to use jquery.

You could try something like a mini-parser.
Like this function:
function HTMLtoBB(html) {
search = new Array( /\<b\>(.*?)\<\/b\>/g,
/\<i\>(.*?)\<\/i\>/g,
/\<u\>(.*?)\<\/u\>/g,
/\<font size=\'(.*?)\'\>(.*?)\<\/font\>/g,
/\<font color=\'(.*?)\'\>(.*?)\<\/font\>/g,
/\<img src=\'(.*?)\'\>/g,
/\<a href=\'(.*?)\'\>(.*?)\<\/a\>/g,
/\<blockqoute\>(.*?)\<\/blockquote\>/g,
/\<center\>(.*?)\<\/center\>/g
);
replace = new Array("[b]$1[/b]",
"[i]$1[/i]",
"[u]$1[/u]",
"[size=$1]$2[/size]",
"[color=$1]$2[/color]",
"[img=$1]",
"[url=$1]$2[/url]",
"[quote]$1[/quote]",
"[center]$1[/center]"
);
for (i = 0; i < search.length; i++) {
html = html.replace(search[i], replace[i]);
}
return html;
}
This will convert the HTML-Tags to BB-Codes. Or you replace the BB-Codes with something other.

You could attach the loaded HTML to the dom, and then use element.innerText to strip away all the HTML, leaving just the plain text (if this is what you want to do - I don't think it is completely clear from your question)

Related

Convert escaped HTML string to Javascript object (No JQuery)

I'm practicing my Javascript by making a browser plugin to display external comment from Reddit on other webpages. The comments come in this format:
<div class="md"><p>I have them all over my yard. I didn&#39;t realize they spread so bad when I planted them.
They look cool with early morning dew on them though.</p>
</div>
I need to re-introduce the HTML characters (i.e. <div> => <div>), in order to put the formatted HTML onto the page.
Is there some native functionality Javascript provides to do this?
From what I can tell: x = document.createElement("div"); x.innerHTML = rawComment does not work, as the HTML is escaped, and the innerHTML returns a <div> with a string in it instead of a series of DOM nodes.

What you might try to do is the following:
// some dummy deocded text
let encoded = '<div class="md"><p>I have them all over my yard.</p></div>';
// create a new textarea and insert your encoded text
let dummyElement = document.createElement('textarea');
dummyElement.innerHTML = encoded;
// retrieve the textarea's value, which will be your decoded text
let decoded = dummyElement.value;
// decoded will be: <div class="md"><p>I have them all over my yard.</p></div>
Working Fiddle
This will work without jQuery, as you're only using the pure Javascript function's of your browser.

How to have tags that are within quotes ignored? HTML, Javascript

Im am trying to use innerHTML to make a paragraph display.
innerHTML="<html><head></head><body></body></html>";
But I am of course getting errors because it is trying to process the tags. How can I get the browser to ignore these tags since theyre in quotes?

try:
innerHTML="<html><head></head><body></body></html>".replace(/</g,'<');
in words: replacing all < with < will trigger the browser to render < as string < but it [the browser] won't try to render the html-string as html. You could also replace all > with > ofcourse.
Alternatively you can use textContent instead of innerHTML, which will escape the string directly:
[yourElement].textContent = '<html><head></head><body></body></html>';

You should do escape HTML in this case.
How about a common function:
function escape(s) {
var el = document.createElement("div");
el.innerText = el.textContent = s;
s = el.innerHTML;
return s;
}
Usage:
innerHTML = escape("<html><head></head><body></body></html>");
I think using a browser built-in behavior to escape HTML is better than trying to replace all special characters.

How to insert HTML entities with createTextNode?

If I want to add an ascii symbol form js to a node somewhere?
Tried as a TextNode, but it didn't parse it as a code:
var dropdownTriggerText = document.createTextNode('blabla ∧');

You can't create nodes with HTML entities. Your alternatives would be to use unicode values
var dropdownTriggerText = document.createTextNode('blabla \u0026');
or set innerHTML of the element. You can of course directly input &...

createTextNode is supposed to take any text input and insert it into the DOM exactly like it is. This makes it impossible to insert for example HTML elements, and HTML entities. It’s actually a feature, so you don’t need to escape these first. Instead you just operate on the DOM to insert text nodes.
So, you can actually just use the & symbol directly:
var dropdownTriggerText = document.createTextNode('blabla &');

I couldn't find an automated way to do this. So I made a function.
// render HTML as text for inserting into text nodes
function renderHTML(txt) {
var tmpDiv = document.createElement("div"); tmpDiv.innerHTML = txt;
return tmpDiv.innerText || tmpDiv.textContent || txt;
}

How to remove only html tags in a string using javascript

I want to remove html tags from given string using javascript. I looked into current approaches but there are some unsolved problems occured with them.
Current solutions
(1) Using javascript, creating virtual div tag and get the text
function remove_tags(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
return tmp.textContent||tmp.innerText;
}
(2) Using regex
function remove_tags(html)
{
return html.replace(/<(?:.|\n)*?>/gm, '');
}
(3) Using JQuery
function remove_tags(html)
{
return jQuery(html).text();
}
These three solutions are working correctly, but if the string is like this
<div> hello <hi all !> </div>
stripped string is like
hello . But I need only remove html tags only. like hello <hi all !>
Edited: Background is, I want to remove all the user input html tags for a particular text area. But I want to allow users to enter <hi all> kind of text. In current approach, its remove any content which include within <>.

Using a regex might not be a problem if you consider a different approach. For instance, looking for all tags, and then checking to see if the tag name matches a list of defined, valid HTML tag names:
var protos = document.body.constructor === window.HTMLBodyElement;
validHTMLTags =/^(?:a|abbr|acronym|address|applet|area|article|aside|audio|b|base|basefont|bdi|bdo|bgsound|big|blink|blockquote|body|br|button|canvas|caption|center|cite|code|col|colgroup|data|datalist|dd|del|details|dfn|dir|div|dl|dt|em|embed|fieldset|figcaption|figure|font|footer|form|frame|frameset|h1|h2|h3|h4|h5|h6|head|header|hgroup|hr|html|i|iframe|img|input|ins|isindex|kbd|keygen|label|legend|li|link|listing|main|map|mark|marquee|menu|menuitem|meta|meter|nav|nobr|noframes|noscript|object|ol|optgroup|option|output|p|param|plaintext|pre|progress|q|rp|rt|ruby|s|samp|script|section|select|small|source|spacer|span|strike|strong|style|sub|summary|sup|table|tbody|td|textarea|tfoot|th|thead|time|title|tr|track|tt|u|ul|var|video|wbr|xmp)$/i;
function sanitize(txt) {
var // This regex normalises anything between quotes
normaliseQuotes = /=(["'])(?=[^\1]*[<>])[^\1]*\1/g,
normaliseFn = function ($0, q, sym) {
return $0.replace(/</g, '<').replace(/>/g, '>');
},
replaceInvalid = function ($0, tag, off, txt) {
var
// Is it a valid tag?
invalidTag = protos &&
document.createElement(tag) instanceof HTMLUnknownElement
|| !validHTMLTags.test(tag),
// Is the tag complete?
isComplete = txt.slice(off+1).search(/^[^<]+>/) > -1;
return invalidTag || !isComplete ? '<' + tag : $0;
};
txt = txt.replace(normaliseQuotes, normaliseFn)
.replace(/<(\w+)/g, replaceInvalid);
var tmp = document.createElement("DIV");
tmp.innerHTML = txt;
return "textContent" in tmp ? tmp.textContent : tmp.innerHTML;
}
Working Demo: http://jsfiddle.net/m9vZg/3/
This works because browsers parse '>' as text if it isn't part of a matching '<' opening tag. It doesn't suffer the same problems as trying to parse HTML tags using a regular expression, because you're only looking for the opening delimiter and the tag name, everything else is irrelevant.
It's also future proof: the WebIDL specification tells vendors how to implement prototypes for HTML elements, so we try and create a HTML element from the current matching tag. If the element is an instance of HTMLUnknownElement, we know that it's not a valid HTML tag. The validHTMLTags regular expression defines a list of HTML tags for older browsers, such as IE 6 and 7, that do not implement these prototypes.

If you want to keep invalid markup untouched, regular expressions is your best bet. Something like this might work:
text = html.replace(/<\/?(span|div|img|p...)\b[^<>]*>/g, "")
Expand (span|div|img|p...) into a list of all tags (or only those you want to remove). NB: the list must be sorted by length, longer tags first!
This may provide incorrect results in some edge cases (like attributes with <> characters), but the only real alternative would be to program a complete html parser by yourself. Not that it would be extremely complicated, but might be an overkill here. Let us know.

var StrippedString = OriginalString.replace(/(<([^>]+)>)/ig,"");

Here is my solution ,
function removeTags(){
var txt = document.getElementById('myString').value;
var rex = /(<([^>]+)>)/ig;
alert(txt.replace(rex , ""));
}

I use regular expression for preventing HTML tags in my textarea
Example
<form>
<textarea class="box"></textarea>
<button>Submit</button>
</form>
<script>
$(".box").focusout( function(e) {
var reg =/<(.|\n)*?>/g;
if (reg.test($('.box').val()) == true) {
alert('HTML Tag are not allowed');
}
e.preventDefault();
});
</script>

<script type="text/javascript">
function removeHTMLTags() {
var str="<html><p>I want to remove HTML tags</p></html>";
alert(str.replace(/<[^>]+>/g, ''));
}</script>

How do you check for equality between a jquery .html() variable and ruby on rails generated html?

Currently I am trying to compare two variables, newhtml and oldhtml.
I get the old html using:
var oldhtml = $('#pop').html();
I get the new html using:
var newhtml = ("<%= escape_javascript render(:file => 'shared/_content') %>");
I'm then trying to compare them using:
if(newhtml == oldhtml)
{
//is never called
$('#pop').hide();
}
else
{
//is always called
$('#pop').html(newhtml);
}
but it's not working. In this scenario the oldhtml variable is actually just the previous newhtml variable. So, if you click on the same content on the main page newhtml should actually just be a ruby on rails generated version of oldhtml (the html present in <div id="pop">). I've tried appending them to see where there difference lies, but it looks exactly the same to me. I was hoping someone could nudge me in the right direction? Thanks!

I'd try to put the newhtml in a hidden DOM element and then get that content back from jquery .html()
$('#HiddenDOM').html(newhtml);
if ($('#HiddenDOM').html() === $('#pop').html()) { ... }
Not tested but might cancel out html rendering issues.
Also, use triple equality comparison for javascript: === instead of == . See Cockford on Javascript The Good Parts on youtube for an explanation.
Overall, I'd rethink your design because the comparison might not be truly possible and would open your architecture to all sort of browser mysteries. Sometimes, you first have to find out what doesn't work to get to the working solution.

I can't recall any guarantees regarding the formatting of the html returned by html(), particularly whitepace.
But, at least when I pull a chunk of HTML out of a page in firefox, what comes out seems to be formatted the same as how it was sent by the server.
So, perhaps there are only trivial differences between the strings, e.g. in leading and trailing whitespace. Try using a function line this to compare the two HTML snippets:
function compare_html(a,b) {
a = a.replace(/^\s+/,"").replace(/\s+$/,"");
b = b.replace(/^\s+/,"").replace(/\s+$/,"");
return a == b;
}

The jQuery documentation warns that with .html() "some browsers may not return HTML that exactly replicates the HTML source in an original document". This will also apply to dynamically inserted HTML.
Your best bet is to compare two javascript strings, neither of which is read back from the DOM.
eg:
var newhtml = ("<%= escape_javascript render(:file => 'shared/_content') %>"),
oldhtml = newhtml;
...
if(newhtml == oldhtml)
{
$('#pop').hide();
}
else
{
oldhtml = newhtml;
$('#pop').html(newhtml);
}
It's undoubtedly more complicated than this but I'm sure you get the idea.

That was all very helpful and I actually sort of combined the answers to get something that worked. What ended up working is this:
Get the oldhtml like before:
var oldhtml = $('#pop').html();
Then insert the new html into the div:
$('#pop').html("<%= escape_javascript render(:file => 'shared/_content') %>");
Following that, get the newhtml from the div, the same way I retrieved the oldhtml:
var newhtml = $('#pop').html();
The if statement then works, but you want to make sure to change the div to something arbitrary that can't be retrieved from your database:
if(newhtml == oldhtml)
{
$('#pop').html("<h>toggled_off</h>");
$('#pop').hide();
}
else
{
$('#pop').html(newhtml);
}

strip the tags of both htmls and then compare with ==
`var StrippedString = OriginalString.replace(/(<([^>]+)>)/ig,"");

Develop Reference

JavaScript is the programming language of the Web.

getting Pure test from html text using javascript - javascript

I am using struts, and I am getting html text from database and I am storing it in a string and passing it to jsp. Now in jsp I have to extract pure text from that html string and has to display in the TextArea using javascript. Please suggest some solutions, I am not allowed to use jquery.

You could attach the loaded HTML to the dom, and then use element.innerText to strip away all the HTML, leaving just the plain text (if this is what you want to do - I don't think it is completely clear from your question)

Related

Convert escaped HTML string to Javascript object (No JQuery)

How to have tags that are within quotes ignored? HTML, Javascript

How to insert HTML entities with createTextNode?

How to remove only html tags in a string using javascript

How do you check for equality between a jquery .html() variable and ruby on rails generated html?

Categories

Resources