Convert String from Element to working HTML code with jQuery - javascript

I'm having a checkbox which is generated dinamically. The Checkbox text contains a string with some html code inside of it. The text comes directly from the database and doesn't display it as html, but just as a string. Is it possible to convert the string to html, so it get displayed correctly? The checkbox:
<label for="id_122_gen">"I hereby consent to the processing of my above-mentioned data according
to the <a href="/declarationofconsent.pdf" target="_blank">declaration of consent." </label>
<input type="checkbox" name="confirm" id="id_122_gen" >
I tried to get the containing text with $.text() method, what worked so far.
$mystring = $("#id_122_gen").text();
After that I've tried to use jQuery method $.parseHTML() and save the result again.
$myhtml = $.parseHTML( $mystring );
Apparantly it is saved as an array, because when I try to save the result again with the $.text() method, it displays:
[object Text],/declarationofconsent.pdf,[object Text]
It's just this. No clickable link and the checkbox disappeared aswell. I'm a bit confused now what to do and don't know how I can display the correct content with a clickable link.

The solution depends on how the html of your page is generated. Your label either has has it's inner html escaped or not.
This is important; inspect view may guide you wrong with escaped HTML and show as proper HTML, so make sure to check the page source.
Most likely your data is HTML escaped and that's why you can't see the link on initial render.
If you see < and > inside the label's source it's HTML escaped.
If it's escaped and you just want to convert it to proper HTML and set it to the label, use this:
$("label[for='id_122_gen']").html( $("label[for='id_122_gen']").text() );
Basically this unescapes the label's value. It reads the innerHTML as text and thus changes the escaped characters to real ones, and when you set it back as it's html value and the innerHTML becomes unescaped.
If you just want to get the link, read on.
If the value inside the label is HTML escaped then you'll have to use .text() to read that value. If it contains unescaped HTML then you'll have to use .html().
Afterwards the flow is the same, you parse the html first.
Since there is text and a link, the parsed html will return as an array with multiple elements. If you just want to get the link you have to search in the array.
You can check out the code below.
$mystring = $("label[for='id_122_gen']").text(); //Use .html() for unescaped
$myhtml = $.parseHTML( $mystring );
var mylink = null;
var e;
while (e = $myhtml.pop())
{
if(e.tagName == "A"){
mylink = e;
break;
}
}
console.log(mylink);

Related

How do I get the HTML contained within a td using jQuery?

I have a table cell that contains HTML, e.g.,
<td id="test">something™ is here</td>
I have an input field that I want to use to edit the HTML inside the table cell, e.g.,
<input type="text" id="editor" value="">
I need to get the string something™ is here from the table cell so I can put it into the <input> for editing. I have tried
var txt=$("#test").text();
var htm=$("#test").html();
Both of them are returning "something™ is here" rather than the raw HTML - I have a breakpoint in Firebug immediately after setting the two test values, and that's what I'm seeing.
Reading the jQuery documentation, I really expected the .html() method to return the raw HTML I'm looking for, but that's not what is happening.
I know that Javascript doesn't have an encoder like PHP's htmlspecialchars() function and that I have to work around that, but all four of these operations produce the same results:
var enchtm=$("<div/>").text(htm).html();
var enctxt=$("<div/>").text(txt).html();
var htmenc=$("<div/>").html(htm).text();
var txtenc=$("<div/>").html(txt).text();
Every permutatation puts "something™ is here" in the editfield, not the raw HTML.
How do I get the string something™ is here from the table cell into the <input> so I can edit it?
It doesn't exist. Entities are decoded before the DOM is produced, and .html() (which is really just a wrapper for the innerHTML property) doesn't re-encode it because there's no reason for it to -- something™ is exactly as valid a representation of the HTML as something™ is. There is no "completely raw" (pre-character-decoding) view of the HTML provided by the browser.
Suggestion: provide the initial value as the value attribute of the input, instead of having it as the content of the div, so that the flow of data is always one way and this problem doesn't occur.
As the other answers have indicated, what I was trying to do is literally impossible - the original HTML source code that was used to populate the table cell no longer exists in the browser by the time it gets written to the DOM document.
The way I worked around this was using a title attribute on the table cell, e.g.,
// in the PHP/HTML source document
<?php $text='something™ is here'; // the contents of the table cell ?>
<td id="test" title="<?php echo htmlspecialchars($text) ?>"><?php echo $text ?></td>
Now the Javascript is relatively simple:
$("#editor").val($("#test").attr("title"));
The problem with this is that some of these table cells are supposed to have tooltips - which use the title attribute - and some are not. Fortunately for me, the table cells that may contain HTML that needs to be edited never need to show a tooltip, and the ones that show tooltips have simple text that can be retrieved using the .text() method. I can set a class attribute on the cells that need a tooltip, e.g.,
<td class="tooltipped" title="This is a tooltip">Hover here!</td>
I can then use jQuery's .not() method to suppress the tooltips on the cells were the title is being used for storing encoded HTML:
// suppress browser's default tooltip on td titles
$("td[title]").not(".tooltipped").mouseover(function()
{ var elem=$(this);
elem.data("title",elem.attr("title"));
// Using null here wouldn't work in IE, but empty string does
elem.attr("title","");
}).mouseout(function()
{ var elem=$(this);
elem.attr("title",elem.data("title"));
});
I don't think that you would be able to get the exact value as html parses the document and displays the it on the browser and I don't know of any keyword or library that helps to find the exact contents of the element.
But you would use Ajax in this case and get all the contents of the file in the string format to get the exact value.
var client = new XMLHttpRequest();
client.open('GET', './file.html'); // get the file by location
client.onreadystatechange = function() {
if (client.readyState === 4) {
if (client.status === 200) { // when the file is ready
var file_contents = (client.responseText); // the whole file is stored in the string format in the variable
get_value(file_contents); // function get_value fetches the exact value
}
}
}
client.send();
function(str_file) {
var indextest = str_file.indexOf("test"); // fetches the index of test
var indexclosing = str_file.indexOf("</td>"); // fetches the index of closing td tag
var td_content = "";
var condition = false;
for (var i = indextest; i < indexclosing; i++) {
if (str_file.charAt(i - 1) == ">") {
condition = true; // the condition is true when the previous character is '>'
}
if (condition) { // when the condition is true start storing the value in the td_content element
td_content += str_file.charAt(i);
}
};
console.log(td_content) // display it
};
Javascript has escape function but it won't help you in this case, I just did a trick for you, but stay to get best answer.
var htm = $("#test").html().replace(/™/g, '™');
$('#editor').val(htm);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<table>
<tr>
<td id="test">something™ is here</td>
</tr>
</table>
<input type="text" id="editor" value="">

Get HTML attribute value as is via JavaScript

I have a website where I feed information to an analytics engine via the meta tag as such:
<meta property="analytics-track" content="Hey There!">
I am trying to write a JavaScript script (no libraries) to access the content section and retrieve the information as is. In essence, it should include the HTML entity and not transform/strip it.
The reason is that I am using PhantomJS to examine which pages have HTML entities in the meta data and remove them as they screw up my analytics data (For example, I'll have entries that include both Hey There! and Hey There! when in fact they are both the same page, and thus should not have two separate data points).
The most simple JS format I have is this:
document.getElementsByTagName('meta')[4].getAttribute("content")
And when I examined it in on console, it returns the text in the following format:
"Hey There!"
What I would like it to return is:
"Hey There!"
How can I ensure that the data returned will keep the HTML entity. If that's not possible, is there a way to detect HTML entity via JavaScript. I tried:
document.getElementsByTagName('meta')[4].getAttribute("content").includes(' ')
But it returns false
Use queryselector to select the element with the property value "analytics-track", outerHTML to get the element as a String and match to select the unparsed value of the content property with Regex.
document.querySelector('[property=analytics-track]').outerHTML.match(/content="(.*)"/)[1];
See http://jsfiddle.net/sjmcpherso/mz63fnjg/
You can't, that isn't really there. Its just an encoding for a non-breaking space. To the document, the DOM, the web page, to everything, it looks like:
Hey There!
Except the character between the y and the T isn't a space of the sort you'd get by hitting the space bar, its a completely different character.
Observe:
<span id='a' data-a='Hey There!'></span>
<span id='a1' data-a='Hey There!'></span>
<span id='b' data-b='Hey There!'></span>
var a = document.getElementById('a').getAttribute('data-a')
var a1 = document.getElementById('a1').getAttribute('data-a')
var b = document.getElementById('b').getAttribute('data-b')
console.log(a,b,a==b)
console.log(a,a1,a==a1)
Gives:
Hey There! Hey There! false
Hey There! Hey There! true
Instead, consider altering your method of 'equality' to view a space and a non-breaking space as equal:
var re = '/(\xC2\xA0/| )';
x = x.replace(re, ' ');
To get the HTML of the meta tag as is, use outerHTML:
document.getElementsByTagName('meta')[4].outerHTML
Working Snippet:
console.log(document.getElementsByTagName('meta')[0].outerHTML);
<meta property="analytics-track" content="Hey There!">
<h3>Check your console</h3>
Element.outerHTML - Web APIs | MDN
Update 1:
To filter out the meta content, use the following:
metaInfo.match(/content="(.*)">/)[1]; // assuming that content attribute is always at the end of the meta tag
Working Snippet:
var metaInfo = document.getElementsByTagName('meta')[0].outerHTML;
console.log(metaInfo);
console.log('Meta Content = ' + metaInfo.match(/content="(.*)">/)[1]);
<meta property="analytics-track" content="Hey There!">
<h3>Check your console</h3>

detect not closed html tags in a text, and fix it

I have an html text called news which is generated by ckeditor, so it may contain html tags like bold, italic, paragraph and....
I want to choose the first 100 characters of it and show them to the users of my app. but choosing the first 100 characters of news may cause to choose an html text which has unclosed tags(the closing tags of my html text may be after the character number 100).
is there any PHP or js function to parse a text and fix the unclosed html tags or at least remove unclosed html tags?
As far as removing tags altogether, the strip_tags function in PHP should do the trick.
strip_tags($input, '<br><br/>');
The second argument is the allowed tags (the ones that don't get stripped).
Tidy is a binding for the Tidy HTML clean and repair utility which allows you to not only clean and otherwise manipulate HTML documents, but also traverse the document tree.
Try this
You could remove the HTML Tags by using the PHP function strip_tags().
<?php
$text = '<h1>Test Header. <b>On Example</b>';
echo strip_tags($text);
// Output: Test Header. On Example
?>
Please check the reference on the function at https://secure.php.net/strip_tags
Side Note:
I have a function to chopContent that you could use, to get the first 100 characters:
public function chopContent($content, $content_length = 100)
{
if (strlen($content) > $content_length) {
$offset = ($content_length - 13) - strlen($content);
$content = mb_substr($content, 0, strrpos($content, ' ', $offset)) . '...';
}
return $content;
}

Does javascript consider everything enclosed in <> as html tags?

I am tasked with converting hundreds of Word document pages into a knowledge base html application. This means copying and pasting the HTML of the word document into an editor like Notepad++ and cleaning it up. (Since it is internal document I need to convert, I cannot use online converters).
I have been able to do most of what I need with a javascript function that works "onload" of the body tag. I then copy the resulting HTML into my application framework.
Here is part of the function I wrote: (it shows only code for removing attributes of div and p tags but works for all html tags in the document)
function removeatts() //this function will remove all attributes from all elements and also remove empty span elements
{//for removing div tag attributes
var divs=document.getElementsByTagName('div'); //look at all div tags
var divnum=divs.length; //number of div tags on the page
for (var i=0; i<divnum; i++) //run through all the div tags
{//remove attributes for each div tag
divs[i].removeAttribute("class");
divs[i].removeAttribute("id");
divs[i].removeAttribute("name");
divs[i].removeAttribute("style");
divs[i].removeAttribute("lang");
}
//for removing p tag attributes
var ps=document.getElementsByTagName('p'); //look at all p tags
var pnum=ps.length; //number of p tags on the page
for (var i=0; i<pnum; i++) //run through all the p tags
{//remove attributes for each p tag
var para=ps[i].innerHTML;
if (para.length!==0) //ie if there is content inside the p tag
{
ps[i].removeAttribute("class");
ps[i].removeAttribute("id");
ps[i].removeAttribute("name");
ps[i].removeAttribute("style");
ps[i].removeAttribute("lang");
}
else
{//remove empty p tag
ps[i].remove() ;
}
if (para=="<o:p></o:p>" || para=="<o:p> </o:p>" || para=="<o:p> </o:p>")
{
ps[i].remove() ;
}
}
The first problem I encountered is that if I included the if (para=="<o:p></o:p>" || para=="<o:p> </o:p>" || para=="<o:p> </o:p>") part in an else if statement, the whole function stopped executing.
However, without the if (para=="<o:p></o:p>" || para=="<o:p> </o:p>" || para=="<o:p> </o:p>") part, the function does exactly what it is supposed to.
If, however, I keep it the way it is right now, it does some of what I want it to do.
The trouble occurs over some of the Word generated html that looks like this:
<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto; margin-
left:.25in;text-align:justify;text-indent:-.25in;line-height:150%;
mso-list:l0 level1 lfo1;tab-stops:list .75in'>
<![if !supportLists]><span style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:Symbol;color:black'><span style='mso-list:Ignore'>·
<span style='font:7.0pt "Times New Roman"'>
</span></span></span>
<![endif]><span style='font-family:"Arial","sans-serif";mso-fareast-font-family:Calibri;color:black'>
SOME TEXT.<span style='mso-spacerun:yes'>  </span>SOME MORE TEXT.<span style='mso-spacerun:yes'>  </span>EVEN MORE TEXT.
<span style='mso-spacerun:yes'>  </span>BLAH BLAH BLAH.<o:p></o:p></span></p>
<p><o:p></o:p></p>
Notice the <o:p></o:p> in the last two lines..... This is not getting removed either when treated as plain text or if I write code for it in the function just like the divs and paragraphs as shown in the function above. When I run the function on this, I get
<p>
<![if !supportLists]><span>·
<span>
</span></span></span>
<![endif]><span>
SOME TEXT.<span> </span>SOME MORE TEXT.<span> </span>EVEN MORE TEXT.
<span> </span>BLAH BLAH BLAH.<o:p></o:p></span></p>
<p><o:p></o:p></p>
I have looked around but cannot find any information about whether javascript works the same on known html tags and on something like this that follows the principle of opening and closing tags but doesn't match known HTML tags!
Any ideas about a workaround would be greatly appreciated!
Javascript has no special processing of HTML tags in javascript strings. It honestly doesn't know anything about HTML in the string.
More likely your issue is trying to compare .innerHTML of a tag to a predetermined string. You cannot and should not do that because there is no guarentee for the format of .innerHTML. As there are hundreds of ways that the same HTML can be formatted and some browsers don't remember the original HTML, but reconstitue it when you ask for .innerHTML, you simply can't do that type of string comparison.
To be sure of your comparison, you will have to actually parse the HTML (at least with some sort of crude parser which perhaps could even be a regex) to see if it matches what you want because you can't rely on optional spacing or optional capitilization in a direct string comparison.
Or, perhaps even better, since your HTML is already parsed, why not just look at the actual HTML objects themselves and see if you have what you want there. You shouldn't even have to remove all those attributes then.
It's not Javascript that is unhappy with the unknown tags. It's the browser.
For JS it's simply a string. So, if it's a very specific case that you don't need <o:p> in particular then you could just remove it by running it with a regex itself.
para.replace(/<[/]?o:p>/ig, "");
But if there are many more, I would strongly suggest you to get familiar with XSLT transformation.
The first problem I encountered is that if I included the if (para=="<o:p></o:p>" || para=="<o:p> </o:p>" || para=="<o:p> </o:p>")
part in an else if statement, the whole function stopped executing.
This is because you cannot have else if after else.
Notice the <o:p></o:p> in the last two lines..... This is not getting removed
I cannot confirm that. When I run your function it removes the <o:p> inside the <p>, as it is supposed to. The <o:p> within the <span> is not processed, because your function does not do that.
If you want to remove all <o:p>s, try
[].forEach.call(document.querySelectorAll('o\\:p'), function (el) {
el.remove();
});
After that, you may want to remove empty <p>s like this
[].forEach.call(document.querySelectorAll('p'), function (el) {
if (!el.childNodes.length) {
el.remove();
}
});

Add code examples on your page with Javascript

I have a html code inside string
string_eng += '<b>Year Bonus</b> - bonus for each year</br></br>';
And I want to put this inside textarea, but when I do it, the result is:
- bonus for each year
It simply deletes all things inside the html tags. I just want to show all the code inside the string. I already tried <xmp>,<pre>, but none of them worked.
Thanks for any help.
EDIT.
Code with which I input data from the array to the textarea/code.
$('body').append('<code class="code_text"></code>');
for(var i=0; i<tag_list.length; i++){
var string='';
string+='---------------------------------------------------------------------------------\n';
string+='tag: '+tag_list[i][0]+'\n';
string+='nazwa_pl '+tag_list[i][1]+'\n';
string+='nazwa_eng '+tag_list[i][2]+'\n';
string+='tekst_pl '+tag_list[i][3]+'\n';
string+='tekst_eng '+tag_list[i][4]+'\n';
string+='\n\n\n';
$('.code_text').append(string);
}
I tried this using jsfiddle:
HTML
<textarea id="code"></textarea>
JavaScript
$(document).ready(function() {
var string_eng = '';
string_eng += '<b>Year Bonus</b> - bonus for each year</br></br>';
$("#code").html(string_eng);
});
Output (contained in textarea)
<b>Year Bonus</b> - bonus for each year</br></br>
Try it here: http://jsfiddle.net/UH53y/
It does not omit values held within tags, however if you were expecting the <b></b> tags to render as bold within the textarea, or the <br /> tags to render as line breaks, this wont happen either. textarea does not support formatting.
See this question for more information: HTML : How to retain formatting in textarea?
It's because you're using the jQuery .append method which seems to parse the string and insert it afterwards. I don't know jQuery at all, so there might be another special jQuery method, but here is a simple fix:
$('.code_text').append(document.createTextNode(string));
Edit:
I just read and tried the answer of Salman A. The "special jQuery method" exists and he used it. You can use this:
$('.code_text').text(string);

Categories

Resources