Encoding input in textarea with javascript for TinyMCE editor - javascript

This is quite complicated. I'm not able to put this into words directly. So here goes my problem -
I've this short-code generator form built upon WordPress TinyMCE editor which is basically an HTML document. It has two <textarea></textarea> inputs. When user clicks on "Insert short-code", javascript code on this page populates the short-code and inserts it into the editor in WordPress.
Now, if the user inserts something like -
>>>This is an "amazing post"! You can't wait to get your hands on it.<<<
in the one of the textarea, the short-code generates the corresponding attribute as this -
c_text=">>>This is an "amazing post"! You can't wait to get your hands on it.<<<"
This short-code with other attributes is then saved into database by WordPress. When it is retrieved at the front-end, the add_shortcode() function in WordPress messes everything up, which is quite obvious as >,<,',"exist inside c_text
An obvious choice in PHP would be to use htmlentities with ENT_QUOTES, but how to do it in this case? Using the phpjs equivalent function would mean including lot of javascript. How to do this effectively?
UPDATE : I tried using the phpjs equivalent function. It does the conversion properly. But, the quotes are replaced by WordPress.
Example - I put a random string <> <test> <<><<<>><<> ' <"">?sdg##Y^#ASCST##Y^
and did console.log(htmlentities(string,'ENT_QUOTES'));, it logged
<> <test> <<><<<>><<> ' <"">?sdg##Y^#ASCST##Y^
which is correct. But, when I viewed the switched to HTML view of TinyMCE, it showed
<> <test> <<><<<>><<> '
<"">?sdg##Y^#ASCST##Y^
The quotes were replaced by TinyMCE.

I ended up using urlencode function and it worked fine.

Related

Strip HTML elements within DIV

I have a simple search engine on one of our older websites. This site is running IIS 6.0 on Windows Server 2003. The search functionality is provided by Microsoft Indexing Service.
You can see the search functionality on our website. (Just type in "speakers" and you will see some hits.
I would like to use the "FullHit" feature offered by the indexing service. When using this feature the Indexing service inserts the full hit results in between "begindetail" and "enddetail" on a target web page.
The problem that I have is that the documents that are being returned have HTML. This looks messy. (Just click on "Hit Locator Tool" in the search results above to see what I mean.)
I would like to create a DIV section such as ...
<DIV name="target">
begindetail
enddetail
</DIV>
Then, after the page is populated I would like to use javascript to strip out all of the HTML elements (but not the data) between the opening and closing DIV.
For example, <FONT color="magenta">Good Data</FONT> would be modified to only show Good Data.
I can also use Classic ASP if necessary.
Please let me know if you have any suggestions or know of any functions that I can add to the target page to accomplish this task.
Thanks in advance.
I inspected your webpage, and there definitely must be some logic errors in your ASP code. (1) Instead of something like <div></div> being passed to the browser, it is HTML entities for special characters, so it is being passed like &ltDIV&gt &lt/DIV&gt, which is very ugly and is why it is rendering as text instead of HTML code. In your ASP code, you must not be parsing the search result text before passing it to the browser. (2) All of this improperly-formatted code is inserted after the first closing html tag, and then there are closing body and html tags after the improperly-formatted code, so somewhere in your ASP code, you are telling it to append the code to the end of the document, rather than insert it inside the original <body></body>.
If you want to decode the mixture of HTML entities, <br> tags, and text into rendered HTML, this JavaScript may work:
window.onload = function() {
var text = decodeHTMLEntities(document.body.innerText);
document.write(text);
}
function decodeHTMLEntities(text) {
var entities = [
['amp', '&'],
['apos', '\''],
['#x27', '\''],
['#x2F', '/'],
['#39', '\''],
['#47', '/'],
['lt', '<'],
['gt', '>'],
['nbsp', ' '],
['quot', '"']
];
for (var i = 0, max = entities.length; i < max; ++i)
text = text.replace(new RegExp('&'+entities[i][0]+';', 'g'), entities[i][1]);
return text;
}
jsFiddle: https://jsfiddle.net/6ohc1tkr/
But first things first, you need to fix your ASP code, or whatever you use to parse and then display the search results. That's what is causing the improper formatting and display of the HTML. Show us your back-end code and then we can help you.
This is what I used to accomplish what you are trying to do.
string-strip-html
It worked pretty well for me.
I now have the search feature working as expected. I would like to thank everyone for their insightful comments. This feedback helped me identify and fix the problem.
OS: Windows Server 2003
IIS: 6.0
Microsoft Index Server
The hit locator tool will only work properly for HTML pages. If you use this tool with a simple TXT file then the results will not be displayed correctly.

Prevent Javascript Injection in data attribute

I have a script that pulls a text from an API and sets that as a tooltip in my html.
<div class="item ttip" data-html="<?php echo $obj->titleTag;?>">...</div>
The API allows html and javascript to be entered on their side for that field.
I tried this $obj->titleTag = htmlentities(strip_tags_content($this->channel->status)));
I now had a user that entered the following (or similar, he is blocked now I cannot check it again):
\" <img src="xx" onerror=window.location.replace(https://www.youtube.com/watch?v=IAISUDbjXj0)>
which does not get caught by the above.
I could str_replace the window.location stuff, but that seems dirty.
What would be the right approach? I am reading a lot of "Whitelists" but I don't understand the concept for such a case.
//EDIT strip_tags_content comes from here: https://php.net/strip_tags#86964
Well, It's not tags you're replacing now but code within tags. You need to allow certain attributes in your code rather than stripping tags since you've only got one tag in there ;)
What you wanna do is check for any handlers being bound in the JS, a full list here, and then remove them if anything contains something like onerror or so

how to avoid fetching a part of html page which is being called inside another page?

I am calling a .html page(say A.html, which is dynamically created by another software each time a request is made) inside another webpage (say B.html). I am doing this by using the .load() function. Everything works fine but the problem is I donot want the so many "br" tags (empty tags) present at the end of A.html into B.html. Is there any way to avoid fetching those "br" tags into B.html? Any suggestion would be of great help. Thank you in advance.
You can't avoid loading part of a file when you are just accessing it.
The best option would be to simply remove the extra <br> tags from the document to begin with. There is probably a better way to accomplish whatever they are attempting to accomplish.
With some server-side scripting, it could be possible to strip them automatically when you load it, but would probably be pretty bothersome to do.
Instead, if you can't remove the <br> elements for some reason, what might be easier, if you are just dealing with a handful of <br> tags would be to simply strip them out.
Since you mention using the load() function, I'm guessing you are using jQuery.
If that's the case, something like this would cleanly strip out any extra <br> tags from the end of the document.
Here is a JSfiddle which will do it: http://jsfiddle.net/dMJ2F/
var html = "<p>A</p><br><p>B</p><br><p>C</p><br><br /><br/>";
var $html = $('<div>').append(html);
var $br;
while (($br = $html.find('br:last-child')).length > 0) {
$br.remove();
}
$('p').text($html.html());
Basically, throw the loaded stuff in to a div (in memory), then loop through and remove each <br> at the end until there aren't any. You could use regex to do this as well, but it runs a few risks that this jQuery method doesn't.
You shout delete the br-tags in your A.html.
Substitute them by changing the class .sequence with marging-top:30px
And have an other value in your B.html-file.
You also can run this:
$('br', '.sequence').remove();​
in the load-function. It will strip all br-tags.
You can't avoid fetching a part of your page, but you CAN fetch only a part of it.
According to the jQuery docs, you can call load like this:
$("#result").load("urlorpage #form-id");
That way, you only load the form html inside the result element.

Replacing to HTML Character Entities and reverting back

When replacing things in my chat room it comes up in the box as the 'HTML Character Entities'. However, I want it to revert back and actually show the character typed in when it is then shown in the chat room. So I am using the following code to stop any html from being entered and damaging the chat room by replacing certain html character with there entities (I want to get one or two working before I look at the others I know there are many more.) ....
Javascript
var str1 = this.value.replace(/>/g, '<');
if (str1!=this.value) this.value=str1;
var str2 = this.value.replace(/</g, '>');
if (str2!=this.value) this.value=str2;
and then the following code then displays the text after it has been entered into the database etc. and on updating the chat box it uses the following to add in the the updated messages ...
Returned from php and then displayed through the following javascript
$('#chatroomarea').append($("<p>"+ data.text[i] +"</p>"));
I have messed around with this a few times changing it to val and using
.html(.append($("<p>"+ data.text[i] +"</p>")));
Etc. But I have had no luck. I'm not quite sure how to do this I just need the HTML Character Entities to actually show up back in there true Character instead of displaying something such as... '&#62'
This might be something I need to actually put within the replacing code where it will include code of it's own on replacing such as (this is just an example I'm not exactly sure on how I would write it) ....
var str1 = this.value.replace(/>/g, '.html(<)');
Any help on this would be much appreciated, Thank you.
$('#chatroomarea').append($("<xmp>"+ data.text[i] +"</xmp>"));
HTML xmp tag
The use is deprecated, but supported in most browsers.
Another option will be to use a styled textarea , To my knowledge these two are the tags that doesn't bother rendering html tags as it is.

Use JS to replace text in Gmail message body

I want to write an GnuPG extension for Google Chrome. So far, everything works as expected: If I detect ASCII armored crypt-text, I parse it with my extension and then replace it. (after password has been entered)
Gmail however litters the message body with an insane amount of tags, so my simple JS approach doesn't work anymore. Is there something which can select an certain amount of visible text, no matter how many tags are contained in it, and replace it with some other text? (the tags don't need to survive). ie I want to unencrypt the mailbody in place.
what do you need is something like this:
/<[^>]+>/g
this regexp will remove all tags, an leave plain text...
just gotta replace for nothing... something like this:
"<p>text <b>full</b> of <i>junk</i> and <u>unwanted</u> tags</p>".replace(/<[^>]+>/g, "");
...and about selecting an specific part you can use substring, I guess!
What I really needed to do was a little different:
expand my regex so it didn't care about tags:
var re = /-----[\s\S]+?-----[\s\S]+?-----[\s\S]+?-----/gm;
store all the matches, with tags
use the regex provided by gibatronic to remove tags and then further process the cleaned text using gpg
use body.innerHTML.replace() to replace the matches from 1) with the processed text from 3)
It works now, the only problem is it breaks Gmail. Site layout stays intact, but all buttons and links become defunct. Only solution is to reload the page. Gotta fix this :S

Categories

Resources