jQuery adding unwanted characters to string?

jQuery adding unwanted characters to string? - javascript

I'm using a jQuery to add localised currency signs to a page. Seemingly an innocent and straightforward procedure:
$('.currency').text( userCurrency() )
However, if the currency string is £, instead the output is Â£.
I've got no idea what might be causing it, as I cannot recreate the issue in jsFiddle.
It doesn't happen in firefox, ie or safari, only chrome.
By setting a breakpoint at the function call it is possible to see that the text is not actually visible (or loaded?) in the browser (even though the code is only run after the window has loaded).
I understand I'm not giving you much to work with, and I'm sorry - this is a very bizarre issue indeed.
Has anyone out there encountered anything similar, maybe someone has ideas how I can go about troubleshooting?

This is due to the different encodings used. Make sure all documents use the utf-8 encoding, along with setting the meta tag for utf-8. But also as in Qambar Raza's answer mentions you should probably use the html entity of certain characters instead of the actual character.

You need to use this in your of the HTML page:
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
It basically sets the content type of your page to ('utf-8') encoding meaning it can support the character set for currency.
Also, make sure you use HTML entities like:
£ for £
i.e your function userCurrency() should return html entities (e.g £) rather than a direct symbol. You can find more details about this topic on the following link:
http://webdesign.about.com/od/localization/l/blhtmlcodes-cur.htm

Related

define word boundaries for HTML5 spellcheck

I have some HTML in a contenteditable that looks like <span>hello wor</span><strong>ld</strong></span>. If I change it, so that world is misspelt, I would like to be able to get suggestions on this complete word. However, this is what actually happens:
The text is separated into two words, left clicking simply gives suggestions for one or the other.
Is there any recourse?

The implementation of “spelling checks” requested by using the spellcheck attribute (which the question is apparently about) is heavily browser-dependent, and the HTML5 spec intentionally leaves the issue open. Browsers may implement whatever checks they like, the way they like, and they do. You cannot change this in your code.
Although modern browsers generally have spelling checks of some kind at least for English, they differ in the treatment of cases like this, among other things. Firefox treats adjacent inline elements (with no whitespace between them) as constituting one word, but Chrome and IE do not. Moreover, browsers might not spellcheck initial content, only content as entered or edited by the user.
The only way to get consistent spelling checking is to implement it yourself: instead of using the spellcheck attribute (“HTML5 spellcheck”), you would need to have spelling checking routine and integrate it into your HTML document using JavaScript. People who have implemented such systems normally have the routine running server-side and make the HTML page communicate with it using Ajax.

How to clear MS Word content when printed as PDF using Pechkin

Our Client asked our team to implement print functionality for companies which are their clients. The page where they add companies has an textarea which they use to write short description. However they sometimes copy and paste from MS Word or other sources. When copied in the textarea it looks normal, but when printed it often contains strange characters (see the printscreen at the following URL http://prntscr.com/4oadw3 )
Is there any way we can clean what they paste before we convert the HTML to PDF?
I would appreciate your help with this.
Thanks

You can always do some pre-processing before generating the PDF. Either by properly encoding the "strange characters", or deleting them.
Also check whether the HTML is shown properly in a browser. WKHTMLTOPDF uses webkit to render the HTML.
Having this in the HTML header might help too.
<meta charset="utf-8" />

Adding the script tag before Doctype Declaration

I just did a little test and if I add a script in the manner illustrated below it seems to execute in most modern major browsers. It executes before the page is loaded, I was wondering if this would work across all browsers (including historic)?
<script type="text/javascript">
alert("hello world");
</script>
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
</body>
</html>
I am of course trying to find a way to execute a script to set a page up before any of it is loaded...any input towards this end would be greatly appreciated. Would it be wrong to use this method?
Thanks in advance for any help!

The script gets executed, but the the markup (any element before a DOCTYPE string) puts some browses to quirks mode, which means a large number of poorly documented quirks and oddities, and mess, too.
So whatever your reasons are, you should at least put the element in the first syntactically correct place, namely right after the <head> tag. This can hardly matter as compared with placing it at the start of the document.
Whether the placement solves your real problem is an entirely different thing. You should ask a separate question, clearly describing the problem (rather than an assumed solution), preferably illustrated by some code that demonstrates the problem.

According to the specs,
A conformant document in the HTML syntax must consist of the following
parts, in the following order:
Optionally, a single U+FEFF BYTE ORDER MARK (BOM) character.
Any number of comments and space characters.
A doctype.
Any number of comments and space characters.
An html element, with its attributes (if any) and its contents (if any).
Browsers follow these specs, and your code (even though works now) may break in the future, since
it clearly breaks the rule of order of elements.
Secondly, it's almost always better to load the scripts last for performance gain.

You mention in the comments that you want to hide/show elements before the page is displayed and that onload is too slow. Try using the DOMContentLoaded instead is it triggers as soon as the HTML DOM is built but before all images CSS and other external references is loaded.
That has always worked for me - though I use jQuery's ready event to make it work cross-browser. And it keeps your HTML valid.

I'm seeing malicious Javascript code injected exactly like this, which somehow makes a blank space at the top of a WordPress page. If you click that blank space, you're taken to a site that talks about crypto. The malicious script uses the Javascript atob() function which when then deconverted with base64 and html-escaped causes the crypto page to be loaded.
Just so you know...

Reading from the HTML DOM returns UTF-8 characters

I have a contenteditable div where I'm reading individual characters and sending them off to a server (for more background this is similar to Google Wave where typing a character automatically sends it)
I was using a plain old html textfield before and everything worked fine until I "upgraded" to a contenteditable div.
My problem is that now the characters are in UTF-8 format, which is causing some weird problems on the server that I would rather not debug. It would be much easier to force everything to be ASCII on the client side.
Is there any way to do this? I tried putting in a meta tag stating the html file is charset=ISO-8859-1, but it doesnt seem to work. Reading from the div tag still returns UTF-8 codes. (One example is when I press space I get the pair 0xC2 0xA0 which corresponds to a "non-breaking white space"

Why are you using ASCII for user input? You're just delaying an enormous headache.
But, to answer your question: if your application expects ASCII, you need to check user input and convert it to ASCII manually. It sounds like you need to be checking every keystroke and converting it on-the-fly before it is even rendered to the screen. charset doesn't apply to user input, that is dependant on system-specific settings.
But again: just use UTF-8.

Weird javascript

I 've got a very interesting thing with an html here.
<html>
<head>
<script language="Javascript">
var str="me&myhtml";
function delimit(){
document.write(str);
document.write("<hr>");
}
</script>
</head>
<body onload="delimit()">
</body>
</html>
You see 'me&myhtml' with a line beneath. However, if you comment //document.write("<hr>");
then you just see 'me', which is weird.
Thing is, javascript normally doesn't read '&', but rather interprets as a command. However, if you have any document.write() with at least 6 characters, or a 'br', or 'hr', then it displays '&' as it is.
Any ideas?

If there is no HTML tag after &myhtml JavaScript or the HTML rendering engine probably interprets it as an unrecognized or incomplete entity (lacking an ending ;) and does not render it. If you follow me&myhtml with an HTML tag, then JavaScript or the HTML rendering engine recognizes that &myhtml is not an incomplete entity because it is followed by an HTML tag instead of a ;.
It doesn't matter what HTML tag you follow the string with, <a> or <p> work just as well as <br>.
The behavior is not consistent across all browsers. IE 6, 7 & 8 and Firefox 2, 3 & 3.5 behave the way you describe, Safari for Windows 3 & 4 render nothing when you comment out the second document.write() or do something like document.write("abc");. Opera 9.64 & 10b3 render the text correctly regardless of the content of the second write().
Note that using document.write() in the onload event without writing out correctly formatted and valid HTML (complete with doctype and <html>, <head>, etc) is probably a bug anyway. Note the problem does not manifest itself if you do:
<html>
<head>
<script>
function delimit() {
document.write('me&myhtml');
}
</script>
</head>
<body>
<script>
delimit();
</script>
</body>
</html>
To answer your question, it is a bug in either the implementation of a specification, or an undefined part of a specification that each browser implements differently. What you are doing is slightly incorrect, the outcome of that slightly incorrect code is not defined and it can not be relied on to behave consistently across all of the major browsers in use today. What you are doing should be avoided.

Try str = "me&myhtml";

I just wanted to know why this is happening.
Well, the browser doesn't yet know whether your &myhtml is an entity reference you haven't finished writing yet, or just broken code it will have to fix up. For example you can say:
document.write('&eac');
document.write('ute;');
(Of course there is no entity reference like &myhtmlpotato; that you could be referring to, but the parser doesn't know that yet.)
If you let the parser know there's no more bits of entity reference coming, by document-writing something that couldn't possibly be in an entity reference, such as a <tag>, or spaces, it'll give up and decide your code was simply broken, and fix it.
Normally the end of the page would be a place where this would happen, but you don't have an end of the page, because the script isn't doing what you think it's doing.
Instead of calling document.write() during the original page loading process when it can write some content to your current page, you're calling it in the onload, by which time the document is completely loaded and you can't add to it. In this state, calling document.write() actually calls document.open() implicitly, destroying the current page and starting a new one, to which you then write ‘my&myhtml’. That new page you have opened stays open and not fully loaded until you call document.close() to tell it you aren't going to write any more to it. At that point your partial entity reference will be resolved as bad markup and fixed.

It probably has to do with the fact that & is essentially an escaping character in HTML. If you want to write out an ampersand to the page, you should use &

Develop Reference

JavaScript is the programming language of the Web.

jQuery adding unwanted characters to string? - javascript

This is due to the different encodings used. Make sure all documents use the utf-8 encoding, along with setting the meta tag for utf-8. But also as in Qambar Raza's answer mentions you should probably use the html entity of certain characters instead of the actual character.

Related

define word boundaries for HTML5 spellcheck

How to clear MS Word content when printed as PDF using Pechkin

Adding the script tag before Doctype Declaration

Reading from the HTML DOM returns UTF-8 characters

Weird javascript

Categories

Resources