Javascript encodeURIComponent doesn't encode single quotes - javascript

Try it out:
encodeURIComponent("'##$%^&");
If you try this out you will see all the special characters are encoded except for the single quote. What function can I use to encode ALL the characters and use PHP to decode them?
Thanks.

I'm not sure why you would want them to be encoded. If you only want to escape single quotes, you could use .replace(/'/g, "%27"). However, good references are:
When are you supposed to use escape instead of encodeURI / encodeURIComponent?
Comparing escape(), encodeURI(), and encodeURIComponent() at xkr.us
Javascript Madness: Query String Parsing #Javascript Encode/Decode Functions

You can use:
function fixedEncodeURIComponent (str) {
return encodeURIComponent(str).replace(/[!'()*]/g, escape);
}
fixedEncodeURIComponent("'##$%^&");
Check reference: http://mdn.beonex.com/en/JavaScript/Reference/Global_Objects/encodeURIComponent.html

You can use btoa() and atob(), this encodes and decodes the given string including single quote.

Just try encodeURI() and encodeURIComponent() yourself...
console.log(encodeURIComponent('##$%^&*'));
Input: ##$%^&*. Output: %40%23%24%25%5E%26*. So, wait, what happened to *? Why wasn't this converted? TLDR: You actually want fixedEncodeURIComponent() and fixedEncodeURI(). Long-story...
encodeURIComponent() : Do not use. Use fixedEncodeURIComponent(), as defined and explained by the MDN encodeURIComponent() Documentation, emphasis mine...
To be more stringent in adhering to RFC 3986 (which reserves !, ', (, ), and *), even though these characters have no formalized URI delimiting uses, the following can be safely used:
function fixedEncodeURIComponent(str) { return encodeURIComponent(str).replace(/[!'()*]/g, function(c) { return '%' + c.charCodeAt(0).toString(16); }); }
While we're on the topic, also don't use encodeURI(). MDN also has their own rewrite of it, as defined by the MDN encodeURI() Documentation. To quote their explanation...
If one wishes to follow the more recent RFC3986 for URLs, which makes square brackets reserved (for IPv6) and thus not encoded when forming something which could be part of a URL (such as a host), the following code snippet may help:
function fixedEncodeURI(str) { return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']'); }

Recent answer (2021)
Using JavaScript's URLSearchParams:
console.log(new URLSearchParams({ encoded: "'##$%^&" }).toString())

I found a neat trick that never misses any characters. I tell it to replace everything except for nothing. I do it like this (URL encoding):
function encode(w){return w.replace(/[^]/g,function(w){return '%'+w.charCodeAt(0).toString(16)})}
function encode(w){return w.replace(/[^]/g,function(w){return '%'+w.charCodeAt(0).toString(16)})}
loader.value = encode(document.body.innerHTML);
<textarea id=loader rows=11 cols=55>www.WHAK.com</textarea>

As #Bergi wrote, you can just replace all the characters:
function encoePicture(pictureUrl)
{
var map=
{
'&': '%26',
'<': '%3c',
'>': '%3e',
'"': '%22',
"'": '%27'
};
var encodedPic = encodeURI(pictureUrl);
var result = encodedPic.replace(/[&<>"']/g, function(m) { return map[m];});
return result;
}

Related

Escaping ' and & and similar characters in url

I need a way to encode both ' and & in a url. Check the following examples:
// "get_records.php?artist=Mumford%20%26%20Sons"
"get_records.php?artist=" + encodeURIComponent("Mumford & Sons");
// "get_records.php?artist=Gigi%20D'Agostinos"
"get_records.php?artist=" + encodeURIComponent("Gigi D'Agostino");
encodeURIComponent doesn't encode '. I can use escape instead, but it's deprecated, I guess. What do I do in this case? Create a custom encoder?
I'll be escaping other characters too: :, /, ., ,, !, for example, for the following strings
"11:59"
"200 km/h in the Wrong Lane"
"P.O.D."
"Everybody Else Is Doing It, So Why Can't We"
"Up!"
So creating a custom encoder seems like the best option. Is there an alternative approach that I can use?
You will have to implement this functionality yourself.
MDN covers this exact topic. Extending their proposal to cover the & character and others as well should be trival.
To be more stringent in adhering to RFC 3986 (which reserves !, ', (, ), and *), even though these characters have no formalized URI delimiting uses, the following can be safely used:
function fixedEncodeURIComponent (str) {
return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
Source
You can replace characters before create URL.
' = %27
Check this:
http://www.w3schools.com/tags/ref_urlencode.asp

Replacement for javascript escape?

I know that the escape function has been deprecated and that you should use encodeURI or encodeURIComponent instead. However, the encodeUri and encodeUriComponent doesn't do the same thing as escape.
I want to create a mailto link in javascript with Swedish åäö. Here are a comparison between escape, encodeURIComponent and encodeURI:
var subject="åäö";
var body="bodyåäö";
console.log("mailto:?subject="+escape(subject)+"&body=" + escape(body));
console.log("mailto:?subject="+encodeURIComponent(subject)+"&body=" + encodeURIComponent(body));
console.log("mailto:?subject="+encodeURI(subject)+"&body=" + encodeURI(body));
Output:
mailto:?subject=My%20subject%20with%20%E5%E4%F6&body=My%20body%20with%20more%20characters%20and%20swedish%20%E5%E4%F6
mailto:?subject=My%20subject%20with%20%C3%A5%C3%A4%C3%B6&body=My%20body%20with%20more%20characters%20and%20swedish%20%C3%A5%C3%A4%C3%B6
mailto:?subject=My%20subject%20with%20%C3%A5%C3%A4%C3%B6&body=My%20body%20with%20more%20characters%20and%20swedish%20%C3%A5%C3%A4%C3%B6
Only the mailto link created with "escape" opens a properly formatted mail in Outlook using IE or Chrome. When using encodeURI or encodeURIComponent the subject says:
My subject with åäö
and the body is also looking messed up.
Is there some other function besides escape that I can use to get the working mailto link?
escape() is defined in section B.2.1.2 escape and the introduction text of Annex B says:
... All of the language features and behaviours specified in this annex have one or more undesirable characteristics and in the absence of legacy usage would be removed from this specification. ...
For characters, whose code unit value is 0xFF or less, escape() produces a two-digit escape sequence: %xx. This basically means, that escape() converts a string containing only characters from U+0000 to U+00FF to an percent-encoded string using the latin-1 encoding.
For characters with a greater code unit, the four-digit format %uxxxx is used. This is not allowed within the hfields section (where subject and body are stored) of an mailto:-URI (as defined in RFC6068):
mailtoURI = "mailto:" [ to ] [ hfields ]
to = addr-spec *("," addr-spec )
hfields = "?" hfield *( "&" hfield )
hfield = hfname "=" hfvalue
hfname = *qchar
hfvalue = *qchar
...
qchar = unreserved / pct-encoded / some-delims
some-delims = "!" / "$" / "'" / "(" / ")" / "*"
/ "+" / "," / ";" / ":" / "#"
unreserved and pct-encoded are defined in STD66:
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
A percent sign is only allowed if it is directly followed by two hexdigits, percent followed by u is not allowed.
Using a self-implemented version, that behaves exactly like escape doesn't solve anything - instead just continue to use escape, it won't be removed anytime soon.
To summerise: Your previous usage of escape() generated latin1-percent-encoded mailto-URIs if all characters are in the range U+0000 to U+00FF, otherwise an invalid URI was generated (which might still be correctly interpreted by some applications, if they had javascript-encode/decode compatibility in mind).
It is more correct (no risk of creating invalid URIs) and future-proof, to generate UTF8-percent-encoded mailto-URIs using encodeURIComponent() (don't use encodeURI(), it does not escape ?, /, ...). RFC6068 requires usage of UTF-8 in many places (but allows other encodings for "MIME encoded words and for bodies in composed email messages").
Example:
text_latin1="Swedish åäö"
text_other="Emoji 😎"
document.getElementById('escape-latin-1-link').href="mailto:?subject="+escape(text_latin1);
document.getElementById('escape-other-chars-link').href="mailto:?subject="+escape(text_other);
document.getElementById('utf8-link').href="mailto:?subject="+encodeURIComponent(text_latin1);
document.getElementById('utf8-other-chars-link').href="mailto:?subject="+encodeURIComponent(text_other);
function mime_word(text){
q_encoded = encodeURIComponent(text) //to utf8 percent encoded
.replace(/[_!'()*]/g, function(c){return '%'+c.charCodeAt(0).toString(16).toUpperCase();})// encode some more chars as utf8
.replace(/%20/g,'_') // mime Q-encoding is using underscore as space
.replace(/%/g,'='); //mime Q-encoding uses equal instead of percent
return encodeURIComponent('=?utf-8?Q?'+q_encoded+'?=');//add mime word stuff and escape for uri
}
//don't use mime_word for body!!!
document.getElementById('mime-word-link').href="mailto:?subject="+mime_word(text_latin1);
document.getElementById('mime-word-other-chars-link').href="mailto:?subject="+mime_word(text_other);
<a id="escape-latin-1-link">escape()-latin1</a><br/>
<a id="escape-other-chars-link">escape()-emoji</a><br/>
<a id="utf8-link">utf8</a><br/>
<a id="utf8-other-chars-link">utf8-emoji</a><br/>
<a id="mime-word-link">mime-word</a><br/>
<a id="mime-word-other-chars-link">mime-word-emoji</a><br/>
For me, the UTF-8 links and the Mime-Word links work in Thunderbird. Only the plain UTF-8 links work in Windows 10 builtin Mailapp and my up-to-date version of Outlook.
To quote the MDN Documentation directly...
This function was used mostly for URL queries (the part of a URL following ?)—not for escaping ordinary String literals, which use the format "\xHH". (HH are two hexadecimal digits, and the form \xHH\xHH is used for higher-plane Unicode characters.)
The problem you are experiencing is because escape() does not support the UTF-8 while encodeURI() and encodeURIComponent() do.
But to be absolutely clear: never use encodeURI() or encodeURIComponent(). Let's just try it out:
console.log(encodeURIComponent('##*'));
Input: ##*. Output: %40%23*. Ordinarily, once user input is cleansed, I feel like I can trust that cleansed input. But if I ran rm * on my Linux system to delete a file specified by a user, that would literally delete all files on my system, even if I did the encoding 100% completely server-side. This is a massive bug in encodeURI() and encodeURIComponent(), which MDN Web docs clearly point with a solution.
Use fixedEncodeURI(), when trying to encode a complete URL (i.e., all of example.com?arg=val), as defined and further explained at the MDN encodeURI() Documentation...
function fixedEncodeURI(str) {
return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']');
}
Or, you may need to use use fixedEncodeURIComponent(), when trying to encode part of a URL (i.e., the arg or the val in example.com?arg=val), as defined and further explained at the MDN encodeURIComponent() Documentation...
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
If you are having trouble distinguishing what fixedEncodeURI(), fixedEncodeURIComponent(), and escape() do, I always like to simplify it with:
fixedEncodeURI() : will not encode +#?=:#;,$& to their http-encoded equivalents (as & and + are common URL operators)
fixedEncodeURIComponent() will encode +#?=:#;,$& to their http-encoded equivalents.
The escape() function was deprecated in JavaScript version 1.5. Use encodeURI() or encodeURIComponent() instead.
example
string: "May/June 2016, Volume 72, Issue 3"
escape: "May/June%202016%2C%20Volume%2072%2C%20Issue%203"
encodeURI: "May/June%202016,%20Volume%2072,%20Issue%203"
encodeURIComponent:"May%2FJune%202016%2C%20Volume%2072%2C%20Issue%203"
source https://www.w3schools.com/jsref/jsref_escape.asp

javascript encodeURIComponent and converting spaces to + symbols

I would like to encode my URL, but I want to convert spaces to plus symbols.
This is what I attempted to do...
var search = "Testing this here &";
encodeURIComponent(search.replace(/ /gi,"+"));
The output from that is Testing%2Bthis%2Bhere%2B%26 but what I would like it to be is Testing+this+here+%26 I tried replacing the space with %20 to convert it into a plus symbol, but that didn't seem to work. Can anyone tell me what it is I'm doing wrong here?
encodeURIComponent(search).replace(/%20/g, "+");
What you're doing wrong here is that first you convert spaces to pluses, but then encodeURIComponent converts pluses to "%2B".
Just try encodeURI() and encodeURIComponent() yourself...
console.log(encodeURIComponent('##$%^&*'));
Input: ##$%^&*. Output: %40%23%24%25%5E%26*. So, wait, what happened to *? Why wasn't this converted? TLDR: You actually want fixedEncodeURIComponent() and fixedEncodeURI(). Long-story...
Don't use encodeURIComponent() directly.
You should use fixedEncodeURIComponent(), as indicated by the MDN Documentation. encodeURIComponent does not encode any of the following: !',()*. You need to use this other function. It will solve not only your space problems, but other character problems.
function fixedEncodeURIComponent(str) { return encodeURIComponent(str).replace(/[!'()*]/g, function(c) { return '%' + c.charCodeAt(0).toString(16); }); }
To quote the MDN Documentation encodeURIComponent()...
To be more stringent in adhering to RFC 3986 (which reserves !, ', (, ), and *), even though these characters have no formalized URI delimiting uses, the following can be safely used: fixedEncodeURIComponent().

Why doesn't this particular regex work in JavaScript?

I have this regex on Javascript :
var myString="aaa#aaa.com";
var mailValidator = new RegExp("\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*");
if (!mailValidator.test(myString))
{
alert("incorrect");
}
but it shouldn't alert "incorrect" with aaa#aaa.com.
It should return "incorrect" for aaaaaa.com instead (as example).
Where am I wrong?
When you create a regex from a string, you have to take into account the fact that the parser will strip out backslashes from the string before it has a chance to be parsed as a regex.
Thus, by the time the RegExp() constructor gets to work, all the \w tokens have already been changed to just plain "w" in the string constant. You can either double the backslashes so the string parse will leave just one, or you can use the native regex constant syntax instead.
It works if you do this:
var mailValidator = /\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*/;
What happens in yours is that you need to double escape the backslash because they're inside a string, like "\\w+([-+.]\\w+)*...etc
Here's a link that explains it (in the "How to Use The JavaScript RegExp Object" section).
Try var mailValidator = new RegExp("\\w+([-+.]\\w+)*#\\w+([-.]\\w+)*\\.\\w+([-.]\\w+)*");

Javascript equivalent to php's urldecode()

I wrote a custom xml parser and its locking up on special characters. So naturally I urlencoded them into my database.
I can't seem to find an equivalent to php's urldecode().
Are there any extentions for jquery or javascript that can accomplish this?
You could use the decodeURIComponent function to convert the %xx into characters. However, to convert + into spaces you need to replace them in an extra step.
function urldecode(url) {
return decodeURIComponent(url.replace(/\+/g, ' '));
}
Check out this one
function urldecode (str) {
return decodeURIComponent((str + '').replace(/\+/g, '%20'));
}
I think you need the decodeURI function.

Categories

Resources