Should I use encodeURI or encodeURIComponent for encoding URLs? - javascript

Which of these two methods should be used for encoding URLs?

It depends on what you are actually wanting to do.
encodeURI assumes that the input is a complete URI that might have some characters which need encoding in it.
encodeURIComponent will encode everything with special meaning, so you use it for components of URIs such as
var world = "A string with symbols & characters that have special meaning?";
var uri = 'http://example.com/foo?hello=' + encodeURIComponent(world);

If you're encoding a string to put in a URL component (a querystring parameter), you should call encodeURIComponent.
If you're encoding an existing URL, call encodeURI.

xkr.us has a great discussion, with examples. To quote their summary:
The escape() method does not encode the + character which is
interpreted as a space on the server side as well as generated by
forms with spaces in their fields. Due to this shortcoming and the
fact that this function fails to handle non-ASCII characters
correctly, you should avoid use of escape() whenever possible. The
best alternative is usually encodeURIComponent().
escape() will not encode: #*/+
Use of the encodeURI() method is a bit more specialized than escape()
in that it encodes for URIs as opposed to the querystring, which is
part of a URL. Use this method when you need to encode a string to be
used for any resource that uses URIs and needs certain characters to
remain un-encoded. Note that this method does not encode the '
character, as it is a valid character within URIs.
encodeURI() will not encode: ~!##$&*()=:/,;?+'
Lastly, the encodeURIComponent() method should be used in most cases
when encoding a single component of a URI. This method will encode
certain chars that would normally be recognized as special chars for
URIs so that many components may be included. Note that this method
does not encode the ' character, as it is a valid character within
URIs.
encodeURIComponent() will not encode: ~!*()'

Here is a summary.
escape() will not encode # * _ + - . /
Do not use it.
encodeURI() will not encode A-Z a-z 0-9 ; , / ? : # & = + $ - _ . ! ~ * ' ( ) #
Use it when your input is a complete URL like 'https://searchexample.com/search?q=wiki'
encodeURIComponent() will not encode A-Z a-z 0-9 - _ . ! ~ * ' ( )
Use it when your input is part of a complete URL
e.g
const queryStr = encodeURIComponent(someString)

encodeURI and encodeURIComponent are used for different purposes.
Some of the difference are
encodeURI is used to encode a full URL whereas encodeURIComponent is used for encoding a URI component such as a query string.
There are 11 characters which are not encoded by encodeURI, but encoded by encodeURIComponent.
List:
Character
encodeURI
encodeURIComponent
#
#
%23
$
$
%24
&
&
%26
+
+
%2B
,
,
%2C
/
/
%2F
:
:
%3A
;
;
%3B
=
=
%3D
?
?
%3F
#
#
%40
Notes:
encodeURIComponent does not encode -_.!~*'(). If you want to these characters are encoded, you have to replace them with a corresponding UTF-8 sequence of characters
If you want to learn more about encodeURI and encodeURIComponent, please check the reference link.
Reference Link

encodeURIComponent() : assumes that its argument is a portion (such as the protocol, hostname, path, or query string)
of a URI. Therefore it escapes the punctuation characters that are used to separate the portionsof a URI.
encodeURI(): is used for encoding existing url

Difference between encodeURI and encodeURIComponent:
encodeURIComponent(value) is mainly used to encode queryString parameter values, and it encodes every applicable character in value. encodeURI ignores protocol prefix (http://) and domain name.
In very, very rare cases, when you want to implement manual encoding to encode additional characters (though they don't need to be encoded in typical cases) like: ! * , then
you might use:
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
(source)

Other answers describe the purposes. Here are the characters each function will actually convert:
control = '\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F'
+ '\x10\x11\x12\x13\x14\X15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F'
+ '\x7F'
encodeURI (control + ' "%<>[\\]^`{|}' )
encodeURIComponent(control + ' "%<>[\\]^`{|}' + '#$&,:;=?' + '+/#' )
escape (control + ' "%<>[\\]^`{|}' + '#$&,:;=?' + "!'()~")
All characters above are converted to percent-hexadecimal codes. Space to %20, percent to %25, etc. The characters below pass through unchanged.
Here are the characters the functions will NOT convert:
pass_thru = '*-._0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
encodeURI (pass_thru + '#$&,:;=?' + '+/#' + "!'()~")
encodeURIComponent(pass_thru + "!'()~")
escape (pass_thru + '+/#' )

As a general rule use encodeURIComponent. Don't be scared of the long name thinking it's more specific in it's use, to me it's the more commonly used method. Also don't be suckered into using encodeURI because you tested it and it appears to be encoding properly, it's probably not what you meant to use and even though your simple test using "Fred" in a first name field worked, you'll find later when you use more advanced text like adding an ampersand or a hashtag it will fail. You can look at the other answers for the reasons why this is.

Related

Escape content for javascript file download [duplicate]

I am using a url to open a html page, and i am sending data in querystring withe the page url.
For example: abc.html?firstParameter=firstvalue&seconedParameter=seconedvalue
Problem is that if firstvalue or secondvalue in parameter contains
special character like #,(,),%,{, then my url is not constructing well. In this case url is not validating.
I am doing all this in javascript.
Can any body please help me out this.
You have 3 options:
escape() will not encode: #*/+
encodeURI() will not encode: ~!##$&*()=:/,;?+'
encodeURIComponent() will not encode: ~!*()'
But in your case, if you want to pass a url into a GET parameter of other page, you should use escape or encodeURIComponent, but not encodeURI.
To be safe and ensure that you've escaped all the reserved characters specified in both RFC 1738 and RFC 3986 you should use a combination of encodeURIComponent, escape and a replace for the asterisk('*') like this:
encoded = encodeURIComponent( parm ).replace(/[!'()]/g, escape).replace(/\*/g, "%2A");
[Explanation]
While RFC 1738: Uniform Resource Locators (URL) specifies that the *, !, ', ( and ) characters may be left unencoded in the URL,
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.
RFC 3986, pages 12-13, states that these special characters are reserved as sub-delimiters.
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "#"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
The escape() function has been deprecated but can be used to URL encode the exclamation mark, single quote, left parenthesis and right parenthesis. And since there is some ambiguity on whether an asterisk must be encoded in a URL, and it doesn't hurt to encode, it you can explicitly encode is using something like the replace() function call. [Note that the escape() function is being passed as the second parameter to the first replace() function call. As used here, replace calls the escape() function once for each matched special character of !, ', ( or ), and escape merely returns the 'escape sequence' for that character back to replace, which reassembles any escaped characters with the other fragments.]
Also see 'https://stackoverflow.com/questions/6533561/urlencode-the-asterisk-star-character'
Also while some websites have even identified the asterkisk(*) as being a reserved character under RFC3986, they don't include it in their URL component encoding tool.
Unencoded URL parms:
parm1=this is a test of encoding !##$%^&*()'
parm2=note that * is not encoded
Encoded URL parms:
parm1=this+is+a+test+of+encoding+%21%40%23%24%25%5E%26*%28%29%27
parm2=note+that+*+is+not+encodeds+not+encoded

Encode % to pass in URL javascript [duplicate]

Which of these two methods should be used for encoding URLs?
It depends on what you are actually wanting to do.
encodeURI assumes that the input is a complete URI that might have some characters which need encoding in it.
encodeURIComponent will encode everything with special meaning, so you use it for components of URIs such as
var world = "A string with symbols & characters that have special meaning?";
var uri = 'http://example.com/foo?hello=' + encodeURIComponent(world);
If you're encoding a string to put in a URL component (a querystring parameter), you should call encodeURIComponent.
If you're encoding an existing URL, call encodeURI.
xkr.us has a great discussion, with examples. To quote their summary:
The escape() method does not encode the + character which is
interpreted as a space on the server side as well as generated by
forms with spaces in their fields. Due to this shortcoming and the
fact that this function fails to handle non-ASCII characters
correctly, you should avoid use of escape() whenever possible. The
best alternative is usually encodeURIComponent().
escape() will not encode: #*/+
Use of the encodeURI() method is a bit more specialized than escape()
in that it encodes for URIs as opposed to the querystring, which is
part of a URL. Use this method when you need to encode a string to be
used for any resource that uses URIs and needs certain characters to
remain un-encoded. Note that this method does not encode the '
character, as it is a valid character within URIs.
encodeURI() will not encode: ~!##$&*()=:/,;?+'
Lastly, the encodeURIComponent() method should be used in most cases
when encoding a single component of a URI. This method will encode
certain chars that would normally be recognized as special chars for
URIs so that many components may be included. Note that this method
does not encode the ' character, as it is a valid character within
URIs.
encodeURIComponent() will not encode: ~!*()'
Here is a summary.
escape() will not encode # * _ + - . /
Do not use it.
encodeURI() will not encode A-Z a-z 0-9 ; , / ? : # & = + $ - _ . ! ~ * ' ( ) #
Use it when your input is a complete URL like 'https://searchexample.com/search?q=wiki'
encodeURIComponent() will not encode A-Z a-z 0-9 - _ . ! ~ * ' ( )
Use it when your input is part of a complete URL
e.g
const queryStr = encodeURIComponent(someString)
encodeURI and encodeURIComponent are used for different purposes.
Some of the difference are
encodeURI is used to encode a full URL whereas encodeURIComponent is used for encoding a URI component such as a query string.
There are 11 characters which are not encoded by encodeURI, but encoded by encodeURIComponent.
List:
Character
encodeURI
encodeURIComponent
#
#
%23
$
$
%24
&
&
%26
+
+
%2B
,
,
%2C
/
/
%2F
:
:
%3A
;
;
%3B
=
=
%3D
?
?
%3F
#
#
%40
Notes:
encodeURIComponent does not encode -_.!~*'(). If you want to these characters are encoded, you have to replace them with a corresponding UTF-8 sequence of characters
If you want to learn more about encodeURI and encodeURIComponent, please check the reference link.
Reference Link
encodeURIComponent() : assumes that its argument is a portion (such as the protocol, hostname, path, or query string)
of a URI. Therefore it escapes the punctuation characters that are used to separate the portionsof a URI.
encodeURI(): is used for encoding existing url
Difference between encodeURI and encodeURIComponent:
encodeURIComponent(value) is mainly used to encode queryString parameter values, and it encodes every applicable character in value. encodeURI ignores protocol prefix (http://) and domain name.
In very, very rare cases, when you want to implement manual encoding to encode additional characters (though they don't need to be encoded in typical cases) like: ! * , then
you might use:
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
(source)
Other answers describe the purposes. Here are the characters each function will actually convert:
control = '\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F'
+ '\x10\x11\x12\x13\x14\X15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F'
+ '\x7F'
encodeURI (control + ' "%<>[\\]^`{|}' )
encodeURIComponent(control + ' "%<>[\\]^`{|}' + '#$&,:;=?' + '+/#' )
escape (control + ' "%<>[\\]^`{|}' + '#$&,:;=?' + "!'()~")
All characters above are converted to percent-hexadecimal codes. Space to %20, percent to %25, etc. The characters below pass through unchanged.
Here are the characters the functions will NOT convert:
pass_thru = '*-._0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
encodeURI (pass_thru + '#$&,:;=?' + '+/#' + "!'()~")
encodeURIComponent(pass_thru + "!'()~")
escape (pass_thru + '+/#' )
As a general rule use encodeURIComponent. Don't be scared of the long name thinking it's more specific in it's use, to me it's the more commonly used method. Also don't be suckered into using encodeURI because you tested it and it appears to be encoding properly, it's probably not what you meant to use and even though your simple test using "Fred" in a first name field worked, you'll find later when you use more advanced text like adding an ampersand or a hashtag it will fail. You can look at the other answers for the reasons why this is.

JavaScript encodeURIComponent With Backslash

w3schools says the following about encodeURIComponent function:
This function encodes special characters. In addition,
it encodes the following characters: , / ? : # & = + $ #.
Does that mean that it cannot encode a backslash (\)?
This function encodes special characters. In addition, it encodes the following characters: , / ? : # & = + $ # .
This definition is vague as to what "special characters" are. It sounds like a comparison between encodeURI and encodeURIComponent. Both will correctly escape \ as %5C, so you don't have to worry about backslashes.
encodeURI will leave the listed characters as it is assumed that the entire URI is being encoded:
encodeURI('http://example.com/foo bar/baz.html');
//produces "http://example.com/foo%20bar/baz.html"
encodeURIComponent will escape everything as it is assumed that the string is to be used as part of a query-string:
'http://example.com?foo=' + encodeURIComponent('http://example.com/fizz/buzz.html');
//produces "http://example.com?foo=http%3A%2F%2Fexample.com%2Ffizz%2Fbuzz.html"

escaping special character in a url

I am using a url to open a html page, and i am sending data in querystring withe the page url.
For example: abc.html?firstParameter=firstvalue&seconedParameter=seconedvalue
Problem is that if firstvalue or secondvalue in parameter contains
special character like #,(,),%,{, then my url is not constructing well. In this case url is not validating.
I am doing all this in javascript.
Can any body please help me out this.
You have 3 options:
escape() will not encode: #*/+
encodeURI() will not encode: ~!##$&*()=:/,;?+'
encodeURIComponent() will not encode: ~!*()'
But in your case, if you want to pass a url into a GET parameter of other page, you should use escape or encodeURIComponent, but not encodeURI.
To be safe and ensure that you've escaped all the reserved characters specified in both RFC 1738 and RFC 3986 you should use a combination of encodeURIComponent, escape and a replace for the asterisk('*') like this:
encoded = encodeURIComponent( parm ).replace(/[!'()]/g, escape).replace(/\*/g, "%2A");
[Explanation]
While RFC 1738: Uniform Resource Locators (URL) specifies that the *, !, ', ( and ) characters may be left unencoded in the URL,
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.
RFC 3986, pages 12-13, states that these special characters are reserved as sub-delimiters.
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "#"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
The escape() function has been deprecated but can be used to URL encode the exclamation mark, single quote, left parenthesis and right parenthesis. And since there is some ambiguity on whether an asterisk must be encoded in a URL, and it doesn't hurt to encode, it you can explicitly encode is using something like the replace() function call. [Note that the escape() function is being passed as the second parameter to the first replace() function call. As used here, replace calls the escape() function once for each matched special character of !, ', ( or ), and escape merely returns the 'escape sequence' for that character back to replace, which reassembles any escaped characters with the other fragments.]
Also see 'https://stackoverflow.com/questions/6533561/urlencode-the-asterisk-star-character'
Also while some websites have even identified the asterkisk(*) as being a reserved character under RFC3986, they don't include it in their URL component encoding tool.
Unencoded URL parms:
parm1=this is a test of encoding !##$%^&*()'
parm2=note that * is not encoded
Encoded URL parms:
parm1=this+is+a+test+of+encoding+%21%40%23%24%25%5E%26*%28%29%27
parm2=note+that+*+is+not+encodeds+not+encoded

What is the difference between decodeURIComponent and decodeURI?

What is the difference between the JavaScript functions decodeURIComponent and decodeURI?
To explain the difference between these two let me explain the difference between encodeURI and encodeURIComponent.
The main difference is that:
The encodeURI function is intended for use on the full URI.
The encodeURIComponent function is intended to be used on .. well .. URI components that is
any part that lies between separators (; / ? : # & = + $ , #).
So, in encodeURIComponent these separators are encoded also because they are regarded as text and not special characters.
Now back to the difference between the decode functions, each function decodes strings generated by its corresponding encode counterpart taking care of the semantics of the special characters and their handling.
encodeURIComponent/decodeURIComponent() is almost always the pair you want to use, for concatenating together and splitting apart text strings in URI parts.
encodeURI in less common, and misleadingly named: it should really be called fixBrokenURI. It takes something that's nearly a URI, but has invalid characters such as spaces in it, and turns it into a real URI. It has a valid use in fixing up invalid URIs from user input, and it can also be used to turn an IRI (URI with bare Unicode characters in) into a plain URI (using %-escaped UTF-8 to encode the non-ASCII).
Where encodeURI should really be named fixBrokenURI(), decodeURI() could equally be called potentiallyBreakMyPreviouslyWorkingURI(). I can think of no valid use for it anywhere; avoid.
js> s = "http://www.example.com/string with + and ? and & and spaces";
http://www.example.com/string with + and ? and & and spaces
js> encodeURI(s)
http://www.example.com/string%20with%20+%20and%20?%20and%20&%20and%20spaces
js> encodeURIComponent(s)
http%3A%2F%2Fwww.example.com%2Fstring%20with%20%2B%20and%20%3F%20and%20%26%20and%20spaces
Looks like encodeURI produces a "safe" URI by encoding spaces and some other (e.g. nonprintable) characters, whereas encodeURIComponent additionally encodes the colon and slash and plus characters, and is meant to be used in query strings. The encoding of + and ? and & is of particular importance here, as these are special chars in query strings.
As I had the same question, but didn't find the answer here, I made some tests in order to figure out what the difference actually is.
I did this, since I need the encoding for something, which is not URL/URI related.
encodeURIComponent("A") returns "A", it does not encode "A" to "%41"
decodeURIComponent("%41") returns "A".
encodeURI("A") returns "A", it does not encode "A" to "%41"
decodeURI("%41") returns "A".
-That means both can decode alphanumeric characters, even though they did not encode them. However...
encodeURIComponent("&") returns "%26".
decodeURIComponent("%26") returns "&".
encodeURI("&") returns "&".
decodeURI("%26") returns "%26".
Even though encodeURIComponent does not encode all characters, decodeURIComponent can decode any value between %00 and %7F.
Note: It appears that if you try to decode a value above %7F (unless it's a unicode value), then your script will fail with an "URI error".
encodeURIComponent()
Converts the input into a URL-encoded
string
encodeURI()
URL-encodes the input, but
assumes a full URL is given, so
returns a valid URL by not encoding
the protocol (e.g. http://) and
host name (e.g.
www.stackoverflow.com).
decodeURIComponent() and decodeURI() are the opposite of the above
decodeURIComponent will decode URI special markers such as &, ?, #, etc, decodeURI will not.
encodeURIComponent
Not Escaped:
A-Z a-z 0-9 - _ . ! ~ * ' ( )
encodeURI()
Not Escaped:
A-Z a-z 0-9 ; , / ? : # & = + $ - _ . ! ~ * ' ( ) #
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI
Encode URI:
The encodeURI() method does not encodes:
, / ? : # & = + $ * #
Example
URI: https://my test.asp?name=ståle&car=saab
Encoded URI: https://my%20test.asp?name=st%C3%A5le&car=saab
Encode URI Component:
The encodeURIComponent() method also encodes:
, / ? : # & = + $ #
Example
URI: https://my test.asp?name=ståle&car=saab
Encoded URI: https%3A%2F%2Fmy%20test.asp%3Fname%3Dst%C3%A5le%26car%3Dsaab
For More: W3Schoools.com

Categories

Resources