Character in the url is changed when using window.location.search - javascript

I have an url like: file:///C:/Users/index.html?Scale:%20Service-Qualität
When I use window.location.search to get the parameter in the url, in this case the parameter should be Scale: Service-Qualität but what I actually received was Scale:%20Service-Qualit%C3%A4t, I dont know why my character ä changed to %C3%A4 and when I tested in the console it displayed as Scale: Service-Qualität
Can anyone help me to fix this problem?

I found the solution for my problem. What I need to do is decode again my url using: decodeURIComponent(url); then I will get again exact url string.

You are seeing two issues here.
The ä being converted into %C3%A4 is called URL or percent encoding.
It's because URLs can't, technically, contain Unicode characters.
Browsers and servers work around this by converting non-ASCII characters in URLs to their percent encoded equivalents.
It's generally nothing to worry about.
In your case however, there seems to be an actual problem as well. The weird output in the console could be because your web page uses a single-byte encoding (like ISO-8859-1) instead of UTF-8.
Switching the web page to UTF-8 might solve the problem, using this Meta tag:
<meta charset="utf-8"/>
and, of course, saving the HTML file as UTF-8 in your editor.

Related

FileSaver.js can't specify the charset

I'm using a FilseSaver.js to save a rtf file. This is working fine.
However when I'm using special chars, it goes wrong... The charset automatically "changes" from ansi to utf-8 and characters aren't displayed right in the rtf document.
I've tried "forcing" ansi, but it seems like all browsers ignore this setting?
Here is a part of my script:
var blob = new Blob([rtf], {type: "application/rtf;charset=windows-1252"});
saveAs(blob, filename);
This can be fixed by converting my special characters to unicode of hex characters. However using the correct charset seems simpler. Why isn't my script working and how should I fix this?
Thanks!
I've come across the same problem, and it seems there's no possibility of stating the charset in the constructor, other than to manually encode your data into the encoding you need.
See this issue https://github.com/eligrey/FileSaver.js/issues/14

Escape HTML tags. Any issue possible with charset encoding?

I have a function to escape HTML tags, to be able to insert text into HTML.
Very similar to:
Can I escape html special chars in javascript?
I know that Javascript use Unicode internally, but HTML pages may be encoded in different charsets like UTF-8 or ISO8859-1, etc..
My question is: There is any issue with this very simple conversion? or should I take into consideration the page charset?
If yes, how to handle that?
PS: For example, the equivalente PHP function (http://php.net/manual/en/function.htmlspecialchars.php) has a parameter to select a charset.
No, JavaScript lives in the Unicode world so encoding issues are generally invisible to it. escapeHtml in the linked question is fine.
The only place I can think of where JavaScript gets to see bytes would be data: URLs (typically hidden beneath base64). So this:
var markup = '<p>Hello, '+escapeHtml(user_supplied_data);
var url = 'data:text/html;base64,'+btoa(markup);
iframe.src = url;
is in principle a bad thing. Although I don't know of any browsers that will guess UTF-7 in this situation, a charset=... parameter should be supplied to ensure that the browser uses the appropriate encoding for the data. (btoa uses ISO-8859-1, for what it's worth.)

HTML Encode String

I am trying to HTML-Encode a string with jQuery, but I can't seem to find the right encoding format.
What I got is a String like Ütest.docx. The server doesn't handle special characters very well so that I get a FileNotFoundException from Java (I have no way of editing the server itself).
Now, I tried around and found out that the URL works when I replace Ü with %DC. Now I tought this is called HTML Encoding, googled a bit but I always get results saying something about URL-Encoding. I checked that, and it seems like this isn't the right encoding, because Ü is beeing encoded to %C3%9C, which doesn't work for the server.
Now, which encoding is it, that would encode Ü to %DC? And is there a function in javascript or jQuery that would to the encoding for me?
Thanks for any help, I've been trying to find out which encoding I need for an hour now, but no luck.
They are both URL encoding, just that the UTF-8 one is a newer standard.
If you are using Tomcat, you can use just encodeURIComponent() which uses UTF-8
and works when you set the Tomcat connector URIEncoding attribute to <connector URIEncoding="UTF-8" ...>
If that's not ok, you can use this:
function uriEncodeLegacy( str ) {
return escape(str.replace( /[\u0100-\uFFFF]/g, ""));
}
uriEncodeLegacy("Ü") //%DC
However UTF-8 is recommended, otherwise you cannot even support the € character for example.

How to encode periods for URLs in Javascript?

The SO post below is comprehensive, but all three methods described fail to encode for periods.
Post: Encode URL in JavaScript?
For instance, if I run the three methods (i.e., escape, encodeURI, encodeURIComponent), none of them encode periods.
So "food.store" comes out as "food.store," which breaks the URL. It breaks the URL because the Rails app cannot recognize the URL as valid and displays the 404 error page. Perhaps it's a configuration mistake in the Rails routes file?
What's the best way to encode periods with Javascript for URLs?
I know this is an old thread, but I didn't see anywhere here any examples of URLs that were causing the original problem. I encountered a similar problem myself a couple of days ago with a Java application. In my case, the string with the period was at the end of the path element of the URL eg.
http://myserver.com/app/servlet/test.string
In this case, the Spring library I'm using was only passing me the 'test' part of that string to the relevant annotated method parameter of my controller class, presumably because it was treating the '.string' as a file extension and stripping it away. Perhaps this is the same underlying issue with the original problem above?
Anyway, I was able to workaround this simply by adding a trailing slash to the URL. Just throwing this out there in case it is useful to anybody else.
John
Periods shouldn't break the url, but I don't know how you are using the period, so I can't really say. None of the functions I know of encode the '.' for a url, meaning you will have to use your own function to encode the '.' .
You could base64 encode the data, but I don't believe there is a native way to do that in js. You could also replace all periods with their ASCII equivalent (%2E) on both the client and server side.
Basically, it's not generally necessary to encode '.', so if you need to do it, you'll need to come up with your own solution. You may want to also do further testing to be sure the '.' will actually break the url.
hth
I had this same problem where my .htaccess was breaking input values with .
Since I did not want to change what the .htaccess was doing I used this to fix it:
var val="foo.bar";
var safevalue=encodeURIComponent(val).replace(/\./g, '%2E');
this does all the standard encoding then replaces . with there ascii equivalent %2E. PHP automatically converts back to . in the $_REQUEST value but the .htaccess doesn't see it as a period so things are all good.
Periods do not have to be encoded in URLs. Here is the RFC to look at.
If a period is "breaking" something, it may be that your server is making its own interpretation of the URL, which is a fine thing to do of course but it means that you have to come up with some encoding scheme of your own when your own metacharacters need escaping.
I had the same question and maybe my solution can help someone else in the future.
In my case the url was generated using javascript. Periods are used to separate values in the url (sling selectors), so the selectors themselves weren't allowed to have periods.
My solution was to replace all periods with the html entity as is Figure 1:
Figure 1: Solution
var urlPart = 'foo.bar';
var safeUrlPart = encodeURIComponent(urlPart.replace(/\./g, '.'));
console.log(safeUrlPart); // foo%26%2346%3Bbar
console.log(decodeURIComponent(safeUrlPart)); // foo.bar
I had problems with .s in rest api urls. It is the fact that they are interpreted as extensions which in it's own way makes sense. Escaping doesn't help because they are unescaped before the call (as already noted). Adding a trailing / didn't help either. I got around this by passing the value as a named argument instead. e.g. api/Id/Text.string to api/Id?arg=Text.string. You'll need to modify the routing on the controller but the handler itself can stay the same.
If its possible using a .htaccess file would make it really cool and easy. Just add a \ before the period. Something like:\.
It is a rails problem, see Rails REST routing: dots in the resource item ID for an explanation (and Rails routing guide, Sec. 3.2)
You shouldn't be using encodeURI() or encodeURIComponent() anyway.
console.log(encodeURIComponent('%^&*'));
Input: %^&*. Output: %25%5E%26*. So, to be clear, this doesn't convert *. Hopefully you know this before you run rm * after "cleansing" that input server-side!
Luckily, MDN gave us a work-around to fix this glaring problem, fixedEncodeURI() and fixedEncodeURIComponent(), which is based on this regex: [!'()*]. (Source: MDN Web Docs: encodeURIComponent().) Just rewrite it to add in a period and you'll be fine:
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[\.!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
console.log(fixedEncodeURIComponent('hello.'));

Javascript Special Characters coming back incorrectly

There is a page where I have certain special characters on and when retrieving values of these via javascript I am getting an odd conversion. The character 'Œ' is coming back as 'R' and its lower case version 'œ' is coming back as 'S'. Is this a limitation of javascript or could it possibly be the browser. This is from testing in firefox. Also this is being retrieved via a repl client (Jssh/MozRepl) so it seems that it could be an issue with these clients themselves rather than the browser.
You likely have an encoding problem somewhere. There are many opportunities to mis-handle the encoding of text. If you post some code, we might be able to help you find it.
Output streams aren't scriptably safe for non-ASCII characters so you will need to wrap the stream in a nsIBinaryOutputStream, a nsIUnicharOutputStream or a nsIConverterOutputStream.

Categories

Resources