I have a website hompage encoded on iso-8859-1.
Then into that website i include different css and javascript files encoded on utf-8.
There is a way for show correct characters into the page from js files without change all encoding?
It should not be an issue. You've probably failed to identify the encoding of some of the files. To be on the safe side:
Configure your web server to add a correct Content-Type HTTP header with a charset attribute, e.g.:
Content-Type: application/javascript; charset=utf-8
When the language supports it, identify the encoding from the document itself, e.g.:
HTML 4:<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
HTML 5:<meta charset="iso-8859-1">
CSS:#charset "UTF-8";
Declare the charset when linking the resource, e.g.:
<script type="text/javascript" src="foo.js" charset="utf-8"></script>
(This is actually deprecated.)
In practice, you can probably omit some of these steps. I'd say #1 is the most important.
If you mix encoding, then you will face difficulty in future, especially if your pages contain different locales. So always use UTF-8 encoding.
Also you can change iso-8859-1 to UTF-8, without any changes in body as UTF-8 contains all characters of any encoding.
Related
I have links in an html file like
href="%87%d9%84-%d9%8a%d9%86%d9%81%d8%b9-%d8%a7%d8%ae%d9%84%d9%89-%d8%a8%d8%b1%d9%86%d8%a7%d9%85%d8%ac-%d8%a7%d9%84%d9%85%d9%8a%d8%aa%d8%a7%d8%aa%d8%b1%d9%8a%d8%af-%d9%8a%d9%86%d8%a8%d9%87%d9%86%d9%89/index.html"
And I want when the user clicks on this link in the browser, the link will
be
RealUtf8Text/index.html
Is There any way to use it using .htaccess file ?
If not, how we can do it using a javascript file ?
I don't want to make changes in the files, just add .htaccess or javascript file and the problem is solved.
The problem appears to be that your URL does not contain valid UTF-8 data. There is no UTF-8 sequence that begins with the octet 87.
I'm guessing that your URL is missing a d9 or d8 octet. This URL:
http://localhost/%d9%87%d9%84-%d9%8a%d9%86%d9%81%d8%b9-%d8%a7%d8%ae%d9%84%d9%89-%d8%a8%d8%b1%d9%86%d8%a7%d9%85%d8%ac-%d8%a7%d9%84%d9%85%d9%8a%d8%aa%d8%a7%d8%aa%d8%b1%d9%8a%d8%af-%d9%8a%d9%86%d8%a8%d9%87%d9%86%d9%89/index.html
is shown as arabic characters in my browser:
How the URL is displayed will of course depend on the browser's support for arabic characters, and is not something that can be affected by JavaScript or .htaccess.
You use urldecode,
urldecode("%87%d9%84-%d9%8a%d9%86%d9%81%d8%b9-%d8%a7%d8%ae%d9%84%d9%89-%d8%a8%d8%b1%d9%86%d8%a7%d9%85%d8%ac-%d8%a7%d9%84%d9%85%d9%8a%d8%aa%d8%a7%d8%aa%d8%b1%d9%8a%d8%af-%d9%8a%d9%86%d8%a8%d9%87%d9%86%d9%89/index.html")
see the documentation in
http://php.net/manual/en/function.urldecode.php.
also make sure you have this on your html
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
I'm still new to webdev and dealing with character set encodings. I've read http://kunststube.net/encoding/ along with a few other pieces on the subject.
My problem is that I've got a bunch of text that I'm pulling from a server. It is encoded and served as utf-8.
However, when I display the strings, the french / spanish accents are garbled up. I've googled around and it seems JavaScript engines use UCS-2 or UTF-16 internally. Is there something I have to do to get it to treat my text as UTF-8? I have the <meta charset="utf-8"> in my html, but it doesn't seem to do anything.
Any ideas?
Without any links, I can't inspect what you are doing directly, but you shouldn't need to do anything special inside JavaScript to get it to work, just make sure all your sources are set to UTF-8 correctly, and that the browser is interpreting them as such.
You may need to make sure your server (Apache? IIS?) is setting the appropriate encode header. For example in PHP:
header('Content-Type: text/plain; charset=utf-8');
header('Content-Type: text/html; charset=utf-8');
Or in .htaccess there are many ways to do it. A couple of ways:
AddCharset UTF-8 .html
or specific files:
<Files "example.js">
AddCharset UTF-8 .js
</Files>
refs:
http://us2.php.net/manual/fr/function.header.php
https://www.w3.org/International/questions/qa-htaccess-charset.en
If you don't have meta tag in your html then put it in the header :
<meta charset="UTF-8">
else , you have to declare character encoding in your script file
I'm a bit stuck, given my page includes an external JavaScript which uses document.write. The problem is my page is UTF-8 encoded, and the contents written are encoded in latin-1, which causes some display problems.
Is there any way to handle this ?
I have to admit never having had to mix encodings, but in theory you should be able to specify the charset attribute (link) on the script tag — but be sure you're not conflicting with it when serving the external file. From that link:
The charset attribute gives the character encoding of the external script resource...its value must be a valid character encoding name, must be an ASCII case-insensitive match for the preferred MIME name for that encoding, and must match the encoding given in the charset parameter of the Content-Type metadata of the external file, if any.
So that will tell the browser how to interpret the script data, provided your server provides the same charset (or doesn't supply any charset) in the Content-Type header when serving up the script file.
Once the browser is reading the script with the right charset, you should be okay, because by the time JavaScript is dealing with strings, they're UTF-16 (according to Section 8.4 of the 5th edition spec).
I have an HTML document stored in a file, with a UTF-8 encoding, and I want my extension to display this file in the browser, so I call loadURIWithFlags('file://' + file.path, flags, null, 'UTF-8', null); but it loads it as ISO-8859-1 instead of UTF-8. (I can tell because ISO-8859-1 is selected on the View>Character Encoding menu, and because non-breaking-space characters are showing up as an  followed by a space. If I switch to UTF-8 using the Character Encoding menu, then everything looks right.)
I tried including LOAD_FLAGS_BYPASS_CACHE and LOAD_FLAGS_CHARSET_CHANGE in the flags but that didn't seem to have any effect. I also checked that auto-detect was turned off, so that wasn't the problem either. Adding <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> to the document seems to have solved the problem, but I would expect that using the 'charset' argument of loadURIWithFlags should work just as well, so I'm wondering if I did something wrong in my initial attempt.
You did the right thing and the only solution is to include encoding information inside the document because if you rely only on HTTP headers you will fail to load the document when the document is saved on disk (because there is no such thing as headers for files).
If you are the one saving the file you could add the UTF-8 BOM to the file in order to assure that it will be properly loaded by Firefox or other applications.
I've got a multilingual site, that allows users to input text to search a form field, but the text goes through Javascript before heading off to the backend.
Special chars like "欢" are being properly handled in Firefox, but not in any version of IE.
Can someone help me understand what's going on?
Thanks!
You might find it useful to add the accept-charset attribute to your form. This specifies to the browser what character-set the server accepts. Your JS should follow this and send it in that format.
Some other things that can affect the way IE handles character encoding:
Specifying the correct doctype (ie, standards vs. "compliance" modes).
The Content-Type header sent by the server; I believe most browsers adhere to the header over the meta-tag, so if your server is specifying ISO-8859-1 and your page specifies UTF-8 there will be some confusion.
The format of the Content-Type header; some "modern" browsers (specifically FF) accept utf8 as an alias of utf-8. IE does not, and falls-back to ISO-8859-1. (This comes from painful personal experience! ;)
Character-sets are a real pain. You need to ensure that all the components are talking the same "language" front-to-back - that includes both storage and communication.
The next step to track down what is going on is to have your server code log the headers for your JS request to be sure that the encoding matches what you're expecting.
Some browsers default differently, set the default encoding for your site forcefully by utilizing the meta tag for encoding. As here:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
UTF-8 may not be what you're looking for but is the most likely. Testing, go to view->character encoding in firefox and set it manually. By knowing which one works, you will know which to set it to. A list of schemas here:
http://tlt.its.psu.edu/suggestions/international/web/tips/declare.html
and more here: http://tlt.its.psu.edu/suggestions/international/bylanguage/index.html
Ensure your characters and your page are both using UTF-8 encoding.