How to get the actual currency symbol from the internationalization standard - javascript

Well one can format a value like this:
const formatter new Intl.NumberFormat('de-DE', { style: 'currency', currency: 'EUR' });
console.log(formatter.format(200));
This automatically selects the "correct" currency symbol. Now what I like is to actually get the symbol (given the locale). I could try formatToParts isn't really doing that since the "parts" are sometimes in a different order: in German the symbol is at the end while in English the symbol is before the amount.
I would like the symbol to be in front of other fields (ie input)
The main important thing to me is not even that the correct symbol is used (it should) - but most important it should always be the same as the one used for displaying.

Related

How can I convert this UTF-8 string to plain text in javascript and how can a normal user write it in a textarea [duplicate]

While reviewing JavaScript concepts, I found String.normalize(). This is not something that shows up in W3School's "JavaScript String Reference", and, hence, it is the reason I might have missed it before.
I found more information about it in HackerRank which states:
Returns a string containing the Unicode Normalization Form of the
calling string's value.
With the example:
var s = "HackerRank";
console.log(s.normalize());
console.log(s.normalize("NFKC"));
having as output:
HackerRank
HackerRank
Also, in GeeksForGeeks:
The string.normalize() is an inbuilt function in javascript which is
used to return a Unicode normalisation form of a given input string.
with the example:
<script>
// Taking a string as input.
var a = "GeeksForGeeks";
// calling normalize function.
b = a.normalize('NFC')
c = a.normalize('NFD')
d = a.normalize('NFKC')
e = a.normalize('NFKD')
// Printing normalised form.
document.write(b +"<br>");
document.write(c +"<br>");
document.write(d +"<br>");
document.write(e);
</script>
having as output:
GeeksForGeeks
GeeksForGeeks
GeeksForGeeks
GeeksForGeeks
Maybe the examples given are just really bad as they don't allow me to see any change.
I wonder... what's the point of this method?
It depends on what will do with strings: often you do not need it (if you are just getting input from user, and putting it to user). But to check/search/use as key/etc. such strings, you may want a unique way to identify the same string (semantically speaking).
The main problem is that you may have two strings which are semantically the same, but with two different representations: e.g. one with a accented character [one code point], and one with a character combined with accent [one code point for character, one for combining accent]. User may not be in control on how the input text will be sent, so you may have two different user names, or two different password. But also if you mangle data, you may get different results, depending on initial string. Users do not like it.
An other problem is about unique order of combining characters. You may have an accent, and a lower tail (e.g. cedilla): you may express this with several combinations: "pure char, tail, accent", "pure char, accent, tail", "char+tail, accent", "char+accent, cedilla".
And you may have degenerate cases (especially if you type from a keyboard): you may get code points which should be removed (you may have a infinite long string which could be equivalent of few bytes.
In any case, for sorting strings, you (or your library) requires a normalized form: if you already provide the right, the lib will not need to transform it again.
So: you want that the same (semantically speaking) string has the same sequence of unicode code points.
Note: If you are doing directly on UTF-8, you should also care about special cases of UTF-8: same codepoint could be written in different ways [using more bytes]. Also this could be a security problem.
The K is often used for "searches" and similar tasks: CO2 and CO₂ will be interpreted in the same manner, but this could change the meaning of the text, so it should often used only internally, for temporary tasks, but keeping the original text.
As stated in MDN documentation, String.prototype.normalize() return the Unicode Normalized Form of the string. This because in Unicode, some characters can have different representation code.
This is the example (taken from MDN):
const name1 = '\u0041\u006d\u00e9\u006c\u0069\u0065';
const name2 = '\u0041\u006d\u0065\u0301\u006c\u0069\u0065';
console.log(`${name1}, ${name2}`);
// expected output: "Amélie, Amélie"
console.log(name1 === name2);
// expected output: false
console.log(name1.length === name2.length);
// expected output: false
const name1NFC = name1.normalize('NFC');
const name2NFC = name2.normalize('NFC');
console.log(`${name1NFC}, ${name2NFC}`);
// expected output: "Amélie, Amélie"
console.log(name1NFC === name2NFC);
// expected output: true
console.log(name1NFC.length === name2NFC.length);
// expected output: true
As you can see, the string Amélie as two different Unicode representations. With normalization, we can reduce the two forms to the same string.
Very beautifully explained here --> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize
Short answer : The point is, characters are represented through a coding scheme like ascii, utf-8 , etc.,(We use mostly UTF-8). And some characters have more than one representation. So 2 string may render similarly, but their unicode may vary! So string comparrision may fail here! So we use normaize to return a single type of representation
// source from MDN
let string1 = '\u00F1'; // ñ
let string2 = '\u006E\u0303'; // ñ
string1 = string1.normalize('NFC');
string2 = string2.normalize('NFC');
console.log(string1 === string2); // true
console.log(string1.length); // 1
console.log(string2.length); // 1
Normalization of strings isn't exclusive of JavaScript - see for instances in Python. The values valid for the arguments are defined by the Unicode (more on Unicode normalization).
When it comes to JavaScript, note that there's documentation with String.normalize() and String.prototype.normalize(). As #ChrisG mentions
String.prototype.normalize() is correct in a technical sense, because
normalize() is a dynamic method you call on instances, not the class
itself. The point of normalize() is to be able to compare Strings that
look the same but don't consist of the same characters, as shown in
the example code on MDN.
Then, when it comes to its usage, found a great example of the usage of String.normalize() that has
let s1 = 'sabiá';
let s2 = 'sabiá';
// one is in NFC, the other in NFD, so they're different
console.log(s1 == s2); // false
// with normalization, they become the same
console.log(s1.normalize('NFC') === s2.normalize('NFC')); // true
// transform string into array of codepoints
function codepoints(s) { return Array.from(s).map(c => c.codePointAt(0).toString(16)); }
// printing the codepoints you can see the difference
console.log(codepoints(s1)); // [ "73", "61", "62", "69", "e1" ]
console.log(codepoints(s2)); // [ "73", "61", "62", "69", "61", "301" ]
So while saibá e saibá in this example look the same to the human eye or even if we used console.log(), we can see that without normalization when comparing them we'd get different results. Then, by analyzing the codepoints, we see they're different.
There are some great answers here already, but I wanted to throw in a practical example.
I enjoy Bible translation as a hobby. I wasn't too thrilled at the flashcard option out there in the wild in my price range (free) so I made my own. The problem is, there is more than one way to do Hebrew and Greek in Unicode to get the exact same thing. For example:
בָּא
בָּא
These should look identical on your screen, and for all practical purposes they are identical. However, the first was typed with the qamats (the little t shaped thing under it) before the dagesh (the dot in the middle of the letter) and the second was typed with the dagesh before the qamats. Now, since you're just reading this, you don't care. And your web browser doesn't care. But when my flashcards compare the two, then they aren't the same. To the code behind the scenes, it's no different than saying "center" and "centre" are the same.
Similarly, in Greek:
ἀ
ἀ
These two should look nearly identical, but the top is one Unicode character and the second one is two Unicode characters. Which one is going to end up typed in my flashcards is going to depend on which keyboard I'm sitting at.
When I'm adding flashcards, believe it or not, I don't always type in vocab lists of 100 words. That's why God gave us spreadsheets. And sometimes the places I'm importing the lists from do it one way, and sometimes they do it the other way, and sometimes they mix it. But when I'm typing, I'm not trying to memorize the order that the dagesh or quamats appear or if the accents are typed as a separate character or not. Regardless if I remember to type the dagesh first or not, I want to get the right answer, because really it's the same answer in every practical sense either way.
So I normalize the order before saving the flashcards and I normalize the order before checking it, and the result is that it doesn't matter which way I type it, it comes out right!
If you want to check out the results:
https://sthelenskungfu.com/flashcards/
You need a Google or Facebook account to log in, so it can track progress and such. As far as I know (or care) only my daughter and I currently use it.
It's free, but eternally in beta.

Extracting valid date from a string

I need to extract a valid date from a list of random strings. The date can be present in any date format("01/25/16", "25/01/2016", "20-01-2016", "3-Nov-2016" etc) with different kind of separators.
I tried the using Date.parse() and new Date() but these method also return a valid value for any number passed which ideally is not a date.
For Ex: Date.parse("1") = 978336000000
My current solution is to check each string with the following regex
if(!string.match(/^\d+$|[a-zA-Z]+\s*[a-zA-Z0-9]*/) && (string.length > 7)) {
const date = Date.parse(string)
return (!isNaN(date))
}
This regex works to identify date strings like "01/25/16", "25/01/2016", "20-01-2016"
This regex matches most of the regular text like "100", "hello", "123hello", "1h ello12" and lets in values like "123-123", "01/25/16" and Date.parse() identifies pretty good.
But this misses the date string like "23-Nov-2016" so I added one more regex along with previous one
if(((!string.match(/^\d+$|[a-zA-Z]+\s*[a-zA-Z0-9]*/) && (string.length > 7)) || ((string.match(/^\d+$|[a-zA-Z]+\s*[a-zA-Z0-9]*/) && string.toLowerCase.match(/jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec/))) {
const date = Date.parse(string)
return (!isNaN(date))
}
I definitely believe that there exists a much simpler solution than using this large sets of regex in javascript.
EDIT : I don't control the date input rules to specifically validate certain formats.
Unfortunately, I don't think there is a better solution than using a set of regular expressions.
The problem is that there are at least a million different ways to write the same date. It seems like no matter what date formats you have planned for, your users will always come up with something that doesn't fit. So I approached this in the following way for a project I'm working on:
Make a list of acceptable date formats.
Tell the users not to use different formats and enforce it via client-side validation.
In my case, I'm living in the US, and dates are usually written like 'M/D/YY'. To allow for a reasonable range of variation, I wrote my code to accept M/D/YY, M/D/YYYY, and M/D (where the current year is substituted if the year is omitted). These formats are recognized using regular expressions then parsed using the Moment.js library.
You may want to expand the list of permitted formats if your users habitually use them - that's fine. But the important thing is to realize that you can't plan for all possible formats - there are just too many variations.
If you can meet your users' expectations 90% of the time (with the most common formats) and train your users that these are the accepted formats, you'll have happy users and date parsing code that's not 10,000 lines long.

Getting the user's region with navigator.language

For some time, I've been using something like this to get my user's country (ISO-3166):
const region = navigator.language.split('-')[1]; // 'US'
I've always assumed the string would be similar to en-US -- where the country would hold the 2nd position of the array.
I am thinking this assumption is incorrect. According to MDN docs, navigator.language returns: "string representing the language version as defined in BCP 47." Reading BCP 47, the primary language subtag is guaranteed to be first (e.g., 'en') but the region code is not guaranteed to be the 2nd subtag. There can be subtags that preceed and follow the region subtag.
For example "sr-Latn-RS" is a valid BCP 47 language tag:
sr | Latn | RS
primary language | script subtag | region subtag
Is the value returned from navigator.language a subset of BCP 47 containing only language and region? Or is there a library or regex that is commonly used to extract the region subtag from a language tag?
Your solution is based on the false premise that the browser's language tag reliably matches the user's country. E.g., I have set my browser language to German, even though I am living nowhere near Germany at the moment, but rather in the United States.
Also, for example in Chrome, many language packs do not require you to specify the region modifier. Setting Chrome's display language to German
provides the following language tag:
> navigator.language
< "de"
No region tag at all, and a fairly common language.
Bottom line is, my browser setup results in language tag de, even though I live in the United States.
A more accurate and possibly reliable way to determine the user's location would be to derive it from the IP address associated with the request. There are numerous services that offer this service. ip-api.com is one of them:
$.get("http://ip-api.com/json", function(response) {
console.log(response.country); // "United States"
console.log(response.countryCode); // "US"
}, "jsonp");
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
You can now extract the region from a locale identifier using the Locale object in the Internationalization API.
const { region } = new Intl.Locale('sr-Latn-RS') // region => 'RS'
Note that this is not currently compatible with Internet Explorer.
Regex found here: https://github.com/gagle/node-bcp47/blob/master/lib/index.js
var re = /^(?:(en-GB-oed|i-ami|i-bnn|i-default|i-enochian|i-hak|i-klingon|i-lux|i-mingo|i-navajo|i-pwn|i-tao|i-tay|i-tsu|sgn-BE-FR|sgn-BE-NL|sgn-CH-DE)|(art-lojban|cel-gaulish|no-bok|no-nyn|zh-guoyu|zh-hakka|zh-min|zh-min-nan|zh-xiang))$|^((?:[a-z]{2,3}(?:(?:-[a-z]{3}){1,3})?)|[a-z]{4}|[a-z]{5,8})(?:-([a-z]{4}))?(?:-([a-z]{2}|\d{3}))?((?:-(?:[\da-z]{5,8}|\d[\da-z]{3}))*)?((?:-[\da-wy-z](?:-[\da-z]{2,8})+)*)?(-x(?:-[\da-z]{1,8})+)?$|^(x(?:-[\da-z]{1,8})+)$/i;
let foo = re.exec('de-AT'); // German in Austria
let bar = re.exec('zh-Hans-CN'); // Simplified Chinese using Simplified script in mainland China
console.log(`region ${foo[5]}`); // 'region AT'
console.log(`region ${bar[5]}`); // 'region CN'
In Firefox, you can choose your language settings in preferences:
The list of languages has 269 items, 192 of which do not include any region code.
The region is only useful when a language has different variants depending on the location. This way users can tell the server in which language variant they prefer the response to be.
Do not use this approach to locate the user. It's too unreliable, because the user may not specify any region, or because the user could physically be in another place.
If you want to locate the user, you should use the Geolocation API.
Be careful you have navigator.language and navigator.languages.
langage :
console.log(navigator.language); // "fr"
langages :
console.log(navigator.languages); // ["fr", "fr-FR", "en-US", "en"]
To find countries see Wikipedia on ISO 3166-1 or use javascript lib :
i18n-iso-countries
country.js
ISO Country List - HTML select/dropdown snippet
Just as #TimoSta said,
Try this
$.getJSON('http://freegeoip.net/json/', function(result) {
alert(result.country_code);
});
from Get visitors language & country code with javascript (client-side). See answer of #noducks
The value you are receiving stems from the Accept-Language header of the HTTP request.
The values of the header can be quite complex like
Accept-Language: da, en-GB;q=0.8, en;q=0.7
As the name implies, the Accept-Language header basically defines acceptable languages, not countries.
A language tag may contain also additional location information, as in 'en-GB' but others like 'en' do not.
In case not, there is just no information about the country.
It is also not always possible to exactly map a language like 'en' to a country.
If the language is 'en', the country might be 'GB' but it may also be 'US'.
What you can do ;
Determine the country only, if the language contains one, as in 'en-GB'
If the language does not contain a country you have the following options :
A few languages are only used in one country, like 'da', danish which is spoken only in Denmark (I am guessing here), so you may map these cases.
You may use a default for other cases, depending on the language, e.g. map 'en' to 'GB'
You may use a general default like 'US' for all cases no country can be determined.
You can use additional information e.g. the clients IP address to determine the country
Finally you may ask the user to enter the country
I collected some additional information about the Accept-Language header here

Browser intl.NumberFormat not displaying currency symbols correctly

I'm attempting to write a currency formatting function using Intl.NumberFormat.
It works correctly when I pass it things like USD, or EUR as the currency, but seems to break when I pass it more obscure currency codes like PLN or COL, and instead of displaying their symbols as requested it displays the Codes. It is clearly recognizing the code because when I ask it to display the name instead it works correctly:
Intl.NumberFormat("en-US",{
style:'currency',
minimumIntegerDigits:1,
currency: 'PLN',
currencyDisplay: 'symbol'
}).format(43);
Displays "PLN43" while
Intl.NumberFormat("en-US",{
style:'currency',
minimumIntegerDigits:1,
currency: 'PLN',
currencyDisplay: 'name'
}).format(43);
Displays "43.00 Polish zlotys"
The Intl.NumberFormat should have the symbols you need, you just have to make sure you specify the correct language code.
You can find a mapping of ISO language codes here:
https://www.w3schools.com/tags/ref_language_codes.asp
In this case you will need to use the Polish value "pl" instead of "en-US"
Intl.NumberFormat("pl",{
style:'currency',
minimumIntegerDigits:1,
currency: 'PLN',
currencyDisplay: 'symbol'
}).format(43);
According to the spec:
However, the set of combinations of currency code and language tag for which localized currency symbols are available is implementation dependent. Where a localized currency symbol is not available, the ISO 4217 currency code is used for formatting.

jQuery Globalization. Pass currency unit as argument for C format

I am setting up a multilingual site that deals with currencies. I want to be able to display correct currency formats based on the selected language. The server side PHP stuff is a doddle. Using a combination of PHPs NumberFormatter and strftime I have been able to format currencies and dates correctly.
There is however a requirement to have the same degree of formatting done client side with javascript.
I have come across Globalization (former jQuery plugin) and it looks quite promising.
If I want to display a dollar value in American English I can do something like this:
jQuery.preferCulture("en-US");
// Formatting price
var price = jQuery.format(3899.888, "c");
//Assigning stock price to the control
jQuery("#price").html(price);
and this will output:
$3,899.89
While doing:
jQuery.preferCulture("fr-FR");
// Formatting price
var price = jQuery.format(3899.888, "c");
//Assigning stock price to the control
jQuery("#price").html(price);
outputs:
3 899,89 €
which looks perfect. however, I have a need to output multiple currencies. So, if I have 'fr-FR' selected as my preferred culture, how can I output, say, a dollar value like so:
3 899,89 $
so that the format is French, but the value is American Dollar. I have looked but not found anyway to pass a currency symbol as an argument.
The only documented way to modify the currency symbol in Globalize is to change the numberFormat.currency.symbol property of a given culture—in this case, the fr-FR culture. This will kind of do what you want, but it’s not a very elegant solution, and you would need to manually build a table of correct symbols for each locale and write another method to swap them out. (n.b. It is possible to pass a third argument to Globalize.format with a different locale identifier, but this just formats the number using that locale’s cultural settings.) Looking at the culture definition syntax, there is simply no provision for displaying different currencies using a given locale.
If you were to look elsewhere, the dojo/currency module in the Dojo Toolkit does do exactly what you need, using data from the Unicode Common Locale Data Repository to determine how to represent various currencies in different locales. So you can set your locale to fr, write currency.format(3899.888, { currency: "USD" }), and it will output the currency in USD in the correct format for the French locale.
I had the same problem, in the end I just replaced the default currency symbol on the output with the symbol I wanted to display. It's a bit primitive but it keeps the formatting correct for the locale with the currency symbol you want.
function formatCurrency(value, format, symbol){
var formattedValue = Globalize.format(value, format);
if (typeof symbol === "string") {
formattedValue = formattedValue.replace(Globalize.culture().numberFormat.currency.symbol, symbol);
}
return formattedValue;
}
document.getElementById("price1").innerHTML = formatCurrency(123.34,"c"); //<-- $123.34
document.getElementById("price2").innerHTML = formatCurrency(123.34,"c","£"); //<-- £123.34
Here is the fiddle

Categories

Resources