How to compare locale dependent float numbers?

How to compare locale dependent float numbers? - javascript

I need to compare a float value entered in a web form against a range. The problem is that the client computers may have various locale settings, meaning that user may use either "." or "," to separate the integer part from decimal one.
Is there a simple way to do it? As it is for an intranet and that they are only allowed to use IE, a VBScript is fine, even if I would prefer to use JavaScript.
EDIT: Let me clarify it a bit:
I cannot rely on the system locale, because, for example, a lot of our french customers use a computer with an english locale, even if they still use the comma to fill data in the web forms.
So I need a way to perform a check accross multiple locale "string to double" conversion.
I know that the raise condition is "what about numbers with 3 decimal digits", but in our environment, this kind of answer never happen, and if it happens, it will be threated as an out of range error due to the multiplication by a thousand, so it's not a real issue for us.

In Javascript use parseFloat on the text value to get a number. Similarly in VBScript use CDbl on the text value. Both should conform to the current locale settings enforce for the user.

This code should work:
function toFloat(localFloatStr)
var x = localFloatStr.split(/,|\./),
x2 = x[x.length-1],
x3 = x.join('').replace(new RegExp(x2+'$'),'.'+x2);
return parseFloat(x3);
// x2 is for clarity, could be omitted:
//=>x.join('').replace(new RegExp(x[x.length-1]+'$'),'.'+x[x.length-1])
}
alert(toFloat('1,223,455.223')); //=> 1223455.223
alert(toFloat('1.223.455,223')); //=> 1223455.223
// your numbers ;~)
alert(toFloat('3.123,56')); //=> 3123.56
alert(toFloat('3,123.56')); //=> 3123.56

What we do is try parsing using the culture of the user and if that doesn't work, parse it using an invariant culture.
I wouldn't know how to do it in javascript or vbscript exactly though.

I used KooiInc's answer but change it a bit, because it didn't reckon with some cases.
function toFloat(strNum) {
var full = strNum.split(/[.,]/);
if (full.length == 1) return parseFloat(strNum);
var back = full[full.length - 1];
var result = full.join('').replace(new RegExp(back + '$'), '.' + back);
return parseFloat(result);
}

Forbid using any thousands separator.
Give the user an example: "Reals should look like this: 3123.56 or 3123,56". Then simply change , to . and parse it.
You can always tell user that he did something wrong with a message like this:
"I don't understand what you mean by "**,**,**".
Please format numbers like "3123.56."

Related

Why doesn't my function correctly replace when using some regex pattern

This is an extension of this SO question
I made a function to see if i can correctly format any number. The answers below work on tools like https://regex101.com and https://regexr.com/, but not within my function(tried in node and browser):
const
const format = (num, regex) => String(num).replace(regex, '$1')
Basically given any whole number, it should not exceed 15 significant digits. Given any decimal, it should not exceed 2 decimal points.
so...
Now
format(0.12345678901234567890, /^\d{1,13}(\.\d{1,2}|\d{0,2})$/)
returns 0.123456789012345678 instead of 0.123456789012345
but
format(0.123456789012345,/^-?(\d*\.?\d{0,2}).*/)
returns number formatted to 2 deimal points as expected.

Let me try to explain what's going on.
For the given input 0.12345678901234567890 and the regex /^\d{1,13}(\.\d{1,2}|\d{0,2})$/, let's go step by step and see what's happening.
^\d{1,13} Does indeed match the start of the string 0
(\. Now you've opened a new group, and it does match .
\d{1,2} It does find the digits 1 and 2
|\d{0,2} So this part is skipped
) So this is the end of your capture group.
$ This indicates the end of the string, but it won't match, because you've still got 345678901234567890 remaining.
Javascript returns the whole string because the match failed in the end.
Let's try removing $ at the end, to become /^\d{1,13}(\.\d{1,2}|\d{0,2})/
You'd get back ".12345678901234567890". This generates a couple of questions.
Why did the preceding 0 get removed?
Because it was not part of your matching group, enclosed with ().
Why did we not get only two decimal places, i.e. .12?
Remember that you're doing a replace. Which means that by default, the original string will be kept in place, only the parts that match will get replaced. Since 345678901234567890 was not part of the match, it was left intact. The only part that matched was 0.12.

Answer to title question: your function doesn't replace, because there's nothing to replace - the regex doesn't match anything in the string. csb's answer explains that in all details.
But that's perhaps not the answer you really need.
Now, it seems like you have an XY problem. You ask why your call to .replace() doesn't work, but .replace() is definitely not a function you should use. Role of .replace() is replacing parts of string, while you actually want to create a different string. Moreover, in the comments you suggest that your formatting is not only for presenting data to user, but you also intend to use it in some further computation. You also mention cryptocurriencies.
Let's cope with these problems one-by-one.
What to do instead of replace?
Well, just produce the string you need instead of replacing something in the string you don't like. There are some edge cases. Instead of writing all-in-one regex, just handle them one-by-one.
The following code is definitely not best possible, but it's main aim is to be simple and show exactly what is going on.
function format(n) {
const max_significant_digits = 15;
const max_precision = 2;
let digits_before_decimal_point;
if (n < 0) {
// Don't count minus sign.
digits_before_decimal_point = n.toFixed(0).length - 1;
} else {
digits_before_decimal_point = n.toFixed(0).length;
}
if (digits_before_decimal_point > max_significant_digits) {
throw new Error('No good representation for this number');
}
const available_significant_digits_for_precision =
Math.max(0, max_significant_digits - digits_before_decimal_point);
const effective_max_precision =
Math.min(max_precision, available_significant_digits_for_precision);
const with_trailing_zeroes = n.toFixed(effective_max_precision);
// I want to keep the string and change just matching part,
// so here .replace() is a proper method to use.
const withouth_trailing_zeroes = with_trailing_zeroes.replace(/\.?0*$/, '');
return withouth_trailing_zeroes;
}
So, you got the number formatted the way you want. What now?
What can you use this string for?
Well, you can display it to the user. And that's mostly it. The value was rounded to (1) represent it in a different base and (2) fit in limited precision, so it's pretty much useless for any computation. And, BTW, why would you convert it to String in the first place, if what you want is a number?
Was the value you are trying to print ever useful in the first place?
Well, that's the most serious question here. Because, you know, floating point numbers are tricky. And they are absolutely abysmal for representing money. So, most likely the number you are trying to format is already a wrong number.
What to use instead?
Fixed-point arithmetic is the most obvious answer. Works most of the time. However, it's pretty tricky in JS, where number may slip into floating-point representation almost any time. So, it's better to use decimal arithmetic library. Optionally, switch to a language that has built-in bignums and decimals, like Python.

How to "unformat" a numerical string? JavaScript

So I know how to format a string or integer like 2000 to 2K, but how do I reverse it?
I want to do something like:
var string = "$2K".replace("/* K with 000 and remove $ symbol in front of 2 */");
How do I start? I am not very good regular expressions, but I have been taking some more time out to learn them. If you can help, I certainly appreciate it. Is it possible to do the same thing for M for millions (adding 000000 at the end) or B for billions (adding 000000000 at the end)?

var string = "$2K".replace(/\$(\d+)K/, "$1000");
will give output as
2000

I'm going to take a different approach to this, as the best way to do this is to change your app to not lose the original numeric information. I recognize that this isn't always possible (for example, if you're scraping formatted values...), but it could be useful way to think about it for other users with similar question.
Instead of just storing the numeric values or the display values (and then trying to convert back to the numeric values later on), try to update your app to store both in the same object:
var value = {numeric: 2000, display: '2K'}
console.log(value.numeric); // 2000
console.log(value.display); // 2K
The example here is a bit simplified, but if you pass around your values like this, you don't need to convert back in the first place. It also allows you to have your formatted values change based on locale, currency, or rounding, and you don't lose the precision of your original values.

Is it better to compare strings using toLowerCase or toUpperCase in JavaScript?

I'm going through a code review and I'm curious if it's better to convert strings to upper or lower case in JavaScript when attempting to compare them while ignoring case.
Trivial example:
var firstString = "I might be A different CASE";
var secondString = "i might be a different case";
var areStringsEqual = firstString.toLowerCase() === secondString.toLowerCase();
or should I do this:
var firstString = "I might be A different CASE";
var secondString = "i might be a different case";
var areStringsEqual = firstString.toUpperCase() === secondString.toUpperCase();
It seems like either "should" or would work with limited character sets like only English letters, so is one more robust than the other?
As a note, MSDN recommends normalizing strings to uppercase, but that is for managed code (presumably C# & F# but they have fancy StringComparers and base libraries):
http://msdn.microsoft.com/en-us/library/bb386042.aspx

Revised answer
It's been quite a while when I answered this question. While cultural issues still holds true (and I don't think they will ever go away), the development of ECMA-402 standard made my original answer... outdated (or obsolete?).
The best solution for comparing localized strings seems to be using function localeCompare() with appropriate locales and options:
var locale = 'en'; // that should be somehow detected and passed on to JS
var firstString = "I might be A different CASE";
var secondString = "i might be a different case";
if (firstString.localeCompare(secondString, locale, {sensitivity: 'accent'}) === 0) {
// do something when equal
}
This will compare two strings case-insensitive, but accent-sensitive (for example ą != a).
If this is not sufficient for performance reasons, you may want to use eithertoLocaleUpperCase()ortoLocaleLowerCase()` passing the locale as a parameter:
if (firstString.toLocaleUpperCase(locale) === secondString.toLocaleUpperCase(locale)) {
// do something when equal
}
In theory there should be no differences. In practice, subtle implementation details (or lack of implementation in the given browser) may yield different results...
Original answer
I am not sure if you really meant to ask this question in Internationalization (i18n) tag, but since you did...
Probably the most unexpected answer is: neither.
There are tons of problems with case conversion, which inevitably leads to functional issues if you want to convert the character case without indicating the language (like in JavaScript case). For instance:
There are many natural languages that don't have concept of upper- and lowercase characters. No point in trying to convert them (although this will work).
There are language specific rules for converting the string. German sharp S character (ß) is bound to be converted into two upper case S letters (SS).
Turkish and Azerbaijani (or Azeri if you prefer) has "very strange" concept of two i characters: dotless ı (which converts to uppercase I) and dotted i (which converts to uppercase İ <- this font does not allow for correct presentation, but this is really different glyph).
Greek language has many "strange" conversion rules. One particular rule regards to uppercase letter sigma (Σ) which depending on a place in a word has two lowercase counterparts: regular sigma (σ) and final sigma (ς). There are also other conversion rules in regard to "accented" characters, but they are commonly omitted during implementation of conversion function.
Some languages has title-case letters, i.e. ǈ which should be converted to things like Ǉ or less appropriately LJ. The same may regard to ligatures.
Finally there are many compatibility characters that may mean the same as what you are trying to compare to, but be composed of completely different characters. To make it worse, things like "ae" may be the equivalent of "ä" in German and Finnish, but equivalent of "æ" in Danish.
I am trying to convince you that it is really better to compare user input literally, rather than converting it. If it is not user-related, it probably doesn't matter, but case conversion will always take time. Why bother?

Some other options have been presented, but if you must use toLowerCase, or
toUpperCase, I wanted some actual data on this. I pulled the full list
of two byte characters that fail with toLowerCase or toUpperCase. I then
ran this test:
let pairs = [
[0x00E5,0x212B],[0x00C5,0x212B],[0x0399,0x1FBE],[0x03B9,0x1FBE],[0x03B2,0x03D0],
[0x03B5,0x03F5],[0x03B8,0x03D1],[0x03B8,0x03F4],[0x03D1,0x03F4],[0x03B9,0x1FBE],
[0x0345,0x03B9],[0x0345,0x1FBE],[0x03BA,0x03F0],[0x00B5,0x03BC],[0x03C0,0x03D6],
[0x03C1,0x03F1],[0x03C2,0x03C3],[0x03C6,0x03D5],[0x03C9,0x2126],[0x0392,0x03D0],
[0x0395,0x03F5],[0x03D1,0x03F4],[0x0398,0x03D1],[0x0398,0x03F4],[0x0345,0x1FBE],
[0x0345,0x0399],[0x0399,0x1FBE],[0x039A,0x03F0],[0x00B5,0x039C],[0x03A0,0x03D6],
[0x03A1,0x03F1],[0x03A3,0x03C2],[0x03A6,0x03D5],[0x03A9,0x2126],[0x0398,0x03F4],
[0x03B8,0x03F4],[0x03B8,0x03D1],[0x0398,0x03D1],[0x0432,0x1C80],[0x0434,0x1C81],
[0x043E,0x1C82],[0x0441,0x1C83],[0x0442,0x1C84],[0x0442,0x1C85],[0x1C84,0x1C85],
[0x044A,0x1C86],[0x0412,0x1C80],[0x0414,0x1C81],[0x041E,0x1C82],[0x0421,0x1C83],
[0x1C84,0x1C85],[0x0422,0x1C84],[0x0422,0x1C85],[0x042A,0x1C86],[0x0463,0x1C87],
[0x0462,0x1C87]
];
let upper = 0, lower = 0;
for (let pair of pairs) {
let row = 'U+' + pair[0].toString(16).padStart(4, '0') + ' ';
row += 'U+' + pair[1].toString(16).padStart(4, '0') + ' pass: ';
let s = String.fromCodePoint(pair[0]);
let t = String.fromCodePoint(pair[1]);
if (s.toUpperCase() == t.toUpperCase()) {
row += 'toUpperCase ';
upper++;
} else {
row += ' ';
}
if (s.toLowerCase() == t.toLowerCase()) {
row += 'toLowerCase';
lower++;
}
console.log(row);
}
console.log('upper pass: ' + upper + ', lower pass: ' + lower);
Interestingly, one of the pairs fails with both. But based on this,
toUpperCase is the best option.

It never depends upon the browser as it is only the JavaScript which is involved.
both will give the performance based upon the no of characters need to be changed (flipping case)
var areStringsEqual = firstString.toLowerCase() === secondString.toLowerCase();
var areStringsEqual = firstString.toUpperCase() === secondString.toUpperCase();
If you use test prepared by #adeneo you can feel it's browser dependent, but make some other test inputs like:
"AAAAAAAAAAAAAAAAAAAAAAAAAAAA"
and
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
and compare.
Javascript performance depends upon the browser if some DOM API or any DOM manipulation/interaction is there, otherwise for all plain JavaScript, it will give the same performance.

Running an equation from a string

I am trying to create a simple online calculator that can run basic calculations in JavaScript.
I have managed to create the interface so that numbers and operators and stored in a form field.
What I would like to be able to do is pass the values within the form field to a function that will calculate the total of the form field.
The form field could contain anything from a simple "10 + 10" to more complex equations using brackets. The operators in use are +, -, *, and /.
Is it possible to pass the form field's text (a string) to a JavaScript function that can recognize the operators and the perform the function of the operation on the values?
A possible value in the text field would be:
120/4+130/5
The function should then return 56 as the answer. I have done this in JavaScript when I know the values like this:
function WorkThisOut(a,b,c,d) {
var total = a/b+c/d;
alert (total);
}
WorkThisOut(120,4,130,5);
What I would like to be able to do is pass the full value "120/4+130/5" to the function and it be able to extract the numbers and operators to create the total.
Does anyone have any ideas on how this could be done or if it is even possible? this may get more complex where I may need to pass values in parentheses "(120/4)+(130/5)"

I may get blasted for this. But, here it goes anyway.
There are three solutions I can think of for this:
Implement your own parser, lexer and parse out the code.
That's not super easy, but it may be a great learning experience.
Run an eval under a subdomain meant only for that, so that scripts can't maliciously access your site
Sanitize the input to contain only 12345678790+-/*().
eval(input.replace(/[^0-9\(\)\+\-\*\/\.]/g, ""));
Please blast away with tricks to get around this solution

You can use the expression parser included with the math.js library:
http://mathjs.org
Example usage:
math.eval('1.2 / (2.3 + 0.7)'); // 0.4
math.eval('5.08 cm in inch'); // 2 inch
math.eval('sin(45 deg) ^ 2'); // 0.5
math.eval('9 / 3 + 2i'); // 3 + 2i
math.eval('det([-1, 2; 3, 1])'); // -7

It is pretty hard to do much damage with eval if you don't allow identifiers.
function reval(string){
var answer='';
if(/^[\d()\/*.+-]+$/.test(str)){
try{
answer= eval(str);
}
catch(er){
answer= er.name+', '+er.message;
}
}
return answer;
}

what about eval?
consider calc as the id of textfield. then
$('#calc').change(function(e){
alert(eval($(this).val()));
})
but remember to validate input before processing.

This is a pretty old topic, but for the new visitors who have similar problem: the string-math package calculates the [String] equations like in the sample above 120/4+130/5. It also recognizes parentheses.

Javascript percentage validation

I am after a regular expression that validates a percentage from 0 100 and allows two decimal places.
Does anyone know how to do this or know of good web site that has example of common regular expressions used for client side validation in javascript?
#Tom - Thanks for the questions. Ideally there would be no leading 0's or other trailing characters.
Thanks to all those who have replied so far. I have found the comments really interesting.

Rather than using regular expressions for this, I would simply convert the user's entered number to a floating point value, and then check for the range you want (0 to 100). Trying to do numeric range validation with regular expressions is almost always the wrong tool for the job.
var x = parseFloat(str);
if (isNaN(x) || x < 0 || x > 100) {
// value is out of range
}

I propose this one:
(^100(\.0{1,2})?$)|(^([1-9]([0-9])?|0)(\.[0-9]{1,2})?$)
It matches 100, 100.0 and 100.00 using this part
^100(\.0{1,2})?$
and numbers like 0, 15, 99, 3.1, 21.67 using
^([1-9]([0-9])?|0)(\.[0-9]{1,2})?$
Note what leading zeros are prohibited, but trailing zeros are allowed (though no more than two decimal places).

This reminds me of an old blog Entry By Alex Papadimoulis (of The Daily WTF fame) where he tells the following story:
"A client has asked me to build and install a custom shelving system. I'm at the point where I need to nail it, but I'm not sure what to use to pound the nails in. Should I use an old shoe or a glass bottle?"
How would you answer the question?
It depends. If you are looking to pound a small (20lb) nail in something like drywall, you'll find it much easier to use the bottle, especially if the shoe is dirty. However, if you are trying to drive a heavy nail into some wood, go with the shoe: the bottle with shatter in your hand.
There is something fundamentally wrong with the way you are building; you need to use real tools. Yes, it may involve a trip to the toolbox (or even to the hardware store), but doing it the right way is going to save a lot of time, money, and aggravation through the lifecycle of your product. You need to stop building things for money until you understand the basics of construction.
This is such a question where most people sees it as a challenge to come up with the correct regular expression to solve the problem, but it would be much better to just say that using regular expressions are using the wrong tool for the job.
The problem when trying to use regex to validate numeric ranges is that it is hard to change if the requirements for the allowed range is changes. Today the requirement may be to validate numbers between 0 and 100 and it is possible to write a regex for that which doesn't make your eyes bleed. But next week the requirment maybe changes so values between 0 and 315 are allowed. Good luck altering your regex.
The solution given by Greg Hewgill is probably better - even though it would validate "99fxx" as "99". But given the circumstances that might actually be ok.

Given that your value is in str
str.match(/^(100(\.0{1,2})?|([0-9]?[0-9](\.[0-9]{1,2})))$/)

^100(\.(0){0,2})?$|^([1-9]?[0-9])(\.(\d{0,2}))?\%$
This would match:
100.00
optional "1-9" followed by a digit (this makes the int part), optionally followed by a dot and two digits
From what I see, Greg Hewgill's example doesn't really work that well because parseFloat('15x') would simply return 15 which would match the 0<x<100 condition. Using parseFloat is clearly wrong because it doesn't validate the percentage value, it tries to force a validation. Some people around here are complaining about leading zeroes and some are ignoring trailing invalid characters. Maybe the author of the question should edit it and make clear what he needs.

I recomend this, if you are not exclusively developing for english speaking users:
[0-9]{1,2}((,|\.)[0-9]{1,10})?%?
You can simply replace the 10 by a 2 to get two decimal places.
My example will match:
15.5
5.4366%
1,43
50,55%
34
45%
Of cause the output of this one is harder to cast, but something like this will do (Java Code):
private static Double getMyVal(String myVal) {
if (myVal.contains("%")) {
myVal = myVal.replace("%", "");
}
if (myVal.contains(",")) {
myVal = myVal.replace(',', '.');
}
return Double.valueOf(myVal);
}

None of the above solutions worked for me, as I needed my regex to allow for values with numbers and a decimal while the user is typing ex: '18.'
This solution allows for an empty string so the user can delete their entire input, and accounts for the other rules articulated above.
/(^$)|(^100(\.0{1,2})?$)|(^([1-9]([0-9])?|0)\.(\.[0-9]{1,2})?$)|(^([1-9]([0-9])?|0)(\.[0-9]{1,2})?$)/

(100|[0-9]{1,2})(\.[0-9]{1,2})?
That should be the regex you want. I suggest you to read Mastering Regular Expression and download RegexBuddy or The Regex Coach.

#mlarsen:
Is not that a regex here won't do the job better.
Remember that validation msut be done both on client and on server side, so something like:
100|(([1-9][0-9])|[0-9])(\.(([0-9][1-9])|[1-9]))?
would be a cross-language check, just beware of checking the input length with the output match length.

(100(\.(0){1,2})?|([1-9]{1}|[0-9]{2})(\.[0-9]{1,2})?)

Develop Reference

JavaScript is the programming language of the Web.