Why is "_" getting removed from a number in javascript? - javascript

I tried entering the below code in the Chrome console:
var a = 16_11;
It's not inside " or '. And the output of a is 1611 instead of 16_11. Why is _ getting removed?

You got a numeric separator which is a proposal and actual shipping in V8 v7.5/Chrome 75.
This feature enables developers to make their numeric literals more readable by creating a visual separation between groups of digits. Large numeric literals are difficult for the human eye to parse quickly, especially when there are long digit repetitions. This impairs both the ability to get the correct value / order of magnitude...
1000000000 // Is this a billion? a hundred millions? Ten millions?
101475938.38 // what scale is this? what power of 10?
...but also fails to convey some use-case information, such as fixed-point arithmetic using integers. For instance, financial computations often work in 4- to 6-digit fixed-point arithmetics, but even storing amounts as cents is not immediately obvious without separators in literals:
const FEE = 12300;
// is this 12,300? Or 123, because it's in cents?
const AMOUNT = 1234500;
// is this 1,234,500? Or cents, hence 12,345? Or financial, 4-fixed 123.45?
Using underscores (_, U+005F) as separators helps improve readability for numeric literals, both integers and floating-point (and in JS, it's all floating-point anyway):
1_000_000_000 // Ah, so a billion
101_475_938.38 // And this is hundreds of millions
let fee = 123_00; // $123 (12300 cents, apparently)
let fee = 12_300; // $12,300 (woah, that fee!)
let amount = 12345_00; // 12,345 (1234500 cents, apparently)
let amount = 123_4500; // 123.45 (4-fixed financial)
let amount = 1_234_500; // 1,234,500
Also, this works on the fractional and exponent parts, too:
0.000_001 // 1 millionth
1e10_000 // 10^10000 -- granted, far less useful / in-range...
Some more sources:
ES proposal: numeric separators
Numeric separators
var a = 1_000;
console.log(a);

Because Chrome implements the experimental numeric separator proposal, which permits optional underscores between any digits in a number literal. Without that, it would be just a syntax error.

What are you going to do with that code snippet? 16_11 is not meaningful number, i think.
So if you want string 16_11, then
var a = "16_11";
will work.

The purpose it is introduced to increase the readability because some number can be large and difficult to read while programming. It just acts as separator here, so you can easily identify how many digits are there.
For example looking at below example you can easily say it is a trillion without putting much effort;
var a = 1_000_000_000_000;
console.log(a);

Related

How to generate a GUID with a custom alphabet, that behaves similar to an MD5 hash (in JavaScript)?

I am wondering how to generate a GUID given an input string, such that the same input string results in the same GUID (sort of like an MD5 hash). The problem with MD5 hashes is they just guarantee low collision rate, rather than uniqueness. Instead I would like something like this:
guid('v1.0.0') == 1231231231123123123112312312311231231231
guid('v1.0.1') == 6154716581615471658161547165816154716581
guid('v1.0.2') == 1883939319188393931918839393191883939319
How would you go about implementing this sort of thing (ideally in JavaScript)? Is it even possible to do? I am not sure where to start. Things like the uuid module don't take a seed string, and they don't let you use a custom format/alphabet.
I am not looking for the canonical UUID format, but rather a GUID, ideally one made up of just integers.
What you would need is define a one-to-one mapping of text strings (such as "v1.0.0") onto 40 digit long strings (such as "123123..."). This is also known as a bijection, although in your case an injection (a simple one-to-one mapping from inputs to outputs, not necessarily onto) may be enough. As you note, hash functions don't necessarily ensure this mapping, but there are other possibilities, such as full-period linear congruential generators (if they take a seed that you can map one-to-one onto input string values), or other reversible functions.
However, if the set of possible input strings is larger than the set of possible output strings, then you can't map all input strings one-to-one with all output strings (without creating duplicates), due to the pigeonhole principle.
For example, you can't generally map all 120-character strings one-to-one with all 40-digit strings unless you restrict the format of the 120-character strings in some way. However, your problem of creating 40-digit output strings can be solved if you can accept limiting input strings to no more than 1040 values (about 132 bits), or if you can otherwise exploit redundancy in the input strings so that they are guaranteed to compress losslessly to 40 decimal digits (about 132 bits) or less, which may or may not be possible. See also this question.
The algorithm involves two steps:
First, transform the string to a BigInt by building up the string's charCodeAt() values similarly to the stringToInt method given in another answer. Throw an error if any charCodeAt() is 0x80 or greater, or if the resulting BigInt is equal to or greater than BigInt(alphabet_length)**BigInt(output_length).
Then, transform the integer to another string by taking the mod of the BigInt and the output alphabet's size and replacing each remainder with the corresponding character in the output alphabet, until the BigInt reaches 0.
One approach would be to use the method from that answer:
/*
* uuid-timestamp (emitter)
* UUID v4 based on timestamp
*
* Created by tarkh
* tarkh.com (C) 2020
* https://stackoverflow.com/a/63344366/1261825
*/
const uuidEmit = () => {
// Get now time
const n = Date.now();
// Generate random
const r = Math.random(); // <- swap this
// Stringify now time and generate additional random number
const s = String(n) + String(~~(r*9e4)+1e4);
// Form UUID and return it
return `${s.slice(0,8)}-${s.slice(8,12)}-4${s.slice(12,15)}-${[8,9,'a','b'][~~(r*3)]}${s.slice(15,18)}-${s.slice(s.length-12)}`;
};
// Generate 5 UUIDs
console.log(`${uuidEmit()}
${uuidEmit()}
${uuidEmit()}
${uuidEmit()}
${uuidEmit()}`);
And simply swap out the Math.random() call to a different random function which can take your seed value. (There are numerous algorithms out there for creating a seedable random method, so I won't try prescribing a particular one).
Most random seeds expect numeric, so you could convert a seed string to an integer by just adding up the character values (multiplying each by 10^position so you'll always get a unique number):
const stringToInt = str =>
Array.prototype.slice.call(str).reduce((result, char, index) => result += char.charCodeAt(0) * (10**(str.length - index)), 0);
console.log(stringToInt("v1.0.0"));
console.log(stringToInt("v1.0.1"));
console.log(stringToInt("v1.0.2"));
If you want to generate the same extract string every time, you can take a similar approach to tarkh's uuidEmit() method but get rid of the bits that change:
const strToInt = str =>
Array.prototype.slice.call(str).reduce((result, char, index) => result += char.charCodeAt(0) * (10**(str.length - index)), 0);
const strToId = (str, len = 40) => {
// Generate random
const r = strToInt(str);
// Multiply the number by some things to get it to the right number of digits
const rLen = `${r}`.length; // length of r as a string
// If you want to avoid any chance of collision, you can't provide too long of a string
// If a small chance of collision is okay, you can instead just truncate the string to
// your desired length
if (rLen > len) throw new Error('String too long');
// our string length is n * (r+m) + e = len, so we'll do some math to get n and m
const mMax = 9; // maximum for the exponent, too much longer and it might be represented as an exponent. If you discover "e" showing up in your string, lower this value
let m = Math.floor(Math.min(mMax, len / rLen)); // exponent
let n = Math.floor(len / (m + rLen)); // number of times we repeat r and m
let e = len - (n * (rLen + m)); // extra to pad us to the right length
return (new Array(n)).fill(0).map((_, i) => String(r * (i * 10**m))).join('')
+ String(10**e);
};
console.log(strToId("v1.0.0"));
console.log(strToId("v1.0.1"));
console.log(strToId("v1.0.2"));
console.log(strToId("v1.0.0") === strToId("v1.0.0")); // check they are the same
console.log(strToId("v1.0.0") === strToId("v1.0.1")); // check they are different
Note, this will only work with smaller strings, (probably about 10 characters top) but it should be able to avoid all collisions. You could tweak it to handle larger strings (remove the multiplying bit from stringToInt) but then you risk collisions.
I suggest using MD5...
Following the classic birthday problem, all things being equal, the odds of 2 people sharing a birthday out of a group of 23 people is ( see https://en.wikipedia.org/wiki/Birthday_problem )...
For estimating MD5 collisions, I'm going to simplify the birthday problem formula, erring in the favor of predicting a higher chance of a collision...
Note though that whereas in the birthday problem, a collision is a positive result, in the MD5 problem, a collision is a negative result, and therefore providing higher than expected collision odds provides a conservative estimate of the chance of a MD5 collision. Plus this higher predicted chance can in some way be considered a fudge factor for any uneven distribution in the MD5 output, although I do not believe there is anyway to quantify this without a God computer...
An MD5 hash is 16 bytes long, resulting in a range of 256^16 possible values. Assuming that the MD5 algorithm is generally uniform in its results, lets suppose we create one quadrillion (ie, a million billion or 10^15) unique strings to run through the hash algorithm. Then using the modified formula (to ease the collision calculations and to add a conservative fudge factor), the odds of a collision are...
So, after 10^15 or one quadrillion unique input strings, the estimated odds of a hash collision are on par with the odds of winning the Powerball or the Mega Millions Jackpot (which are on order of 1 in ~300,000,000 per https://www.engineeringbigdata.com/odds-winning-powerball-grand-prize-r/ ).
Note too that 256^16 is 340282366920938463463374607431768211456, which is 39 digits, falling within the desired range of 40 digits.
So, suggest using the MD5 hash ( converting to BigInt ), and if you do run into a collision, I will be more than glad to spot you a lottery ticket, just to have a chance to tap into your luck and split the proceeds...
( Note: I used https://keisan.casio.com/calculator for the calculations. )
While UUID v4 is just used for random ID generation, UUID v5 is more like a hash for a given input string and namespace. It's perfect for what you describe.
As you already mentioned, You can use this npm package:
npm install uuid
And it's pretty easy to use.
import {v5 as uuidv5} from 'uuid';
// use a UUIDV4 as a unique namespace for your application.
// you can generate one here: https://www.uuidgenerator.net/version4
const UUIDV5_NAMESPACE = '...';
// Finally, provide the input and namespace to get your unique id.
const uniqueId = uuidv5(input, namespace);

Pitfalls with using scientific notation in JavaScript

This question is not seeking developer code formatting opinions. Personally, I prefer to use scientific notation in my JS code when I can because I believe it is more readable. For me, 6e8 is more readable than 600000000. That being said, I am solely looking for potential risks and disadvantages specifying numbers in scientific notation in JS. I don't see it often in the wild and was wondering if there is technical reasoning for that or if it simply because of developer's druthers.
You don't see scientific notation "often in the wild" because the only numbers that actually get typed in JS tend to be constants:
Code-centric constants (such as enums and levels) tend to be small.
Physical/mathematical constants (such as π or e) tend to be highly specific.
Neither of these benefit from scientific notation too much.
I have seen Plank's constant 'in the wild' as:
const h = 6.62607004e-34;
console.log('Plank', h);
The other place it often makes sense is time limits, for instance the number of ms in a day as 864e5. For instance:
function addDaysToDate(date, days) {
if (days === 0)
return date;
date.setTime(864e5 * days + date.valueOf());
return date;
}
const now = new Date();
const thisTimeTomorrow = addDaysToDate(now, 1);
console.log('This time tomorrow', thisTimeTomorrow);
I don't think there's any technical reason not to use this notation, it's more that developers avoid hard coding numbers at all.
I don't think there are any risks. You may have to be careful with numbers in strings, but if you're doing that then this syntax is a far smaller issue than, say, number localisation (for instance a DE user entering "20.000,00", expecting 2e4, but getting 2e6 thanks to invariant number formatting swapping the thousand and decimal separators).
I'd add that JS will output that syntax by default anyway for small numbers, but avoids for large numbers up to a point (which varies by browser):
console.log('Very small', 1234 / 100000000000)
console.log('Large, but still full in some browsers', 1e17 * 1234)
console.log('Large, scientific', 1e35 * 1234)
From O. R. Mapper in this question:
Human users are not the only ones who want to read numbers. It seems
D3 will throw an exception when encountering a translate
transformation that contains coordinates in scientific notation
In addition, if you want change the string representation, as opposed to just what the literal looks like in your source, you'll have to be careful with serialized/stored data.
Also, from experience, often times you can have large numbers whose significance is in their individual digits like an ID or phone number. In this case, reducing these numbers to scientific notation hurts readability.
E-notation indicates a number that should be multiplied by 10 raised
to a given power.
is not scientific exponential notation . One pitfall is that e "times ten raised to the power of" in JavaScript is not The number e the base of the natural logarithm, represented at browser as Math.E. For individuals familiar with the mathematical constant e, JavaScript e has an entirely different meaning. 6 * Math.pow(10, 8) returns expected result and does not include use of the JavaScript artifact e.
Although the E stands for exponent, the notation is usually referred
to as (scientific) E-notation rather than (scientific) exponential
notation. The use of E-notation facilitates data entry and readability
in textual communication since it minimizes keystrokes, avoids reduced
font sizes and provides a simpler and more concise display, but it is
not encouraged in publications. Submission Guidelines for Authors:
HPS 2010 Midyear
Proceedings

Intl formatting of huge floating point numbers

I'm trying to better understand why large numbers, with potentially large precisions are inconsistently handled, specifically in JavaScript and it's localization facilities (e.g. ECMA-402/Intl). I'm assuming this has to do with the use of floating point numbers, but I'd like to understand where the limits are and/or how to avoid these pitfalls.
For example, using Intl.NumberFormat:
console.log(new Intl.NumberFormat('en-US', { minimumFractionDigits: 3, maximumFractionDigits: 3 }).format(9999999999990.001)); // logs 9,999,999,999,990.000
let test1 = 9999999999990.001
console.log(test1); // logs 9999999999990.002
How would I be able to figure out where these numbers start to get inconsistent? Is there some kind of limit? Does that limit change as I increase decimal precision, e.g. :
let test2 = 9999999999990.0004;
console.log(test2) // logs 9999999999990
Is there some kind of limit? Does that limit change as I increase decimal precision?
Yes, and yes. Floating-point numbers in JavaScript are themselves stored in 64 bits of space, which means they are limited in the precision they can represent. See this answer for more information.
How would I be able to figure out where these numbers start to get inconsistent?
Pass your "numeric literals" to a function in the form of strings, and check to see if that string, when coerced to a number and back, returns the correct literal:
function safeNumber (s) {
if (String(+s) !== s)
throw new Error('Unsafe number!')
return +s
}
let safe = safeNumber('999999999999999')
console.log(safe)
let unsafe = safeNumber('9999999999990.001')
console.log(unsafe)

how to get the number of digits in a number with leading zeros

For normal numbers, like var a = 123, it is easy to count the number of digits (with a.toString().length), but what if var a = 00123? (assume it is still in decimal).
There are a couple of problems you might experience here with a possible easy solution. First entering a number value with leading zeros, will be interpreted differently than expected. Generally it wont store the number in decimal format but instead octal or some other base. If you just want to get the length of that value then you need to store it as a string.
var a = '00123';
console.log(a.length);
Just keep in mind if you dont store it as a string the number will probably not be stored as decimal.
This is a common Javascript gotcha with a simple solution:
Just specify the base, or 'radix', like so:
parseInt('000123',10); // 123
You could also use Number:
Number('000123'); // 123

Working with string (array?) of bits of an unspecified length

I'm a javascript code monkey, so this is virgin territory for me.
I have two "strings" that are just zeros and ones:
var first = "00110101011101010010101110100101010101010101010";
var second = "11001010100010101101010001011010101010101010101";
I want to perform a bitwise & (which I've never before worked with) to determine if there's any index where 1 appears in both strings.
These could potentially be VERY long strings (in the thousands of characters). I thought about adding them together as numbers, then converting to strings and checking for a 2, but javascript can't hold precision in large intervals and I get back numbers as strings like "1.1111111118215729e+95", which doesn't really do me much good.
Can I take two strings of unspecified length (they may not be the same length either) and somehow use a bitwise & to compare them?
I've already built the loop-through-each-character solution, but 1001^0110 would strike me as a major performance upgrade. Please do not give the javascript looping solution as an answer, this question is about using bitwise operators.
As you already noticed yourself, javascript has limited capabilities if it's about integer values. You'll have to chop your strings into "edible" portions and work your way through them. Since the parseInt() function accepts a base, you could convert 64 characters to an 8 byte int (or 32 to a 4 byte int) and use an and-operator to test for set bits (if (a & b != 0))
var first = "00110101011101010010101110100101010101010101010010001001010001010100011111",
second = "10110101011101010010101110100101010101010101010010001001010001010100011100",
firstInt = parseInt(first, 2),
secondInt = parseInt(second, 2),
xorResult = firstInt ^ secondInt, //524288
xorString = xorResult.toString(2); //"10000000000000000000"

Categories

Resources