Find numbers at a specific position - javascript

I'm trying to find an expression for JavaScript which gives me the two characters at a specific position.
It's always the same call so its may be not too complicated.
I have always a 10 char long number and i want to replace the first two, the two at place 3 and 4 or the two at place 5 and 6 and so on.
So far I've done this:
number.replace(/\d{2}/, index));
this replace my first 2 digits with 2 others digits.
but now I want to include some variables at which position the digits should be replaced, something like:
number.replace(/\d{atposx,atpox+1}/, index));
that means:
01234567891
and I want sometimes to replace 01 with 02 and sometimes 23 with 56.
(or something like this with other numbers).
I hope I pointed out what I want.

This function works fine:
function replaceChars(input, startPos, replacement){
return input.substring(0,startPos) +
replacement +
input.substring(startPos+replacement.length)
}
Usage:
replaceChars("0123456789",2,"55") // output: 0155456789
Live example: http://jsfiddle.net/FnkpT/

Numbers are fairly easily interpreted as strings in JS. So, if you're working with an actual number (i.e. 9876543210) and not a number that's represented by a string (i.e. '987654321'), just turn the number into a string (''.concat(number); ) and don't limit yourself to the constraints of what you can do with just numbers.
Both of the above examples are fine (bah, they beat me to it), but you can even think about it like this:
var numberString = ''.concat(number);
var numberChunks = numberString.match(/(\d{2})/g);
You've now got an array of chunks that you can either walk through, switch through, or whatever other kind of flow you want to follow. When you're done, just say...
numberString = numberChunks.join('');
number = parseInt(numberString, 10);
You've got your number back as a native number (or skip the last part to just get the string back). And, aside from that, if you're doing multiple replacements.. the more replacements you do in the number, the more efficient breaking it up into chunks and dealing with the chunks are. I did a quick test, and running the 'replaceChars' function was faster on a single change, but will be slower than just splitting into an array if you're doing two or more changes to the data.
Hope that makes sense!

You can try this
function replaceAtIndex(str,value,index) {
return str.substr(0,index)+value+str.substr(index+value.length);
}
replaceAtIndex('0123456789','X',3); // returns "012X456789"
replaceAtIndex('0123456789','XY',3); // returns "012XY56789"

Related

How can I convert this UTF-8 string to plain text in javascript and how can a normal user write it in a textarea [duplicate]

While reviewing JavaScript concepts, I found String.normalize(). This is not something that shows up in W3School's "JavaScript String Reference", and, hence, it is the reason I might have missed it before.
I found more information about it in HackerRank which states:
Returns a string containing the Unicode Normalization Form of the
calling string's value.
With the example:
var s = "HackerRank";
console.log(s.normalize());
console.log(s.normalize("NFKC"));
having as output:
HackerRank
HackerRank
Also, in GeeksForGeeks:
The string.normalize() is an inbuilt function in javascript which is
used to return a Unicode normalisation form of a given input string.
with the example:
<script>
// Taking a string as input.
var a = "GeeksForGeeks";
// calling normalize function.
b = a.normalize('NFC')
c = a.normalize('NFD')
d = a.normalize('NFKC')
e = a.normalize('NFKD')
// Printing normalised form.
document.write(b +"<br>");
document.write(c +"<br>");
document.write(d +"<br>");
document.write(e);
</script>
having as output:
GeeksForGeeks
GeeksForGeeks
GeeksForGeeks
GeeksForGeeks
Maybe the examples given are just really bad as they don't allow me to see any change.
I wonder... what's the point of this method?
It depends on what will do with strings: often you do not need it (if you are just getting input from user, and putting it to user). But to check/search/use as key/etc. such strings, you may want a unique way to identify the same string (semantically speaking).
The main problem is that you may have two strings which are semantically the same, but with two different representations: e.g. one with a accented character [one code point], and one with a character combined with accent [one code point for character, one for combining accent]. User may not be in control on how the input text will be sent, so you may have two different user names, or two different password. But also if you mangle data, you may get different results, depending on initial string. Users do not like it.
An other problem is about unique order of combining characters. You may have an accent, and a lower tail (e.g. cedilla): you may express this with several combinations: "pure char, tail, accent", "pure char, accent, tail", "char+tail, accent", "char+accent, cedilla".
And you may have degenerate cases (especially if you type from a keyboard): you may get code points which should be removed (you may have a infinite long string which could be equivalent of few bytes.
In any case, for sorting strings, you (or your library) requires a normalized form: if you already provide the right, the lib will not need to transform it again.
So: you want that the same (semantically speaking) string has the same sequence of unicode code points.
Note: If you are doing directly on UTF-8, you should also care about special cases of UTF-8: same codepoint could be written in different ways [using more bytes]. Also this could be a security problem.
The K is often used for "searches" and similar tasks: CO2 and CO₂ will be interpreted in the same manner, but this could change the meaning of the text, so it should often used only internally, for temporary tasks, but keeping the original text.
As stated in MDN documentation, String.prototype.normalize() return the Unicode Normalized Form of the string. This because in Unicode, some characters can have different representation code.
This is the example (taken from MDN):
const name1 = '\u0041\u006d\u00e9\u006c\u0069\u0065';
const name2 = '\u0041\u006d\u0065\u0301\u006c\u0069\u0065';
console.log(`${name1}, ${name2}`);
// expected output: "Amélie, Amélie"
console.log(name1 === name2);
// expected output: false
console.log(name1.length === name2.length);
// expected output: false
const name1NFC = name1.normalize('NFC');
const name2NFC = name2.normalize('NFC');
console.log(`${name1NFC}, ${name2NFC}`);
// expected output: "Amélie, Amélie"
console.log(name1NFC === name2NFC);
// expected output: true
console.log(name1NFC.length === name2NFC.length);
// expected output: true
As you can see, the string Amélie as two different Unicode representations. With normalization, we can reduce the two forms to the same string.
Very beautifully explained here --> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize
Short answer : The point is, characters are represented through a coding scheme like ascii, utf-8 , etc.,(We use mostly UTF-8). And some characters have more than one representation. So 2 string may render similarly, but their unicode may vary! So string comparrision may fail here! So we use normaize to return a single type of representation
// source from MDN
let string1 = '\u00F1'; // ñ
let string2 = '\u006E\u0303'; // ñ
string1 = string1.normalize('NFC');
string2 = string2.normalize('NFC');
console.log(string1 === string2); // true
console.log(string1.length); // 1
console.log(string2.length); // 1
Normalization of strings isn't exclusive of JavaScript - see for instances in Python. The values valid for the arguments are defined by the Unicode (more on Unicode normalization).
When it comes to JavaScript, note that there's documentation with String.normalize() and String.prototype.normalize(). As #ChrisG mentions
String.prototype.normalize() is correct in a technical sense, because
normalize() is a dynamic method you call on instances, not the class
itself. The point of normalize() is to be able to compare Strings that
look the same but don't consist of the same characters, as shown in
the example code on MDN.
Then, when it comes to its usage, found a great example of the usage of String.normalize() that has
let s1 = 'sabiá';
let s2 = 'sabiá';
// one is in NFC, the other in NFD, so they're different
console.log(s1 == s2); // false
// with normalization, they become the same
console.log(s1.normalize('NFC') === s2.normalize('NFC')); // true
// transform string into array of codepoints
function codepoints(s) { return Array.from(s).map(c => c.codePointAt(0).toString(16)); }
// printing the codepoints you can see the difference
console.log(codepoints(s1)); // [ "73", "61", "62", "69", "e1" ]
console.log(codepoints(s2)); // [ "73", "61", "62", "69", "61", "301" ]
So while saibá e saibá in this example look the same to the human eye or even if we used console.log(), we can see that without normalization when comparing them we'd get different results. Then, by analyzing the codepoints, we see they're different.
There are some great answers here already, but I wanted to throw in a practical example.
I enjoy Bible translation as a hobby. I wasn't too thrilled at the flashcard option out there in the wild in my price range (free) so I made my own. The problem is, there is more than one way to do Hebrew and Greek in Unicode to get the exact same thing. For example:
בָּא
בָּא
These should look identical on your screen, and for all practical purposes they are identical. However, the first was typed with the qamats (the little t shaped thing under it) before the dagesh (the dot in the middle of the letter) and the second was typed with the dagesh before the qamats. Now, since you're just reading this, you don't care. And your web browser doesn't care. But when my flashcards compare the two, then they aren't the same. To the code behind the scenes, it's no different than saying "center" and "centre" are the same.
Similarly, in Greek:
ἀ
ἀ
These two should look nearly identical, but the top is one Unicode character and the second one is two Unicode characters. Which one is going to end up typed in my flashcards is going to depend on which keyboard I'm sitting at.
When I'm adding flashcards, believe it or not, I don't always type in vocab lists of 100 words. That's why God gave us spreadsheets. And sometimes the places I'm importing the lists from do it one way, and sometimes they do it the other way, and sometimes they mix it. But when I'm typing, I'm not trying to memorize the order that the dagesh or quamats appear or if the accents are typed as a separate character or not. Regardless if I remember to type the dagesh first or not, I want to get the right answer, because really it's the same answer in every practical sense either way.
So I normalize the order before saving the flashcards and I normalize the order before checking it, and the result is that it doesn't matter which way I type it, it comes out right!
If you want to check out the results:
https://sthelenskungfu.com/flashcards/
You need a Google or Facebook account to log in, so it can track progress and such. As far as I know (or care) only my daughter and I currently use it.
It's free, but eternally in beta.

Why doesn't my function correctly replace when using some regex pattern

This is an extension of this SO question
I made a function to see if i can correctly format any number. The answers below work on tools like https://regex101.com and https://regexr.com/, but not within my function(tried in node and browser):
const
const format = (num, regex) => String(num).replace(regex, '$1')
Basically given any whole number, it should not exceed 15 significant digits. Given any decimal, it should not exceed 2 decimal points.
so...
Now
format(0.12345678901234567890, /^\d{1,13}(\.\d{1,2}|\d{0,2})$/)
returns 0.123456789012345678 instead of 0.123456789012345
but
format(0.123456789012345,/^-?(\d*\.?\d{0,2}).*/)
returns number formatted to 2 deimal points as expected.
Let me try to explain what's going on.
For the given input 0.12345678901234567890 and the regex /^\d{1,13}(\.\d{1,2}|\d{0,2})$/, let's go step by step and see what's happening.
^\d{1,13} Does indeed match the start of the string 0
(\. Now you've opened a new group, and it does match .
\d{1,2} It does find the digits 1 and 2
|\d{0,2} So this part is skipped
) So this is the end of your capture group.
$ This indicates the end of the string, but it won't match, because you've still got 345678901234567890 remaining.
Javascript returns the whole string because the match failed in the end.
Let's try removing $ at the end, to become /^\d{1,13}(\.\d{1,2}|\d{0,2})/
You'd get back ".12345678901234567890". This generates a couple of questions.
Why did the preceding 0 get removed?
Because it was not part of your matching group, enclosed with ().
Why did we not get only two decimal places, i.e. .12?
Remember that you're doing a replace. Which means that by default, the original string will be kept in place, only the parts that match will get replaced. Since 345678901234567890 was not part of the match, it was left intact. The only part that matched was 0.12.
Answer to title question: your function doesn't replace, because there's nothing to replace - the regex doesn't match anything in the string. csb's answer explains that in all details.
But that's perhaps not the answer you really need.
Now, it seems like you have an XY problem. You ask why your call to .replace() doesn't work, but .replace() is definitely not a function you should use. Role of .replace() is replacing parts of string, while you actually want to create a different string. Moreover, in the comments you suggest that your formatting is not only for presenting data to user, but you also intend to use it in some further computation. You also mention cryptocurriencies.
Let's cope with these problems one-by-one.
What to do instead of replace?
Well, just produce the string you need instead of replacing something in the string you don't like. There are some edge cases. Instead of writing all-in-one regex, just handle them one-by-one.
The following code is definitely not best possible, but it's main aim is to be simple and show exactly what is going on.
function format(n) {
const max_significant_digits = 15;
const max_precision = 2;
let digits_before_decimal_point;
if (n < 0) {
// Don't count minus sign.
digits_before_decimal_point = n.toFixed(0).length - 1;
} else {
digits_before_decimal_point = n.toFixed(0).length;
}
if (digits_before_decimal_point > max_significant_digits) {
throw new Error('No good representation for this number');
}
const available_significant_digits_for_precision =
Math.max(0, max_significant_digits - digits_before_decimal_point);
const effective_max_precision =
Math.min(max_precision, available_significant_digits_for_precision);
const with_trailing_zeroes = n.toFixed(effective_max_precision);
// I want to keep the string and change just matching part,
// so here .replace() is a proper method to use.
const withouth_trailing_zeroes = with_trailing_zeroes.replace(/\.?0*$/, '');
return withouth_trailing_zeroes;
}
So, you got the number formatted the way you want. What now?
What can you use this string for?
Well, you can display it to the user. And that's mostly it. The value was rounded to (1) represent it in a different base and (2) fit in limited precision, so it's pretty much useless for any computation. And, BTW, why would you convert it to String in the first place, if what you want is a number?
Was the value you are trying to print ever useful in the first place?
Well, that's the most serious question here. Because, you know, floating point numbers are tricky. And they are absolutely abysmal for representing money. So, most likely the number you are trying to format is already a wrong number.
What to use instead?
Fixed-point arithmetic is the most obvious answer. Works most of the time. However, it's pretty tricky in JS, where number may slip into floating-point representation almost any time. So, it's better to use decimal arithmetic library. Optionally, switch to a language that has built-in bignums and decimals, like Python.

Dynamically splitting an array depending on character count

I have an array like so:
const testArray = [ 'blah', 'abc​testtt​​', 'atestc​', 'testttttt' ]
I would like to split the string once it reaches a certain character count, for example lets use 10 characters. Also, I would like the output to swap itself to be able to use within 10 characters. Please see the expected output below if this doesn't really make sense to you. Please assume that each item in the array will not be above 10 characters just for example purpose.
So once the testArray reaches 10 characters I would like the next item to be under a new variable maybe? Not sure if thats the best way of doing this.
Something like this maybe? Again this may be very inefficient, if so please feel free to use another method.
const testArray = [ 'blah', 'abctesttt​​', 'atestc', 'testttttt' ]
if ((testArray.join('\n')).length) >= 10 {
/* split the string into parts and store it under a variable maybe?
console.log((the_splitted_testArray).join('\n')); */
}
Expected output:
"blah
atestc" //instead of using "abctesttt" it would use "atestc" as it's the next element in the array and it also avoids reaching the 10 character limit, if adding "atestc" caused the character limit to go over 10, I would like it to check the next element and so on
"abctesttt" // it can't add the remaining "testttttt" since that would cause the character limit to be reached
"testttttt"
First of all, as you can't create a new variable out of nowhere at run time, you are probably going to use a "parent"-array, which then contains the actual strings with a length of 10 maximally.
For the grouping you probably have to design an algorithm yourself. My first idea for an algorithm is something like below. Probably not the best and most efficient way (as the description of "efficient" depends on your personal priorities), but feel free to optimise it yourself :)
Walk through $testArray[], sort all strings into a new two-dimensional array: $stringLength[$messagesWithSameLength[]]. Like array(1=>array('.','a'),2=>array('hi','##',...),...)
Now, always try to get as many strings together as possible. Start with one of the longest strings, calculate the remaining space and get a string suiting best into it. If none fits, start a new group.
Always try to use the space as good as possible

Grab multiple numbers 1-10 from string

I am parsing a string of multiple numbers between 1 and 10 with the eventual goal of adding them to a set.
There will be multiple concatenated numbers after a text identifier such as {text}12345678910.
I am currently using match(/\d/g) to grab the numbers but it separates 1 and 0 in 10. I then look for 0 in my String Array, see if there's a 1 in the element before it, turn it into a 10 and delete the other entry. Not very elegant.
How can I clean up my matching code? I definitely don't need to use regex for this, but it makes grabbing the numbers fairly easy.
You could just match with this regex:
/10|\d/g
(instead of the one you use currently, not additionally)
Regex is executed left-to-right, so first it finds any occurrences of 10, and then of other digits (so using, for example /\d|10/g or even /\d|(10)/g won't work either).

How to "unformat" a numerical string? JavaScript

So I know how to format a string or integer like 2000 to 2K, but how do I reverse it?
I want to do something like:
var string = "$2K".replace("/* K with 000 and remove $ symbol in front of 2 */");
How do I start? I am not very good regular expressions, but I have been taking some more time out to learn them. If you can help, I certainly appreciate it. Is it possible to do the same thing for M for millions (adding 000000 at the end) or B for billions (adding 000000000 at the end)?
var string = "$2K".replace(/\$(\d+)K/, "$1000");
will give output as
2000
I'm going to take a different approach to this, as the best way to do this is to change your app to not lose the original numeric information. I recognize that this isn't always possible (for example, if you're scraping formatted values...), but it could be useful way to think about it for other users with similar question.
Instead of just storing the numeric values or the display values (and then trying to convert back to the numeric values later on), try to update your app to store both in the same object:
var value = {numeric: 2000, display: '2K'}
console.log(value.numeric); // 2000
console.log(value.display); // 2K
The example here is a bit simplified, but if you pass around your values like this, you don't need to convert back in the first place. It also allows you to have your formatted values change based on locale, currency, or rounding, and you don't lose the precision of your original values.

Categories

Resources