I have a code...
var userArray=userIn.match(/(?:[A-Z][a-z]*|\d+|[()])/g);
...that separates the user input of a chemical formula into its components.
For example, entering Cu(NO3)2N3 will yield
Cu , ( , N , O , 3 , ) , 2 , N , 3.
In finding the percentage of each element in the entire weight, I need to count how many times each element is entered.
So in the example above,
Cu : 1 ,
N : 5 ,
O : 6
Any suggestions of how I should go about doing this?
You need to build a parser
There is no simple way around that. You need nesting and memory, a regular expression can't handle that very well (well, a real CS regulular expression can't handle that at all).
First, you get the result regexp you have. This is called Tokenization.
Now, you have to actually parse that.
I suggest the following approach I will give you pseudo code because I think it will be better deductively. If you have any questions about it let me know:
method chemistryExpression(tokens): #Tokens is the result of your regex
Create an empty map called map
While the next token is a letter, consume it (remove it from the tokens)
2.1 Add the letter to the map with occurrence 1 or increment it by one if it's already inside the map
If the next token is (, consume it: # Deal with nesting
3.1 Add the occurrences from parseExpression(tokens) to the map (note, tokens changed)
3.2 Remove the extra ) you've just encountered
num = consume tokens while the next token is a number and convert to int
Multiply the occurances of all tokens in the map by num
Return the map
Implementation suggestion
The map can just be an object.
Adding to the map is checking if the key is there, if it is not, set it to 1, if it is there, increment its value by one.
Multiplying can be done using a for... in loop.
This solution is recursive this means you're using a function which calls itself (chemistryExpression) in this case. This parser is a very basic example of a recursive descent parser and handles nesting well.
Common sense and good practice necessitate two methods
peek - what is the next token in the tokens, this is tokens[0]
next - grab the next token from tokens, this is tokens.unshift()
For each value in userArray, check if there is a next element anf if that next element is a number, if so, add this number to the count of the current element type, else add 1. You can use an object as a map to store a count for each distinct element type :
var map = { }
map[userArray[/*an element*/] = ...
EDIT : if you have numbers longer than a digit, then in a loop while the next is a number, concatenate all numbers into a string and parseInt()
Related
This is an extension of this SO question
I made a function to see if i can correctly format any number. The answers below work on tools like https://regex101.com and https://regexr.com/, but not within my function(tried in node and browser):
const
const format = (num, regex) => String(num).replace(regex, '$1')
Basically given any whole number, it should not exceed 15 significant digits. Given any decimal, it should not exceed 2 decimal points.
so...
Now
format(0.12345678901234567890, /^\d{1,13}(\.\d{1,2}|\d{0,2})$/)
returns 0.123456789012345678 instead of 0.123456789012345
but
format(0.123456789012345,/^-?(\d*\.?\d{0,2}).*/)
returns number formatted to 2 deimal points as expected.
Let me try to explain what's going on.
For the given input 0.12345678901234567890 and the regex /^\d{1,13}(\.\d{1,2}|\d{0,2})$/, let's go step by step and see what's happening.
^\d{1,13} Does indeed match the start of the string 0
(\. Now you've opened a new group, and it does match .
\d{1,2} It does find the digits 1 and 2
|\d{0,2} So this part is skipped
) So this is the end of your capture group.
$ This indicates the end of the string, but it won't match, because you've still got 345678901234567890 remaining.
Javascript returns the whole string because the match failed in the end.
Let's try removing $ at the end, to become /^\d{1,13}(\.\d{1,2}|\d{0,2})/
You'd get back ".12345678901234567890". This generates a couple of questions.
Why did the preceding 0 get removed?
Because it was not part of your matching group, enclosed with ().
Why did we not get only two decimal places, i.e. .12?
Remember that you're doing a replace. Which means that by default, the original string will be kept in place, only the parts that match will get replaced. Since 345678901234567890 was not part of the match, it was left intact. The only part that matched was 0.12.
Answer to title question: your function doesn't replace, because there's nothing to replace - the regex doesn't match anything in the string. csb's answer explains that in all details.
But that's perhaps not the answer you really need.
Now, it seems like you have an XY problem. You ask why your call to .replace() doesn't work, but .replace() is definitely not a function you should use. Role of .replace() is replacing parts of string, while you actually want to create a different string. Moreover, in the comments you suggest that your formatting is not only for presenting data to user, but you also intend to use it in some further computation. You also mention cryptocurriencies.
Let's cope with these problems one-by-one.
What to do instead of replace?
Well, just produce the string you need instead of replacing something in the string you don't like. There are some edge cases. Instead of writing all-in-one regex, just handle them one-by-one.
The following code is definitely not best possible, but it's main aim is to be simple and show exactly what is going on.
function format(n) {
const max_significant_digits = 15;
const max_precision = 2;
let digits_before_decimal_point;
if (n < 0) {
// Don't count minus sign.
digits_before_decimal_point = n.toFixed(0).length - 1;
} else {
digits_before_decimal_point = n.toFixed(0).length;
}
if (digits_before_decimal_point > max_significant_digits) {
throw new Error('No good representation for this number');
}
const available_significant_digits_for_precision =
Math.max(0, max_significant_digits - digits_before_decimal_point);
const effective_max_precision =
Math.min(max_precision, available_significant_digits_for_precision);
const with_trailing_zeroes = n.toFixed(effective_max_precision);
// I want to keep the string and change just matching part,
// so here .replace() is a proper method to use.
const withouth_trailing_zeroes = with_trailing_zeroes.replace(/\.?0*$/, '');
return withouth_trailing_zeroes;
}
So, you got the number formatted the way you want. What now?
What can you use this string for?
Well, you can display it to the user. And that's mostly it. The value was rounded to (1) represent it in a different base and (2) fit in limited precision, so it's pretty much useless for any computation. And, BTW, why would you convert it to String in the first place, if what you want is a number?
Was the value you are trying to print ever useful in the first place?
Well, that's the most serious question here. Because, you know, floating point numbers are tricky. And they are absolutely abysmal for representing money. So, most likely the number you are trying to format is already a wrong number.
What to use instead?
Fixed-point arithmetic is the most obvious answer. Works most of the time. However, it's pretty tricky in JS, where number may slip into floating-point representation almost any time. So, it's better to use decimal arithmetic library. Optionally, switch to a language that has built-in bignums and decimals, like Python.
I have an array like so:
const testArray = [ 'blah', 'abctesttt', 'atestc', 'testttttt' ]
I would like to split the string once it reaches a certain character count, for example lets use 10 characters. Also, I would like the output to swap itself to be able to use within 10 characters. Please see the expected output below if this doesn't really make sense to you. Please assume that each item in the array will not be above 10 characters just for example purpose.
So once the testArray reaches 10 characters I would like the next item to be under a new variable maybe? Not sure if thats the best way of doing this.
Something like this maybe? Again this may be very inefficient, if so please feel free to use another method.
const testArray = [ 'blah', 'abctesttt', 'atestc', 'testttttt' ]
if ((testArray.join('\n')).length) >= 10 {
/* split the string into parts and store it under a variable maybe?
console.log((the_splitted_testArray).join('\n')); */
}
Expected output:
"blah
atestc" //instead of using "abctesttt" it would use "atestc" as it's the next element in the array and it also avoids reaching the 10 character limit, if adding "atestc" caused the character limit to go over 10, I would like it to check the next element and so on
"abctesttt" // it can't add the remaining "testttttt" since that would cause the character limit to be reached
"testttttt"
First of all, as you can't create a new variable out of nowhere at run time, you are probably going to use a "parent"-array, which then contains the actual strings with a length of 10 maximally.
For the grouping you probably have to design an algorithm yourself. My first idea for an algorithm is something like below. Probably not the best and most efficient way (as the description of "efficient" depends on your personal priorities), but feel free to optimise it yourself :)
Walk through $testArray[], sort all strings into a new two-dimensional array: $stringLength[$messagesWithSameLength[]]. Like array(1=>array('.','a'),2=>array('hi','##',...),...)
Now, always try to get as many strings together as possible. Start with one of the longest strings, calculate the remaining space and get a string suiting best into it. If none fits, start a new group.
Always try to use the space as good as possible
I am parsing a string of multiple numbers between 1 and 10 with the eventual goal of adding them to a set.
There will be multiple concatenated numbers after a text identifier such as {text}12345678910.
I am currently using match(/\d/g) to grab the numbers but it separates 1 and 0 in 10. I then look for 0 in my String Array, see if there's a 1 in the element before it, turn it into a 10 and delete the other entry. Not very elegant.
How can I clean up my matching code? I definitely don't need to use regex for this, but it makes grabbing the numbers fairly easy.
You could just match with this regex:
/10|\d/g
(instead of the one you use currently, not additionally)
Regex is executed left-to-right, so first it finds any occurrences of 10, and then of other digits (so using, for example /\d|10/g or even /\d|(10)/g won't work either).
This question already has answers here:
How to match numbers between X and Y with regexp?
(7 answers)
Closed 7 years ago.
First of all, i know Regular expressions isn't the best tool to achieve what I want here. I have done enough research to know that bit. Still, The problem I am stuck in requires me to make up a regex to find the values between some lower and upper bound values.
So here is the problem, I have a large set of data, let's say ranging between 1 and 1000000. That data is not under my direct control, I cannot manipulate the data directly. Only way of finding out (searching) some values from that data is regex.. Now, the user can give two values, a minimum value and a maximum value and I need to construct a regex based on these two values and then query the large data set using the regex to get all the values lying between the set range. So, if my data contains [1,5,7,9,15,30,45,87] and user sets the range min:10, max:40. The regex should filter out values 15, 30.
From whatever I have searched, I know it is very much possible to build a regex for finding out values between fixed values (if we know them beforehand) for example, values between 1 to 100 can be found by:
^(100|[1-9][0-9]?)$
But what gets so tricky about my problem is that the input range can be anything from pretty much 1 digit values to up to 10 digit values. 10000-550000 can be an example user input for a large data set.
I know this will require some complex logic and loops involved on the basis of number of digits in the lower bound and number of digits in the upper bound of the range and then some recursive or other magical logic to build a regex that covers all the number lying in that range.
I've been filling up pages to come up with a logic but I'm afraid it surpasses my knowledge of regex. If anyone has ever done something like this before or try to point me in the right direction or attempt it him/herself - it'll be quite helpful. Thanks.
The language I will be using this in is JavaScript and I read somewhere that JS doesn't support conditional regex, keeping that in mind, solution doesn't have to be in specific to a language.
If your task is to get numbers between min and max value from the dataset, you can try filter method.
Var filteredResults = Dataset.filter(function(item){
If(item < max && item > min)
Return item
}
)
I'm trying to find an expression for JavaScript which gives me the two characters at a specific position.
It's always the same call so its may be not too complicated.
I have always a 10 char long number and i want to replace the first two, the two at place 3 and 4 or the two at place 5 and 6 and so on.
So far I've done this:
number.replace(/\d{2}/, index));
this replace my first 2 digits with 2 others digits.
but now I want to include some variables at which position the digits should be replaced, something like:
number.replace(/\d{atposx,atpox+1}/, index));
that means:
01234567891
and I want sometimes to replace 01 with 02 and sometimes 23 with 56.
(or something like this with other numbers).
I hope I pointed out what I want.
This function works fine:
function replaceChars(input, startPos, replacement){
return input.substring(0,startPos) +
replacement +
input.substring(startPos+replacement.length)
}
Usage:
replaceChars("0123456789",2,"55") // output: 0155456789
Live example: http://jsfiddle.net/FnkpT/
Numbers are fairly easily interpreted as strings in JS. So, if you're working with an actual number (i.e. 9876543210) and not a number that's represented by a string (i.e. '987654321'), just turn the number into a string (''.concat(number); ) and don't limit yourself to the constraints of what you can do with just numbers.
Both of the above examples are fine (bah, they beat me to it), but you can even think about it like this:
var numberString = ''.concat(number);
var numberChunks = numberString.match(/(\d{2})/g);
You've now got an array of chunks that you can either walk through, switch through, or whatever other kind of flow you want to follow. When you're done, just say...
numberString = numberChunks.join('');
number = parseInt(numberString, 10);
You've got your number back as a native number (or skip the last part to just get the string back). And, aside from that, if you're doing multiple replacements.. the more replacements you do in the number, the more efficient breaking it up into chunks and dealing with the chunks are. I did a quick test, and running the 'replaceChars' function was faster on a single change, but will be slower than just splitting into an array if you're doing two or more changes to the data.
Hope that makes sense!
You can try this
function replaceAtIndex(str,value,index) {
return str.substr(0,index)+value+str.substr(index+value.length);
}
replaceAtIndex('0123456789','X',3); // returns "012X456789"
replaceAtIndex('0123456789','XY',3); // returns "012XY56789"