Match hyphenated floats - javascript

I'm working on JavaScript code, and I need to extract float or int numbers between hyphens. Like this:
var string="someText-180.5-200.70-someOtherText";
I need to get:
var number1="180.5";
var number2="200.70";

While other RegExp based solutions work, I'm going to propose a solution that does not use regular expressions.
My motivation for this is that regular expressions can be difficult to read, modify and/or debug sometimes.
This code returns an array with all the floats from the string (assuming the floats are positive)
"someText-180.5-200.70-someOtherText".split("-"). //splits by hyphens
map(function(elem){
return parseFloat(elem);//convert each part to float
}).filter(function(elem){
return elem===elem;//filter out the not-a-numbers,
// that is, stuff converting to float failed on
});
This could also be done without map/filter in a less 'functional' way if you're more comfortable with that. I believe code readability is very important and you should feel comfortable with your own code. You can do
var str = "someText-180.5-200.70-someOtherText";
var splitStr = str.split("-");
var arrayOfFloats = [];
for(var i=0;i<splitStr.length;i++){
var asFloat = parseFloat(splitStr[i]);
if(asFloat===asFloat){ //check for NaN, that is parse float fails
arrayOfFloats.push(asFloat);
}
}
//now arrayOfFloats contains all the floats in your expression
If you would like not to match the first and last parts (that is, only floats that are strictly between hyphens) elements you can slice them first :)

One of the problems is, that a regex matches this: "-180.5-" , than the next match will be searched in "200.70-someOtherText" string so, when the next search run wont match 200.70 ...
We should do a bit more than write a regex.
In my soultion, i cuted the examined part of the string, and run regex again on the other part of the string.. and do while there is matching.
See below the code:
function findAllINeed(str){
result = [];
while ((match = /(?:\-)(([0-9]+)\.?([0-9]*))(?:\-)/.exec(str)) != null) {
str = str.substr(match.index + match[1].length)
result.push(parseFloat(match[1]));
}
return result;
}
I tried this:
findAllINeed("someText-180.5-200.70-someOtherText sd sdf -6 -6.777 7- s 4.55 -4-sdfsdfsdf -45.77-4-")
[180.5, 200.7, 4, 45.77, 4]
Does not match -6, -6.7777, 4.55, 7- ...
But find all Positive float or integer between '-' characters
I hope this helped you out.

You can use the following regex to accomplish what you are looking for:
[0-9]*\.?,?[0-9]+
Input:
someText-180.5-200.70-someOtherText
Output:
180.5
200.70

You can use this regex:
var string="someText-180.5-200.70-someOtherText";
var match = string.match(/.*?(\d+(\.\d+)?).*?(\d+(\.\d+)?)/);
console.log(match[1]); // prints 180.5
console.log(match[3]); // prints 200.70

var numbers = string.match(/[\d.]+/g)
Or if you're concerned about multiple periods creating invalid numbers:
var numbers = string.match(/\d+(\.\d+)?/g)

Related

How can I split the word by numbers but also keep the numbers in Node.js?

I would like to split a word by numbers, but at the same time keep the numbers in node.js.
For example, take this following sentence:
var a = "shuan3jia4";
What I want is:
"shuan3 jia4"
However, if you use a regexp's split() function, the numbers that are used on the function are gone, for example:
s.split(/[0-9]/)
The result is:
[ 'shuan', 'jia', '' ]
So is there any way to keep the numbers that are used on the split?
You can use match to actually split it per your requirement:
var a = "shuan3jia4";
console.log(a.match(/[a-z]+[0-9]/ig));
use parenthesis around the match you wanna keep
see further details at Javascript and regex: split string and keep the separator
var s = "shuan3jia4";
var arr = s.split(/([0-9])/);
console.log(arr);
var s = "shuan3jia4";
var arr = s.split(/(?<=[0-9])/);
console.log(arr);
This will work as per your requirements. This answer was curated from #arhak and C# split string but keep split chars / separators
As #codybartfast said, (?<=PATTERN) is positive look-behind for PATTERN. It should match at any place where the preceding text fits PATTERN so there should be a match (and a split) after each occurrence of any of the characters.
Split, map, join, trim.
const a = 'shuan3jia4';
const splitUp = a.split('').map(function(char) {
if (parseInt(char)) return `${char} `;
return char;
});
const joined = splitUp.join('').trim();
console.log(joined);

How to split a string by a character not directly preceded by a character of the same type?

Let's say I have a string: "We.need..to...split.asap". What I would like to do is to split the string by the delimiter ., but I only wish to split by the first . and include any recurring .s in the succeeding token.
Expected output:
["We", "need", ".to", "..split", "asap"]
In other languages, I know that this is possible with a look-behind /(?<!\.)\./ but Javascript unfortunately does not support such a feature.
I am curious to see your answers to this question. Perhaps there is a clever use of look-aheads that presently evades me?
I was considering reversing the string, then re-reversing the tokens, but that seems like too much work for what I am after... plus controversy: How do you reverse a string in place in JavaScript?
Thanks for the help!
Here's a variation of the answer by guest271314 that handles more than two consecutive delimiters:
var text = "We.need.to...split.asap";
var re = /(\.*[^.]+)\./;
var items = text.split(re).filter(function(val) { return val.length > 0; });
It uses the detail that if the split expression includes a capture group, the captured items are included in the returned array. These capture groups are actually the only thing we are interested in; the tokens are all empty strings, which we filter out.
EDIT: Unfortunately there's perhaps one slight bug with this. If the text to be split starts with a delimiter, that will be included in the first token. If that's an issue, it can be remedied with:
var re = /(?:^|(\.*[^.]+))\./;
var items = text.split(re).filter(function(val) { return !!val; });
(I think this regex is ugly and would welcome an improvement.)
You can do this without any lookaheads:
var subject = "We.need.to....split.asap";
var regex = /\.?(\.*[^.]+)/g;
var matches, output = [];
while(matches = regex.exec(subject)) {
output.push(matches[1]);
}
document.write(JSON.stringify(output));
It seemed like it'd work in one line, as it did on https://regex101.com/r/cO1dP3/1, but had to be expanded in the code above because the /g option by default prevents capturing groups from returning with .match (i.e. the correct data was in the capturing groups, but we couldn't immediately access them without doing the above).
See: JavaScript Regex Global Match Groups
An alternative solution with the original one liner (plus one line) is:
document.write(JSON.stringify(
"We.need.to....split.asap".match(/\.?(\.*[^.]+)/g)
.map(function(s) { return s.replace(/^\./, ''); })
));
Take your pick!
Note: This answer can't handle more than 2 consecutive delimiters, since it was written according to the example in the revision 1 of the question, which was not very clear about such cases.
var text = "We.need.to..split.asap";
// split "." if followed by "."
var res = text.split(/\.(?=\.)/).map(function(val, key) {
// if `val[0]` does not begin with "." split "."
// else split "." if not followed by "."
return val[0] !== "." ? val.split(/\./) : val.split(/\.(?!.*\.)/)
});
// concat arrays `res[0]` , `res[1]`
res = res[0].concat(res[1]);
document.write(JSON.stringify(res));

regex - get numbers after certain character string

I have a text string that can be any number of characters that I would like to attach an order number to the end. Then I can pluck off the order number when I need to use it again. Since there's a possibility that the number is variable length, I would like to do a regular expression that catch's everything after the = sign in the string ?order_num=
So the whole string would be
"aijfoi aodsifj adofija afdoiajd?order_num=3216545"
I've tried to use the online regular expression generator but with no luck. Can someone please help me with extracting the number on the end and putting them into a variable and something to put what comes before the ?order_num=203823 into its own variable.
I'll post some attempts of my own, but I foresee failure and confusion.
var s = "aijfoi aodsifj adofija afdoiajd?order_num=3216545";
var m = s.match(/([^\?]*)\?order_num=(\d*)/);
var num = m[2], rest = m[1];
But remember that regular expressions are slow. Use indexOf and substring/slice when you can. For example:
var p = s.indexOf("?");
var num = s.substring(p + "?order_num=".length), rest = s.substring(0, p);
I see no need for regex for this:
var str="aijfoi aodsifj adofija afdoiajd?order_num=3216545";
var n=str.split("?");
n will then be an array, where index 0 is before the ? and index 1 is after.
Another example:
var str="aijfoi aodsifj adofija afdoiajd?order_num=3216545";
var n=str.split("?order_num=");
Will give you the result:
n[0] = aijfoi aodsifj adofija afdoiajd and
n[1] = 3216545
You can substring from the first instance of ? onward, and then regex to get rid of most of the complexities in the expression, and improve performance (which is probably negligible anyway and not something to worry about unless you are doing this over thousands of iterations). in addition, this will match order_num= at any point within the querystring, not necessarily just at the very end of the querystring.
var match = s.substr(s.indexOf('?')).match(/order_num=(\d+)/);
if (match) {
alert(match[1]);
}

getting contents of string between digits

have a regex problem :(
what i would like to do is to find out the contents between two or more numbers.
var string = "90+*-+80-+/*70"
im trying to edit the symbols in between so it only shows up the last symbol and not the ones before it. so trying to get the above variable to be turned into 90+80*70. although this is just an example i have no idea how to do this. the length of the numbers, how many "sets" of numbers and the length of the symbols in between could be anything.
many thanks,
Steve,
The trick is in matching '90+-+' and '80-+/' seperately, and selecting only the number and the last constant.
The expression for finding the a number followed by 1 or more non-numbers would be
\d+[^\d]+
To select the number and the last non-number, add parens:
(\d+)[^\d]*([^\d])
Finally add a /g to repeat the procedure for each match, and replace it with the 2 matched groups for each match:
js> '90+*-+80-+/*70'.replace(/(\d+)[^\d]*([^\d])/g, '$1$2');
90+80*70
js>
Or you can use lookahead assertion and simply remove all non-numerical characters which are not last: "90+*-+80-+/*70".replace(/[^0-9]+(?=[^0-9])/g,'');
You can use a regular expression to match the non-digits and a callback function to process the match and decide what to replace:
var test = "90+*-+80-+/*70";
var out = test.replace(/[^\d]+/g, function(str) {
return(str.substr(-1));
})
alert(out);
See it work here: http://jsfiddle.net/jfriend00/Tncya/
This works by using a regular expression to match sequences of non-digits and then replacing that sequence of non-digits with the last character in the matched sequence.
i would use this tutorial, first, then review this for javascript-specific regex questions.
This should do it -
var string = "90+*-+80-+/*70"
var result = '';
var arr = string.split(/(\d+)/)
for (i = 0; i < arr.length; i++) {
if (!isNaN(arr[i])) result = result + arr[i];
else result = result + arr[i].slice(arr[i].length - 1, arr[i].length);
}
alert(result);
Working demo - http://jsfiddle.net/ipr101/SA2pR/
Similar to #Arnout Engelen
var string = "90+*-+80-+/*70";
string = string.replace(/(\d+)[^\d]*([^\d])(?=\d+)/g, '$1$2');
This was my first thinking of how the RegEx should perform, it also looks ahead to make sure the non-digit pattern is followed by another digit, which is what the question asked for (between two numbers)
Similar to #jfriend00
var string = "90+*-+80-+/*70";
string = string.replace( /(\d+?)([^\d]+?)(?=\d+)/g
, function(){
return arguments[1] + arguments[2].substr(-1);
});
Instead of only matching on non-digits, it matches on non-digits between two numbers, which is what the question asked
Why would this be any better?
If your equation was embedded in a paragraph or string of text. Like:
This is a test where I want to clean up something like 90+*-+80-+/*70 and don't want to scrap the whole paragraph.
Result (Expected) :
This is a test where I want to clean up something like 90+80*70 and don't want to scrap the whole paragraph.
Why would this not be any better?
There is more pattern matching, which makes it theoretically slower (negligible)
It would fail if your paragraph had embedded numbers. Like:
This is a paragraph where Sally bought 4 eggs from the supermarket, but only 3 of them made it back in one piece.
Result (Unexpected):
This is a paragraph where Sally bought 4 3 of them made it back in one piece.

Splitting Nucleotide Sequences in JS with Regexp

I'm trying to split up a nucleotide sequence into amino acid strings using a regular expression. I have to start a new string at each occurrence of the string "ATG", but I don't want to actually stop the first match at the "ATG". Valid input is any ordering of a string of As, Cs, Gs, and Ts.
For example, given the input string: ATGAACATAGGACATGAGGAGTCA
I should get two strings: ATGAACATAGGACATGAGGAGTCA (the whole thing) and ATGAGGAGTCA (the first match of "ATG" onward). A string that contains "ATG" n times should result in n results.
I thought the expression /(?:[ACGT]*)(ATG)[ACGT]*/g would work, but it doesn't. If this can't be done with a regexp it's easy enough to just write out the code for, but I always prefer an elegant solution if one is available.
If you really want to use regular expressions, try this:
var str = "ATGAACATAGGACATGAGGAGTCA",
re = /ATG.*/g, match, matches=[];
while ((match = re.exec(str)) !== null) {
matches.push(match);
re.lastIndex = match.index + 3;
}
But be careful with exec and changing the index. You can easily make it an infinite loop.
Otherwise you could use indexOf to find the indices and substr to get the substrings:
var str = "ATGAACATAGGACATGAGGAGTCA",
offset=0, match=str, matches=[];
while ((offset = match.indexOf("ATG", offset)) > -1) {
match = match.substr(offset);
matches.push(match);
offset += 3;
}
I think you want is
var subStrings = inputString.split('ATG');
KISS :)
Splitting a string before each occurrence of ATG is simple, just use
result = subject.split(/(?=ATG)/i);
(?=ATG) is a positive lookahead assertion, meaning "Assert that you can match ATG starting at the current position in the string".
This will split GGGATGTTTATGGGGATGCCC into GGG, ATGTTT, ATGGGG and ATGCCC.
So now you have an array of (in this case four) strings. I would now go and take those, discard the first one (this one will never contain nor start with ATG) and then join the strings no. 2 + ... + n, then 3 + ... + n etc. until you have exhausted the list.
Of course, this regex doesn't do any validation as to whether the string only contains ACGT characters as it only matches positions between characters, so that should be done before, i. e. that the input string matches /^[ACGT]*$/i.
Since you want to capture from every "ATG" to the end split isn't right for you. You can, however, use replace, and abuse the callback function:
var matches = [];
seq.replace(/atg/gi, function(m, pos){ matches.push(seq.substr(pos)); });
This isn't with regex, and I don't know if this is what you consider "elegant," but...
var sequence = 'ATGAACATAGGACATGAGGAGTCA';
var matches = [];
do {
matches.push('ATG' + (sequence = sequence.slice(sequence.indexOf('ATG') + 3)));
} while (sequence.indexOf('ATG') > 0);
I'm not completely sure if this is what you're looking for. For example, with an input string of ATGabcdefghijATGklmnoATGpqrs, this returns ATGabcdefghijATGklmnoATGpqrs, ATGklmnoATGpqrs, and ATGpqrs.

Categories

Resources