I am trying to extract a multi-digit number which is preceeded by a non-digit from a string in javascript, but having trouble. For example, I want to get the "32" out of "B32 is the floor".
var str = "B23 is the floor."
var num = str.match(/\d*/);
$("#result").append(Object.prototype.toString.call(num) + ', ');
$("#result").append(Object.prototype.toString.call(num[0]) + ', ');
$("#result").append(num[0] + ', ');
$("#result").append(num[0].length);
Returns:
Result:[object Array], [object String], , 0
num[0] seems to be an empty string.
For some reason the regext /\d*/ does not work the way it is supposed to. I have tried /(\d)/, /(\d)*/, /[0-9]/, and some other reasonable and unreasonable things, but it just doesn't seem to work.
Here is my jsFiddle if you want to take a look:
http://jsfiddle.net/PLYHF/3/
The problem is that the regex parser is "lazy". It sees that your regex is perfectly fine with "nothing" (since * means 0 or more), so anything will pass.
Instead, try /\d+/. This will force there to be at least one digit.
parseInt(/[A-Z]+([0-9]+)/.exec('B23 is the floor.')[1]); // 23
What you'll want to do is match for /^0-9/, and then get the second value (Array[1]) from the returned array. That will contain what's captured by the first group. It would end up looking like this:
var str = "B23 is the floor."
var num = str.match(/[^0-9\s]([0-9]+)/);
$('#result').append(num[1]);
demo
Related
I would like to split a spreadsheet cell reference (eg, A10, AB100, ABC5) to two string parts: column reference and row reference.
A10 => A and 10
AB100 => AB and 100 ...
Does anyone know how to do this by string functions?
var res = "AA123";
//Method 1
var arr = res.match(/[a-z]+|[^a-z]+/gi);
document.write(arr[0] + "<br>" + arr[1]);
//Method 2 (as deceze suggested)
var arr = res.match(/([^\d]+)(\d+)/);
document.write("<br>" + arr[1] + "<br>" + arr[2]);
//Note here [^\d] is the same as \D
This is easiest to do with a regular expression (regex). For example:
var ref = "AA100";
var matches = ref.match(/^([a-zA-Z]+)([1-9][0-9]*)$/);
if (matches) {
var column = matches[1];
var row = Number(matches[2]);
console.log(column); // "AA"
console.log(row); // 100
} else {
throw new Error('invalid ref "' + ref + '"');
}
The important part here is the regex literal, /^([a-zA-Z]+)([1-9][0-9]*)$/. I'll walk you through it.
^ anchors the regex to the start of the string. Otherwise you might match something like "123ABC456".
[a-zA-Z]+ matches one or more character from a-z or A-Z.
[1-9][0-9]* matches exactly one character from 1-9, and then zero or more characters from 0-9. This makes sure that the number you are matching never starts with zero (i.e. "A001" is not allowed).
$ anchors the regex to the end of the string, so that you don't match something like "ABC123DEF".
The parentheses around ([a-zA-Z]+) and ([1-9][0-9]*) "capture" the strings inside them, so that we can later find them using matches[1] and matches[2].
This example is strict about only matching valid cell references. If you trust the data you receive to always be valid then you can get away with a less strict regex, but it is good practice to always validate your data anyway in case your data source changes or you use the code somewhere else.
It is also up to you to decide what you want to do if you receive invalid data. In this example I make the script throw an error, but there might be better choices in your situation (e.g. prompt the user to enter another value).
I'm trying to build a regular expression that parses a string and skips things in brackets.
Something like
string = "A bc defg hi [hi] jkl mnop.";
The .match() should return "hi" but not [hi]. I've spent 5 hours running through RE's but I'm throwing in the towel.
Also this is for javascript or jquery if that matters.
Any help is appreciated. Also I'm working on getting my questions formatted correctly : )
EDIT:
Ok I just had a eureka moment and figured out that the original RegExp I was using actually did work. But when I was replaces the matches with the [matches] it simply replaced the first match in the string... over and over. I thought this was my regex refusing to skip the brackets but after much time of trying almost all of the solutions below, I realized that I was derping Hardcore.
When .replace was working its magic it was on the first match, so I quite simply added a space to the end of the result word as follows:
var result = string.match(regex);
var modifiedResult = '[' + result[0].toString() + ']';
string.replace(result[0].toString() + ' ', modifiedResult + ' ');
This got it to stop targeting the original word in the string and stop adding a new set of brackets to it with every match. Thank you all for your help. I am going to give answer credit to the post that prodded me in the right direction.
preprocess the target string by removing everything between brackets before trying to match your RE
string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*\]/, "")
then apply your RE to tmpstring
correction: made the match for brackets eager per nhahtd comment below, and also, made the RE global
string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*?\]/g, "")
You don't necessarily need regex for this. Simply use string manipulation:
var arr = string.split("[");
var final = arr[0] + arr[1].split("]")[1];
If there are multiple bracketed expressions, use a loop:
while (string.indexOf("[") != -1){
var arr = string.split("[");
string = arr[0] + arr.slice(1).join("[").split("]").slice(1).join("]");
}
Using only Regular Expressions, you can use:
hi(?!])
as an example.
Look here about negative lookahead: http://www.regular-expressions.info/lookaround.html
Unfortunately, javascript does not support negative lookbehind.
I used http://regexpal.com/ to test, abcd[hi]jkhilmnop as test data, hi(?!]) as the regex to find. It matched 'hi' without matching '[hi]'. Basically it matched the 'hi' so long as there was not a following ']' character.
This of course, can be expanded if needed. This has a benefit of not requiring any pre-processing for the string.
r"\[(.*)\]"
Just play arounds with this if you wanto to use regular expressions.
What do yo uwant to do with it? If you want to selectively replace parts like "hi" except when it's "[hi]", then I often use a system where I match what I want to avoid first and then what I want to watch; if it matches what I want to avoid then I return the match, otherwise I return the processed match.
Like this:
return string.replace(/(\[\w+\])|(\w+)/g, function(all, m1, m2) {return m1 || m2.toUpperCase()});
which, with the given string, returns:
"A BC DEFG HI [hi] JKL MNOP."
Thus: it replaces every word with uppercase (m1 is empty), except if the word is between square brackets (m1 is not empty).
This builds an array of all the strings contained in [ ]:
var regex = /\[([^\]]*)\]/;
var string = "A bc defg hi [hi] [jkl] mnop.";
var results=[], result;
while(result = regex.exec(string))
results.push(result[1]);
edit
To answer to the question, this regex returns the string less all is in [ ], and trim whitespaces:
"A bc defg [hi] mnop [jkl].".replace(/(\s{0,1})\[[^\]]*\](\s{0,1})/g,'$1')
Instead of skipping the match you can probably try something different - match everything but do not capture the string within square brackets (inclusive) with something like this:
var r = /(?:\[.*?[^\[\]]\])|(.)/g;
var result;
var str = [];
while((result = r.exec(s)) !== null){
if(result[1] !== undefined){ //true if [string] matched but not captured
str.push(result[1]);
}
}
console.log(str.join(''));
The last line will print parts of the string which do not match the [string] pattern. For example, when called with the input "A [bc] [defg] hi [hi] j[kl]u m[no]p." the code prints "A hi ju mp." with whitespaces intact.
You can try different things with this code e.g. replacing etc.
I'm sorry if it is a confusing question. I was trying to find a way to do this but couldn't find it so, if it is a repeated question, my apologies!
I have a text something like this: something:"askjnqwe234"
I want to be able to get askjnqwe234 using a RegExp. You can notice I want to omit the quotes. I was trying this using /[^"]+(?=(" ")|"$)/g but it returns an array. I want a RegExt to return a single string, not an array.
I don't know if it's possible but I do not want to specify the position of the array; something like this:
var x = string.match(/[^"]+(?=(" ")|"$)/g)[0];
Thanks!
Try:
/"([^"]*)"/g
in English: look for " the match and record anything that isn't " till you see another "".
match and exec always return an array or null, so, assuming you have a single double-quoted value and no newlines in the string, you could use
var x;
var str = 'something:"askjnqwe234"';
x = str.replace( /^[^"]*"|".*/g, '' );
// "askjnqwe234"
Or, if you may have other quoted values in the string
x = str.replace( /.*?something:"([^"]*)".*/, '$1' );
where $1 refers to the substring captured by the sub-pattern [^"]* between the ().
Further explanation on request.
Notwithstanding the above, I recommend that you tolerate the array indexing and just use match.
You can capture the information inside quotes like this, assuming it matches:
var x = string.match(/something:"([^"]*)"/)[1];
The memory capture at index 1 is the part inside the double quotes.
If you're not sure it will match:
var match = string.match(/something:"([^"]*)"/);
if (match) {
// use match[1] here
}
How would you change this:
a-10-b-19-c
into something like this:
a-10-b-20-c
using regular expressions in Javascript?
It should also change this:
a-10-b-19
into this:
a-10-b-20
The only solution I've found so far is:
reverse the original string -> "c-91-b-01-a"
find the first number (with \d+) -> "91"
reverse it -> "19"
turn in into a number (parseInt) -> 19
add 1 to it -> 20
turn it into a string again (toString) -> "20"
reverse it again -> "02"
replace the original match with this new number -> "c-02-b-01-a"
reverse the string -> "a-10-b-20-c"
I was hoping someone on SO would have a simpler way to do this... Anyone?
Here is a simple way.
var str = 'a-10-b-19-c';
str = str.replace(/(\d*)(?=(\D*)?$)/, +str.match(/(\d*)(?=(\D*)?$)/)[0]+1);
+str.match finds 19, adds 1 to it and returns 20. The + makes sure the answer is an int. str.replace finds 19 and replaces it with what str.match returned which was 20.
Explanation
(\d*) - matches any digits
(?=...) - positive lookahead, doesn't change regex position, but makes sure that pattern exists further on down the line.
(\D*)?$ - it doesn't have to, but can match anything that is not a number multiple times and then matches the end of the string
//replaces last digit sequences with 20
'a-10-b-19-c'.replace(/\d+(?!.*\d+)/, '20')
/ --> Start of regex
\d+ --> Match any digit (one or more)
(?!.*\d+) --> negative look ahead assertion that we cannot find any future (one or more) digits
/ --> end of regex
Edit: Just reread about adding,
Can use match for that, e.g.:
var m ='a-10-b-19-c'.match(/\d+(?!.*\d+)/);
'a-10-b-19-c'.replace(/\d+(?!.*\d+)/, parseInt(m[0]) + 1);
Here's an even simpler one:
str.replace(/(.*\D)(\d+)/, function(s, pfx, n) {return pfx + ((+n) + 1)})
or
str.replace(/.*\D(\d+)/, function(s, n) {return s.slice(0, -n.length) + ((+n) + 1)})
Neither of these will work if the number is the first thing in the string, but this one will:
(' ' + str).replace(/.*\D(\d+)/,
function(s, n) {
return s.slice(1, -n.length) + ((+n) + 1)
})
(Why does Javascript need three different substring functions?)
Here's the simplest way I can think of:
var str = 'a-10-b-19-c';
var arr = str.split('-');
arr[3] = parseInt(arr[3]) + 1;
str = arr.join('-');
Edit to explain:
The split() method takes the parameter (in this case the hyphen) and breaks it up into an array at each instance it finds. If you type arr into your JavaScript console after this part runs you'll get ["a", "10", "b", "19", "c"]
We know that we need to change the 4th item here, which is accessed by index 3 via arr[3]. Each piece of this array is a string. If you try to increment a string by 1 it will simply concatenate the string with a 1 (JS uses the + for addition and concatenation) so you need to use parseInt() to make it an integer before you do the addition.
Then we use the join() method to glue the array back together into a string!
Try this one:
var str='a-10-b-19-c';
var pattern=/\d+/g;
var matches=pattern.exec(str);
var last=matches[0];
while((matches=pattern.exec(str))!=null)
{
last=matches[0];
}
var newStr=str.replace(last, parseInt(last)+1);
console.log(newStr);
The code outputs a-10-b-20-c
have a regex problem :(
what i would like to do is to find out the contents between two or more numbers.
var string = "90+*-+80-+/*70"
im trying to edit the symbols in between so it only shows up the last symbol and not the ones before it. so trying to get the above variable to be turned into 90+80*70. although this is just an example i have no idea how to do this. the length of the numbers, how many "sets" of numbers and the length of the symbols in between could be anything.
many thanks,
Steve,
The trick is in matching '90+-+' and '80-+/' seperately, and selecting only the number and the last constant.
The expression for finding the a number followed by 1 or more non-numbers would be
\d+[^\d]+
To select the number and the last non-number, add parens:
(\d+)[^\d]*([^\d])
Finally add a /g to repeat the procedure for each match, and replace it with the 2 matched groups for each match:
js> '90+*-+80-+/*70'.replace(/(\d+)[^\d]*([^\d])/g, '$1$2');
90+80*70
js>
Or you can use lookahead assertion and simply remove all non-numerical characters which are not last: "90+*-+80-+/*70".replace(/[^0-9]+(?=[^0-9])/g,'');
You can use a regular expression to match the non-digits and a callback function to process the match and decide what to replace:
var test = "90+*-+80-+/*70";
var out = test.replace(/[^\d]+/g, function(str) {
return(str.substr(-1));
})
alert(out);
See it work here: http://jsfiddle.net/jfriend00/Tncya/
This works by using a regular expression to match sequences of non-digits and then replacing that sequence of non-digits with the last character in the matched sequence.
i would use this tutorial, first, then review this for javascript-specific regex questions.
This should do it -
var string = "90+*-+80-+/*70"
var result = '';
var arr = string.split(/(\d+)/)
for (i = 0; i < arr.length; i++) {
if (!isNaN(arr[i])) result = result + arr[i];
else result = result + arr[i].slice(arr[i].length - 1, arr[i].length);
}
alert(result);
Working demo - http://jsfiddle.net/ipr101/SA2pR/
Similar to #Arnout Engelen
var string = "90+*-+80-+/*70";
string = string.replace(/(\d+)[^\d]*([^\d])(?=\d+)/g, '$1$2');
This was my first thinking of how the RegEx should perform, it also looks ahead to make sure the non-digit pattern is followed by another digit, which is what the question asked for (between two numbers)
Similar to #jfriend00
var string = "90+*-+80-+/*70";
string = string.replace( /(\d+?)([^\d]+?)(?=\d+)/g
, function(){
return arguments[1] + arguments[2].substr(-1);
});
Instead of only matching on non-digits, it matches on non-digits between two numbers, which is what the question asked
Why would this be any better?
If your equation was embedded in a paragraph or string of text. Like:
This is a test where I want to clean up something like 90+*-+80-+/*70 and don't want to scrap the whole paragraph.
Result (Expected) :
This is a test where I want to clean up something like 90+80*70 and don't want to scrap the whole paragraph.
Why would this not be any better?
There is more pattern matching, which makes it theoretically slower (negligible)
It would fail if your paragraph had embedded numbers. Like:
This is a paragraph where Sally bought 4 eggs from the supermarket, but only 3 of them made it back in one piece.
Result (Unexpected):
This is a paragraph where Sally bought 4 3 of them made it back in one piece.