Need a regex that finds "string" but not "[string]" - javascript

I'm trying to build a regular expression that parses a string and skips things in brackets.
Something like
string = "A bc defg hi [hi] jkl mnop.";
The .match() should return "hi" but not [hi]. I've spent 5 hours running through RE's but I'm throwing in the towel.
Also this is for javascript or jquery if that matters.
Any help is appreciated. Also I'm working on getting my questions formatted correctly : )
EDIT:
Ok I just had a eureka moment and figured out that the original RegExp I was using actually did work. But when I was replaces the matches with the [matches] it simply replaced the first match in the string... over and over. I thought this was my regex refusing to skip the brackets but after much time of trying almost all of the solutions below, I realized that I was derping Hardcore.
When .replace was working its magic it was on the first match, so I quite simply added a space to the end of the result word as follows:
var result = string.match(regex);
var modifiedResult = '[' + result[0].toString() + ']';
string.replace(result[0].toString() + ' ', modifiedResult + ' ');
This got it to stop targeting the original word in the string and stop adding a new set of brackets to it with every match. Thank you all for your help. I am going to give answer credit to the post that prodded me in the right direction.

preprocess the target string by removing everything between brackets before trying to match your RE
string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*\]/, "")
then apply your RE to tmpstring
correction: made the match for brackets eager per nhahtd comment below, and also, made the RE global
string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*?\]/g, "")

You don't necessarily need regex for this. Simply use string manipulation:
var arr = string.split("[");
var final = arr[0] + arr[1].split("]")[1];
If there are multiple bracketed expressions, use a loop:
while (string.indexOf("[") != -1){
var arr = string.split("[");
string = arr[0] + arr.slice(1).join("[").split("]").slice(1).join("]");
}

Using only Regular Expressions, you can use:
hi(?!])
as an example.
Look here about negative lookahead: http://www.regular-expressions.info/lookaround.html
Unfortunately, javascript does not support negative lookbehind.
I used http://regexpal.com/ to test, abcd[hi]jkhilmnop as test data, hi(?!]) as the regex to find. It matched 'hi' without matching '[hi]'. Basically it matched the 'hi' so long as there was not a following ']' character.
This of course, can be expanded if needed. This has a benefit of not requiring any pre-processing for the string.

r"\[(.*)\]"
Just play arounds with this if you wanto to use regular expressions.

What do yo uwant to do with it? If you want to selectively replace parts like "hi" except when it's "[hi]", then I often use a system where I match what I want to avoid first and then what I want to watch; if it matches what I want to avoid then I return the match, otherwise I return the processed match.
Like this:
return string.replace(/(\[\w+\])|(\w+)/g, function(all, m1, m2) {return m1 || m2.toUpperCase()});
which, with the given string, returns:
"A BC DEFG HI [hi] JKL MNOP."
Thus: it replaces every word with uppercase (m1 is empty), except if the word is between square brackets (m1 is not empty).

This builds an array of all the strings contained in [ ]:
var regex = /\[([^\]]*)\]/;
var string = "A bc defg hi [hi] [jkl] mnop.";
var results=[], result;
while(result = regex.exec(string))
results.push(result[1]);
edit
To answer to the question, this regex returns the string less all is in [ ], and trim whitespaces:
"A bc defg [hi] mnop [jkl].".replace(/(\s{0,1})\[[^\]]*\](\s{0,1})/g,'$1')

Instead of skipping the match you can probably try something different - match everything but do not capture the string within square brackets (inclusive) with something like this:
var r = /(?:\[.*?[^\[\]]\])|(.)/g;
var result;
var str = [];
while((result = r.exec(s)) !== null){
if(result[1] !== undefined){ //true if [string] matched but not captured
str.push(result[1]);
}
}
console.log(str.join(''));
The last line will print parts of the string which do not match the [string] pattern. For example, when called with the input "A [bc] [defg] hi [hi] j[kl]u m[no]p." the code prints "A hi ju mp." with whitespaces intact.
You can try different things with this code e.g. replacing etc.

Related

RegExp match a single quoted text without quotes - JavaScript

I'm sorry if it is a confusing question. I was trying to find a way to do this but couldn't find it so, if it is a repeated question, my apologies!
I have a text something like this: something:"askjnqwe234"
I want to be able to get askjnqwe234 using a RegExp. You can notice I want to omit the quotes. I was trying this using /[^"]+(?=(" ")|"$)/g but it returns an array. I want a RegExt to return a single string, not an array.
I don't know if it's possible but I do not want to specify the position of the array; something like this:
var x = string.match(/[^"]+(?=(" ")|"$)/g)[0];
Thanks!
Try:
/"([^"]*)"/g
in English: look for " the match and record anything that isn't " till you see another "".
match and exec always return an array or null, so, assuming you have a single double-quoted value and no newlines in the string, you could use
var x;
var str = 'something:"askjnqwe234"';
x = str.replace( /^[^"]*"|".*/g, '' );
// "askjnqwe234"
Or, if you may have other quoted values in the string
x = str.replace( /.*?something:"([^"]*)".*/, '$1' );
where $1 refers to the substring captured by the sub-pattern [^"]* between the ().
Further explanation on request.
Notwithstanding the above, I recommend that you tolerate the array indexing and just use match.
You can capture the information inside quotes like this, assuming it matches:
var x = string.match(/something:"([^"]*)"/)[1];
The memory capture at index 1 is the part inside the double quotes.
If you're not sure it will match:
var match = string.match(/something:"([^"]*)"/);
if (match) {
// use match[1] here
}

How to detect a series of characters in a string?

For example, I have a string:
"This is the ### example"
I would like to substring the ### out of the above string?
The number of Hash keys may vary, so I would like to find out and replace the ### pattern with, say, 001 for example.
Can anybody help?
You can also do a replace. I am familiar with the C# version of this,
string stringValue = "Thia is the ### example";
stringValue.Replace("###", "");
This would remove ### completely from the above string. Again you would have to know the exact string.
In JavaScript, it's similar - .replace (with a lowercase r) is used. So:
var stringValue = "This is the ### example";
var replacedValue = stringValue.replace('###', '');
You'll want to investigate either "Regular Expressions" for this, or, if you know the precise position and length of the characters you are interested in, you can simply use String's .substring method.
If you want to capture multiple # characters, then you'll need regular expressions:
var myString = "This is #### the example";
var result = myString.replace(/#+/g, '');
If you want to remove the space too, you can use the regex /#+\s|\s#+|#+/.
If the rest of the string is known, just get the part that you need:
var example = str.substr(12, str.length - 20);
The javascript match method will return an array of substrings matching a regular expression. You can use this to determine the number of matching characters to be replaced. Assuming you want to replace each octothorpe with a random digit, you could use code like this:
var exampleStr = "This is the ### example";
var swapThese = exampleStr.match(/#/g);
if (swapThese) {
for (var i=0;i<swapThese.length;i++) {
var swapThis = new RegExp(swapThese[i]);
exampleStr = exampleStr.replace(swapThis,Math.floor(Math.random()*9));
}
}
alert(exampleStr); // or whatever you want to do with it
Note that the code only loops the length of the array if it's present: if (swapThese) {
This check is necessary because if the match method finds no matches, it returns null rather than an empty array. Trying to iterate through null value will break.

getting contents of string between digits

have a regex problem :(
what i would like to do is to find out the contents between two or more numbers.
var string = "90+*-+80-+/*70"
im trying to edit the symbols in between so it only shows up the last symbol and not the ones before it. so trying to get the above variable to be turned into 90+80*70. although this is just an example i have no idea how to do this. the length of the numbers, how many "sets" of numbers and the length of the symbols in between could be anything.
many thanks,
Steve,
The trick is in matching '90+-+' and '80-+/' seperately, and selecting only the number and the last constant.
The expression for finding the a number followed by 1 or more non-numbers would be
\d+[^\d]+
To select the number and the last non-number, add parens:
(\d+)[^\d]*([^\d])
Finally add a /g to repeat the procedure for each match, and replace it with the 2 matched groups for each match:
js> '90+*-+80-+/*70'.replace(/(\d+)[^\d]*([^\d])/g, '$1$2');
90+80*70
js>
Or you can use lookahead assertion and simply remove all non-numerical characters which are not last: "90+*-+80-+/*70".replace(/[^0-9]+(?=[^0-9])/g,'');
You can use a regular expression to match the non-digits and a callback function to process the match and decide what to replace:
var test = "90+*-+80-+/*70";
var out = test.replace(/[^\d]+/g, function(str) {
return(str.substr(-1));
})
alert(out);
See it work here: http://jsfiddle.net/jfriend00/Tncya/
This works by using a regular expression to match sequences of non-digits and then replacing that sequence of non-digits with the last character in the matched sequence.
i would use this tutorial, first, then review this for javascript-specific regex questions.
This should do it -
var string = "90+*-+80-+/*70"
var result = '';
var arr = string.split(/(\d+)/)
for (i = 0; i < arr.length; i++) {
if (!isNaN(arr[i])) result = result + arr[i];
else result = result + arr[i].slice(arr[i].length - 1, arr[i].length);
}
alert(result);
Working demo - http://jsfiddle.net/ipr101/SA2pR/
Similar to #Arnout Engelen
var string = "90+*-+80-+/*70";
string = string.replace(/(\d+)[^\d]*([^\d])(?=\d+)/g, '$1$2');
This was my first thinking of how the RegEx should perform, it also looks ahead to make sure the non-digit pattern is followed by another digit, which is what the question asked for (between two numbers)
Similar to #jfriend00
var string = "90+*-+80-+/*70";
string = string.replace( /(\d+?)([^\d]+?)(?=\d+)/g
, function(){
return arguments[1] + arguments[2].substr(-1);
});
Instead of only matching on non-digits, it matches on non-digits between two numbers, which is what the question asked
Why would this be any better?
If your equation was embedded in a paragraph or string of text. Like:
This is a test where I want to clean up something like 90+*-+80-+/*70 and don't want to scrap the whole paragraph.
Result (Expected) :
This is a test where I want to clean up something like 90+80*70 and don't want to scrap the whole paragraph.
Why would this not be any better?
There is more pattern matching, which makes it theoretically slower (negligible)
It would fail if your paragraph had embedded numbers. Like:
This is a paragraph where Sally bought 4 eggs from the supermarket, but only 3 of them made it back in one piece.
Result (Unexpected):
This is a paragraph where Sally bought 4 3 of them made it back in one piece.

java script Regular Expressions patterns problem

My problem start with like-
var str='0|31|2|03|.....|4|2007'
str=str.replace(/[^|]\d*[^|]/,'5');
so the output becomes like:"0|5|2|03|....|4|2007" so it replaces 31->5
But this doesn't work for replacing other segments when i change code like this:
str=str.replace(/[^|]{2}\d*[^|]/,'6');
doesn't change 2->6.
What actually i am missing here.Any help?
I think a regular expression is a bad solution for that problem. I'd rather do something like this:
var str = '0|31|2|03|4|2007';
var segments = str.split("|");
segments[1] = "35";
segments[2] = "123";
Can't think of a good way to solve this with a regexp.
Here is a specific regex solution which replaces the number following the first | pipe symbol with the number 5:
var re = /^((?:\d+\|){1})\d+/;
return text.replace(re, '$15');
If you want to replace the digits following the third |, simply change the {1} portion of the regex to {3}
Here is a generalized function that will replace any given number slot (zero-based index), with a specified new number:
function replaceNthNumber(text, n, newnum) {
var re = new RegExp("^((?:\\d+\\|){"+ n +'})\\d+');
return text.replace(re, '$1'+ newnum);
}
Firstly, you don't have to escape | in the character set, because it doesn't have any special meaning in character sets.
Secondly, you don't put quantifiers in character sets.
And finally, to create a global matching expression, you have to use the g flag.
[^\|] means anything but a '|', so in your case it only matches a digit. So it will only match anything with 2 or more digits.
Second you should put the {2} outside of the []-brackets
I'm not sure what you want to achieve here.

assign matched values from jquery regex match to string variable

I am doing it wrong. I know.
I want to assign the matched text that is the result of a regex to a string var.
basically the regex is supposed to pull out anything in between two colons
so blah:xx:blahdeeblah
would result in xx
var matchedString= $(current).match('[^.:]+):(.*?):([^.:]+');
alert(matchedString);
I am looking to get this to put the xx in my matchedString variable.
I checked the jquery docs and they say that match should return an array. (string char array?)
When I run this nothing happens, No errors in the console but I tested the regex and it works outside of js. I am starting to think I am just doing the regex wrong or I am completely not getting how the match function works altogether
I checked the jquery docs and they say that match should return an array.
No such method exists for jQuery. match is a standard javascript method of a string. So using your example, this might be
var str = "blah:xx:blahdeeblah";
var matchedString = str.match(/([^.:]+):(.*?):([^.:]+)/);
alert(matchedString[2]);
// -> "xx"
However, you really don't need a regular expression for this. You can use another string method, split() to divide the string into an array of strings using a separator:
var str = "blah:xx:blahdeeblah";
var matchedString = str.split(":"); // split on the : character
alert(matchedString[1]);
// -> "xx"
String.match
String.split

Categories

Resources