Javascript regular expression matching prior and trailing characters - javascript

I have this string in a object:
<FLD>dsfgsdfgdsfg;NEW-7db5-32a8-c907-82cd82206788</FLD><FLD>dsfgsdfgsd;NEW-480e-e87c-75dc-d70cd731c664</FLD><FLD>dfsgsdfgdfsgfd;NEW-0aad-440a-629c-3e8f7eda4632</FLD>
this.model.get('value_long').match(/[<FLD>\w+;](NEW[-|\d|\w]+)[</FLD>]/g)
Returns:
[";NEW-7db5-32a8-c907-82cd82206788<", ";NEW-480e-e87c-75dc-d70cd731c664<", ";NEW-0aad-440a-629c-3e8f7eda4632<"]
What is wrong with my regular expression that it is picking up the preceding ; and trailing <
here is a link to the regex
http://regexr.com?30k3m
Updated:
this is what I would like returned:
["NEW-7db5-32a8-c907-82cd82206788", "NEW-480e-e87c-75dc-d70cd731c664", "NEW-0aad-440a-629c-3e8f7eda4632"]
here is a JSfiddle for it
http://jsfiddle.net/mwagner72/HHMLK/

Square brackets create a character class, which you do not want here, try changing your regex to the following:
<FLD>\w+;(NEW[-\d\w]+)</FLD>
Since it looks like you want to grab the capture group from each match, you can use the following code to construct an array with the capture group in it:
var regex = /<FLD>\w+;(NEW[\-\d\w]+)<\/FLD>/g;
var match = regex.exec(string);
var matches = [];
while (match !== null) {
matches.push(match[1]);
match = regex.exec(string);
}
[<FLD>\w+;] would match one of the characters inside of the square brackets, when I think what you actually want to do is match all of those. Also for the other character class, [-|\d|\w], you can remove the | because it is already implied in a character class, | should only be used for alternation inside of a group.
Here is an updated link with the new regex: http://jsfiddle.net/RTkzx/1

Related

select the key and the value without the equal using regex

Hava a string like this:
"let key1=value1; let key2=value2;"
I want to select the key and value as groups using regex, I've tried using look around.
/(\w+)(?=\=)(\w+);/g
but it doesn't work with me, any suggestions?
The following regex should do the trick: (let (\w+) ?= ?(\w+);?)+.
Each let statement will be a match where the key will be the group 2 and the value the group 3.
The (?=\=) expression part is a lookahead, a zero-width assertion, it does not consume text but requires it to be present on the right. When you say (?=\=)(\w+) you want \w+ pattern to start matching on =. As \w does not match =, your regex always fails.
Use
/(\w+)=(\w+);/g
JavaScript (borrowed from How do you access the matched groups in a JavaScript regular expression?):
var myString = "let key1=value1; let key2=value2;";
var myRegexp = /(\w+)=(\w+);/g;
match = myRegexp.exec(myString);
while (match != null) {
console.log(match[1] + "," + match[2]);
match = myRegexp.exec(myString);
}
To be more specific, we could use (\w+) ?= ?(\w+);?+

Javascript Regex Word Boundary with optional non-word character

I am looking to find a keyword match in a string. I am trying to use word boundary, but this may not be the best case for that solution. The keyword could be any word, and could be preceded with a non-word character. The string could be any string at all and could include all three of these words in the array, but I should only match on the keyword:
['hello', '#hello', '#hello'];
Here is my code, which includes an attempt found in post:
let userStr = 'why hello there, or should I say #hello there?';
let keyword = '#hello';
let re = new RegExp(`/(#\b${userStr})\b/`);
re.exec(keyword);
This would be great if the string always started with #, but it does not.
I then tried this /(#?\b${userStr})\b/, but if the string does start with #, it tries to match ##hello.
The matchThis str could be any of the 3 examples in the array, and the userStr may contain several variations of the matchThis but only one will be exact
You need to account for 3 things here:
The main point is that a \b word boundary is a context-dependent construct, and if your input is not always alphanumeric-only, you need unambiguous word boundaries
You need to double escape special chars inside constructor RegExp notation
As you pass a variable to a regex, you need to make sure all special chars are properly escaped.
Use
let userStr = 'why hello there, or should I say #hello there?';
let keyword = '#hello';
let re_pattern = `(?:^|\\W)(${keyword.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')})(?!\\w)`;
let res = [], m;
// To find a single (first) match
console.log((m=new RegExp(re_pattern).exec(userStr)) ? m[1] : "");
// To find multiple matches:
let rx = new RegExp(re_pattern, "g");
while (m=rx.exec(userStr)) {
res.push(m[1]);
}
console.log(res);
Pattern description
(?:^|\\W) - a non-capturing string matching the start of string or any non-word char
(${keyword.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')}) - Group 1: a keyword value with escaped special chars
(?!\\w) - a negative lookahead that fails the match if there is a word char immediately to the right of the current location.
Check whether the keyword already begins with a special character. If it does, don't include it in the regular expression.
var re;
if ("##".indexOf(keyword[0]) == -1) {
re = new RegExp(`[##]?\b${keyword}\b`);
} else {
re = new RegExp(`\b${keyword}\b`);
}

Get all characters not matching the Reg expression Pattern in Javascript

I have below requirement where a entered text must match any of below allowed character list and get all characters not matching the reg exp pattern.
0-9
A-Z,a-z
And special characters like:
space,.#,-_&()'/*=:;
carriage return
end of line
The regular expression which I could construct is as below
/[^a-zA-Z0-9\ \.#\,\r\n*=:;\-_\&()\'\/]/g
For an given example, say input='123.#&-_()/*=:/\';#$%^"~!?[]av'. The invalid characters are '#$%^"~!?[]'.
Below is the approach I followed to get the not matched characters.
1) Construct the negation of allowed reg expn pattern like below.
/^([a-zA-Z0-9\ \.#\,\r\n*=:;\-_\&()\'\/])/g (please correct if this reg exp is right?)
2) Use replace function to get all characters
var nomatch = '';
for (var index = 0; index < input.length; index++) {
nomatch += input[index].replace(/^([a-zA-Z0-9\ \.#\,\r\n*=:;\-_\&()\'\/])/g, '');
}
so nomatch='#$%^"~!?[]' // finally
But here the replace function always returns a single not matched character. so using a loop to get all. If the input is of 100 characters then it loops 100 times and is unnecessary.
Is there any better approach get all characters not matching reg exp pattern in below lines.
A better regular expression to get not allowed characters(than the negation of reg exp I have used above)?
Avoid unnecessary looping?
A single line approach?
Great Thanks for any help on this.
You can simplify it by using reverse regex and replace all allowed characters by empty string so that output will have only not-allowed characters left.:
var re = /[\w .#,\r\n*=:;&()'\/-]+/g
var input = '123.#&-_()/*=:/\';#$%^"~!?[]av'
var input = input.replace(re, '')
console.log(input);
//=> "#$%^"~!?[]"
Also note that many special characters don't need to be escaped inside a character class.

Javascript regular expression between brackets

Let's say in the following text
I want [this]. I want [this too]. I don't want \[this]
I want the contents of anything between [] but not \[]. How would I go about doing that? So far I've got /\[([^\]]+)\]/gi. but it matched everything.
Use this one: /(?:^|[^\\])\[(.*?)\]/gi
Here's a working example: http://regexr.com/3clja
?: Non-capturing group
^|[^\\] Beggining of string or anything but \
\[(.*?)\] Match anything between []
Here's a snippet:
var string = "[this i want]I want [this]. I want [this too]. I don't want \\[no]";
var regex = /(?:^|[^\\])\[(.*?)\]/gi;
var match = null;
document.write(string + "<br/><br/><b>Matches</b>:<br/> ");
while(match = regex.exec(string)){
document.write(match[1] + "<br/>");
}
Use this regexp, which first matches the \[] version (but doesn't capture it, thereby "throwing it away"), then the [] cases, capturing what's inside:
var r = /\\\[.*?\]|\[(.*?)\]/g;
^^^^^^^^^ MATCH \[this]
^^^^^^^^^ MATCH [this]
Loop with exec to get all the matches:
while(match = r.exec(str)){
console.log(match[1]);
}
/(?:[^\\]|^)\[([^\]]*)/g
The content is in the first capture group, $1
(?:^|[^\\]) matches the beginning of a line or anything that's not a slash, non-capturing.
\[ matches a open bracket.
([^\]]*) captures any number of consecutive characters that are not closed brackets
\] matches a closing bracket

Regex match Array words with dash

I want to match some keywords in the url
var parentURL = document.referrer;
var greenPictures = /redwoods-are-big.jpg|greendwoods-are-small.jpg/;
var existsGreen = greenPictures.test(parentURL);
var existsGreen turns true when it finds greendwoods-are-small.jpg but also when it finds small.jpg
What can i do that it only turns true if there is exactly greendwoods-are-small.jpg?
You can use ^ to match the beginning of a string and $ to match the end:
var greenPictures = /^(redwoods-are-big.jpg|greendwoods-are-small.jpg)$/;
var existsGreen = greenPictures.test(parentURL);
But of cause the document.referrer is not equal ether redwoods-are-big.jpg or greendwoods-are-small.jpg so i would match /something.png[END]:
var greenPictures = /\/(redwoods-are-big\.jpg|greendwoods-are-small\.jpg)$/; // <-- See how I escaped the / and the . there? (\/ and \.)
var existsGreen = greenPictures.test(parentURL);
Try this regex:
/(redwoods-are-big|greendwoods-are-small)\.jpg/i
I used the i flag for ignoring the character cases in parentURL variable.
Description
Demo
http://regex101.com/r/aI4yJ6
Dashes does not have any special meaning outside character sets, e.g.:
[a-f], [^x-z] etc.
The characters with special meaning in your regexp is | and .
/redwoods-are-big.jpg|greendwoods-are-small.jpg/
| denotes either or.
. matches any character except the newline characters \n \r \u2028 or \u2029.
In other words: There is something else iffy going on in your code.
More on RegExp.
Pages like these can be rather helpful if you struggle with writing regexp's:
regex101 (with sample)
RegexPlanet
RegExr
Debuggex
etc.

Categories

Resources