RegEx for matching the first word - javascript

I have the following prop {priority} that outputs β€˜high priority’, is there a way I can render it simply as β€˜high’? could I use standard js or something like the below?
var getPriority = {priority};
var priority = getPriority.replace( regex );
console.log( priority );
How do I solve this problem?

If you wish to do that with a regular expression, this expression would do so, even if there might be a misspelling in the word "priority":
(.+)(\s[priorty]+)
It can simply use capturing groups for capturing your desired word before "priority". If you wish to add any boundaries to it, it would be much easier to do so, especially if your input string would change.
Graph
This graph shows how the expression would work and you can visualize other expressions in this link:
const regex = /(.+)(\s[priorty]+)/gmi;
const str = `high priority
low priority
medium priority
under-processing pririty
under-processing priority
400-urget priority
400-urget Priority
400-urget PRIority`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Performance Test
This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "high priority";
var regex = /(.+)(\s[priorty]+)/gmi;
var match = string.replace(regex, "$1");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match πŸ’šπŸ’šπŸ’š ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");

you can use substring to get your required string
var str = 'high priority';
console.log(str.substring(0, 4));
// expected output: "high"
so in your code
var getPriority = {priority};
var priority = getPriority.priority.substring(0, 4);
console.log( priority );

You can simply get the only first element of string using .split():
Code below will show first word of string:
var getPriority = {priority};
console.log( getPriority.priority.split(' ', 1)[0]);
Or if priority value always has priority word in the end, you can get rid of it just making it as a separator for .split():
var getPriority = {priority};
console.log( getPriority.priority.split(' priority')[0] );

Related

Regex NodeJS Javascript not applying

I have the following node.js code:
let regex = /^<#&\!?(\d+)>$/g;
console.log("Before match ", gettingargR) // gettingargR = <#&702877893436375100> or <#&!702877893436375100>
gettingargR = gettingargR.match(regex);
console.log("After match ", gettingargR)
console.log("After match value", gettingargR[0])
My target (expected) return is After match [ '702877893436375100' ] and After match value 702877893436375100
When I try to achieve this, console returns:
Before match <#&702877893436375100>
After match [ '<#&702877893436375100>' ]
After match value <#&702877893436375100>
Which means my regex isn't applying. How can I apply it to my string?
You need to remove the g flag from your regexp. The global flag turns off the capture group because it needs to return multiple values therefore the interpretation of the return array changes from a specification of the match to a list of complete matches.
let regex = /^<#&\!?(\d+)>$/; // remove the g flag
console.log("Before match ", gettingargR);
gettingargR = gettingargR.match(regex);
console.log("After match ", gettingargR)
console.log("After match value", gettingargR[1]); // capture group is element 1
Without the g flag the interpretation of the result is different. The return array now has the following structure:
[
'<#&702877893436375100>', // full match
'702877893436375100', // first capture group
// second capture group (if any),
// third capture group (if any),
// etc..
]
For matching all in the big file you can use exec function. For simple first match use reg without "g" flag:
Sample:
const matchAll = (str, reg) => {
let m;
let result = [];
do {
m = reg.exec(str);
if (m) {
result.push(m[1]);
}
} while (m);
return result;
};
gettingargR = `<#&702877893436375100>
<#&!702877893436375100>`;
let regex = /^<#&\!?(\d+)>$/gm;
console.log("After match ", matchAll(gettingargR, regex));
Maybe this could help.
const regex = /^<#&\!?(\d+)>$/g;
const str = `<#&702877893436375100>`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Verified from https://regex101.com/

Regex optimization and best practice

I need to parse information out from a legacy interface. We do not have the ability to update the legacy message. I'm not very proficient at regular expressions, but I managed to write one that does what I want it to do. I just need peer-review and feedback to make sure it's clean.
The message from the legacy system returns values resembling the example below.
%name0=value
%name1=value
%name2=value
Expression: /\%(.*)\=(.*)/g;
var strBody = body_text.toString();
var myRegexp = /\%(.*)\=(.*)/g;
var match = myRegexp.exec(strBody);
var objPair = {};
while (match != null) {
if (match[1]) {
objPair[match[1].toLowerCase()] = match[2];
}
match = myRegexp.exec(strBody);
}
This code works, and I can add partial matches the middle of the name/values without anything breaking. I have to assume that any combination of characters could appear in the "values" match. Meaning it could have equal and percent signs within the message.
Is this clean enough?
Is there something that could break the expression?
First of all, don't escape characters that don't need escaping: %(.*)=(.*)
The problem with your expression: An equals sign in the value would break your parser. %name0=val=ue would result in name0=val=ue instead of name0=val=ue.
One possible fix is to make the first repetition lazy by appending a question mark: %(.*?)=(.*)
But this is not optimal due to unneeded backtracking. You can do better by using a negated character class: %([^=]*)=(.*)
And finally, if empty names should not be allowed, replace the first asterisk with a plus: %([^=]+)=(.*)
This is a good resource: Regex Tutorial - Repetition with Star and Plus
Your expression is fine, and wrapping it with two capturing groups is simple to get your desired variables and values.
You likely may not need to escape some chars and it would still work.
You can use this tool and test/edit/modify/change your expressions if you wish:
%(.+)=(.+)
Since your data is pretty structured, you can also do so with string split and get the same desired outputs, if you want.
RegEx Descriptive Graph
This graph shows how the expression would work and you can visualize other expressions in this link:
JavaScript Test
const regex = /%(.+)=(.+)/gm;
const str = `%name0=value
%name1=value
%name2=value`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Performance Test
This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.
const repeat = 1000000;
const start = Date.now();
for (var i = repeat; i >= 0; i--) {
const string = '%name0=value';
const regex = /(%(.+)=(.+))/gm;
var match = string.replace(regex, "\nGroup #1: $1 \n Group #2: $2 \n Group #3: $3 \n");
}
const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match πŸ’šπŸ’šπŸ’š ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");

Named Groups in Regexp transformed on new `RegExp` construction from string

I am trying to build a Regexp from a series of smaller Regexes in either string or primitive form.
I'm using Node v10.15.0.
Here are my 3 components individually
Month Matcher: /\b(?<month>\bjan(?:uary)?\b|\bfeb(?:ruary)?\b|\bmar(?:ch)?\b|\bapr(?:il)?\b|\bmay\b|\bjun(?:e)?\b|\bjul(?:y)?\b|\baug(?:ust)?\b|\bsep(?:tember)?\b|\boct(?:ober)?\b|\bnov(?:ember)?\b|\bdec(?:ember)?\b)/i
Day Matcher: /(?<day>\d{1,2})/i
Year Matcher: /(?<year>20\d\d)/i
I am trying to create a Regexp from each of these which would look something like this:
new RegExp(/\b(?<month>\bjan(?:uary)?\b|\bfeb(?:ruary)?\b|\bmar(?:ch)?\b|\bapr(?:il)?\b|\bmay\b|\bjun(?:e)?\b|\bjul(?:y)?\b|\baug(?:ust)?\b|\bsep(?:tember)?\b|\boct(?:ober)?\b|\bnov(?:ember)?\b|\bdec(?:ember)?\b) (?<day>\d{1,2}), (?<year>20\d\d)/i);
This would match 'Apr 14, 2018', 'Jun 25, 2019' etc etc.
I've made a number of attempts constructing with:
new RegExp(/my-pattern/i)
new RegExp('my-pattern' + 'my-other-pattern, 'i')
new RegExp(new RegExp('my-pattern', 'i') + new RegExp('other-pattern', 'i') (this one feels most silly).
One strange effect I noticed was that when I tried to build a string . via addition, the constructor would clip the output - see how the 'month' named group is altered below:
var z = new RegExp('\b(?<month>\bjan(?:uary)?\b|\bfeb(?:ruary)?\b|\bmar(?:ch)?\b|\bapr(?:il)?\b|\bmay\b|\bjun(?:e)?\b|\bjul(?:y)?\b|\baug(?:ust)?\b|\b
sep(?:tember)?\b|\boct(?:ober)?\b|\bnov(?:ember)?\b|\bdec(?:ember)?\b)' + '(?<day>\d{1,2})', 'i');
undefined
>>> (?<monthjan(?:uary)feb(?:ruary)mar(?:ch)apr(?:il)majun(?:e)jul(?:y)aug(?:ust)sep(?:tember)oct(?:ober)nov(?:ember)dec(?:ember))(?<day>d{1,2})/i
Can anyone advise on the best approach for this? Otherwise I'm likely to declare the months/days/years matchers over and over again in very verbose patterns.
Thanks
This expression might help you to match your desired date strings.
((1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?[0-9]{2})|(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}
You can simplify it and reduce the boundaries if you wish.
RegEx Descriptive Graph
This graph visualizes the expression, and if you want, you can test other expressions in this link:
JavaScript Test
const regex = /((1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?[0-9]{2})|(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}/gm;
const str = `Apr 14, 2018`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Basic Performance Test
This JavaScript snippet returns runtime of a 1-million times for loop for performance.
const repeat = 1;
const start = Date.now();
for (var i = repeat; i >= 0; i--) {
const string = 'Apr 14, 2018';
const regex = /(((1[0-2]|0?[1-9])\/(3[01]|[12][0-9]|0?[1-9])\/(?:[0-9]{2})?[0-9]{2})|(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})/gm;
var match = string.replace(regex, "Group #1: $1");
}
const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match πŸ’šπŸ’šπŸ’š ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");

How to search strings with brackets using Regular expressions

I have a case wherein I want to search for all Hello (World) in an array. Hello (World) is coming from a variable and can change. I want to achieve this using RegExp and not indexOf or includes methods.
testArray = ['Hello (World', 'Hello (World)', 'hello (worlD)']
My match should return index 1 & 2 as answers.
Use the RegExp constructor after escaping the string (algorithm from this answer), and use some array methods:
const testArray = ['Hello (World', 'Hello (World)', 'hello (worlD)'];
const string = "Hello (World)".replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
const regex = new RegExp(string, "i");
const indexes = testArray.map((e, i) => e.match(regex) == null ? null : i).filter(e => e != null);
console.log(indexes);
This expression might help you to do so:
(\w+)\s\((\w+)
You may not need to bound it from left and right, since your input strings are well structured. You might just focus on your desired capturing groups, which I have assumed, each one is a single word, which you can simply modify that.
With a simple string replace you can match and capture both of them.
RegEx Descriptive Graph
This graph shows how the expression would work and you can visualize other expressions in this link:
Performance Test
This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.
repeat = 1000000;
start = Date.now();
for (var i = repeat; i >= 0; i--) {
var string = "Hello (World";
var regex = /(\w+)\s\((\w+)/g;
var match = string.replace(regex, "$1 & $2");
}
end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match πŸ’šπŸ’šπŸ’š ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
testArray = ['Hello (World', 'Hello (World)', 'hello (worlD)'];
let indexes = [];
testArray.map((word,i)=>{
if(word.match(/\(.*\)/)){
indexes.push(i);
}
});
console.log(indexes);

Regular Expression Split and match

I'm learning how to use regular expressions and am a bit confused by something hopefully some one can clarify, if I use the following string and expression I get the expected results with match but the exact opposite if I use split. Beating my head against the wall I don't understand why?
var a = "212,0,,456,,0,67889";
var patt = /,\d{1,5},/gmi;
pos=a.match(patt);
alert(pos);// returns ,0, ,456, and ,0,
pos=a.split(patt);
alert(pos); //returns 212, and ,67889
Split means, look for a match of the pattern on the string and split that string every time you see a match. Also Remove each match you find.
This link has some good examples:
http://www.tizag.com/javascriptT/javascript-string-split.php
"~ a delimiter is used by the split function as a way of breaking up the string. Every time it sees the delimiter we specified, it will create a new element in an array. The first argument of the split function is the delimiter." (The delimiter is the pattern)
Example one:
<script type="text/javascript">
var myString = "123456789";
var mySplitResult = myString.split("5");
document.write("The first element is " + mySplitResult[0]);
document.write("<br /> The second element is " + mySplitResult[1]);
</script>
Output:
The first element is 1234
The second element is 6789
"Make sure you realize that because we chose the 5 to be our delimiter, it is not in our result. This is because the delimiter is removed from the string and the remaining characters are separated by the chasm of space that the 5 used to occupy."
Example Two:
<script type="text/javascript">
var myString = "zero one two three four";
var mySplitResult = myString.split(" ");
for(i = 0; i < mySplitResult.length; i++){
document.write("<br /> Element " + i + " = " + mySplitResult[i]);
}
</script>
Output:
Element 0 = zero
Element 1 = one
Element 2 = two
Element 3 = three
Element 4 = four
Consecutive splits are ignored, so you are just getting the two strings that don't match the regex.
You can use split or match to achieve the same, but you need different regex. For instance, you can use for match:
\d+
Working demo
Code
var re = /\d+/g;
var str = '212,0,,456,,0,67889';
var m;
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
Or you can use this regex to split:
,+
Working demo

Categories

Resources