regex for ignoring character if inside () parenthesis? - javascript

I was doing some regex, but I get this bug:
I have this string for example "+1/(1/10)+(1/30)+1/50" and I used this regex /\+.[^\+]*/g
and it working fine since it gives me ['+1/(1/10)', '+(1/30)', '+1/50']
BUT the real problem is when the + is inside the parenthesis ()
like this: "+1/(1+10)+(1/30)+1/50"
because it will give ['+1/(1', '+10)', '+(1/30)', '+1/50']
which isn't what I want :(... the thing I want is ['+1/(1+10)', '+(1/30)', '+1/50']
so the regex if it see \(.*\) skip it like it wasn't there...
how to ignore in regex?
my code (js):
const tests = {
correct: "1/(1/10)+(1/30)+1/50",
wrong : "1/(1+10)+(1/30)+1/50"
}
function getAdditionArray(string) {
const REGEX = /\+.[^\+]*/g; // change this to ignore the () even if they have the + sign
const firstChar = string[0];
if (firstChar !== "-") string = "+" + string;
return string.match(REGEX);
}
console.log(
getAdditionArray(test.correct),
getAdditionArray(test.wrong),
)

You can exclude matching parenthesis, and then optionally match (...)
\+[^+()]*(?:\([^()]*\))?
The pattern matches:
\+ Match a +
[^+()]* Match optional chars other than + ( )
(?: Non capture group to match as a whole part
\([^()]*\) Match from (...)
)? Close the non capture group and make it optional
See a regex101 demo.
Another option could be to be more specific about the digits and the + and / and use a character class to list the allowed characters.
\+(?:\d+[+/])?(?:\(\d+[/+]\d+\)|\d+)
See another regex101 demo.

Related

Regex replace last part of url path if condition matches?

I am basically trying to remove the last part of a URL if the URL contains the path /ice/flag/. Example:
Input:
https://test.com/plants/ice/flag/237468372912873
Desired Output:
Because the above URL has /ice/flag/ in its path, I want the last part of the URL to be replaced with redacted.
https://test.com/plants/ice/flag/redacted
However, if the URL did not have /ice/flag (ex: https://test.com/plants/not_ice/flag/237468372912873), it shouldn't be replaced.
What I tried to do is to use the answer mentioned here to change the last part of the path:
var url = 'https://test.com/plants/ice/flag/237468372912873'
url = url.replace(/\/[^\/]*$/, '/redacted')
This works in doing the replacement, but I am unsure how to modify this so that it only matches if /ice/flag is in the path. I tried putting \/ice\/flag in certain parts of the regex to change the behavior to only replace if that is in the string, but nothing has been working. Any tips from those more experienced with regex on how to do this would be greatly appreciated, thank you!
Edit: The URL can be formed in different ways, so there may be other paths before or after /ice/flag/. So all of these are possibilities:
Input:
https://test.com/plants/ice/flag/237468372912873
https://test.com/plants/extra/ice/flag/237468372912873
https://test.com/plants/ice/flag/extra/237468372912873
https://test.com/plants/ice/flag/extra/extra/237468372912873
https://test.com/plants/ice/flag/extra/237468372912873?paramOne=1&paramTwo=2#someHash
Desired Output:
https://test.com/plants/ice/flag/redacted
https://test.com/plants/extra/ice/flag/redacted
https://test.com/plants/ice/flag/extra/redacted
https://test.com/plants/ice/flag/extra/extra/redacted
https://test.com/plants/ice/flag/extra/redacted?paramOne=1&paramTwo=2#someHash
You may search for this regex:
(\/ice\/flag\/(?:[^?#]*\/)?)[^\/#?]+
and replace it with:
$1redacted
RegEx Demo
RegEx Breakup:
(: Start capture group #1
\/ice\/flag\/: Match /ice/flag/
(?:[^?#]*\/)?: Match 0 or more of any char that is not # and ? followed by a / as an optional match
): End capture group #1
[^\/#?]+ Match 1+ of any char that is not / and # and ?
Code:
var arr = [
'https://test.com/plants/ice/flag/237468372912873',
'https://test.com/plants/ice/flag/a/b/237468372912873',
'https://test.com/a/ice/flag/e/237468372912873?p=2/12#aHash',
'https://test.com/plants/not_ice/flag/237468372912873'];
var rx = /(\/ice\/flag\/(?:[^?#\n]*\/)?)[^\/#?\n]+/;
var subst = '$1redacted';
arr.forEach(el => console.log(el.replace(rx, subst)));
Here is functional code with test input strings based on your requirements:
const input = [
'https://test.com/plants/ice/flag/237468372912873',
'https://test.com/plants/extra/ice/flag/237468372912873',
'https://test.com/plants/ice/flag/extra/237468372912873',
'https://test.com/plants/ice/flag/extra/extra/237468372912873',
'https://test.com/plants/ice/flag/extra/237468372912873#someHash',
'https://test.com/plants/ice/flag/extra/237468372912873?paramOne=1&paramTwo=2#someHash',
'https://test.com/plants/not_ice/flag/237468372912873'
];
const re = /(\/ice\/flag\/([^\/#?]+\/)*)[^\/#?]+/;
input.forEach(str => {
console.log('str: ' + str + '\n => ' + str.replace(re, '$1redacted'));
});
Output:
str: https://test.com/plants/ice/flag/237468372912873
=> https://test.com/plants/ice/flag/redacted
str: https://test.com/plants/extra/ice/flag/237468372912873
=> https://test.com/plants/extra/ice/flag/redacted
str: https://test.com/plants/ice/flag/extra/237468372912873
=> https://test.com/plants/ice/flag/extra/redacted
str: https://test.com/plants/ice/flag/extra/extra/237468372912873
=> https://test.com/plants/ice/flag/extra/extra/redacted
str: https://test.com/plants/ice/flag/extra/237468372912873#someHash
=> https://test.com/plants/ice/flag/extra/redacted#someHash
str: https://test.com/plants/ice/flag/extra/237468372912873?paramOne=1&paramTwo=2#someHash
=> https://test.com/plants/ice/flag/extra/redacted?paramOne=1&paramTwo=2#someHash
str: https://test.com/plants/not_ice/flag/237468372912873
=> https://test.com/plants/not_ice/flag/237468372912873
Regex:
( - capture group start
\/ice\/flag\/ - expect /ice/flag/
([^\/#?]+\/)* - zero or more patterns of chars other than /, #, ?, followed by /
) - capture group end
[^\/#?]+ - discard anything that is not /, #, ? but expect at least one char (this will force stuff after the last /)
You can add a ternary operation condition to check if the url includes /ice/flag by url.includes('/ice/flag'), then replace url.replace(/\/[^\/]*$/, '/redacted') else return the url as it is.
function replace(url) {
return url.includes('/ice/flag') ? url.replace(/\/[^\/]*$/, '/redacted') : url;
}
console.log(replace("https://test.com/plants/ice/flag/237468372912873"))
console.log(replace("https://test.com/plants/not_ice/flag/237468372912873"));

How do I replace the last character of the selected regex?

I want this string {Rotation:[45f,90f],lvl:10s} to turn into {Rotation:[45,90],lvl:10}.
I've tried this:
const bar = `{Rotation:[45f,90f],lvl:10s}`
const regex = /(\d)\w+/g
console.log(bar.replace(regex, '$&'.substring(0, -1)))
I've also tried to just select the letter at the end using $ but I can't seem to get it right.
You can use
bar.replace(/(\d+)[a-z]\b/gi, '$1')
See the regex demo.
Here,
(\d+) - captures one or more digits into Group 1
[a-z] - matches any letter
\b - at the word boundary, ie. at the end of the word
gi - all occurrences, case insensitive
The replacement is Group 1 value, $1.
See the JavaScript demo:
const bar = `{Rotation:[45f,90f],lvl:10s}`
const regex = /(\d+)[a-z]\b/gi
console.log(bar.replace(regex, '$1'))
Check this out :
const str = `{Rotation:[45f,90f],lvl:10s}`.split('');
const x = str.splice(str.length - 2, 1)
console.log(str.join(''));
You can use positive lookahead to match the closing brace, but not capture it. Then the single character can be replaced with a blank string.
const bar= '{Rotation:[45f,90f],lvl:10s}'
const regex = /.(?=})/g
console.log(bar.replace(regex, ''))
{Rotation:[45f,90f],lvl:10}
The following regex will match each group of one or more digits followed by f or s.
$1 represents the contents captured by the capture group (\d).
const bar = `{Rotation:[45f,90f],lvl:10s}`
const regex = /(\d+)[fs]/g
console.log(bar.replace(regex, '$1'))

regex to extract numbers starting from second symbol

Sorry for one more to the tons of regexp questions but I can't find anything similar to my needs. I want to output the string which can contain number or letter 'A' as the first symbol and numbers only on other positions. Input is any string, for example:
---INPUT--- -OUTPUT-
A123asdf456 -> A123456
0qw#$56-398 -> 056398
B12376B6f90 -> 12376690
12A12345BCt -> 1212345
What I tried is replace(/[^A\d]/g, '') (I use JS), which almost does the job except the case when there's A in the middle of the string. I tried to use ^ anchor but then the pattern doesn't match other numbers in the string. Not sure what is easier - extract matching characters or remove unmatching.
I think you can do it like this using a negative lookahead and then replace with an empty string.
In an non capturing group (?:, use a negative lookahad (?! to assert that what follows is not the beginning of the string followed by ^A or a digit \d. If that is the case, match any character .
(?:(?!^A|\d).)+
var pattern = /(?:(?!^A|\d).)+/g;
var strings = [
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
for (var i = 0; i < strings.length; i++) {
console.log(strings[i] + " ==> " + strings[i].replace(pattern, ""));
}
You can match and capture desired and undesired characters within two different sides of an alternation, then replace those undesired with nothing:
^(A)|\D
JS code:
var inputStrings = [
"A-123asdf456",
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
console.log(
inputStrings.map(v => v.replace(/^(A)|\D/g, "$1"))
);
You can use the following regex : /(^A)?\d+/g
var arr = ['A123asdf456','0qw#$56-398','B12376B6f90','12A12345BCt', 'A-123asdf456'],
result = arr.map(s => s.match(/(^A|\d)/g).join(''));
console.log(result);

Javascript validation regex for names

I am looking to accept names in my app with letters and hyphens or dashes, i based my code on an answer i found here
and coded that:
function validName(n){
var nameRegex = /^[a-zA-Z\-]+$/;
if(n.match(nameRegex) == null){
return "Wrong";
}
else{
return "Right";
}
}
the only problem is that it accepts hyphen as the first letter (even multiple ones) which i don't want.
thanks
Use negative lookahead assertion to avoid matching the string starting with a hyphen. Although there is no need to escape - in the character class when provided at the end of character class. Use - removed character class for avoiding - at ending or use lookahead assertion.
var nameRegex = /^(?!-)[a-zA-Z-]*[a-zA-Z]$/;
// or
var nameRegex = /^(?!-)(?!.*-$)[a-zA-Z-]+$/;
var nameRegex = /^(?!-)[a-zA-Z-]*[a-zA-Z]$/;
// or
var nameRegex1 = /^(?!-)(?!.*-$)[a-zA-Z-]+$/;
function validName(n) {
if (n.match(nameRegex) == null) {
return "Wrong";
} else {
return "Right";
}
}
function validName1(n) {
if (n.match(nameRegex1) == null) {
return "Wrong";
} else {
return "Right";
}
}
console.log(validName('abc'));
console.log(validName('abc-'));
console.log(validName('-abc'));
console.log(validName('-abc-'));
console.log(validName('a-b-c'));
console.log(validName1('abc'));
console.log(validName1('abc-'));
console.log(validName1('-abc'));
console.log(validName1('-abc-'));
console.log(validName1('a-b-c'));
FYI : You can use RegExp#test method for searching regex match and which returns boolean based on regex match.
if(nameRegex.test(n)){
return "Right";
}
else{
return "Wrong";
}
UPDATE : If you want only single optional - in between words, then use a 0 or more repetitive group which starts with -as in #WiktorStribiżew answer .
var nameRegex = /^[a-zA-Z]+(?:-[a-zA-Z]+)*$/;
You need to decompose your single character class into 2 , moving the hyphen outside of it and use a grouping construct to match sequences of the hyphen + the alphanumerics:
var nameRegex = /^[a-zA-Z]+(?:-[a-zA-Z]+)*$/;
See the regex demo
This will match alphanumeric chars (1 or more) at the start of the string and then will match 0 or more occurrences of - + one or more alphanumeric chars up to the end of the string.
If there can be only 1 hyphen in the string, replace * at the end with ? (see the regex demo).
If you also want to allow whitespace between the alphanumeric chars, replace the - with [\s-] (demo).
You can either use a negative lookahead like Pranav C Balan propsed or just use this simple expression:
^[a-zA-Z]+[a-zA-Z-]*$
Live example: https://regex101.com/r/Dj0eTH/1
The below regex is useful for surnames if one wants to forbid leading or trailing non-alphabetic characters, while permitting a small set of common word-joining characters in between two names.
^[a-zA-Z]+[- ']{0,1}[a-zA-Z]+$
Explanation
^[a-zA-Z]+ must begin with at least one letter
[- ']{0,1} allow zero or at most one of any of -, or '
[a-zA-Z]+$ must end with at least one letter
Test cases
(The double-quotes have been added purely to illustrate the presence of whitespace.)
"Blair" => match
" Blair" => no match
"Blair " => no match
"-Blair" => no match
"- Blair" => no match
"Blair-" => no match
"Blair -" => no match
"Blair-Nangle" => match
"Blair--Nangle" => no match
"Blair Nangle" => match
"Blair -Nangle" => no match
"O'Nangle" => match
"BN" => match
"BN " => no match
" O'Nangle" => no match
"B" => no match
"3Blair" => no match
"!Blair" => no match
"van Nangle" => match
"Blair'" => no match
"'Blair" => no match
Limitations include:
No single-character surnames
No surnames composed of more than two words
Check it out on regex101.

Extract word between '=' and '('

I have the following string
234234=AWORDHERE('sdf.'aa')
where I need to extract AWORDHERE.
Sometimes there can be space in between.
234234= AWORDHERE('sdf.'aa')
Can I do this with a regular expression?
Or should I do it manually by finding indexes?
The datasets are huge, so it's important to do it as fast as possible.
Try this regex:
\d+=\s?(\w+)\(
Check Demo
in Javascript it would like that:
var myString = "234234=AWORDHERE('sdf.'aa')";// or 234234= AWORDHERE('sdf.'aa')
var myRegexp = /\d+=\s?(\w+)\(/g;
var match = myRegexp.exec(myString);
console.log(match[1]); // AWORDHERE
You could do this at least three ways. You need to benchmark to see what's fastest.
Substring w/ indexes
function extract(from) {
var ixEq = from.indexOf("=");
var ixParen = from.indexOf("(");
return from.substring(ixEq + 1, ixParen);
}
.
Splits
function extract(from) {
var spEq = from.split("=");
var spParen = spEq[1].split("(");
return spParen[0];
}
Regex (demo)
Here is some sample regex you could use
/[^=]+=([^(]+).*/g
This says
[^=]+ - One or more character which is not an =
= - The = itself
( - creates a matching group so you can access your match in code
[^(]+ - One or more character which is not a (
) - closes the matching group
.* - Matches the rest of the line
the /g on the end tells it to perform the match on all lines.
Using look around you can search for string preceded by = and followed by ( as following.
Regex: (?<==)[A-Z ]+(?=\()
Explanation:
(?<==) checks if [A-Z ] is preceded by an =.
[A-Z ]+ matches your pattern.
(?=\() checks if matched pattern is followed by a (.
Regex101 Demo
var str = "234234= AWORDHERE('sdf.'aa')";
var regexp = /.*=\s+(\w+)\(.*\)/g;
var match = regexp.exec(str);
alert( match[1] );
I made my solution for this just a little more general than you asked for, but I don't think it takes much more time to execute. I didn't measure. If you need greater efficiency than this provides, comment and I or someone else can help you with that.
Here's what I did, using the command prompt of node:
> var s = "234234= AWORDHERE('sdf.'aa')"
undefined
> var a = s.match(/(\w+)=\s*(\w+)\s*\(.*/)
undefined
> a
[ '234234= AWORDHERE(\'sdf.\'aa\')',
'234234',
'AWORDHERE',
index: 0,
input: '234234= AWORDHERE(\'sdf.\'aa\')' ]
>
As you can see, this matches the number before the = in a[1], and it matches the AWORDHERE name as you requested in a[2]. This will work with any number (including zero) spaces before and/or after the =.

Categories

Resources