Javascript regex assistance - javascript

I have the following javascript regex...
.replace("/^(a-zA-Z\-)/gi", '');
It isn't complete... in fact it's wrong. Essentially what I need to do is take a string like "XJ FC-X35 (1492)" and remove the () and it's contents and the whitespace before the parenthesis.

replace(/^(.*?)\s?\(([^)]+)\)$/gi, "$1$2")
Takes XJ FC-X35 (1492) and returns XJ FC-X351492.
Remove the $2 to turn XJ FC-X35 (1492) into XJ FC-X35, if that's what you wanted instead.
Long explanation
^ // From the start of the string
( // Capture a group ($1)
.*? // That contains 0 or more elements (non-greedy)
) // Finish group $1
\s? // Followed by 0 or 1 whitespace characters
\( // Followed by a "("
( // Capture a group ($2)
[ // That contains any characters in the following set
^) // Not a ")"
]+ // One or more times
) // Finish group $2
\)$ // Followed by a ")" followed by the end of the string.

Try this:
x = "XJ FC-X35 (1492)"
x.replace(/\s*\(.*?\)/,'');

Related

What will be the best regex Expression for censoring email?

Hello I am stuck on a problem for censoring email in a specific format, but I am not getting how to do that, please help me!
Email : exampleEmail#example.com
Required : e***********#e******.com
Help me getting this in javascript,
Current code I am using to censor :
const email = exampleEmail#example.com;
const regex = /(?<!^).(?!$)/g;
const censoredEmail = email.replace(regex, '*');
Output: e**********************m
Please help me getting e***********#e******.com
You can use
const email = 'exampleEmail#example.com';
const regex = /(^.|#[^#](?=[^#]*$)|\.[^.]+$)|./g;
const censoredEmail = email.replace(regex, (x, y) => y || '*');
console.log(censoredEmail );
// => e***********#e******.com
Details:
( - start of Group 1:
^.| - start of string and any one char, or
#[^#](?=[^#]*$)| - a # and any one char other than # that are followed with any chars other than # till end of string, or
\.[^.]+$ - a . and then any one or more chars other than . till end of string
) - end of group
| - or
. - any one char.
The (x, y) => y || '*' replacement means the matches are replaced with Group 1 value if it matched (participated in the match) or with *.
If there should be a single # present in the string, you can capture all the parts of the string and do the replacement on the specific groups.
^ Start of string
([^\s#]) Capture the first char other than a whitespace char or # that should be unmodified
([^\s#]*) Capture optional repetitions of the same
# Match literally
([^\s#]) Capture the first char other than a whitespace char or # after it that should be unmodified
([^\s#]*) Capture optional repetitions of the same
(\.[^\s.#]+) Capture a dot and 1+ other chars than a dot, # or whitespace char that should be unmodified
$ End of string
Regex demo
In the replacement use all 5 capture groups, where you replace group 2 and 4 with *.
const regex = /^([^\s#])([^\s#]*)#([^\s#])([^\s#]*)(\.[^\s.#]+)$/;
[
"exampleEmail#example.com",
"test"
].forEach(email =>
console.log(
email.replace(regex, (_, g1, g2, g3, g4, g5) =>
`${g1}${"*".repeat(g2.length)}#${g3}${"*".repeat(g4.length)}${g5}`)
)
);

Is there a RegEx to extract semicolon separated values from a string... possibly containing string with semicolon

I would like to extract values from a string semicolon separated values that could also contains semicolon but not as separator. The RegEx I found on this website (I lost the post) is almost complete because it separates the key and it's value.
Example:
my.parameter 10; the.foo "Procedural Map"; pve; server.description "This; is \"my\", my description,.\n"
Current result with [^; "]+|"(?:\\"|[^"])*"/g
[
'server.seed',
'10',
'pve',
'server.level',
'"Procedural Map"',
'server.description',
'"This; is \\"my\\", server; description,."'
]
Desired result
[
'my.parameter 10',
'the.foo "Procedural Map"',
'pve',
'server.description "This; is \"my\", server; description,.\n"'
]
Can you help me to improve the RegEx to group the parameter and it's value?
You could use a repeating pattern to first match any char except the ; and then optionally match from an opening till closing double quote and match the escaped double quotes in between.
After that optionally repeat the character class [^";\\]* to also match what comes after the closing double quote.
[^;"\\]+(?:"(?:[^"\\]*(?:\\.[^"\\]*)*)"[^";\\]*)*
[^;"\\]+ Match 1+ times any char except ; " \
(?: Non capture group to repeat as a whole
" Match literally
(?: Non capture group
[^"\\]* Match 0+ times any char except " \
(?:\\.[^"\\]*)* Optionally repeat matching \ and any char followed by 0+ times any char except " and \
) Close the non capture group
" Match literally
[^";\\]* Optionally match any char except " ; \
)* Close the outer non capture group and optionally repeat
Regex demo
I found a workaround by replacing separator by the ASCII separator (␟) then splitting the result.
const separatorPattern = /; (?=([^"]*"[^"]*")*[^"]*$)/g;
const myRawString = "server.seed 10; server.pve, server.level \"Procedural Map\"; server.description \"This; is \\\"my\\\", server; description,.\"";
const replacedSeparator = myRawString.replace(separatorPattern, "␟");
const parameters = replacedSeparator.split("␟");
console.log(parameters);
/*[
'server.seed 10',
'server.pve, server.level "Procedural Map"',
'server.description "This; is \\"my\\", server; description,."'
]*/

How to match 2 separate numbers in Javascript

I have this regex that should match when there's two numbers in brackets
/(P|C\(\d+\,{0,1}\s*\d+\))/g
for example:
C(1, 2) or P(2 3) //expected to match
C(43) or C(43, ) // expect not to match
but it also matches the ones with only 1 number, how can i fix it?
You have a couple of issues. Firstly, your regex will match either P on its own or C followed by numbers in parentheses; you should replace P|C with [PC] (you could use (?:P|C) but [PC] is more performant, see this Q&A). Secondly, since your regex makes both the , and spaces optional, it can match 43 without an additional number (the 4 matches the first \d+ and the 3 the second \d+). You need to force the string to either include a , or at least one space between the numbers. You can do that with this regex:
[PC]\(\d+[ ,]\s*\d+\)
Demo on regex101
Try this regex
[PC]\(\d+(?:,| +) *\d+\)
Click for Demo
Explanation:
[PC]\( - matches either P( or C(
\d+ - matches 1+ digits
(?:,| +) - matches either a , or 1+ spaces
*\d+ - matches 0+ spaces followed by 1+ digits
\) - matches )
You can relax the separator between the numbers by allowing any combination of command and space by using \d[,\s]+\d. Test case:
const regex = /[PC]\(\d+[,\s]+\d+\)/g;
[
'C(1, 2) or P(2 3)',
'C(43) or C(43, )'
].forEach(str => {
let m = str.match(regex);
console.log(str + ' ==> ' + JSON.stringify(m));
});
Output:
C(1, 2) or P(2 3) ==> ["C(1, 2)","P(2 3)"]
C(43) or C(43, ) ==> null
Your regex should require the presence of at least one delimiting character between the numbers.
I suppose you want to get the numbers out of it separately, like in an array of numbers:
let tests = [
"C(1, 2)",
"P(2 3)",
"C(43)",
"C(43, )"
];
for (let test of tests) {
console.log(
test.match(/[PC]\((\d+)[,\s]+(\d+)\)/)?.slice(1)?.map(Number)
);
}

Regex replace all character except last 5 character and whitespace with plus sign

I wanted to replace all characters except its last 5 character and the whitespace with +
var str = "HFGR56 GGKDJ JGGHG JGJGIR"
var returnstr = str.replace(/\d+(?=\d{4})/, '+');
the result should be "++++++ ++++ +++++ JGJGIR" but in the above code I don't know how to exclude whitespace
You need to match each character individually, and you need to allow a match only if more than six characters of that type follow.
I'm assuming that you want to replace alphanumeric characters. Those can be matched by \w. All other characters will be matched by \W.
This gives us:
returnstr = str.replace(/\w(?=(?:\W*\w){6})/g, "+");
Test it live on regex101.com.
The pattern \d+(?=\d{4}) does not match in the example string as is matches 1+ digits asserting what is on the right are 4 digits.
Another option is to match the space and 5+ word characters till the end of the string or match a single word character in group 1 using an alternation.
In the callback of replace, return a + if you have matched group 1, else return the match.
\w{5,}$|(\w)
Regex demo
let pattern = / \w{5,}$|(\w)/g;
let str = "HFGR56 GGKDJ JGGHG JGJGIR"
.replace(pattern, (m, g1) => g1 ? '+' : m);
console.log(str);
Another way is to replace a group at a time where the number of +
replaced is based on the length of the characters matched:
var target = "HFGR56 GGKDJ JGGHG JGJGIR";
var target = target.replace(
/(\S+)(?!$|\S)/g,
function( m, g1 )
{
var len = parseInt( g1.length ) + 1;
//return "+".repeat( len ); // Non-IE (quick)
return Array( len ).join("+"); // IE (slow)
} );
console.log ( target );
You can use negative lookahead with string end anchor.
\w(?!\w{0,5}$)
Match any word character which is not followed by 0 to 5 characters and end of string.
var str = "HFGR56 GGKDJ JGGHG JGJGIR"
var returnstr = str.replace(/\w(?!\w{0,5}$)/g, '+');
console.log(returnstr)

Regex to validate a comma separated list of unique numbers

I am trying to validate a comma separated list of numbers 1-7 unique (not repeating).
i.e.
2,4,6,7,1 is valid input.
2,2,6 is invalid
2 is valid
2, is invalid
1,2,3,4,5,6,7,8 is invalid ( only 7 number)
I tried ^[1-7](?:,[1-7])*$ but it's accepting repeating numbers
var data = [
'2,4,6,7,1',
'2,2,6',
'2',
'2,',
'1,2,3,2',
'1,2,2,3',
'1,2,3,4,5,6,7,8'
];
data.forEach(function(str) {
document.write(str + ' gives ' + /(?!([1-7])(?:(?!\1).)\1)^((?:^|,)[1-7]){1,7}$/.test(str) + '<br/>');
});
Regex are not suited for this. You should split the list into an array and try the different conditions:
function isValid(list) {
var arrList = list.split(",");
if (arrList.length > 7) { // if more than 7, there are duplicates
return false;
}
var temp = {};
for (var i in arrList) {
if (arrList[i] === "") return false; // if empty element, not valid
temp[arrList[i]] = "";
}
if (Object.keys(temp).length !== arrList.length) { // if they're not of same length, there are duplicates
return false;
}
return true;
}
console.log(isValid("2,4,6,7,1")); // true
console.log(isValid("2,2,6")); // false
console.log(isValid("2")); // true
console.log(isValid("2,")); // false
console.log(isValid("1,2,3,4,5,6,7,8")); // false
console.log(isValid("1,2,3")); // true
console.log(isValid("1,2,3,7,7")); // false
No RegEx is needed:
This is much more maintainable and explicit than a convoluted regular expression would be.
function isValid(a) {
var s = new Set(a);
s.delete(''); // for the hanging comma case ie:"2,"
return a.length < 7 && a.length == s.size;
}
var a = '2,4,6,7,1'.split(',');
alert(isValid(a)); // true
a = '2,2,6'.split(',');
alert(isValid(a)); // false
a = '2'.split(',');
alert(isValid(a)); // true
a = '2,'.split(',');
alert(isValid(a)); // false
'1,2,3,4,5,6,7,8'.split(',');
alert(isValid(a)); // false
You were pretty close.
^ # BOS
(?! # Validate no dups
.*
( [1-7] ) # (1)
.*
\1
)
[1-7] # Unrolled-loop, match 1 to 7 numb's
(?:
,
[1-7]
){0,6}
$ # EOS
var data = [
'2,4,6,7,1',
'2,2,6',
'2',
'2,',
'1,2,3,2',
'1,2,2,3',
'1,2,3,4,5,6,7,8'
];
data.forEach(function(str) {
document.write(str + ' gives ' + /^(?!.*([1-7]).*\1)[1-7](?:,[1-7]){0,6}$/.test(str) + '<br/>');
});
Output
2,4,6,7,1 gives true
2,2,6 gives false
2 gives true
2, gives false
1,2,3,2 gives false
1,2,2,3 gives false
1,2,3,4,5,6,7,8 gives false
For a number range that exceeds 1 digit, just add word boundary's around
the capture group and the back reference.
This isolates a complete number.
This particular one is numb range 1-31
^ # BOS
(?! # Validate no dups
.*
( # (1 start)
\b
(?: [1-9] | [1-2] \d | 3 [0-1] ) # number range 1-31
\b
) # (1 end)
.*
\b \1 \b
)
(?: [1-9] | [1-2] \d | 3 [0-1] ) # Unrolled-loop, match 1 to 7 numb's
(?: # in the number range 1-31
,
(?: [1-9] | [1-2] \d | 3 [0-1] )
){0,6}
$ # EOS
var data = [
'2,4,6,7,1',
'2,2,6',
'2,30,16,3',
'2,',
'1,2,3,2',
'1,2,2,3',
'1,2,3,4,5,6,7,8'
];
data.forEach(function(str) {
document.write(str + ' gives ' + /^(?!.*(\b(?:[1-9]|[1-2]\d|3[0-1])\b).*\b\1\b)(?:[1-9]|[1-2]\d|3[0-1])(?:,(?:[1-9]|[1-2]\d|3[0-1])){0,6}$/.test(str) + '<br/>');
});
Like other commenters, I recommend you to use something other than regular expressions to solve your problem.
I have a solution, but it is too long to be a valid answer here (answers are limited to 30k characters). My solution is actually a regular expression in the language-theory sense, and is 60616 characters long. I will show you here the code I used to generate the regular expression, it is written in Python, but easily translated in any language you desire. I confirmed that it is working in principle with a smaller example (that uses only the numbers 1 to 3):
^(2(,(3(,1)?|1(,3)?))?|3(,(1(,2)?|2(,1)?))?|1(,(3(,2)?|2(,3)?))?)$
Here's the code used to generate the regex:
def build_regex(chars):
if len(chars) == 1:
return list(chars)[0]
return ('('
+
'|'.join('{}(,{})?'.format(c, build_regex(chars - {c})) for c in chars)
+
')')
Call it like this:
'^' + build_regex(set("1234567")) + "$"
The concept is the following:
To match a single number a, we can use the simple regex /a/.
To match two numbers a and b, we can match the disjunction /(a(,b)?|b(,a)?)/
Similarily, to match n numbers, we match the disjunction of all elements, each followed by the optional match for the subset of size n-1 not containing that element.
Finally, we wrap the expression in ^...$ in order to match the entire text.
Edit:
Fixed error when repeating digit wasn't the first one.
One way of doing it is:
^(?:(?:^|,)([1-7])(?=(?:,(?!\1)[1-7])*$))+$
It captures a digit and then uses a uses a look-ahead to make sure it doesn't repeats itself.
^ # Start of line
(?: # Non capturing group
(?: # Non capturing group matching:
^ # Start of line
| # or
, # comma
) #
([1-7]) # Capture digit being between 1 and 7
(?= # Positive look-ahead
(?: # Non capturing group
, # Comma
(?!\1)[1-7] # Digit 1-7 **not** being the one captured earlier
)* # Repeat group any number of times
$ # Up to end of line
) # End of positive look-ahead
)+ # Repeat group (must be present at least once)
$ # End of line
var data = [
'2,4,6,7,1',
'2,2,6',
'2',
'2,',
'1,2,3,4,5,6,7,8',
'1,2,3,3,6',
'3,1,5,1,8',
'3,2,1'
];
data.forEach(function(str) {
document.write(str + ' gives ' + /^(?:(?:^|,)([1-7])(?=(?:,(?!\1)[1-7])*$))+$/.test(str) + '<br/>');
});
Note! Don't know if performance is an issue, but this does it in almost half the number of steps compared to sln's solution ;)

Categories

Resources