Complex assignments with comma separator - javascript

I have a serie of string that will be pass to a function, and that function must return an array. The string is a serie of vars to be export on bash, and some of that vars may be a json. This is the possible list of string as example and the expected result:
string
return
desc
ONE=one
[ "ONE=one" ]
Array of one element
ONE="{}"
[ 'ONE="{}"' ]
Array of one element with quoted value.
ONE='{}'
[ "ONE='{}'" ]
Array of one element with simple quoted value
ONE='{attr: \"value\"}'
[ "ONE='{attr: \\"value\\"}'" ]
Array of one element
ONE='{attr1: \"value\", attr2:\"value attr 2\"}'
[ "ONE='{attr1: \\"value\\", attr2:\\"value attr 2\\"}'" ]
Array of one element and json inside with multiples values
ONE=one,TWO=two
[ "ONE=one", "TWO=two" ]
Array of two elements
ONE=one, TWO=two
[ "ONE=one", "TWO=two" ]
Array of two elements (Ignoring space after comma)
ONE='{}', TWO=two
[ "ONE='{}', TWO=two" ]
Array of two elements, one quoted
ONE='{}',TWO='{}',THREE='{}'
[ "ONE='{}'", "TWO='{}'", "THREE='{}'" ]
Array of three elements
ONE='{}', TWO=two, THREE=three
[ "ONE='{}',", "TWO=two", "THREE=three" ]
Array of three elements, one quoted
How can i get the correct regex or process to get the expected result on each one?
This is what i have:
function parseVars(envString) {
let matches = envArg.matchAll(/([A-Za-z][A-Za-z0-9]+=(["']?)((?:\\\2|(?:(?!\2)).)*)(\2))(\,\s?)?/g);
let ret = [];
for (const match of matches) {
ret.push(match[1].trim())
}
return ret;
}
And tests:
describe("parseVars function", () => {
it("should be one simple variable", () => {
expect(parseVars("ONE=one")).toMatchObject([
"ONE=one"
]);
});
it("should be two simple variable", () => {
expect(parseVars("ONE=one,TWO=two")).toMatchObject([
"ONE=one",
"TWO=two"
]);
});
it("should be two simple variable (Trim space)", () => {
expect(parseVars("ONE=one, TWO=two")).toMatchObject([
"ONE=one",
"TWO=two"
]);
});
it("should be simple json", () => {
expect(parseVars("ONE='{}'")).toMatchObject([
"ONE='{}'",
]);
});
it("should be three simple json", () => {
expect(parseVars("ONE='{}',TWO='{}',THREE='{}'")).toMatchObject([
"ONE='{}'",
"TWO='{}'",
"THREE='{}'",
]);
});
it("should be three simple json (Simple quote)", () => {
expect(parseVars("ONE='{}'")).toMatchObject([
"ONE='{}'",
]);
});
it("should be three simple json with attribute", () => {
expect(parseVars("ONE='{attr: \"value\"}'")).toMatchObject([
"ONE='{attr: \"value\"}'",
]);
});
it("should be complex json with multiple attributes", () => {
expect(parseVars("ONE='{attr1: \"value\", attr2:\"value attr 2\"}'")).toMatchObject([
"ONE='{attr1: \"value\", attr2:\"value attr 2\"}'",
]);
});
it("should be one json and one simple var", () => {
expect(parseVars("ONE='{}', TWO=two")).toMatchObject([
"ONE='{}'",
"TWO=two",
]);
});
it("should be one json and two simple vars", () => {
expect(parseVars("ONE='{}', TWO=two, THREE=three")).toMatchObject([
"ONE='{}'",
"TWO=two",
"THREE=three",
]);
});
});
And the results:
parseVars function
✕ should be one simple variable (4ms)
✕ should be two simple variable (1ms)
✕ should be two simple variable (Trim space)
✓ should be simple json (1ms)
✓ should be three simple json
✓ should be three simple json (Simple quote)
✓ should be three simple json with attribute
✓ should be complex json with multiple attributes
✕ should be one json and one simple var (1ms)
✕ should be one json and two simple vars (1ms)

The issue with your regex is you're only testing the quote enclosures like ONE='{attr: \"value\"}', but not allowing ONE=one.
When you use a capture group with an optional match (['"]?), if it doesn't match, the group still captures a zero-width character. When combine it with a negative lookahead (?!\2) it fails everything - any character has a zero-width character in front of it.
You just need to combine the quote enclosure test with |[^,]*, so it works for both scenarios.
Here's a simplified version of your concept:
/(?=\b[a-z])\w+=(?:(['"])(?:(?!\1).)*\1|[^,]*)/gi
Explanation
(?=\b[a-z])\w+ any word characters, but must start with an alphabetic character
= equal sign
(?: non-capturing group
(['"])(?:\\\1|(?!\1).)*\1 a quote enclosure
|[^,]* or any string that not made by comma
)
See the proof
const texts = [
`ONE=one`,
`ONE="{}"`,
`ONE='{}'`,
`ONE='{attr: \"value\"}'`,
`ONE='{attr1: \"value\", attr2:\"value attr 2\"}'`,
`ONE=one,TWO=two`,
`ONE=one, TWO=two`,
`ONE='{}', TWO=two`,
`ONE='{}',TWO='{}',THREE='{}'`,
`ONE='{}', TWO=two, THREE=three`
];
const regex = /(?=\b[a-z])\w+=(?:(['"])(?:\\\1|(?!\1).)*\1|[^,]*)/gi;
texts.forEach(text => {
console.log(text, '=>', text.match(regex));
})

You might also start the match with a char a-z followed by optional word chars. Then match either from an opening till closing " or ', or match all except a whitespace or comma without using lookarounds or capture groups.
Using a case insensitive match using /i
\b[a-z]\w*=(?:"[^"\\]*(?:\\.[^"\\]*)*"|\'[^\'\\]*(?:\\.[^\'\\]*)*\'|[^\s,]+)
The pattern matches:
\b A word boundary to prevent a partial match
[a-z]\w*= Match a char a-z, optional word chars and =
(?: Non capture group
"[^"\\]*(?:\\.[^"\\]*)*" Match from " till " not stopping at an escaped one
| Or
\'[^\'\\]*(?:\\.[^\'\\]*)*\' Match from ' till ' not stopping at an escaped one
| Or
[^\s,]+ Match 1+ times any char except a whitspace char or ,
) Close non capture group
See a Regex demo
const regex = /\b[a-z]\w*=(?:"[^"\\]*(?:\\.[^"\\]*)*"|\'[^\'\\]*(?:\\.[^\'\\]*)*\'|[^\s,]+)/gi;
[
`ONE=one`,
`ONE="{}"`,
`ONE='{}'`,
`ONE='{attr: \"value\"}'`,
`ONE="{attr: \"value\"}"`,
`ONE='{attr1: \"value\", attr2:\"value attr 2\"}'`,
`ONE=one,TWO=two`,
`ONE=one, TWO=two`,
`ONE='{}', TWO=two`,
`ONE='{}',TWO='{}',THREE='{}'`,
`ONE='{}', TWO=two, THREE=three`
].forEach(s => console.log(s.match(regex)))

Related

How to add access property and wrap the arguments with curly brackets with regex in js?

I have an array of strings. Each string might contain function calls.
I made this regular expression to match the word between $ctrl. and (, as long as it is immediately after $ctrl. and it will also match the parameters inside the parentheses if they exist.
/(\$ctrl\.)(\w+)(\(([^)]+)\))?/g;
Once there is a match then I add emit before the function call ($ctrl.foo() to $ctrl.foo.emit()), and if there are brackets then I wrap them with curly brackets: $ctrl.foo(account, user) to $ctrl.foo.emit({ account, user })).
The problem is this regex doesn't work for some cases.
const inputs = [
'$ctrl.foo(account)',
'$ctrl.foo(account, bla)',
'$ctrl.foo(account, bla); $ctrl.some()',
'$ctrl.foo(account, bla);$ctrl.some(you)',
'$ctrl.foo.some(account, bla);$ctrl.fn.some(you)',
'$ctrl.gog',
];
const regex = /(\$ctrl\.)(\w+)(\(([^)]+)\))?/g;
inputs.forEach((input) => {
let output = input.replace(regex, '$1$2.emit({$4})');
console.log(output);
});
The results:
$ctrl.foo.emit({account})
$ctrl.foo.emit({account, bla})
$ctrl.foo.emit({account, bla}); $ctrl.some.emit({})()
$ctrl.foo.emit({account, bla});$ctrl.some.emit({you})
$ctrl.foo.emit({}).some(account, bla);$ctrl.fn.emit({}).some(you)
$ctrl.gog.emit({})
The first and two results are excellent. The regex adds emit and wraps the arguments with {..}.
But the regex is not working if I don't have arguments or if I have another access property before the function call: $ctrl.foo.bar() (should not match this case).
What is missing in my regex to get those results?
$ctrl.foo.emit({account})
$ctrl.foo.emit({account, bla})
$ctrl.foo.emit({account, bla}); $ctrl.some.emit()
$ctrl.foo.emit({account, bla});$ctrl.some.emit({you})
$ctrl.foo.some(account, bla);$ctrl.fn.some(you)
$ctrl.gog
Maybe this modified regexp works better for you?
const inputs = [
'$ctrl.foo(account)',
'$ctrl.foo(account, bla)',
'$ctrl.foo(account, bla); $ctrl.some()',
'$ctrl.foo(account, bla);$ctrl.some(you)',
'$ctrl.foo.some(account, bla);$ctrl.fn.some(you)',
'$ctrl.gog',
];
const regex = /(\$ctrl\.)(\w+)(\(([^)]*)\))/g;
inputs.forEach((input) => {
let output = input.replace(regex, '$1$2.emit({$4})');
console.log(output);
});
I changed two quantifies in the regexp:
[^)]+ to [^)]* this allows also zero-length matches
)?/g to )/g this makes the existence of the \( ... \)-group at the end of the pattern no longer optional but compulsory.

How to split a string based on a regex pattern with conditions (JavaScript)

I am trying to split a string so that I can separate it depending on a pattern. I'm having trouble getting the correct regex pattern to do so. I also need to insert the results into an array of objects. Perhaps by using a regex pattern, the string can be split into a resulting array object to achieve the objective. Note that the regex pattern must not discriminate between - or --. Or is there any better way to do this?
I tried using string split() method, but to no avail. I am trying to achieve the result below:
const example1 = `--filename test_layer_123.png`;
const example2 = `--code 1 --level critical -info "This is some info"`;
const result1 = [{ name: "--filename", value: "test_layer_123.png" }];
const result2 = [
{ name: "--code", value: "1" },
{ name: "--level", value: "critical" },
{ name: "-info", value: "This is some info" },
];
If you really want to use Regex to solve this.
Try this Pattern /((?:--|-)\w+)\s+"?([^-"]+)"?/g
Code example:
function matchAllCommands(text, pattern){
let new_array = [];
let matches = text.matchAll(pattern);
for (const match of matches){
new_array.push({name: match.groups.name, value: match.groups.value});
}
return new_array;
}
let RegexPattern = /(?<name>(?:--|-)\w+)\s+"?(?<value>[^-"]+)"?/g;
let text = '--code 1 --level critical -info "This is some info"';
console.log(matchAllCommands(text, RegexPattern));
Here is a solution that splits the argument string using a positive lookahead, and creates the array of key & value pairs using a map:
function getArgs(str) {
return str.split(/(?= --?\w+ )/).map(str => {
let m = str.match(/^ ?([^ ]+) (.*)$/);
return {
name: m[1],
value: m[2].replace(/^"(.*)"$/, '$1')
};
});
}
[
'--filename test_layer_123.png', // example1
'--code 1 --level critical -info "This is some info"' // example2
].forEach(str => {
var result = getArgs(str);
console.log(JSON.stringify(result, null, ' '));
});
Positive lookahead regex for split:
(?= -- positive lookahead start
--?\w+ -- expect space, 1 or 2 dashes, 1+ word chars, a space
) -- positive lookahead end
Match regex in map:
^ -- anchor at start of string
? -- optional space
([^ ]+) -- capture group 1: capture everything to next space
-- space
(.*) -- capture group 2: capture everything that's left
$ -- anchor at end of string

How do you access the groups of match/matchAll like an array?

Here's what I would like to be able to do:
function convertVersionToNumber(line) {
const groups = line.matchAll(/^# ([0-9]).([0-9][0-9]).([0-9][0-9])\s*/g);
return parseInt(groups[1] + groups[2] + groups[3]);
}
convertVersionToNumber("# 1.03.00")
This doesn't work because groups is an IterableIterator<RegExpMatchArray>, not an array. Array.from doesn't seem to turn it into an array of groups either. Is there an easy way (ideally something that can fit on a single line) that can convert groups into an array?
The API of that IterableIterator<RegExpMatchArray> is a little inconvenient, and I don't know how to skip the first element in a for...of. I mean, I do know how to use both of these, it just seems like it's going to add 4+ lines so I'd like to know if there is a more concise way.
I am using typescript, so if it has any syntactic sugar to do this, I'd be happy to use that.
1) matchAll will return an Iterator object Iterator [RegExp String Iterator]
result will contain an Iterator and when you use the spread operator It will give you all matches. Since it contains only one match so It contains a single element only.
[ '# 1.03.00', '1', '03', '00', index: 0, input: '# 1.03.00', groups: undefined ]
Finally, we used a spread operator to get all value and wrap it in an array
[...result]
function convertVersionToNumber(line) {
const result = line.matchAll(/^# ([0-9]).([0-9][0-9]).([0-9][0-9])\s*/g);
const groups = [...result][0];
return parseInt(groups[1] + groups[2] + groups[3]);
}
console.log(convertVersionToNumber("# 1.03.00"));
Since you are using regex i.e /^# ([0-9]).([0-9][0-9]).([0-9][0-9])\s*/
2) If there are multiple matches then yon can spread results in an array and then use for..of to loop over matches
function convertVersionToNumber(line) {
const iterator = line.matchAll(/# ([0-9]).([0-9][0-9]).([0-9][0-9])\s*/g);
const results = [...iterator];
for (let arr of results) {
const [match, g1, g2, g3] = arr;
console.log(match, g1, g2, g3);
}
}
convertVersionToNumber("# 1.03.00 # 1.03.00");
Alternate solution: You can also get the same result using simple match also
function convertVersionToNumber(line) {
const result = line.match(/\d/g);
return +result.join("");
}
console.log(convertVersionToNumber("# 1.03.00"));
You do not need .matchAll in this concrete case. You simply want to match a string in a specific format and re-format it by only keeping the three captured substrings.
You may do it with .replace:
function convertVersionToNumber(line) {
return parseInt(line.replace(/^# (\d)\.(\d{2})\.(\d{2})[\s\S]*/, '$1$2$3'));
}
console.log( convertVersionToNumber("# 1.03.00") );
You may check if the string before replacing is equal to the new string if you need to check if there was a match at all.
Note you need to escape dots to match them as literal chars.
The ^# (\d)\.(\d{2})\.(\d{2})[\s\S]* pattern matches
^ - start of string
# - space + #
(\d) - Group 1: a digit
\. - a dot
(\d{2}) - Group 2: two digits
\. - a dot
(\d{2}) - Group 3: two digits
[\s\S]* - the rest of the string (zero or more chars, as many as possible).
The $1$2$3 replacement pattern is the concatenated Group 1, 2 and 3 values.

why condition is always true in javascript?

Could you please tell me why my condition is always true? I am trying to validate my value using regex.i have few conditions
Name should not contain test "text"
Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
I do like this
https://jsfiddle.net/aoerLqkz/2/
var val = 'ab dd'
if (/test|[^a-z]|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i.test(val)) {
alert( 'match')
} else {
alert( 'false')
}
I tested my code with the following string and getting an unexpected result
input string "abc" : output fine :: "match"
input string "aaa" : output fine :: "match"
input string "aa a" : **output ** :: "match" why it is match ?? there is space between them why it matched ????
input string "sa c" : **output ** :: "match" why it is match ?? there is different string and space between them ????
The string sa c includes a space, the pattern [^a-z] (not a to z) matches the space.
Possibly you want to use ^ and $ so your pattern also matches the start and end of the string instead of looking for a match anywhere inside it.
there is space between them why it matched ????
Because of the [^a-z] part of your regular expression, which matches the space:
> /[^a-z]/i.test('aa a');
true
The issue is the [^a-z]. This means that any string that has a non-letter character anywhere in it will be a match. In your example, it is matching the space character.
The solution? Simply remove |[^a-z]. Without it, your regex meets all three criteria.
test checks if the value contains the word 'test'.
abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz checks if the value contains three sequential letters.
(.)\1\1 checks if any character is repeated three times.
Complete regex:
/test|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i`
I find it helpful to use a regex tester, like https://www.regexpal.com/, when writing regular expressions.
NOTE: I am assuming that the second criteria actually means "three consecutive letters", not "three consecutive characters" as it is written. If that is not true, then your regex doesn't meet the second criteria, since it only checks for three consecutive letters.
I would not do this with regular expresions, this expresion will always get more complicated and you have not the possibilities you had if you programmed this.
The rules you said suggest the concept of string derivative. The derivative of a string is the distance between each succesive character. It is specially useful dealing with password security checking and string variation in general.
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
//these strings have the same derivative: [0,0,0,0]
console.log(derivative('aaaaa'));
console.log(derivative('bbbbb'));
//these strings also have the same derivative: [1,1,1,1]
console.log(derivative('abcde'));
console.log(derivative('mnopq'));
//up and down: [1,-1, 1,-1, 1]
console.log(derivative('ababa'));
With this in mind you can apply your each of your rules to each string.
// Rules:
// 1. Name should not contain test "text"
// 2. Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
// 3. Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
const arrayContains = (master, sub) =>
master.join(",").indexOf( sub.join( "," ) ) == -1;
const rule1 = (text) => !text.includes('text');
const rule2 = (text) => !arrayContains(derivative(text),[1,1]);
const rule3 = (text) => !arrayContains(derivative(text),[0,0]);
const testing = [
"smthing textual",'abc','aaa','xyz','12345',
'1111','12abb', 'goodbcd', 'weeell'
];
const results = testing.map((input)=>
[input, rule1(input), rule2(input), rule3(input)]);
console.log(results);
Based on the 3 conditions in the post, the following regex should work.
Regex: ^(?:(?!test|([a-z])\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz).)*$
Demo

JS regexp: match repeated pattern

I wonder why these regexps aren't equivalent:
/(a)(a)(a)/.exec ("aaa").toString () => "aaa,a,a,a" , as expected
/(a){3}/.exec ("aaa").toString () => "aaa,a" :(
/(a)*/.exec ("aaa").toString () => "aaa,a" :(
How must the last two be reformulated so that they behave like the first? The important thing is that I want arbitrary multiples matched and remembered.
The following line
/([abc])*/.exec ("abc").toString () => "abc,c"
suggests that only one character is saved per parenthesis - the last match.
You probably are looking for this:
var re = /([abc])/g,
matches = [],
input = "abc";
while (match = re.exec(input)) matches.push(match[1]);
console.log(matches);
//=> ["a", "b", "c"]
Remember that any matching group will give you last matched pattern not all of them.
RegExBuddy describes it very well:
Note: you repeated the capturing group itself. The group will capture
only the last iteration

Categories

Resources