How can I extract a specific column from a string table? - javascript

I have a text. How I can extract the Name column with JavaScript and RegExp? I have a code but that doesn't work correctly
const stdout =
`Name Enabled Description
---- ------- -----------
6ytec True
DefaultAccount False Some text.
John True
WDAGUtilityAccount False Some text
Admin False Some text
Guest False Some text`
const regexp = new RegExp("([а-яА-Яa-zA-Z_0-9]+)\\s+(True|False)\\s+");
let result = stdout.match(regexp);
console.log(result);

An alternate approach without regex:
const stdout =
`Name Enabled Description
---- ------- -----------
6ytec True
DefaultAccount False Some text.
John True
WDAGUtilityAccount False Some text
Admin False Some text
Guest False Some text`
const firstColumn = stdout
.split('\n')
.map(line => line.split(' '))
.map(word => word[0])
.slice(2)
console.log(firstColumn);

You can use the following:
[...stdout.matchAll(/^(?!----|Name\b)\S+\b/gm)]
^ - Matches beginning of line
(?! - Negative lookahead
| - or
\S+ - non white space
And \b means boundary between non character (i.e. when the non white spaces end)
/gm - means global and multiline

i found solution! Thanks for talking about "group":
console.log(stdout);
const regexp = new RegExp("(?<name>([а-яА-Яa-zA-Z_0-9]+))\\s+(True|False)\\s+", "mg");
for (const match of stdout.matchAll(regexp)) {
console.log(match.groups.name);
}

Related

JavaScript Alphanumeric Regex and allow asterisk at the start of the string but do not allow asterisk at the last 4 digits of the string

I have this regex ^[a-zA-Z0-9*]+$ for only allowing alphanumeric chars and allow Asterisk(*). But I would like allow asterisk only at the start of the string. But asterisk is not allowed at the last 4 digits of the string.
new RegExp('^[a-zA-Z0-9*]+$').test('test') ---Valid
new RegExp('^[a-zA-Z0-9*]+$').test('test1234') --Valid
new RegExp('^[a-zA-Z0-9*]+$').test('test##_')--Invalid
new RegExp('^[a-zA-Z0-9*]+$').test('****1234') --Valid
new RegExp('^[a-zA-Z0-9*]+$').test('*tes**1234') --Valid
new RegExp('^[a-zA-Z0-9*]+$').test('test****') --Should be Invalid
"How would I allow Asterisk only at the start of the string?" But if the asterisk presents in any of the last 4 positions then it should be invalid
You can use this regex to allow only alphanumeric chars and asterisk, but no asterisk at the last 4 char positions:
const regex = /^(?:[a-z\d*]*[a-z\d]{4}|[a-z\d]{1,3})$/i;
[
'1',
'12',
'test',
'test1234',
'****1234',
'*tes**1234',
'*1*2345',
'test##_',
'test****',
'test***5',
'test**4*',
'*3**'
].forEach(str => {
let result = regex.test(str);
console.log(str, '==>', result);
});
Output:
1 ==> true
12 ==> true
test ==> true
test1234 ==> true
****1234 ==> true
*tes**1234 ==> true
*1*2345 ==> true
test##_ ==> false
test**** ==> false
test***5 ==> false
test**4* ==> false
*3** ==> false
Explanation of regex:
^ -- anchor at start of string
(?: -- start non-capture group (for logical OR)
[a-z\d*]*[a-z\d]{4} -- allow alphanumeric chars and asterisk, followed by 4 alphanumeric chars
| -- logical OR
[a-z\d]{1,3} -- allow 1 to 3 alphanumeric chars
) -- close group
$ -- anchor at end of string
Not that it is easier to read and more efficient to use /.../ instead of new RegExp("..."). You need the regex constructor only if you have variable input.

split string based on words and highlighted portions with `^` sign

I have a string that has highlighted portions with ^ sign:
const inputValue = 'jhon duo ^has a car^ right ^we know^ that';
Now how to return an array which is splited based on words and ^ highlights, so that we return this array:
['jhon','duo', 'has a car', 'right', 'we know', 'that']
Using const input = inputValue.split('^'); to split by ^ and const input = inputValue.split(' '); to split by words is not working and I think we need a better idea.
How would you do this?
You can use match with a regular expression:
const inputValue = 'jhon duo ^has a car^ right ^we know^ that';
const result = Array.from(inputValue.matchAll(/\^(.*?)\^|([^^\s]+)/g),
([, a, b]) => a || b);
console.log(result);
\^(.*?)\^ will match a literal ^ and all characters until the next ^ (including it), and the inner part is captured in a capture group
([^^\s]+) will match a series of non-white space characters that are not ^ (a "word") in a second capture group
| makes the above two patterns alternatives: if the first doesn't match, the second is tried.
The Array.from callback will extract only what occurs in a capture group, so excluding the ^ characters.
trincot's answer is good, but here's a version that doesn't use regex and will throw an error when there are mismatched ^:
function splitHighlights (inputValue) {
const inputSplit = inputValue.split('^');
let highlighted = true
const result = inputSplit.flatMap(splitVal => {
highlighted = !highlighted
if (splitVal == '') {
return [];
} else if (highlighted) {
return splitVal.trim();
} else {
return splitVal.trim().split(' ')
}
})
if (highlighted) {
throw new Error(`unmatched '^' char: expected an even number of '^' characters in input`);
}
return result;
}
console.log(splitHighlights('^jhon duo^ has a car right ^we know^ that'));
console.log(splitHighlights('jhon duo^ has^ a car right we^ know that^'));
console.log(splitHighlights('jhon duo^ has a car^ right ^we know^ that'));
console.log(splitHighlights('jhon ^duo^ has a car^ right ^we know^ that'));
You can still use split() but capture the split-sequence to include it in the output.
For splitting you could use *\^([^^]*)\^ *| + to get trimmed items in the results.
const inputValue = 'jhon duo ^has a car^ right ^we know^ that';
// filtering avoids empty items if split-sequence at start or end
let input = inputValue.split(/ *\^([^^]*)\^ *| +/).filter(Boolean);
console.log(input);
regex
matches
*\^
any amount of space followed by a literal caret
([^^]*)
captures any amount of non-carets
\^ *
literal caret followed by any amount of space
| +
OR split at one or more spaces

Regex to capture all vars between delimiters

How do I capture all 1, 2 and 3 in not |1|2|3|
My regex \|(.*?)\| skips 2.
const re = /\|(.*?)\|/gi;
const text = 'not |1|2|3|'
console.log(text.match(re).map(m => m[1]));
You can use
const re = /\|([^|]*)(?=\|)/g;
const text = 'not |123|2456|37890|'
console.log(Array.from(text.matchAll(re), m => m[1]));
Details:
\| - a | char
([^|]*) - Group 1: zero or more chars other than |
(?=\|) - a positive lookahead that matches a location that is immediately followed with |.
If you do not care about matching the | on the right, you can remove the lookahead.
If you also need to match till the end of string when the trailing | is missing, you can use /\|([^|]*)(?=\||$)/g.

Struggling with RegEx validation and formating for specfici ID format

I have couple specific string formatting i want to achieve for different entities:
Entity 1: 1111-abcd-1111 or 1111-abcd-111111
Entity 2: [10 any symbol or letter(all cap) or number]-[3 letters]
Entity 3: [3 letters all cap]-[3 any]-[5 number]
Not sure if Regex is best approach, because i also want to use this as validator when user starts typing the char's it will check against that Entity selected and then against it's RegEx
Here is a regex with some input strings:
const strings = [
'1111-abcd-1111', // match
'1111-abcd-111111', // match
'1111-abcd-1111111', // no match
'ABCS#!%!3!-ABC', // match
'ABCS#!%!3!-ABCD', // nomatch
'ABC-#A3-12345', // match
'ABC-#A3-1234' // no match
];
const re = /^([0-9]{4}-[a-z]{4}-[0-9]{4,6}|.{10}-[A-Za-z]{3}|[A-Z]{3}-.{3}-[0-9]{5})$/;
strings.forEach(str => {
console.log(str + ' => ' + re.test(str));
});
Result:
1111-abcd-1111 => true
1111-abcd-111111 => true
1111-abcd-1111111 => false
ABCS#!%!3!-ABC => true
ABCS#!%!3!-ABCD => false
ABC-#A3-12345 => true
ABC-#A3-1234 => false
Explanation of regex:
^ - anchor text at beginning, e.g. what follows must be at the beginning of the string
( - group start
[0-9]{4}-[a-z]{4}-[0-9]{4,6} - 4 digits, -, 4 lowercase letters, -, 4-6 digits
| - logical OR
.{10}-[A-Za-z]{3} - any 10 chars, -, 3 letters
| - logical OR
[A-Z]{3}-.{3}-[0-9]{5} - 3 uppercase letters, -, any 3 chars, -, 5 digits
) - group end
$ - anchor at end of string
Your definition is not clear; you can tweak the regex as needed.

How can I Regex filename with exactly 1 underscores in javascript?

I need to match if filenames have exactly 1 underscores. For example:
Prof. Leonel Messi_300001.pdf -> true
Christiano Ronaldo_200031.xlsx -> true
Eden Hazard_3322.pdf -> true
John Terry.pdf -> false
100023.xlsx -> false
300022_Fernando Torres.pdf -> false
So the sample : name_id.extnames
Note : name is string and id is number
I try like this : [a-zA-Z\d]+_[0-9\d]
Is my regex correct?
As the filename will be name_id.extension, as name string or space [a-z\s]+? then underscore _, then the id is a number [0-9]+?, then the dot, as dot is a special character you need to scape it with backslash \., then the extension name with [a-z]+
const checkFileName = (fileName) => {
const result = /[a-z\s]+?_\d+?\.[a-z]+/i.test(fileName);
console.log(result);
return result;
}
checkFileName('Prof. Leonel Messi_300001.pdf')
checkFileName('Christiano Ronaldo_200031.xlsx')
checkFileName('Eden Hazard_3322.pdf')
checkFileName('John Terry.pdf')
checkFileName('100023.xlsx')
checkFileName('300022_Fernando Torres.pdf')
[a-zA-Z]+_[0-9\d]+
or
[a-zA-Z]+_[\d]+
You should use ^...$ to match the line. Then just try to search a group before _ which doesn't have _, and the group after, without _.
^(?<before>[^_]*)_(?<after>[^_]*)\.\w+$
https://regex101.com/r/ZrA7B1/1
Regex
My try with separate groups for
name: Can contain anything. Last _ occurrence should be the end
id: Can contain only numbers. Last _ occurrence should be the start
ext: Before last .. Can only contain a-z and should be more than one character.
/^(?<name>.+)\_(?<id>\d+)\.(?<ext>[a-z]+)/g
Regex 101 Demo
JS
const fileName = "Lionel Messi_300001.pdf"
const r = /^(?<name>.+)\_(?<id>\d+)\.(?<ext>[a-z]+)/g
const fileNameMatch = r.test(fileName)
if (fileNameMatch) {
r.lastIndex = 0
console.log(r.exec(fileName).groups)
}
See CodePen

Categories

Resources