Why regExp has diff results in diff senarios? - javascript

Simply speaking, in either node.js or in the browser, run the code below:
const sep = '\\';
const regExpression = `/b\\${sep}|a\\${sep}/`;
const testCases = ['a\\abb\\abc','b\\'];
const regTest = new RegExp(regExpression);
console.log(`Result for ${testCases[0]} is ${regTest.test(testCases[0])}`)
console.log(`Result for ${testCases[1]} is ${regTest.test(testCases[1])}`)
Both of the outputs are false:
error
however, if I change to this:
const regExpression = `/c|b\\${sep}|a\\${sep}/`;
Both of the results will be true....why?
right
Another interesting thing is: The matching condition cannot be always the first, which takes '/c|b\${sep}|a\${sep}/' as an example, 'c' will NOT match.....

Is because of the regex itself.
const regExpression = "/test/";
const regTest = new RegExp(regExpression);
console.log(regTest); // Regex: //test//
console.log(regTest.test("test")) // false
console.log(regTest.test("/test/")) // true
In the first case /b\\\\|a\\\\/ -> regex -> //b\\|a\\//. The regex will try to find /b\\ or a\\/. So will fail in both values.
'a\\abb\\abc' => FALSE
'b\\' => FALSE
'a\\/abb\\abc' => TRUE (a\\/ coincidence)
'/b\\' => TRUE (/b\\ coincidence)
In the second case /c|b\\\\|a\\\\/ -> regex-> /c|b\\|a\\/. The regex will try to find /c or b\\ or a\\/.
'a\\abb\\abc' => TRUE (b\\ coincidence)
'b\\' => TRUE (b\\ coincidence)
So, in conclusion you could solve your problem with:
const regExpression = `b\\${sep}|a\\${sep}`;
This should try to find b\\ or a\\. I don't know if this is the case but remember the ^ and $ regex tokens too. You could make your tests in regex101.

Related

Comparing two RegEx objects in Node.js

I'm using NodeRED to perform some logic on a string which has been created from image analysis (OCR) on Microsoft Azure Cognitive Services. The image analysis doesn't allow for any pattern matching / input pattern.
The resulting string (let's call it 'A') sometimes interprets characters slightly incorrectly, typical things like 'l' = '1' or 's' = '5'.
The resulting string can be one of only a few different formats, for argument sake lets say:
[a-z]{4,5}
[a-g]{3}[0-9]{1,2}
[0-9][a-z]{4}
What I need to do is determine which format the intepretted string ('A') most closely aligns to ('1', '2' or '3'). Once I establish this, I was planning to adjust the misinterpretted characters and hopefully be left with a string that is (near) perfect.
My initial plan was to convert 'A' into RegEx - so if 'A' came back as "12345", I would change this to a RegEx object [1|l][2|z]34[5|s], compare this object to the RegEx objects and hopefully one would come back as a match.
In reality, the interpretted string is more like 8 alphanumeric and five different (fairly complex) RegEx possibilities, but I've tried to simplify the problem for the purposes of this question.
So the question: is it possible to compare RegEx in this way? Does anyone have any other suggestions on how this image analysis could be improved?
Thanks
Here is a solution using a Cartesian product to compare a string for possible matches. Test string is 'abclz', which could match pattern1 or pattern2:
const cartesian = (...a) => a.reduce((a, b) => a.flatMap(d => b.map(e => [d, e].flat())));
const charMapping = {
'1': ['1','l'],
'l': ['1','l'],
'2': ['2','z'],
'z': ['2','z'],
'5': ['5','s'],
's': ['5','s']
};
const buckets = {
pattern1: /^[a-z]{4,5}$/,
pattern2: /^[a-g]{3}[0-9]{1,2}$/,
pattern3: /^[0-9][a-z]{4}$/
};
const input = 'abclz';
console.log('input:', input);
let params = input.split('').map(c => charMapping[c] || [c]);
let toCompare = cartesian(...params).map(arr => arr.join(''));
console.log('toCompare:', toCompare);
let potentialMatches = toCompare.flatMap(str => {
return Object.keys(buckets).map(pattern => {
let match = buckets[pattern].test(str);
console.log(str, pattern + ':', match);
return match ? str : null;
}).filter(Boolean);
});
console.log('potentialMatches:', potentialMatches);
Output:
input: abclz
toCompare: [
"abc12",
"abc1z",
"abcl2",
"abclz"
]
abc12 pattern1: false
abc12 pattern2: true
abc12 pattern3: false
abc1z pattern1: false
abc1z pattern2: false
abc1z pattern3: false
abcl2 pattern1: false
abcl2 pattern2: false
abcl2 pattern3: false
abclz pattern1: true
abclz pattern2: false
abclz pattern3: false
potentialMatches: [
"abc12",
"abclz"
]

Javascript start match function on second character

I'm trying to split a string from the second position in my string which I pass to the function.
Current position:
commandHandler(player: PlayerMp, command: string) {
if(command.startsWith("/", 0)){
const cmd = command.match(/\S+/g);
cmd.forEach(element => console.log(element));
}
}
If I pass "/test this" to this function then I get the following response: 1) "/test" 2) "this" while I need the following response: 1) "test" 2) "this"
What am I doing wrong?
You can use slice(1) to remove the first character of the string then proceed as before.
const command = "/test this";
if(command.startsWith("/")){
const cmd = command.slice(1).match(/\S+/g)
cmd.forEach(element => console.log(element));
}

How to get the first string's from the right hand side

So I have this code that would get the first string's from the right hand side and stop whenever there is an integer but for some reason its not working with me.
Example input of fUnit is "CS_25x2u"
expected output of it after using unit is "u".
Real output is "undefined".
function buildUnit(fUnit){// wahts ginna be passed here is the gibirish unit and the output of this function is the clear unit
fUnit = fUnit.toString;
const regex = /[a-zA-Z]*$/;
const unit = (x) => x.match(regex)[0];
fUnit = unit(fUnit);
If you need more info please let me know
Thank you
const regex = /[a-zA-Z]*$/;
console.log(regex.exec(sample));
Assuming fUnit variable contains your string
const unit = (x) => x.match(regex)[0];
console.log(unit(fUnit));
Just in case.
function buildUnit(fUnit) {
return fUnit.toString().match(/[A-z]*$/)[0];
}
console.log(buildUnit('CS_25x2u')); // 'u'
console.log(buildUnit('')); // ''
console.log(buildUnit(123)); // ''
console.log(buildUnit('aaa123')); // ''
console.log(buildUnit('aaa 123 bbb')); // 'bbb'

Jest, match to regex

Currently I have this test:
import toHoursMinutes from '../../../app/utils/toHoursMinutes';
describe('app.utils.toHoursMinutes', () => {
it('should remove 3rd group of a time string from date object', async () => {
expect(toHoursMinutes(new Date('2020-07-11T23:59:58.000Z'))).toBe('19:59');
});
});
What toHoursMinutes does is to receive a Date object and transform it like this way:
export default (date) => `${('' + date.getHours()).padStart(2, '0')}:${('' + date.getMinutes()).padStart(2, '0')}`;
My local time offset is -4 so my test pass ok if I compare 23:59 with 19:59, but I want to run the test anywhere, so I prefer to compare the output of toHoursMinutes() with a regex expression like this one, that check the hh:mm format: ^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$
But how can I use a regex to compare instead a explicit string?
I tried this:
const expected = [
expect.stringMatching(/^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$/)
];
it.only('matches even if received contains additional elements', () => {
expect(['55:56']).toEqual(
expect.arrayContaining(expected)
);
});
But I get a:
Expected: ArrayContaining [StringMatching /^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$/]
Received: ["55:56"]
There is a toMatch function on expect() that does just that.
expect('12:59').toMatch(/^\d{1,2}:\d{2}$/); // stripped-down regex
https://jestjs.io/docs/expect#tomatchregexp--string
If you want to match a regex inside of other jest functions, you can do so by using expect.stringMatching(/regex/).
expect({
name: 'Peter Parker',
}).toHaveProperty('name', expect.stringMatching(/peter/i))
https://jestjs.io/docs/expect#expectstringmatchingstring--regexp
I was ok except in the dummy data because wasn't for the regex. In case anyone need it, this works:
const expected2 = [
expect.stringMatching(/^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$/)
];
it('matches even if received contains additional elements', () => {
expect(['12:59']).toEqual(
expect.arrayContaining(expected2)
);
});
In my case, I can check the format of a time in a span using toHaveTextContent().
const span = screen.getByRole("presentation", { name: /time/i });
expect(span).toHaveTextContent(/^(0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]$/);
Docs for toHaveTextContent(): https://github.com/testing-library/jest-dom#tohavetextcontent

Do a simple seach regardless of upper an lower case

Could someone explain to me, how I can do in javascript this simple code, without taking care of upper and lower case?
if(res.search('em')!=-1){ unit='em'; res.replace(unit,'');}
if(res.search('vh')!=-1){ unit='vh'; res.replace(unit,'');}
if(res.search('px')!=-1){ unit='px'; res.replace(unit,'');}
Without any idea, that is what I have coded. It's a lot of code
if(res.search('Em')!=-1){ unit='Em'; res.replace(unit,'');}
if(res.search('eM')!=-1){ unit='eM'; res.replace(unit,'');}
if(res.search('EM')!=-1){ unit='EM'; res.replace(unit,'');}
...
I'm sure there is a better way to do that!?
Thanks a lot.
You could use a regular expression with replace and save the found unit as a side effect of the replacer function. This would allow you to replace the unit without searching the string twice:
let res = "Value of 200Em etc."
let unit
let newRes = res.replace(/(em|vh|px)/i, (found) => {unit = found.toLowerCase(); return ''})
console.log("replaced:", newRes, "Found Unit:", unit)
For the first part you can use toLowerCase()
if(res.toLowerCase().search('em') != -1)
You can use alternation in regex alongside case insensitive flag.
/(em|vh|px)/i Mathces em or vh or px.
function replaceUnit(input){
return input.replace(/(em|px|vh)/i ,'replaced')
}
console.log(replaceUnit('height: 20em'))
console.log(replaceUnit('width:=20Em'))
console.log(replaceUnit('border-radius: 2Px'))
console.log(replaceUnit('unit=pX'))
console.log(replaceUnit('max-height=20Vh'))
you can use toLowerCase(), transform all the string to lower case and compare,
var tobereplaced = 'em';
if(res.search.toLowerCase(tobereplaced)> -1){ res.replace(tobereplaced,'');}
If you can make these three assumptions:
The string always starts with a number
The string always ends with a unit
The unit is always two characters
Then it could be as simple as:
const str = '11.5px';
const unit = str.substr(-2); //=> 'px'
const value = parseFloat(str, 10); //=> 11.5
Or with a function:
const parse = str => ({unit: str.substr(-2), value: parseFloat(str, 10)});
const {unit, value} = parse('11.5px');
// unit='px', value=11.5
All you need to to is force your string to lowercase (or uppercase) before testing its contents:
if( res.toLowerCase().search('em') !== -1){ do(stuff); }
To handle replacing the actual substring value in res, something like this should work:
let caseInsensitiveUnit = "em";
let unitLength;
let actualUnit;
let position = res.toLowerCase().search(caseInsensitiveUnit);
if(position > -1){
unitLength = caseInsensitiveUnit.length;
actualUnit = res.substring(postion, position + unitLength);
res.replace(actualUnit, "");
}

Categories

Resources