How to prevent regex characters from being changed after page is rendered? - javascript

I'am stuck after searching and trying several tests, but just can't figure out how to fix the following issue.
I use these characters \x3c, \x3e and \x22 in a regEx and save is in a variable in *.component.ts but when I use the variable in the markup/HTML, it turns it into <, > and ". the result is that my Pattern doesn't work as expected.
Here is one of test on regex101.com and as you can see it works as it should be:
^(?=.*[a-zA-Z\d!\x22#$%&\'()*+,.:;\x3c=\x3e?#[\]^_`{|}~/\\-])[A-Za-z\d!\x22#$%&\'()*+,.:;\x3c=\x3e?#[\]^_`{|}~/\\-]{8,50}$
How can I prevent this and keep the characters as they are in the original when the page is rendered? Is it a behavior of TypeScript or JavaScript browser engine or what? Any hint would be great.

First of all, you need to use double backslashes to introduce literal backslashes into the regex patterns. I.e. if you write "\x22" as a string literal, it is in fact a mere ". So, to define \x22 in a string literal, write "\\x22".
Then, you have
^(?=.*[a-zA-Z\d!\x22#$%&\'()*+,.:;\x3c=\x3e?#[\]^_`{|}~/\\-])[A-Za-z\d!\x22#$%&\'()*+,.:;\x3c=\x3e?#[\]^_`{|}~/\\-]{8,50}$
The lookahead here is redundant because it requires the same set of chars as is required by the consuming part. The lookahead can be removed, or better replaced with the one you need, (?=[^A-Z]*[A-Z]), requiring at least 1 uppercase ASCII letter:
^(?=[^A-Z]*[A-Z])[A-Za-z\d!\x22#$%&\'()*+,.:;\x3c=\x3e?#[\]^_`{|}~/\\-]{8,50}$
As a string literal:
"^(?=[^A-Z]*[A-Z])[A-Za-z\\d!\\x22#$%&'()*+,.:;\\x3c=\\x3e?#[\\]^_`{|}~/\\\\-]{8,50}$"
See the regex demo.

Related

matchAll() not functioning as documented: Regex testing site says regex works, nodejs says its undefined, why? [duplicate]

So, I'm trying to write a regex that matches all numbers. Here is that regex:
/\b[\d \.]+\b/g
And I try to use it on the string:
100 two 100
And everything works fine; it matches both of the numbers.
But I want to rewrite the regex in the form:
new RegExp(pattern,modifiers)
Because I think it looks clearer.
So I write it like this:
new RegExp('\b[\d \.]+\b','g')
But now it won't match the former test string. I have tried everything, but I just can't get it to work. What am I doing wrong?
Your problem is that the backslash in a string has a special meaning; if you want a backslash in your regexp, you first need to get literal backslashes in the string passed to the regex:
new RegExp('\\b[\\d \\.]+\\b','g');
Note that this is a pretty bad (permissive) regex, as it will match ". . . " as a 'number', or "1 1...3 42". Better might be:
/-?\d+(?:\.\d+)?\b/
Note that this matches odd things like 0000.3 also does not match:
Leading +
Scientific notation, e.g. 1.3e7
Missing leading digit, e.g. .4
Also, note that using the RegExp constructor is (marginally) slower and certainly less idiomatic than using a RegExp literal. Using it is only a good idea when you need to constructor your RegExp from supplied strings. Most anyone with more than passing familiarity with JavaScript will find the /.../ notation fully clear.

JavaScript RegEx - Match quoted string - Possibly unexpected result?

Why does this:
console.log(/^(['"])(?:(?:\\[^])|[^\\])*\1/.test('"\"'))
result in true? Is this expected behavior or a bug? If it's expected, how to achieve intended behavior, which is to result in false as the last closing quote in the example shouldn't be matched as it's escaped? Maybe I made a mistake in writing the RegEx, in which case, I hope someone can kindly point out the error to me...
For the uninitiated, the above regular expression in JavaScript is intended to match only a complete (meaning, the matched portion should be a complete quoted string, NOT that the whole input string should be a complete quoted string.) single or double quoted string that may or not contain backslash escaped special characters. Nested levels of escaped strings may be present. Also, for simplicity, and as per requirement, the match starts from the beginning of the input string, as otherwise, a match may be possible, incorrectly, starting from an already escaped quote.
Tested in Firefox 82.0.2 and Edge 86.0.622.63
Ah, never mind! I figured out that the problem is not in the RegEx, but in the way I crafted the input string. The way I've written it, the outer string interprets the escape instead of the backslash acting as an escape for the inner string! The correct way to write it is to escape the backslash, so the above code should be rewritten as:
console.log(/^(['"])(?:(?:\\[^])|[^\\])*\1/.test('"\\"'))
So, the result is as expected after all, and not a bug!

Unable to find a string matching a regex pattern

While trying to submit a form a javascript regex validation always proves to be false for a string.
Regex:- ^(([a-zA-Z]:)|(\\\\{2}\\w+)\\$?)(\\\\(\\w[\\w].*))+(.jpeg|.JPEG|.jpg|.JPG)$
I have tried following strings against it
abc.jpg,
abc:.jpg,
a:.jpg,
a:asdas.jpg,
What string could possible match this regex ?
This regex won't match against anything because of that $? in the middle of the string.
Apparently using the optional modifier ? on the end string symbol $ is not correct (if you paste it on https://regex101.com/ it will give you an error indeed). If the javascript parser ignores the error and keeps the regex as it is this still means you are going to match an end string in the middle of a string which is supposed to continue.
Unescaped it was supposed to match a \$ (dollar symbol) but as it is written it won't work.
If you want your string to be accepted at any cost you can probably use Firebug or a similar developer tool and edit the string inside the javascript code (this, assuming there's no server side check too and assuming it's not wrong aswell). If you ignore the $? then a matching string will be \\\\w\\\\ww.jpg (but since the . is unescaped even \\\\w\\\\ww%jpg is a match)
Of course, I wrote this answer assuming the escaping is indeed the one you showed in the question. If you need to find a matching pattern for the correctly escaped one ^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w].*))+(\.jpeg|\.JPEG|\.jpg|\.JPG)$ then you can use this tool to find one http://fent.github.io/randexp.js/ (though it will find weird matches). A matching pattern is c:\zz.jpg
If you are just looking for a regular expression to match what you got there, go ahead and test this out:
(\w+:?\w*\.[jpe?gJPE?G]+,)
That should match exactly what you are looking for. Remove the optional comma at the end if you feel like it, of course.
If you remove escape level, the actual regex is
^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w].*))+(.jpeg|.JPEG|.jpg|.JPG)$
After ^start the first pipe (([a-zA-Z]:)|(\\{2}\w+)\$?) which matches an alpha followed by a colon or two backslashes followed by one or more word characters, followed by an optional literal $. There is some needless parenthesis used inside.
The second part (\\(\w[\w].*))+ matches a backslash, followed by two word characters \w[\w] which looks weird because it's equivalent to \w\w (don't need a character class for second \w). Followed by any amount of any character. This whole thing one or more times.
In the last part (.jpeg|.JPEG|.jpg|.JPG) one probably forgot to escape the dot for matching a literal. \. should be used. This part can be reduced to \.(JPE?G|jpe?g).
It would match something like
A:\12anything.JPEG
\\1$\anything.jpg
Play with it at regex101. A better readable could be
^([a-zA-Z]:|\\{2}\w+\$?)(\\\w{2}.*)+\.(jpe?g|JPE?G)$
Also read the explanation on regex101 to understand any pattern, it's helpful!

Regexp in JavaScript to find text between two sentences in multi-line string [duplicate]

Morning All
I have a javascript regular expression that doesn't work correctly and I'm not sure why.
I'm calling the API at https://uptimerobot.com, and getting back a JSON string with details of the monitor statues. This is however wrapped in a function call syntax. Like this:
jsonUptimeRobotApi({MASKED-STATUES-OBJ})
As this call is being made from a generic script I was hoping to test the response to see if it had this type of syntax wrapping then parse it accordingly.
However I can't seem to find a RegEx syntax to match the logic:
Start of string
An unknown number of characters [a-zA-Z]
Open parentheses
Open brace
An unknown number of any character
Close brace
Close parentheses
End of string
This looks right:
^[a-zA-Z]+\(\{.*\}\)$
And works in regex101: https://regex101.com/r/sE7dM6/1
However it fails in my code and via jsFiddle: https://jsfiddle.net/po49pww3/1/
The "m" was added in regex101 as the actual string is much longer, and failed to match without it, however a number of small tweeks that I've tried havn't resulted in a match in jsFiddle.
Anyone know whats wrong?
Escape all the backslashes one more time because within " delimiters, you must escape the backslash one more time or otherwise it would be treated as an escape sequence.
var regEx = new RegExp("^[a-zA-Z]+\\(\\{.*\\}\\)$", "m");
DEMO

RegExp in JavaScript, when a quantifier is part of the pattern

I have been trying to use a regexp that matches any text that is between a caret, less than and a greater than, caret.
So it would look like: ^< THE TEXT I WANT SELECTED >^
I have tried something like this, but it isn't working: ^<(.*?)>^
I'm assuming this is possible, right? I think the reason I have been having such a tough time is because the caret serves as a quantifier. Thanks for any help I get!
Update
Just so everyone knows, they following from am not i am worked
/\^<(.*?)>\^/
But, it turned out that I was getting html entities since I was getting my string by using the .innerHTML property. In other words,
> ... >
< ... <
To solve this, my regexp actually looks like this:
\^<(.*?)((.|\n)*)>\^
This includes the fact that the string in between should be any character or new line. Thanks!
You need to escape the ^ symbol since it has special meaning in a JavaScript regex.
/\^<(.*?)>\^/
In a JavaScript regex, the ^ means beginning of the string, unless the m modifier was used, in which case it means beginning of the line.
This should work:
\^<(.*?)>\^
In a regex, if you want to use a character that has a special meaning (caret, brackets, pipe, ...), you have to escape it using a backslash. For example, (\w\b)*\w\. will select a sequence of words terminated by a dot.
Careful!
If you have to pass the regex pattern as a string, i.e. there's no regex literal like in javascript or perl, you may have to use a double backslash, which the programming language will escape to a single one, which will then be processed by the regex engine.
Same regex in multiple languages:
Python:
import re
myRegex=re.compile(r"\^<(.*?)>\^") # The r before the string prevents backslash escaping
PHP:
$result=preg_match("/\\^<(.*?)>\\^/",$subject); // Notice the double backslashes here?
JavaScript:
var myRegex=/\^<(.*?)>\^/,
subject="^<blah example>^";
subject.match(myRegex);
If you tell us what programming language you're writing in, we'll be able to give you some finished code to work with.
Edit: Whoops, didn't even notice this was tagged as javascript. Then, you don't have to worry about double backslash at all.
Edit 2: \b represent a word boundary. Though I agree yours is what I would have used myself.

Categories

Resources