In an example piece of code, I stumbled upon this line:
// Change the string into lower case and remove all non-alphanumeric characters
var cstr = str_entry.toLowerCase().replace(/[^a-zA-Z0-9]+/g,'');
I think I understand that the /g inside the parameter makes everything in between the // become empty strings (''). Am I correct?
What does the ^ part of the parameter do? What does everything inside the [ ] brackets mean?
The first parameter of the replace function is a regular expression, which is a way of determining if a string matches a complex pattern.
The /g parameter means 'global', so if two parts of the str_entry string match, they will both replaced with an empty string, instead of just the first one.
The ^ within [] means 'not', so it's saying 'check if the string is not a-zA-Z0-9'.
More simply, the regular expression is identifying any non-alphanumeric characters in your string. Using it with replace(..., '') will remove those characters.
Take a look at Regex101 for more information about how regular expressions work. You can punch in your regular expression and it will tell you what each part of it does.
Related
While trying to submit a form a javascript regex validation always proves to be false for a string.
Regex:- ^(([a-zA-Z]:)|(\\\\{2}\\w+)\\$?)(\\\\(\\w[\\w].*))+(.jpeg|.JPEG|.jpg|.JPG)$
I have tried following strings against it
abc.jpg,
abc:.jpg,
a:.jpg,
a:asdas.jpg,
What string could possible match this regex ?
This regex won't match against anything because of that $? in the middle of the string.
Apparently using the optional modifier ? on the end string symbol $ is not correct (if you paste it on https://regex101.com/ it will give you an error indeed). If the javascript parser ignores the error and keeps the regex as it is this still means you are going to match an end string in the middle of a string which is supposed to continue.
Unescaped it was supposed to match a \$ (dollar symbol) but as it is written it won't work.
If you want your string to be accepted at any cost you can probably use Firebug or a similar developer tool and edit the string inside the javascript code (this, assuming there's no server side check too and assuming it's not wrong aswell). If you ignore the $? then a matching string will be \\\\w\\\\ww.jpg (but since the . is unescaped even \\\\w\\\\ww%jpg is a match)
Of course, I wrote this answer assuming the escaping is indeed the one you showed in the question. If you need to find a matching pattern for the correctly escaped one ^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w].*))+(\.jpeg|\.JPEG|\.jpg|\.JPG)$ then you can use this tool to find one http://fent.github.io/randexp.js/ (though it will find weird matches). A matching pattern is c:\zz.jpg
If you are just looking for a regular expression to match what you got there, go ahead and test this out:
(\w+:?\w*\.[jpe?gJPE?G]+,)
That should match exactly what you are looking for. Remove the optional comma at the end if you feel like it, of course.
If you remove escape level, the actual regex is
^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w].*))+(.jpeg|.JPEG|.jpg|.JPG)$
After ^start the first pipe (([a-zA-Z]:)|(\\{2}\w+)\$?) which matches an alpha followed by a colon or two backslashes followed by one or more word characters, followed by an optional literal $. There is some needless parenthesis used inside.
The second part (\\(\w[\w].*))+ matches a backslash, followed by two word characters \w[\w] which looks weird because it's equivalent to \w\w (don't need a character class for second \w). Followed by any amount of any character. This whole thing one or more times.
In the last part (.jpeg|.JPEG|.jpg|.JPG) one probably forgot to escape the dot for matching a literal. \. should be used. This part can be reduced to \.(JPE?G|jpe?g).
It would match something like
A:\12anything.JPEG
\\1$\anything.jpg
Play with it at regex101. A better readable could be
^([a-zA-Z]:|\\{2}\w+\$?)(\\\w{2}.*)+\.(jpe?g|JPE?G)$
Also read the explanation on regex101 to understand any pattern, it's helpful!
This seems a very simple question but I haven't been able to get this to work.
How do I convert the following string:
var origin_str = "abc/!/!"; // Original string
var modified_str = "abc!!"; // replaced string
I tried this:
console.log(origin_str.replace(/\\/,''));
This only removes the first occurrence of backslash. I want to replaceAll. I followed this instruction in SO: How to replace all occurrences of a string in JavaScript?
origin_str.replace(new RegExp('\\', 'g'), '');
This code throws me an error SyntaxError: Invalid regular expression: /\/: \ at end of pattern. What's the regex for removing backslash in javascript.
A quick basic overview of regular expressions in JavaScript
When using regular expressions you can define the expression on two ways.
Either directly in the function or variable by using /regular expression/
Or by using the regExp contructor: new RegExp('regular expression').
Please note the difference between the two ways of defining. In the first the search pattern is encapsuled by forward slashes, while in the second one the search pattern is passed as a string.
Remember that regular expressions is in fact a search language with it's own syntax. Some characters are used to define actions: /, \, ^, $, . (dot), |, ?, *, +, (, ), [, {, ', ". These characters are called metacharacters and need to be escaped if you want them to be part of the search pattern. If not they will be treated as an option or generate script errors. Escaping is done by using the backslash. E.g. \\ escapes the second backslash and the search pattern will now search for backslashes.
There are a multitude of options you can add to your search pattern.:
Examples
adding \d will make the pattern search for a numeric value between [0-9] and/or the underscore. Simple regular expressions are parsed from left to right.
/javascript/
Searches for the word javascript in a string.
/[a-z]/
When a pattern is put between square bracket the search pattern searches for a character matching any one of the values inside the square brackets. This will find d in 229302d34330
You can build a regular expression with multiple blocks.
/(java)|(emca)script/
Find javascript or emcascript in a string. The | is the or operator.
/a/ vs. /a+/
The first matches the first a in aaabbb, the second matches a repetition of a until another character is found. So the second matches: aaa.
The plus sign + means find a one or more times. You can also use * which means zero or more times.
/^\d+$/
We've seen the \d earlier and also the plus sign. This means find one or more numeric characters. The ^ (caret) and $ (dollar sign) are new. The ^ says start searching from the begin of the string, while the $ says until the end of the string. This expression will match: 574545485 but not d43849343, 549854fff or 4348d8788.
Flags
Flags are operators and are declared after the regular expression /regular expression/flags
JavaScript has three flags you can use:
g (global) Searches multiples times for the pattern.
i (ignore case) Ignores case in pattern.
m (multiline) treat beginning and end characters (^ and $) as working over multiple lines (i.e., match the beginning or end of each line (delimited by \n or \r), not only the very beginning or end of the whole input string)
So a regular expression like this:
/d[0-9]+/ig
matches D094938 and D344783 in 98498D094938A37834D344783.
The i makes the search case-insensitive. Matching a D because of the d in the pattern. If D is followed by one or more numbers then the pattern is matched. The g flag commands the expression to look for the pattern globally or simply said: multiple times.
In your case #Qwerty provided the correct regex:
origin_str.replace(/\//g, "")
Where the search pattern is a single forward slash /. Escaped by the backslash to prevent script errors. The g flags commands the replace function to search for all occurrences of the forward slash in the string and replace them with an empty string "".
For a comprehensive tutorial and reference : http://www.regular-expressions.info/tutorial.html
Looking for this?
origin_str.replace(/\//g, "")
The syntax for replace is
.replace(/pattern/flags, replacement)
So in my case the pattern is \/ - an escaped slash
and g is global flag.
Can somebody explain what this regular expression does?
document.cookie.match(/cookieInfo=([^;]*).*$/)[1]
Also it would be great if I can strip out the double quotes I'm seeing in the cookieInfo values. i.e. when cookieInfo="xyz+asd" - I want to strip out the double quotes using the above regular expression.
It basically saying grab as many characters that are not semi-colons and that follow after the string 'cookieInfo='
Try this to eliminate the double quotes:
document.cookie.match(/cookieInfo="([^;]*)".*$/)[1]
It searches the document.cookie string for cookieInfo=.
Next it grabs all of the characters which are not ; (until it hits the first semicolon).
[...] set of all characters included inside.
[^...] set of all characters which don't match
Then it lets the RegEx search through all other characters.
.* any character, 0 or more times.
$ end of string (or in some special cases, end of line).
You could replace " a couple of different ways, but rather than stuffing it into the regex, I'd recommend doing a replace on it after the fact:
var string = document.cookie.match(...)[1],
cleaned_string = string.replace(/^"|"$/g, "");
That second regex says "look at the start of the string and see if there's a ", or look at the end of the string and see if there's a ".
Normally, a RegEx would stop after it did the first thing it found. The g at the end means to keep going for every match it can possibly find in the string that you gave it.
I wouldn't put it in the original RegEx, because playing around with optional quotes can be ugly.
If they're guaranteed to always, always be there, then that's great, but if you assume they are, and you hit one that doesn't have them, then you're going to get a null match.
The regular expression matches a string starting with 'cookieInfo=' followed by and capturing 0 or more non-semi-column characters followed by 0 or more 'anythings'.
To strip out the double quotes you can use the regex /"/ and replace it with an empty string.
I'm trying to extract (potentially hyphenated) words from a string that have been marked with a '#'.
So for example from the string
var s = '#moo, #baa and #moo-baa are writing an email to a#bc.de'
I would like to return
['#moo', '#baa', '#moo-baa']
To make sure I don't capture the email address, I check that the group is preceded by a white-space character OR the beginning of the line:
s.match(/(^|\s)#(\w+[-\w+]*)/g)
This seems to do the trick, but it also captures the spaces, which I don't want:
["#moo", " #baa", " #moo-baa"]
Silencing the grouping like this
s.match(/(?:^|\s)#(\w+[-\w+]*)/g)
doesn't seem to work, it returns the same result as before. I also tried the opposite, and checked that there's no \w or \S in front of the group, but that also excludes the beginning of the line. I know I could simply trim the spaces off, but I'd really like to get this working with just a single 'match' call.
Anybody have a suggestion what I'm doing wrong? Thanks a lot in advance!!
[edit]
I also just noticed: Why is it returning the '#' symbols as well?! I mean, it's what I want, but why is it doing that? They're outside of the group, aren't they?
As far as I know, the whole match is returned from String.match when using the "g" modifier. Because, with the modifier you are telling the function to match the whole expression instead of creating numbered matches from sub-expressions (groups). A global match does not return groups, instead the groups are the matches themselves.
In your case, the regular expression you were looking for might be this:
'#moo, #baa and #moo-baa are writing an email to a#bc.de'.match(/(?!\b)(#[\w\-]+)/g);
You are looking for every "#" symbol that doesn't follow a word boundary. So there is no need for silent groups.
If you don't want to capture the space, don't put the \s inside of the parentheses. Anything inside the parentheses will be returned as part of the capture group.
I'd like to compare 2 strings with each other, but I got a little problem with the Brackets.
The String I want to seek looks like this:
CAPPL:LOCAL.L_hk[1].vorlauftemp_soll
Quoting those to bracket is seemingly useless.
I tried it with this code
var regex = new RegExp("CAPPL:LOCAL.L_hk\[1\].vorlauftemp_soll","gi");
var value = "CAPPL:LOCAL.L_hk[1].vorlauftemp_soll";
regex.test(value);
Somebody who can help me??
It is useless because you're using string. You need to escape the backslashes as well:
var regex = new RegExp("CAPPL:LOCAL.L_hk\\[1\\].vorlauftemp_soll","gi");
Or use a regex literal:
var regex = /CAPPL:LOCAL.L_hk\[1\].vorlauftemp_soll/gi
Unknown escape characters are ignored in JavaScript, so "\[" results in the same string as "[".
In value, you have (1) instead of [1]. So if you expect the regular expression to match and it doesn't, it because of that.
Another problem is that you're using "" in your expression. In order to write regular expression in JavaScript, use /.../g instead of "...".
You may also want to escape the dot in your expression. . means "any character that is not a line break". You, on the other hand, wants the dot to be matched literally: \..
You are generating a regular expression (in which [ is a special character that can be escaped with \) using a string (in which \ is a special character).
var regex = /CAPPL:LOCAL.L_hk\[1\].vorlauftemp_soll/gi;