Validate jquery selector syntax using regex

Validate jquery selector syntax using regex - javascript

I am trying to write a regex expression in javascript to validate whether a string is a valid jquery selector. This is strictly educational and not a particular requirement in any project of mine
Pattern
/^(\$|Jquery)\(('|")[\.|#]?[a-zA-Z][a-zA-Z0-9!]*('|")\)$/gi
It works fine for below tests
$("#id")//true
$('.class')//true
jquery('.class')//true
jquery('div')//true
My problem is that the test on $('#id") also returns true i.e, using mixing single and double quote in js in invalid. How to restrict this. Can we have conditional regex?
const pattern = /^(\$|Jquery)\(('|")[\.|#]?[a-zA-Z][a-zA-Z0-9!]*('|")\)$/gi;
[
`$("#id")`, //true
`$('.class')`, //true
`jquery('.class')`, //true
`jquery('div')`, //true
].forEach(str => console.log(pattern.test(str)));

You can capture the first quote or doublequote in a group, and require that same group (the same quote or doublequote) at the end, using a backreference:
const re = /^(?:\$|Jquery)\((['"])[\.#]?[a-zA-Z][a-zA-Z0-9!]*\1\)$/gi;
console.log(re.test(`$("#id")`))
console.log(re.test(`$('#id")`))
console.log(re.test(`$("#id')`))
console.log(re.test(`$('#id')`))
There are also a couple other things to fix:
/^\$|Jquery...
meant that any string starting with $ would fulfill the regex. Enclose it in a group instead.
Single quote ' doesn't need escaping - best to remove the backslash.
Rather than
[\.|#]?
if you want to possibly match . or # (and not a pipe), use [\.#]? instead

Related

Javascript Regex vs Java Regex

I have a a regex in Javascript that works great: /:([\w]+):/g
I am working on converting my javascript app to java, and I know to escape the \ using \ i.e. /:([\\w]+):/g, yet my tests are still returning no match for the string "hello :testsmilie: how are you?"
Pattern smiliePattern = Pattern.compile("/:([\\w]+):/g");
Matcher m = smiliePattern.matcher(message);
if(m.find()) {
System.println(m.group(0));
}
In javascript it returns ":testsmilie:" just fine, so i'm not sure what the difference is. Any help would be much appreciated!

Your regex in java can just be :
Pattern.compile(":[^:]+:")
Which match : followed by one or more no two dots : followed by :
Or if you want to use \w you can use :
Pattern.compile(":\\w+:")
If you note you don't need parenthesis of group (), so to get the result you can just use :
System.out.println(m.group());

You should learn how is made a Javascript regex, because the / are the delimiters of the real regex, and g is a modifier for global
In Java the equivalent is: :([\\w]+):, and no need of global flag as you just need to call multiple times .find() to get all the matches
You should take a look at regex101 which is a good website to test regex

Non-capturing groups in Javascript regex

I am matching a string in Javascript against the following regex:
(?:new\s)(.*)(?:[:])
The string I use the function on is "new Tag:var;"
What it suppod to return is only "Tag" but instead it returns an array containing "new Tag:" and the desired result as well.
I found out that I might need to use a lookbehind instead but since it is not supported in Javascript I am a bit lost.
Thank you in advance!

Well, I don't really get why you make such a complicated regexp for what you want to extract:
(?:new\\s)(.*)(?:[:])
whereas it can be solved using the following:
s = "new Tag:";
var out = s.replace(/new\s([^:]*):.*;/, "$1")
where you got only one capturing group which is the one you're looking for.

\\s (double escaping) is only needed for creating RegExp instance.
Also your regex is using greedy pattern in .* which may be matching more than desired.
Make it non-greedy:
(?:new\s)(.*?)(?:[:])
OR better use negation:
(?:new\s)([^:]*)(?:[:])

difference between ruby regex and javascript regex

I made this regular expression: /.net.(\w*)/
I'm trying to capture the qa in a string like this:
https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG
I'm doing .replace on it like so location.replace(/.net.(\w*)/,data.newName));
But instead of capturing qa, it captures .net, when I run the code in Javascript
According to this online regex tool made for ruby, it captures qa as intended
http://rubular.com/r/ItrG7BRNRn
What's the difference between Javascript regexes and Ruby regexes, and how can I make my regex work as intended in javascript?
Edit:
I changed my code to this:
var str = `https://xxxxxxxxxx.cloudfront.net/qa/club`;
var re = /\.net\/([^\/]*)\//;
console.log(data2.files[i].location.replace(re,'$1'+ "test"));
And instead of
https://dm7svtk8jb00c.cloudfront.net/test/club
I get this:
https://dm7svtk8jb00c.cloudfrontqatestclub
If I remove the $1 I get https://dm7svtk8jb00c.cloudfronttestclub, which is closer, but I want to keep the slashes.

This would be a better regex:
/\.net\/([^\/]*)\//
Remember that . will match any character, not the period character. For that you need to escape it with a leading backslash: \.
Also, \w will only match numbers, letters and underscores. You could quite legitimately have a dash in that part of the URL. Therefore you're far better off matching anything that isn't a forward slash.

I am not sure how Ruby works, but JavaScript replace will not just replace the capture group, it replaces the whole matched string. By adding another capture group, you can use $1 to add back in the string you want to keep.
...replace(/(.net.)(\w*)/,"$1" + data.newName");

You have to do that like this:
location.replace(/(\.net.)(\w*)/, '$1' + data.newName)
replace replaces the whole matched substring, not a particular group. Ruby works exactly in the same way:
ruby -e "puts 'https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG'.sub(/.net.(\w*)/, '##')"
https://xxxxxx.cloudfront##/club/Slide1.PNG
ruby -e "puts 'https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG'.sub(/(.net.)(\w*)/, '\\1' + '##')"
https://xxxxxx.cloudfront.net/##/club/Slide1.PNG

There's no difference (at least with the pattern you've provided). In both cases, the expression matches ".net/qa", with qa being the first capture group within the expression. Notice that even in your linked example the entire match is highlighted.
I'd recommend something like this:
location.replace(/(.net.)\w*/, "$1" + data.newName);
Or this, to be a bit safer:
location.replace(/(.net.)\w*/, function(m, a) { return a + data.newName; });

It's not so much a different between JavaScript and Ruby's implementations of regular expressions, it's your pattern that needs a bit of work. It's not tight enough.
You can use something like /\.net\/([^\/]+)/, which you can see in action at Rubular.
That returns the characters delimited by / following .net.
Regex patterns are very powerful, but they're also fraught with dangerous side-effects that open up big holes easily, causing false-positives, which can ruin results unexpectedly. Until you know them well, start simply, and test them every imaginable way. And, once you think you know them well, keep doing that; Patterns in code we write where I work are a particular hot-button for me, and I'm always finding holes in them in our code-reviews and requiring them to be tightened until they do exactly what the developer meant, not what they thought they meant.
While the pattern above works, I'd probably do it a bit differently in Ruby. Using the tools made for the job:
require 'uri'
URL = 'https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG'
uri = URI.parse(URL)
path = uri.path # => "/qa/club/Slide1.PNG"
path.split('/')[1] # => "qa"
Or, more succinctly:
URI.parse(URL).path.split('/')[1] # => "qa"

JS RegEx: simple match with optional parts

I think I don't really get RegEx stuff, so I need help matching the following simple pattern:
SOME_TEXT _Syn: SYN_TEXT _Ant: ANT_TEXT
quotes are decorative, X_TEXT is any text (that does not contain _Syn: or _Ant: that are special abbreviation), _Syn or _Ant parts are optional
I need to get SOME_TEXT, SYN_TEXT and ANT_TEXT in array
So for example if _Syn part not present (input is SOME_TEXT _Ant: ANT_TEXT) result should be [SOME_TEXT, '', ANT_TEXT]
Tried different approaches with lazy modifiers but fails to implement it.

/(.*?)(?:_Syn:(.*?))?(?:_Ant:(.*?))?$/
The important parts are the ? after the .* which make them reluctant (not greedy) and the $ at the end that forces the match in spite of all of the optional matches.

Use this regex
var n=str.match(/(SOME|SYN|ANT)_TEXT/g);
n would contain an array of matched strings

Regex validation rules

I'm writing a database backup function as part of my school project.
I need to write a regex rule so the database backup name can only contain legal characters.
By 'legal' I mean a string that doesn't contain ANY symbols or spaces. Only letters from the alphabet and numbers.
An example of a valid string would be '31Jan2012' or '63927jkdfjsdbjk623' or 'hello123backup'.
Here's my JS code so far:
// Check if the input box contains the charactes a-z, A-Z ,or 0-9 with a regular expression.
function checkIfContainsNumbersOrCharacters(elem, errorMessage){
var regexRule = new RegExp("^[\w]+$");
if(regexRule.test( $(elem).val() ) ){
return true;
}else{
alert(errorMessage);
return false;
}
}
//call the function
checkIfContainsNumbersOrCharacters("#backup-name", "Input can only contain the characters a-z or 0-9.");
I've never really used regular expressions before though, however after a quick bit of googling i found this tool, from which I wrote the following regex rule:
^[\w]+$
^ = start of string
[/w] = a-z/A-Z/0-9
'+' = characters after the string.
When running my function, the whatever string I input seems to return false :( is my code wrong? or am I not using regex rules correctly?

The problem here is, that when writing \w inside a string, you escape the w, and the resulting regular expression looks like this: ^[w]+$, containing the w as a literal character. When creating a regular expression with a string argument passed to the RegExp constructor, you need to escape the backslash, like so: new RegExp("^[\\w]+$"), which will create the regex you want.
There is a way to avoid that, using the shorthand notation provided by JavaScript: var regex = /^[\w]+$/; which does not need any extra escaping.

It can be simpler. This works:
function checkValid(name) {
return /^\w+$/.test(name);
}
/^\w+$/ is the literal notation for new RegExp(). Since the .test function returns a boolean, you only need to return its result. This also reads better than new RegExp("^\\w+$"), and you're less likely to goof up (thanks #x3ro for pointing out the need for two backslashes in strings).

The \w is a synonym for [[:alnum:]], which matches a single character of the alnum class. Note that using character classes means that you may match characters that are not part of the ASCII character encoding, which may or may not be what you want. If what you really intend to match is [0-9A-Za-z], then that's what you should use.

When you declare the regex as a string parameter to the RegExp constructor, you need to escape it. Both
var regexRule = new RegExp("^[\\w]+$");
...and...
var regexRule = new RegExp(/^[\w]+$/);
will work.
Keep in mind though, that client side validation for database data will never be enough, as the validation is easily bypassed by disabling javascript in the browser, and invalid/malicious data can reach your DB. You need to validate the data on the server side, but preventing the request with invalid data, but validating client side is good practice.

This is the official spec: http://dev.mysql.com/doc/refman/5.0/en/identifiers.html but it's not very easily converted to a regular expression. Just a regular expression won't do it as there are also reserved words.
Why not just put it in the query (don't forget to escape it properly) and let MySQL give you an error? There might for instance be a bug in the MySQL version you're using, and even though your check is correct, MySQL might still refuse.

Develop Reference

JavaScript is the programming language of the Web.

Validate jquery selector syntax using regex - javascript

Related

Javascript Regex vs Java Regex

Non-capturing groups in Javascript regex

difference between ruby regex and javascript regex

JS RegEx: simple match with optional parts

Regex validation rules

Categories

Resources