Combine whitelist and blacklist in javascript regex expression - javascript

I am having problems constructing a regex that will allow the full range of UTF-8 characters with the exception of 2 characters: _ and ?
So the whitelist is: ^[\u0000-\uFFFF] and the blacklist is: ^[^_%]
I need to combine these into one expression.
I have tried the following code, but does not work the way I had hoped:
var input = "this%";
var patrn = /[^\u0000-\uFFFF&&[^_%]]/g;
if (input.match(patrn) == "" || input.match(patrn) == null) {
return true;
} else {
return false;
}
input: this%
actual output: true
desired output: false

If I understand correctly, one of these should be enough:
/^[^_%]*$/.test(str);
!/[_%]/.test(str);

Use negative lookahead:
(?!_blacklist_)_whitelist_
In this case:
^(?:(?![_%])[\u0000-\uFFFF])*$

Underscore is \u005F and percent is \u0025. You can simply alter the range to exclude these two characters:
^[\u0000-\u0024\u0026-\u005E\u0060-\uFFFF]
This will be just as fast as the original regex.
But I don't think that you are going to get the result you really want this way. JS can only go up to \uFFFF, anything past that will be two characters technically.
According to here, the following code returns false:
/^.$/.test('💩')
You need to have a different way to see if you have characters outside that range. This answer gives the following code:
String.prototype.getCodePointLength= function() {
return this.length-this.split(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g).length+1;
};
Simply put, if the number returned by that is not the same as the number returned by .length() you have a surrogate pair (and thus you should return false).
If your input passes that test, you can run it up against another regex to avoid all the characters between \u0000-\uFFFF that you want to avoid.

Related

JavaScript regex inline validation for basic calculation string with one operator

I've written a basic 2 operand calculator app (+ - * /) that uses a couple of inline regex validations to filter away invalid characters as they are typed.
An example looks like:
//check if operator is present
if(/[+\-*\/]/.test(display1.textContent)){
//validate the string each time a new character is added
if(!/^\d+\.?\d*[+\-*\/]?\d*\.?\d*$/.test(display1.textContent)){
console.log('invalid')
return false
}
//validate the string character by character before operator
} else {
if(!/^\d+\.?\d*$/.test(display1.textContent)){
console.log('invalid')
return false
}
}
In the above, a valid character doesn't return false:
23.4x0.00025 (no false returned and hence the string is typed out)
But, if an invalid character is typed the function returns false and the input is filtered away:
23.4x0.(x) x at the end returns a false so is filtered (only one operator allowed per calculation)
23.4x0. is typed
It works pretty well but allows for the following which I would like to deal with:
2.+.1
I would prefer 2.0+0.1
My regex would need an if-then-else conditional stating that if the current character is '.' then the next character must be a number else the next char can be number|.|operator. Or if the current character is [+-*/] then the next character must be a number, else the next char can be any char (while following the overall logic).
The tricky part is that the logic must process the string as it is typed character by character and validate at each addition (and be accurate), not at the end when the string is complete.
if-then-else regex is not supported in JavaScript (which I think would satisfy my needs) so I need to use another approach whilst remaining within the JS domain.
Any suggestions about this specific problem would be really helpful.
Thanks
https://github.com/jdineley/Project-calculator
Thanks #trincot for the tips using capturing groups and look around. This helped me write what I needed:
https://regex101.com/r/khUd8H/1
git hub app is updated and works as desired. Now just need to make it pretty!
For ensuring that an operator is not allowed when the preceding number ended in a point, you can insert a positive look behind in your regex that requires the character before an operator to always be a digit: (?<=\d)
Demo:
const validate = s => /^(\d+(\.\d*)?((?<=\d)[+*/-]|$))*$/.test(s);
document.querySelector("input").addEventListener("input", function () {
this.style.backgroundColor = validate(this.value) ? "" : "orange";
});
Input: <input>

Using javascript or python regular expressions I want to determine if the number is 1-99 how to make it?

var category = prompt("where do you go? (1~99)", "");
hello
Using regular expressions I want to determine if the category is 1-99.
How can I solve it?
Thank you if you let me know.
You can use character classes to match digits, like this [0-9]. If you put two of them together you'll match 00 - 99. If you put a ? after one of them, then it's optional, so you'll match 0 - 99. To enforce 1-99, make the non-optional one like this [1-9]. Finally, you need to make sure there's nothing before or after the one or two digits using ^, which matches the beginning of the string, and $ which matches the end.
if (category.match(/^[1-9][0-9]?$/)){
console.log("ok")
} else {
console.log("not ok")
}
In JavaScript you can use test() method with RE for 1-99 as shown below:
var one_to_ninetynine = /^[1-9][0-9]?$/i;
if(one_to_ninetynine.test(category)) {
console.log("The number is between 1-99");
} else {
console.log("The number is NOT between 1-99");
}

Formatting in Javascript

I have a question related to formatting strings.
User should parse a string in the Format XX:XX.
if the string parsed by user is in the format XX:XX i need to return true,
else false:
app.post('/test', (req, res) => {
if (req.body.time is in the format of XX:XX) {
return true
} else {
return false
}
});
You can use the RegExp.test function for this kind of thing.
Here is an example:
var condition = /^[a-zA-Z]{2}:[a-zA-Z]{2}$/.test("XX:XX");
console.log("Condition: ", condition);
The regex that I've used in this case check if the string is composed from two upper or lower case letters fallowed by a colon and other two such letters.
Based on your edits it seems that you're trying to check if a string represents an hour and minute value, if that is the case, a regex like this will be more appropriate /^\d{2}:\d{2}$/. This regex checks if the string is composed of 2 numbers fallowed by a colon and another 2 numbers.
The tool you're looking for is called Regular Expressions.
It is globally supported in almost every development platform, which makes it extremely convenient to use.
I would recommend this website for working out your regular expressions.
/^[a-zA-Z]{2}:[a-zA-Z]{2}&/g is an example of a Regular Expression that will take any pattern of:
[a-zA-Z]{2} - two characters from the sets a-z and A-Z.
Followed by :
Followed by the same first argument. Essentially, validating the pattern XX:XX. Of course, you can manipulate it as to what you want to allow for X.
^ marks the beginning of a string and $ marks the end of it, so ASD:AS would not work even though it contains the described pattern.
try using regex
var str = "12:aa";
var patt = new RegExp("^([a-zA-Z]|[0-9]){2}:([a-zA-Z]|[0-9]){2}$");
var res = patt.test(str);
if(res){ //if true
//do something
}
else{}

Find longest repeating substring in JavaScript using regular expressions

I'd like to find the longest repeating string within a string, implemented in JavaScript and using a regular-expression based approach.
I have an PHP implementation that, when directly ported to JavaScript, doesn't work.
The PHP implementation is taken from an answer to the question "Find longest repeating strings?":
preg_match_all('/(?=((.+)(?:.*?\2)+))/s', $input, $matches, PREG_SET_ORDER);
This will populate $matches[0][X] (where X is the length of $matches[0]) with the longest repeating substring to be found in $input. I have tested this with many input strings and found am confident the output is correct.
The closest direct port in JavaScript is:
var matches = /(?=((.+)(?:.*?\2)+))/.exec(input);
This doesn't give correct results
input Excepted result matches[0][X]
======================================================
inputinput input input
7inputinput input input
inputinput7 input input
7inputinput7 input 7
XXinputinputYY input XX
I'm not familiar enough with regular expressions to understand what the regular expression used here is doing.
There are certainly algorithms I could implement to find the longest repeating substring. Before I attempt to do that, I'm hoping a different regular expression will produce the correct results in JavaScript.
Can the above regular expression be modified such that the expected output is returned in JavaScript? I accept that this may not be possible in a one-liner.
Javascript matches only return the first match -- you have to loop in order to find multiple results. A little testing shows this gets the expected results:
function maxRepeat(input) {
var reg = /(?=((.+)(?:.*?\2)+))/g;
var sub = ""; //somewhere to stick temp results
var maxstr = ""; // our maximum length repeated string
reg.lastIndex = 0; // because reg previously existed, we may need to reset this
sub = reg.exec(input); // find the first repeated string
while (!(sub == null)){
if ((!(sub == null)) && (sub[2].length > maxstr.length)){
maxstr = sub[2];
}
sub = reg.exec(input);
reg.lastIndex++; // start searching from the next position
}
return maxstr;
}
// I'm logging to console for convenience
console.log(maxRepeat("aabcd")); //aa
console.log(maxRepeat("inputinput")); //input
console.log(maxRepeat("7inputinput")); //input
console.log(maxRepeat("inputinput7")); //input
console.log(maxRepeat("7inputinput7")); //input
console.log(maxRepeat("xxabcdyy")); //x
console.log(maxRepeat("XXinputinputYY")); //input
Note that for "xxabcdyy" you only get "x" back, as it returns the first string of maximum length.
It seems JS regexes are a bit weird. I don't have a complete answer, but here's what I found.
Although I thought they did the same thing re.exec() and "string".match(re) behave differently. Exec seems to only return the first match it finds, whereas match seems to return all of them (using /g in both cases).
On the other hand, exec seems to work correctly with ?= in the regex whereas match returns all empty strings. Removing the ?= leaves us with
re = /((.+)(?:.*?\2)+)/g
Using that
"XXinputinputYY".match(re);
returns
["XX", "inputinput", "YY"]
whereas
re.exec("XXinputinputYY");
returns
["XX", "XX", "X"]
So at least with match you get inputinput as one of your values. Obviously, this neither pulls out the longest, nor removes the redundancy, but maybe it helps nonetheless.
One other thing, I tested in firebug's console which threw an error about not supporting $1, so maybe there's something in the $ vars worth looking at.

Regular expressions - javascript

I know this is a simple thing. but i just cant make it work.
Req: A word which contain at least one number, alphabets (can be both cases) and at least one symbol (special character).
In c# (?=.[0-9])(?=.[a-zA-z])(?=.*[!##$%_]) worked. But in javascript its not working.
Seems like it always look for number at the beginning since my condition starts with number in the regexp.f
Can anyone give me a regexp that can be used in javascript?
-Rakesh
JavaScript does support lookaheads. However, your groups expect that there's at least on character before the number and letter (because they start with just a dot .). Try adding a * to those two dots:
var pattern = /(?=.*[0-9])(?=.*[a-zA-z])(?=.*[!##$%_])/;
pattern.test('xxx'); // false
pattern.test('111'); // false
pattern.test('!!!'); // false
pattern.test('x1!'); // true
I'm seeing the same problem with this regular expression in C#, too.
Just to cover the obvious answer, given the requirements as stated I would use separate tests.
/[0-9]/.test(string) && /[a-z]/i.test(string) && /[!##$%_]/.test(string)
If you're interested in abstracting this away, one way is to store the tests in an array.
var tests = [ /[0-9]/, /[a-z]/i, /[!##$%_]/ ];
And one way to evaluate multiple tests without modifying the scope of surrounding code, simply shoehorning this into a closure, follows.
var passes = (function(){
for (var i=0; i<tests.length; i++)
if (!tests[i].test(string)) return false;
return true;
})();
I don't think javascript supports those lookaheads. Try
/(.*[a-zA-Z].*[0-9].*[!##$%_].*|.*[a-zA-Z].*[!##$%_].*[0-9].*|.*[0-9].*[a-zA-Z].*[!##$%_].*|.*[!##$%_].*[a-zA-Z].*[0-9].*|.*[0-9].*[!##$%_].*[a-zA-Z].*|.*[!##$%_].*[0-9].*[a-zA-Z].*)/
Not expecting any points for elegance...
Edit: As bdukes pointed out, js does support lookaheads. However, this (ugly) expression does work.
You can have a very long reg exp, with the three character classes repeated in differtent order, or use more than one test-
function teststring(s){
return /^[\da-zA-Z!##$%_]+$/.test(s) &&
/\d/.test(s) && /[a-zA-Z]/.test(s) && /[!##$%_]/.test(s);
}

Categories

Resources