Clarification of a specific regex

Clarification of a specific regex - javascript

I attempted the CoderByte - Simple Symbols - challenge in JavaScript. From CoderByte:
Using the JavaScript language, have the function SimpleSymbols(str)
take the str parameter being passed and determine if it is an
acceptable sequence by either returning the string true or false. The
str parameter will be composed of + and = symbols with several letters
between them (ie. ++d+===+c++==a) and for the string to be true each
letter must be surrounded by a + symbol. So the string to the left
would be false. The string will not be empty and will have at least
one letter.
My solution:
function simpleSymbols(str) {
var isSymbol = true;
var output = " ";
var symbol = " ";
if (str.match(/[a-zA-Z]/).length != 0) {
for (var i = 0; i <= str.length - 1; i++) {
if ((str.charAt(i) >= 'A' && str.charAt(i) <= 'Z') ||
(str.charAt(i) >= 'a' && str.charAt(i) <= 'z')) {
if (i != str.length - 1) {
symbol = str[--i] + str[++i] + str[++i];
var rgx = new RegExp(/\+[a-zA-Z]\+/);
if (!(rgx.test(symbol))) {
isSymbol = false;
break;
}
}
else {
isSymbol = false;
break;
}
}
}
}
else {
isSymbol = false;
}
return isSymbol;
}
This worked fine for all test cases.
On reviewing code of other submissions, I came across a submission which required only a single line of code:
return ('=' + str + '=').match(/([^\+][a-z])|([a-z][^\+])/gi) === null;
I'm having trouble understanding how the RegEx used here works. Theoretically, I understand:
g modifier => checks for all matches
i modifier => case-insensitive checking
a-z => checks the string contains only letters
\+ => refers to the plus sign
| => match either alternative1 OR alternative2
Thus, if referring to the above, I understand that there are two match conditions:
([^\+][a-z])
([a-z][^\+])
So, for a test input such as "+x+y+z+". Am I correct in understanding that the way it checks matches is as follows: +x => x+ => +y => y+ => +z => z+
Further clarification on this RegEx would be really helpful.
Thanks.

https://regex101.com/ is your friend !
Technically you are right about what you have said.
[^+] matches everything BUT the plus sign. Now the regex says "if there is a letter that is not preceded by a + or a letter that is not followed by a plus, return the regex".
But since there is "=== null", it will return true only if the above regex has not found anything.

[^\+] means any character that's not a plus. [] is a character group and a ^ at the beginning inside a character group means negate/not. It just says "does this string contain any character that's not a plus, followed by a letter a-z?" which would mean it doesn't follow the rules.

Related

Regex to find 5 consecutive letters of alphabet (ex. abcde, noprst)

I have strings containing 5 letters of alphabet. I would like to match those that contain letters that are consecutive in alphabet for example:
abcde - return match
nopqrs - return match
cdefg - return match
fghij - return match
but
abcef - do not return match
abbcd - do not return match
I could write all combinations but as you can write in Regex [A-Z] I assumed there must be a better way.

A very simple alternative would be to just use String.prototype.includes:
function isConsecutive(string) {
const result = 'abcdefghijklmnopqrstuvwxyz'.includes(string);
console.log(string, result);
}
// true
isConsecutive('abcde');
isConsecutive('nopqrs');
isConsecutive('cdefg');
isConsecutive('fghij');
// false
isConsecutive('abcef');
isConsecutive('abbcd');

If you can live with Python, this function converts the string sequence into numbered characters, and checks if they are consequtive (if so, they are also consecutive alphabetically):
def are_letters_consequtive(text):
nums = [ord(letter) for letter in text]
if sorted(nums) == list(range(min(nums), max(nums)+1)):
return "match"
return "no match"
print(are_letters_consequtive('abcde'))
print(are_letters_consequtive('cdefg'))
print(are_letters_consequtive('fghij'))
print(are_letters_consequtive('abcef'))
print(are_letters_consequtive('abbcd'))
print(are_letters_consequtive('noprst'))
Outputs:
match
match
match
no match
no match
no match

An alternative using javascript:
let string1 = 'abcde'
let string2 = 'fghiz'
function conletters(string) {
if(string.length > 5 || typeof string != 'string') throw '[ERROR] not string or string greater than 5'
for(let i = 0; i < string.length - 1; i++) {
if(!(string.charCodeAt(i) + 1 == string.charCodeAt(i + 1)))
return false
}
return true
}
console.log('string1 is consecutive: ' + conletters(string1))
console.log('string2 is consecutive: ' + conletters(string2))

You should definitely do it with code:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
That said, you can do better than testing all the combinations when using regexes. With lookahead expressions you can basically do "and" operation. Since you know the length you could do:
const myRegex = /(?=^(ab|bc)...$)(?=^.(ab|bc)..$)(?=^..(ab|bc).$)(?=^...(ab|bc)$)/
You will need to replace the (ab|bc) with all the possible two combinations.
For this particular case it is actually worse than testing all the possibilities (since there are only 22 possibilities) but it makes it more extensible to other situations.

Determine if char is in regex

Im trying to do a custom mention system using Quill and i need to figure out how can i determine if a given char is in the alphabet or not.
Example:
I have this string: Hello my #name is Luis and this function which takes the position of the cursor and evaluate the word to check if it contains #:
CheckWord = function(quill, start){
let at = false, c_char;
for(var i = start; i > 0; --i){
c_char = quill.getText(i, 1);
if (c_char == '#') {
if (quill.getText(i-1, 1) == ' ' || quill.getText(i-1, 1) == '') {
at = true;
break;
}
}
if (c_char == ' ') {
at = false;
break;
}
}
return at;
}
Everything is working fine but in
if (c_char == ' ') {
at = false;
break;
}
I need to verify if is not a alphabet char (A,B,C,D, ect...)
I know that with a regex like this /^[a-zA-Z]+$/ i can achieve what i want but i dont know how to implement it just to check if the given letter is valid.

You can use RegExp.prototype.test() to test if the character is an alphabet (a-z or A-Z), just negate this if you want to match other characters than alphabets:
if (/[a-z]/i.test(c_char)) {
at = false;
break;
}
The i in the regex means case insensitive search (think of i as insensitive), i.e. it checks for both lowercase and uppercase letters. Since c_char is a single character, you don't need the ^ (beginning of input) and $ (end of input) characters in the regex.

Remove Any Non-Digit And Check if Formatted as Valid Number

I'm trying to figure out a regex pattern that allows a string but removes anything that is not a digit, a ., or a leading -.
I am looking for the simplest way of removing any non "number" variables from a string. This solution doesn't have to be regex.
This means that it should turn
1.203.00 -> 1.20300
-1.203.00 -> -1.20300
-1.-1 -> -1.1
.1 -> .1
3.h3 -> 3.3
4h.34 -> 4.34
44 -> 44
4h -> 4
The rule would be that the first period is a decimal point, and every following one should be removed. There should only be one minus sign in the string and it should be at the front.
I was thinking there should be a regex for it, but I just can't wrap my head around it. Most regex solutions I have figured out allow the second decimal point to remain in place.

You can use this replace approach:
In the first replace we are removing all non-digit and non-DOT characters. Only exception is first hyphen that we negative using a lookahead.
In the second replace with a callback we are removing all the DOT after first DOT.
Code & Demo:
var nums = ['..1', '1..1', '1.203.00', '-1.203.00', '-1.-1', '.1', '3.h3',
'4h.34', '4.34', '44', '4h'
]
document.writeln("<pre>")
for (i = 0; i < nums.length; i++)
document.writeln(nums[i] + " => " + nums[i].replace(/(?!^-)[^\d.]+/g, "").
replace(/^(-?\d*\.\d*)([\d.]+)$/,
function($0, $1, $2) {
return $1 + $2.replace(/[.]+/g, '');
}))
document.writeln("</pre>")

A non-regex solution, implementing a trivial single-pass parser.
Uses ES5 Array features because I like them, but will work just as well with a for-loop.
function generousParse(input) {
var sign = false, point = false;
return input.split('').filter(function(char) {
if (char.match(/[0-9]/)) {
return sign = true;
}
else if (!sign && char === '-') {
return sign = true;
}
else if (!point && char === '.') {
return point = sign = true;
}
else {
return false;
}
}).join('');
}
var inputs = ['1.203.00', '-1.203.00', '-1.-1', '.1', '3.h3', '4h.34', '4.34', '4h.-34', '44', '4h', '.-1', '1..1'];
console.log(inputs.map(generousParse));
Yes, it's longer than multiple regex replaces, but it's much easier to understand and see that it's correct.

I can do it with a regex search-and-replace. num is the string passed in.
num.replace(/[^\d\-\.]/g, '').replace(/(.)-/g, '$1').replace(/\.(\d*\.)*/, function(s) {
return '.' + s.replace(/\./g, '');
});

OK weak attempt but seems fine..
var r = /^-?\.?\d+\.?|(?=[a-z]).*|\d+/g,
str = "1.203.00\n-1.203.00\n-1.-1\n.1\n3.h3\n4h.34\n44\n4h"
sar = str.split("\n").map(s=> s.match(r).join("").replace(/[a-z]/,""));
console.log(sar);

Trouble with Javascript easy coderbyte challenge

I'm attempting to answer this question:
Using the JavaScript language, have the function SimpleSymbols(str) take the str parameter being passed and determine if it is an acceptable sequence by either returning the string true or false. The str parameter will be composed of + and = symbols with several letters between them (ie. ++d+===+c++==a) and for the string to be true each letter must be surrounded by a + symbol. So the string to the left would be false. The string will not be empty and will have at least one letter.
Here's my solution:
function SimpleSymbols(str) {
var test;
for (var i =0; i<str.length; i++){
if ((str.charAt(i)!== '+' && str.charAt(i+1) === str.match(/[a-z]/))
||(str.charAt(i+1) === str.match(/[a-z]/) && str.charAt(i+2) !== '+')){
test = false;
break;
}
else if (str.charAt(0) === str.match(/[a-z]/)){
test = false;
break;}
else {
test= true;}
}
return test;
};

I think you can just use two regex and then compare the length of arrays returned by them
function SimpleSymbols(str){
return str.match(/[a-z]/g).length == str.match(/\+[a-z]\+/g).length;
}
The first regex /[a-z]/g will match all the letters and /\+[a-z]\+/g will match all the letters which are followed and preceded by a literal +.
Then, we just use the Array.length property to check if the lengths are same or not and then return the Boolean result. As simple as that.

Why is my RegExp ignoring start and end of strings?

I made this helper function to find single words, that are not part of bigger expressions
it works fine on any word that is NOT first or last in a sentence, why is that?
is there a way to add "" to regexp?
String.prototype.findWord = function(word) {
var startsWith = /[\[\]\.,-\/#!$%\^&\*;:{}=\-_~()\s]/ ;
var endsWith = /[^A-Za-z0-9]/ ;
var wordIndex = this.indexOf(word);
if (startsWith.test(this.charAt(wordIndex - 1)) &&
endsWith.test(this.charAt(wordIndex + word.length))) {
return wordIndex;
}
else {return -1;}
}
Also, any improvement suggestions for the function itself are welcome!
UPDATE: example: I want to find the word able in a string, I waht it to work in cases like [able] able, #able1 etc.. but not in cases that it is part of another word like disable, enable etc

A different version:
String.prototype.findWord = function(word) {
return this.search(new RegExp("\\b"+word+"\\b"));
}
Your if will only evaluate to true if endsWith matches after the word. But the last word of a sentence ends with a full stop, which won't match your alphanumeric expression.

Did you try word boundary -- \b?
There is also \w which match one word character ([a-zA-Z_]) -- this could help you too (depends on your word definition).
See RegExp docs for more details.

If you want your endsWith regexp also matches the empty string, you just need to append |^$ to it:
var endsWith = /[^A-Za-z0-9]|^$/ ;
Anyway, you can easily check if it is the beginning of the text with if (wordIndex == 0), and if it is the end with if (wordIndex + word.length == this.length).
It is also possible to eliminate this issue by operating on a copy of the input string, surrounded with non-alphanumerical characters. For example:
var s = "#" + this + "#";
var wordIndex = this.indexOf(word) - 1;
But I'm afraid there is another problems with your function:
it would never match "able" in a string like "disable able enable" since the call to indexOf would return 3, then startsWith.test(wordIndex) would return false and the function would exit with -1 without searching further.
So you could try:
String.prototype.findWord = function (word) {
var startsWith = "[\\[\\]\\.,-\\/#!$%\\^&\*;:{}=\\-_~()\\s]";
var endsWith = "[^A-Za-z0-9]";
var wordIndex = ("#"+this+"#").search(new RegExp(startsWith + word + endsWith)) - 1;
if (wordIndex == -1) { return -1; }
return wordIndex;
}

Develop Reference

JavaScript is the programming language of the Web.

Clarification of a specific regex - javascript

[^\+] means any character that's not a plus. [] is a character group and a ^ at the beginning inside a character group means negate/not. It just says "does this string contain any character that's not a plus, followed by a letter a-z?" which would mean it doesn't follow the rules.

Related

Regex to find 5 consecutive letters of alphabet (ex. abcde, noprst)

Determine if char is in regex

Remove Any Non-Digit And Check if Formatted as Valid Number

Trouble with Javascript easy coderbyte challenge

Why is my RegExp ignoring start and end of strings?

Categories

Resources