How to get last letters from a string? - javascript

I have the following string
str = '"Apples" AND "Bananas" OR Gala Me'
I want to get the 'Gala Me' bit at the end. The words 'AND', 'OR', or anything between quotes can be considered a token. I have this regular expression.
regex = /AND|OR|"[^"]+"/g
It matches all my tokens but how could I get the opposite of this regex to get the unmatched substring?

You can split() the string using the tokens:
var parts = str.split(/\s*(?:AND|OR|"[^"]+")\s*/)
// ["", "", "", "", "Gala Me"]
Optionally, you can filter them by length:
var parts = str.split(/\s*(?:AND|OR|"[^"]+")\s*/).filter(function(s) {
return s.length > 0;
});
// ["Gala Me"]
Afterwards, you select the last element (if applicable):
if (parts.length) {
console.log(parts.pop());
}
// "Gala Me"

regular expression that matches anything after the last 'AND' or 'OR'
.*(?:AND|OR)(.*)$
------------
|
|->matches greedily until the last AND or OR
Group 1 captures your required string..

Related

Match words that consist of specific characters, excluding between special brackets

I'm trying to match words that consist only of characters in this character class: [A-z'\\/%], excluding cases where:
they are between < and >
they are between [ and ]
they are between { and }
So, say I've got this funny string:
[beginning]<start>How's {the} /weather (\\today%?)[end]
I need to match the following strings:
[ "How's", "/weather", "\\today%" ]
I've tried using this pattern:
/[A-z'/\\%]*(?![^{]*})(?![^\[]*\])(?![^<]*>)/gm
But for some reason, it matches:
[ "[beginning]", "", "How's", "", "", "", "/weather", "", "", "\\today%", "", "", "[end]", "" ]
I'm not sure why my pattern allows stuff between [ and ], since I used (?![^\[]*\]), and a similar approach seems to work for not matching {these cases} and <these cases>. I'm also not sure why it matches all the empty strings.
Any wisdom? :)
There are essentially two problems with your pattern:
Never use A-z in a character class if you intend to match only letters (because it will match more than just letters1). Instead, use a-zA-Z (or A-Za-z).
Using the * quantifier after the character class will allow empty matches. Use the + quantifier instead.
So, the fixed pattern should be:
[A-Za-z'/\\%]+(?![^{]*})(?![^\[]*\])(?![^<]*>)
Demo.
1 The [A-z] character class means "match any character with an ASCII code between 65 and 122". The problem with that is that codes between 91 and 95 are not letters (and that's why the original pattern matches characters like '[' and ']').
Split it with regular expression:
let data = "[beginning]<start>How's {the} /weather (\\today%?)[end]";
let matches = data.split(/\s*(?:<[^>]+>|\[[^\]]+\]|\{[^\}]+\}|[()])\s*/);
console.log(matches.filter(v => "" !== v));
You can match all the cases that you don't want using an alternation and place the character class in a capturing group to capture what you want to keep.
The [^ is a negated character class that matches any character except what is specified.
(?:\[[^\][]*]|<[^<>]*>|{[^{}]*})|([A-Za-z'/\\%]+)
Explanation
(?: Non capture group
\[[^\][]*] Match from opening till closing []
| Or
<[^<>]*> Match from opening till closing <>
| Or
{[^{}]*} Match from opening till closing {}
) Close non capture group
| Or
([A-Za-z'/\\%]+) Repeat the character class 1+ times to prevent empty matches and capture in group 1
Regex demo
const regex = /(?:\[[^\][]*]|<[^<>]*>|{[^{}]*})|([A-Za-z'/\\%]+)/g;
const str = `[beginning]<start>How's {the} /weather (\\\\today%?)[end]`;
let m;
while ((m = regex.exec(str)) !== null) {
if (m[1] !== undefined) console.log(m[1]);
}

Trimming other characters than whitespace? (trim() for variable characters)

Is there an easy way to replace characters at the beginning and end of a string, but not in the middle? I need to trim off dashes. I know trim() exists, but it only trims whitespace.
Here's my use case:
Input:
university-education
-test
football-coach
wine-
Output:
university-education
test
football-coach
wine
You can use String#replace with a regular expression.
^-*|-*$
Explanation:
^ - start of the string
-* matches a dash zero or more times
| - or
-* - matches a dash zero or more times
$ - end of the string
function trimDashes(str){
return str.replace(/^-*|-*$/g, '');
}
console.log(trimDashes('university-education'));
console.log(trimDashes('-test'));
console.log(trimDashes('football-coach'));
console.log(trimDashes('--wine----'));
I would suggest using the trim function of lodash. It does exactly what you want. It has a second parameter which allows you to pass the characters which should be trimmed. In your case you could use it like this:
trim("-test", "-");
The 'trim' function here is inadequate. You can catch this gap using 'RegEx' within the 'replace' function.
let myText = '-education';
myText = myText.replace(/^\-+|\-+$/g, ''); // output: "education"
Use in the array
let myTexts = [
'university-education',
'-test',
'football-coach',
'wine',
];
myTexts = myTexts.map((text/*, index*/) => text.replace(/^\-+|\-+$/g, ''));
/* output:
(4)[
"university-education",
"test",
"football-coach",
"wine"
]
*/
/^\ beginning of the string, dashe, one or more times
| or
\-+$ dashe, one or more times, end of the string
/g 'g' is for global search. Meaning it'll match all occurrences.
Sample:
const removeDashes = (str) => str.replace(/^\-+|\-+$/g, '');
/* STRING EXAMPLE */
const removedDashesStr = removeDashes('-education');
console.log('removedDashesStr', removedDashesStr);
// ^^ output: "removedDashesStr education"
let myTextsArray = [
'university-education',
'-test',
'football-coach',
'wine',
];
/* ARRAY EXAMPLE */
myTextsArray = myTextsArray.map((text/*, index*/) => removeDashes(text));
console.log('myTextsArray', myTextsArray);
/*^ outpuut:
myTextsArray [
"university-education",
"test",
"football-coach",
"wine"
]
*/

Validate string in regular expression

I want to have a regular expression in JavaScript which help me to validate a string with contains only lower case character and and this character -.
I use this expression:
var regex = /^[a-z][-\s\.]$/
It doesn't work. Any idea?
Just use
/^[a-z-]+$/
Explanation
^ : Match from beginning string.
[a-z-] : Match all character between a-z and -.
[] : Only characters within brackets are allowed.
a-z : Match all character between a-z. Eg: p,s,t.
- : Match only strip (-) character.
+ : The shorthand of {1,}. It's means match 1 or more.
$: Match until the end of the string.
Example
const regex= /^[a-z-]+$/
console.log(regex.test("abc")) // true
console.log(regex.test("aBcD")) // false
console.log(regex.test("a-c")) // true
Try this:
var regex = /^[-a-z]+$/;
var regex = /^[-a-z]+$/;
var strs = [
"a",
"aB",
"abcd",
"abcde-",
"-",
"-----",
"a-b-c",
"a-D-c",
" "
];
strs.forEach(str=>console.log(str, regex.test(str)));
Try this
/^[a-z-]*$/
it should match the letters a-z or - as many times as possible.
What you regex does is trying to match a-z one single time, followed by any of -, whitespace or dot one single time. Then expect the string to end.
Use this regular expression:
let regex = /^[a-z\-]+$/;
Then:
regex.test("abcd") // true
regex.test("ab-d") // true
regex.test("ab3d") // false
regex.test("") // false
PS: If you want to allow empty string "" to pass, use /^[a-z\-]*$/. Theres an * instead of + at the end. See Regex Cheat Sheet: https://www.rexegg.com/regex-quickstart.html
I hope this helps
var str = 'asdadWW--asd';
console.log(str.match(/[a-z]|\-/g));
This will work:
var regex = /^[a-z|\-|\s]+$/ //For this regex make use of the 'or' | operator
str = 'test- ';
str.match(regex); //["test- ", index: 0, input: "test- ", groups: undefined]
str = 'testT- ' // string now contains an Uppercase Letter so it shouldn't match anymore
str.match(regex) //null

why condition is always true in javascript?

Could you please tell me why my condition is always true? I am trying to validate my value using regex.i have few conditions
Name should not contain test "text"
Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
I do like this
https://jsfiddle.net/aoerLqkz/2/
var val = 'ab dd'
if (/test|[^a-z]|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i.test(val)) {
alert( 'match')
} else {
alert( 'false')
}
I tested my code with the following string and getting an unexpected result
input string "abc" : output fine :: "match"
input string "aaa" : output fine :: "match"
input string "aa a" : **output ** :: "match" why it is match ?? there is space between them why it matched ????
input string "sa c" : **output ** :: "match" why it is match ?? there is different string and space between them ????
The string sa c includes a space, the pattern [^a-z] (not a to z) matches the space.
Possibly you want to use ^ and $ so your pattern also matches the start and end of the string instead of looking for a match anywhere inside it.
there is space between them why it matched ????
Because of the [^a-z] part of your regular expression, which matches the space:
> /[^a-z]/i.test('aa a');
true
The issue is the [^a-z]. This means that any string that has a non-letter character anywhere in it will be a match. In your example, it is matching the space character.
The solution? Simply remove |[^a-z]. Without it, your regex meets all three criteria.
test checks if the value contains the word 'test'.
abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz checks if the value contains three sequential letters.
(.)\1\1 checks if any character is repeated three times.
Complete regex:
/test|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i`
I find it helpful to use a regex tester, like https://www.regexpal.com/, when writing regular expressions.
NOTE: I am assuming that the second criteria actually means "three consecutive letters", not "three consecutive characters" as it is written. If that is not true, then your regex doesn't meet the second criteria, since it only checks for three consecutive letters.
I would not do this with regular expresions, this expresion will always get more complicated and you have not the possibilities you had if you programmed this.
The rules you said suggest the concept of string derivative. The derivative of a string is the distance between each succesive character. It is specially useful dealing with password security checking and string variation in general.
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
//these strings have the same derivative: [0,0,0,0]
console.log(derivative('aaaaa'));
console.log(derivative('bbbbb'));
//these strings also have the same derivative: [1,1,1,1]
console.log(derivative('abcde'));
console.log(derivative('mnopq'));
//up and down: [1,-1, 1,-1, 1]
console.log(derivative('ababa'));
With this in mind you can apply your each of your rules to each string.
// Rules:
// 1. Name should not contain test "text"
// 2. Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
// 3. Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
const arrayContains = (master, sub) =>
master.join(",").indexOf( sub.join( "," ) ) == -1;
const rule1 = (text) => !text.includes('text');
const rule2 = (text) => !arrayContains(derivative(text),[1,1]);
const rule3 = (text) => !arrayContains(derivative(text),[0,0]);
const testing = [
"smthing textual",'abc','aaa','xyz','12345',
'1111','12abb', 'goodbcd', 'weeell'
];
const results = testing.map((input)=>
[input, rule1(input), rule2(input), rule3(input)]);
console.log(results);
Based on the 3 conditions in the post, the following regex should work.
Regex: ^(?:(?!test|([a-z])\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz).)*$
Demo

Why is this regex matching also words within a non-capturing group?

I have this string (notice the multi-line syntax):
var str = ` Number One: Get this
Number Two: And this`;
And I want a regex that returns (with match):
[str, 'Get this', 'And this']
So I tried str.match(/Number (?:One|Two): (.*)/g);, but that's returning:
["Number One: Get this", "Number Two: And this"]
There can be any whitespace/line-breaks before any "Number" word.
Why doesn't it return only what is inside of the capturing group? Am I misundersating something? And how can I achieve the desired result?
Per the MDN documentation for String.match:
If the regular expression includes the g flag, the method returns an Array containing all matched substrings rather than match objects. Captured groups are not returned. If there were no matches, the method returns null.
(emphasis mine).
So, what you want is not possible.
The same page adds:
if you want to obtain capture groups and the global flag is set, you need to use RegExp.exec() instead.
so if you're willing to give on using match, you can write your own function that repeatedly applies the regex, gets the captured substrings, and builds an array.
Or, for your specific case, you could write something like this:
var these = str.split(/(?:^|\n)\s*Number (?:One|Two): /);
these[0] = str;
Replace and store the result in a new string, like this:
var str = ` Number One: Get this
Number Two: And this`;
var output = str.replace(/Number (?:One|Two): (.*)/g, "$1");
console.log(output);
which outputs:
Get this
And this
If you want the match array like you requested, you can try this:
var getMatch = function(string, split, regex) {
var match = string.replace(regex, "$1" + split);
match = match.split(split);
match = match.reverse();
match.push(string);
match = match.reverse();
match.pop();
return match;
}
var str = ` Number One: Get this
Number Two: And this`;
var regex = /Number (?:One|Two): (.*)/g;
var match = getMatch(str, "#!SPLIT!#", regex);
console.log(match);
which displays the array as desired:
[ ' Number One: Get this\n Number Two: And this',
' Get this',
'\n And this' ]
Where split (here #!SPLIT!#) should be a unique string to split the matches. Note that this only works for single groups. For multi groups add a variable indicating the number of groups and add a for loop constructing "$1 $2 $3 $4 ..." + split.
Try
var str = " Number One: Get this\
Number Two: And this";
// `/\w+\s+\w+(?=\s|$)/g` match one or more alphanumeric characters ,
// followed by one or more space characters ,
// followed by one or more alphanumeric characters ,
// if following space or end of input , set `g` flag
// return `res` array `["Get this", "And this"]`
var res = str.match(/\w+\s+\w+(?=\s|$)/g);
document.write(JSON.stringify(res));

Categories

Resources