Split a string according to flanking characters in javascript - javascript

Javascript lets you split a string according to regular expression. Is it possible to use this functionality to split a string only when the delimiter is flanked by certain characters?
For example, if I want to split the string 12-93 but not at-13 using the - character? Is that possible?
Using a regular expression seems promising, but doing "12-93".split(/[0-9]-[0-9]/) yields ["1", "3"] because the flanking digits are considered to be part of the delimiter.
Can I specify the above split pattern (a dash preceded and followed by a digit) without chopping the flanking digits?
Other Examples
"55,966,575-165,162,787" should yield ["55,966,575", "165,162,787"]
"55,966,575x-165,162,787" should yield ["55,966,575x-165,162,787"]
"sdf55,966,575-165,162,787" should yield ["sdf55,966,575", "165,162,787"]

Using two adjacent character sets seems to work.
See example at https://regex101.com/r/uFHMW1/1
([0-9,a-z]+?[0-9]+)-([0-9]+[0-9,a-z]+)
Try this (live here https://repl.it/EOOQ/0 ):
var strings = [
"55,966,575-165,162,787",
"55,966,575x-165,162,787",
"sdf55,966,575-165,162,787",
];
var pattern = '^([0-9,a-z]+?[0-9]+)-([0-9]+[0-9,a-z]+)$';
var regex = new RegExp(pattern, 'i');
var matched = strings.map(function (string) {
var matches = string.match( regex );
if (matches) {
return [matches[1], matches[2]];
} else {
return [string];
}
});
console.log(matched)
You can also run the above expression as split() like:
string.split(re).filter( str => str.length )
where Array.filter() is used to get rid of the leading and trailing empty strings created when the RegExp matches your input.
var strings = [
"55,966,575-165,162,787",
"55,966,575x-165,162,787",
"sdf55,966,575-165,162,787",
];
var pattern = '^([0-9,a-z]+?[0-9]+)-([0-9]+[0-9,a-z]+)$';
var regex = new RegExp(pattern, 'i');
var matched = strings.map( string => string.split(regex).filter( str => str.length ) );
console.log(matched)

Try using a non-capturing lookahead. You are using a regex that captures all of the characters found, then uses that result as the split character(s).

Related

How to slice optional arguments in RegEx?

Actually i have the following RegExp expression:
/^(?:(?:\,([A-Za-z]{5}))?)+$/g
So the accepted input should be something like ,IGORA but even ,IGORA,GIANC,LOLLI is valid and i would be able to slice the string to 3 group in this case, in other the group number should be equals to the user input that pass the RegExp test.
i was trying to do something like this in JavaScript but it return only the last value
var str = ',GIANC,IGORA';
var arr = str.match(/^(?:(?:\,([A-Za-z]{5}))?)+$/).slice(1);
alert(arr);
So the output is 'IGORA' while i would it to be 'GIANC' 'IGORA'
Here is another example
/^([A-Z]{5})(?:(?:\,([A-Za-z]{2}))?)+$/g
test of regexp may have at least 5 chart string but it also can have other 5 chart string separated with a comma so from input
IGORA,CIAOA,POPOP
I would have an array of ["IGORA","CIAOA","POPOP"]
You can capture the words in a capturing surrounded by an optional preceding comma or an optional trailing comma.
You can test the regex here: ,?([A-Za-z]+),?
const pattern = /,?([A-Za-z]+),?/gm;
const str = `,IGORA,GIANC,LOLLI`;
let matches = [];
let match;
// Iterate until no match found
while ((m = pattern.exec(str))) {
// The first captured group is the match
matches.push(m[1]);
}
console.log(matches);
There are other ways to do this, but I found that one of the simple ways is by using the replace method, as it can replace all instances that match that regex.
For example:
var regex = /^(?:(?:\,([A-Za-z]{5}))?)+$/g;
var str = ',GIANC,IGORA';
var arr = [];
str.replace(regex, function(match) {
arr[arr.length] = match;
return match;
});
console.log(arr);
Also, in my code snippet you can see that there is an extra coma in each string, you can solve that by changing line 5 to arr[arr.length] = match.replace(/^,/, '').
Is this what you're looking for?
Explanation:
\b word boundary (starting or ending a word)
\w a word ([A-z])
{5} 5 characters of previous
So it matches all 5-character words but not NANANANA
var str = 'IGORA,CIAOA,POPOP,NANANANA';
var arr = str.match(/\b\w{5}\b/g);
console.log(arr); //['IGORA', 'CIAOA', 'POPOP']
If you only wish to select words separated by commas and nothing else, you can test for them like so:
(?<=,\s*|^) preceded by , with any number of trailing space, OR is the first word in list.
(?=,\s*|$) followed by , and any number of trailing spaces OR is last word in list.
In the following code, POPOP and MOMMA are rejected because they are not separated by a comma, and NANANANA fails because it is not 5 character.
var str = 'IGORA, CIAOA, POPOP MOMMA, NANANANA, MEOWI';
var arr = str.match(/(?<=,\s*|^)\b\w{5}\b(?=,\s*|$)/g);
console.log(arr); //['IGORA', 'CIAOA', 'MEOWI']
If you can't have any trailing spaces after the comma, just leave out the \s* from both (?<=,\s*|^) and (?=,\s*|$).

Finding ++ in Regular Expression

I want to find ++ or -- or // or ** sign in in string can anyone help me?
var str = document.getElementById('screen').innerHTML;
var res = str.substring(0, str.length);
var patt1 = ++,--,//,**;
var result = str.match(patt1);
if (result)
{
alert("you cant do this :l");
document.getElementById('screen').innerHTML='';
}
This finds doubles of the characters by a backreference:
/([+\/*-])\1/g
[from q. comments]: i know this but when i type var patt1 = /[++]/i; code find + and ++
[++] means one arbitrary of the characters. Normally + is the qantifier "1 or more" and needs to be escaped by a leading backslash when it should be a literal, except in brackets where it does not have any special meaning.
Characters that do need to be escaped in character classes are e.g. the escape character itself (backslash), the expression delimimiter (slash), the closing bracket and the range operator (dash/minus), the latter except at the end of the character class as in my code example.
A character class [] matches one character. A quantifier, e.g. [abc]{2} would match "aa", "bb", but "ab" as well.
You can use a backreference to a match in parentheses:
/(abc)\1
Here the \1 refers to the first parentheses (abc). The entire expression would match "abcabc".
To clarify again: We could use a quantifier on the backreference:
/([+\/*-])\1{9}/g
This matches exactly 10 equal characters out of the class, the subpattern itself and 9 backreferences more.
/.../g finds all occurrences due to the modifier global (g).
test-case on regextester.com
Define your pattern like this:
var patt1 = /\+\+|--|\/\/|\*\*/;
Now it should do what you want.
More info about regular expressions: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
You can use:
/\+\+|--|\/\/|\*\*/
as your expression.
Here I have escaped the special characters by using a backslash before each (\).
I've also used .test(str) on the regular expression as all you need is a boolean (true/false) result.
See working example below:
var str = document.getElementById('screen').innerHTML;
var res = str.substring(0, str.length);
var patt1 = /\+\+|--|\/\/|\*\*/;
var result = patt1.test(res);
if (result) {
alert("you cant do this :l");
document.getElementById('screen').innerHTML = '';
}
<div id="screen">
This is some++ text
</div>
Try this:-
As
n+:- Matches any string that contains at least one n
n* Matches any string that contains zero or more occurrences of n
We need to use backslash before this special characters.
var str = document.getElementById('screen').innerHTML;
var res = str.substring(0, str.length);
var patt1 = /\+\+|--|\/\/|\*\*/;
var result = str.match(patt1);
if (result)
{
alert("you cant do this :l");
document.getElementById('screen').innerHTML='';
}
<div id="screen">2121++</div>

Matching whole words with Javascript's Regex with a few restrictions

I am trying to create a regex that can extract all words from a given string that only contain alphanumeric characters.
Yes
yes absolutely
#no
*NotThis
orThis--
Good *Bad*
1ThisIsOkay2 ButNotThis2)
Words that should have been extracted: Yes, yes, absolutely, Good, 1ThisIsOkay2
Here is the work I have done thus far:
/(?:^|\b)[a-zA-Z0-9]+(?=\b|$)/g
I had found this expression that works in Ruby ( with some tweaking ) but I have not been able to convert it to Javascript regex.
Use /(?:^|\s)\w+(?!\S)/g to match 1 or more word chars in between start of string/whitespace and another whitespace or end of string:
var s = "Yes\nyes absolutely\n#no\n*NotThis\norThis-- \nGood *Bad*\n1ThisIsOkay2 ButNotThis2)";
var re = /(?:^|\s)\w+(?!\S)/g;
var res = s.match(re).map(function(m) {
return m.trim();
});
console.log(res);
Or another variation:
var s = "Yes\nyes absolutely\n#no\n*NotThis\norThis-- \nGood *Bad*\n1ThisIsOkay2 ButNotThis2)";
var re = /(?:^|\s)(\w+)(?!\S)/g;
var res = [];
while ((m=re.exec(s)) !== null) {
res.push(m[1]);
}
console.log(res);
Pattern details:
(?:^|\s) - either start of string or whitespace (consumed, that is why trim() is necessary in Snippet 1)
\w+ - 1 or more word chars (in Snippet 2, captured into Group 1 used to populate the resulting array)
(?!\S) - negative lookahead failing the match if the word chars are not followed with non-whitespace.
You can do that (where s is your string) to match all the words:
var m = s.split(/\s+/).filter(function(i) { return !/\W/.test(i); });
If you want to proceed to a replacement, you can do that:
var res = s.split(/(\s+)/).map(function(i) { return i.replace(/^\w+$/, "#");}).join('');

JS - Split string into substrings by regex

Let's say I have a string that starts by 7878 and ends by 0d0a or 0D0A such as:
var string = "78780d0101234567890123450016efe20d0a";
var string2 = "78780d0101234567890123450016efe20d0a78780d0103588990504943870016efe20d0a";
var string 3 = "78780d0101234567890123450016efe20d0a78780d0103588990504943870016efe20d0a78780d0101234567890123450016efe20d0a"
How can I split it by regex so it becomes an array like:
['78780d0101234567890123450016efe20d0a']
['78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a']
['78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a']
You can split the string with a positive lookahead (?=7878). The regex isn't consuming any characters, so 7878 will be part of the string.
var rgx = /(?=7878)/;
console.log(string1.split(rgx));
console.log(string2.split(rgx));
console.log(string3.split(rgx));
Another option is to split on '7878' and then take all the elements except first and add '7878' to each of them. For example:
var arr = string3.split('7878').slice(1).map(function(str){
return '7878' + str;
});
That works BUT it also matches strings that do NOT end on 0d0a. How
can I only matches those ending on 0d0a OR 0D0A?
Well, then you can use String.match with a plain regex.
console.log(string3.match(/7878.*?0d0a/ig));

RegEx returns as one result instead of two

I have this string:
#{id:123}#{id:456}
and I am searching for matches using this regular expression:
/([a-z0-9\:#{}])/
It returns a match from the string, but it only returns one result instead of two.
Is this what you're looking for?
var myregex = /#{id:\d+}/g;
var theMatchObject = myregex.exec(yourString);
while (theMatchObject != null) {
// do something with the match:
// matched text: theMatchObject[0]
theMatchObject = myregex.exec(yourString);
}
Explanation
#{id: matches literal chars
\d+ matches one or more ASCII digits
} matches a literal char
the code iterates through the matches

Categories

Resources