JS - Split string into substrings by regex - javascript

Let's say I have a string that starts by 7878 and ends by 0d0a or 0D0A such as:
var string = "78780d0101234567890123450016efe20d0a";
var string2 = "78780d0101234567890123450016efe20d0a78780d0103588990504943870016efe20d0a";
var string 3 = "78780d0101234567890123450016efe20d0a78780d0103588990504943870016efe20d0a78780d0101234567890123450016efe20d0a"
How can I split it by regex so it becomes an array like:
['78780d0101234567890123450016efe20d0a']
['78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a']
['78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a']

You can split the string with a positive lookahead (?=7878). The regex isn't consuming any characters, so 7878 will be part of the string.
var rgx = /(?=7878)/;
console.log(string1.split(rgx));
console.log(string2.split(rgx));
console.log(string3.split(rgx));
Another option is to split on '7878' and then take all the elements except first and add '7878' to each of them. For example:
var arr = string3.split('7878').slice(1).map(function(str){
return '7878' + str;
});
That works BUT it also matches strings that do NOT end on 0d0a. How
can I only matches those ending on 0d0a OR 0D0A?
Well, then you can use String.match with a plain regex.
console.log(string3.match(/7878.*?0d0a/ig));

Related

JS split, keep delimiter minus its first character

Original, string: "FOO,blue,FOO,yellow,red,FOO,purple,brown,blue,FOOred,orange,FOO,blue,yellow"
I'd like to convert this mixed string to an array, splitting specifically at each ,FOO, and keep FOO.
Code:
var str = "blue,FOO,yellow,red,FOO,purple,brown,blue,FOOred,orange,FOO,blue,yellow"
var regex = /(?=,FOO)/g
console.log(str.split(regex))
Codepen
Desired result:
[
'FOO,blue',
'FOO,yellow,red',
'FOO,purple,brown,blue',
'FOOred,orange',
'FOO,blue,yellow',
]
Current result:
[
'FOO,blue',
',FOO,yellow,red',
',FOO,purple,brown,blue',
',FOOred,orange',
',FOO,blue,yellow',
]
As you see, each FOO instance included the preceding comma; how can I exclude the comma in the same regex operation?
var str = "FOO,blue,FOO,yellow,red,FOO,purple,brown,blue,FOOred,orange,FOO,blue,yellow"
var regex = /(?=,FOO)/g
console.log(str.split(regex))
You're only looking ahead for the comma at the moment - you need to include it in the match (outside of the lookahead) for it to be split upon and not included in the result.
var str = "FOO,blue,FOO,yellow,red,FOO,purple,brown,blue,FOOred,orange,FOO,blue,yellow"
var regex = /,(?=FOO)/g
console.log(str.split(regex))

How to slice optional arguments in RegEx?

Actually i have the following RegExp expression:
/^(?:(?:\,([A-Za-z]{5}))?)+$/g
So the accepted input should be something like ,IGORA but even ,IGORA,GIANC,LOLLI is valid and i would be able to slice the string to 3 group in this case, in other the group number should be equals to the user input that pass the RegExp test.
i was trying to do something like this in JavaScript but it return only the last value
var str = ',GIANC,IGORA';
var arr = str.match(/^(?:(?:\,([A-Za-z]{5}))?)+$/).slice(1);
alert(arr);
So the output is 'IGORA' while i would it to be 'GIANC' 'IGORA'
Here is another example
/^([A-Z]{5})(?:(?:\,([A-Za-z]{2}))?)+$/g
test of regexp may have at least 5 chart string but it also can have other 5 chart string separated with a comma so from input
IGORA,CIAOA,POPOP
I would have an array of ["IGORA","CIAOA","POPOP"]
You can capture the words in a capturing surrounded by an optional preceding comma or an optional trailing comma.
You can test the regex here: ,?([A-Za-z]+),?
const pattern = /,?([A-Za-z]+),?/gm;
const str = `,IGORA,GIANC,LOLLI`;
let matches = [];
let match;
// Iterate until no match found
while ((m = pattern.exec(str))) {
// The first captured group is the match
matches.push(m[1]);
}
console.log(matches);
There are other ways to do this, but I found that one of the simple ways is by using the replace method, as it can replace all instances that match that regex.
For example:
var regex = /^(?:(?:\,([A-Za-z]{5}))?)+$/g;
var str = ',GIANC,IGORA';
var arr = [];
str.replace(regex, function(match) {
arr[arr.length] = match;
return match;
});
console.log(arr);
Also, in my code snippet you can see that there is an extra coma in each string, you can solve that by changing line 5 to arr[arr.length] = match.replace(/^,/, '').
Is this what you're looking for?
Explanation:
\b word boundary (starting or ending a word)
\w a word ([A-z])
{5} 5 characters of previous
So it matches all 5-character words but not NANANANA
var str = 'IGORA,CIAOA,POPOP,NANANANA';
var arr = str.match(/\b\w{5}\b/g);
console.log(arr); //['IGORA', 'CIAOA', 'POPOP']
If you only wish to select words separated by commas and nothing else, you can test for them like so:
(?<=,\s*|^) preceded by , with any number of trailing space, OR is the first word in list.
(?=,\s*|$) followed by , and any number of trailing spaces OR is last word in list.
In the following code, POPOP and MOMMA are rejected because they are not separated by a comma, and NANANANA fails because it is not 5 character.
var str = 'IGORA, CIAOA, POPOP MOMMA, NANANANA, MEOWI';
var arr = str.match(/(?<=,\s*|^)\b\w{5}\b(?=,\s*|$)/g);
console.log(arr); //['IGORA', 'CIAOA', 'MEOWI']
If you can't have any trailing spaces after the comma, just leave out the \s* from both (?<=,\s*|^) and (?=,\s*|$).

How to replace numbers with an empty char

i need to replace phone number in string on \n new line.
My string: Jhony Jhons,jhon#gmail.com,380967574366
I tried this:
var str = 'Jhony Jhons,jhon#gmail.com,380967574366'
var regex = /[0-9]/g;
var rec = str.trim().replace(regex, '\n').split(','); //Jhony Jhons,jhon#gmail.com,
Number replace on \n but after using e-mail extra comma is in the string need to remove it.
Finally my string should look like this:
Jhony Jhons,jhon#gmail.com\n
You can try this:
var str = 'Jhony Jhons,jhon#gmail.com,380967574366';
var regex = /,[0-9]+/g;
str.replace(regex, '\n');
The snippet above may output what you want, i.e. Jhony Jhons,jhon#gmail.com\n
There's a lot of ways to that, and this is so easy, so try this simple answer:-
var str = 'Jhony Jhons,jhon#gmail.com,380967574366';
var splitted = str.split(","); //split them by comma
splitted.pop(); //removes the last element
var rec = splitted.join() + '\n'; //join them
You need a regex to select the complete phone number and also the preceding comma. Your current regex selects each digit and replaces each one with an "\n", resulting in a lot of "\n" in the result. Also the regex does not match the comma.
Use the following regex:
var str = 'Jhony Jhons,jhon#gmail.com,380967574366'
var regex = /,[0-9]+$/;
// it replaces all consecutive digits with the condition at least one digit exists (the "[0-9]+" part)
// placed at the end of the string (the "$" part)
// and also the digits must be preceded by a comma (the "," part in the beginning);
// also no need for global flag (/g) because of the $ symbol (the end of the string) which can be matched only once
var rec = str.trim().replace(regex, '\n'); //the result will be this string: Jhony Jhons,jhon#gmail.com\n
var str = "Jhony Jhons,jhon#gmail.com,380967574366";
var result = str.replace(/,\d+/g,'\\n');
console.log(result)

Split a string according to flanking characters in javascript

Javascript lets you split a string according to regular expression. Is it possible to use this functionality to split a string only when the delimiter is flanked by certain characters?
For example, if I want to split the string 12-93 but not at-13 using the - character? Is that possible?
Using a regular expression seems promising, but doing "12-93".split(/[0-9]-[0-9]/) yields ["1", "3"] because the flanking digits are considered to be part of the delimiter.
Can I specify the above split pattern (a dash preceded and followed by a digit) without chopping the flanking digits?
Other Examples
"55,966,575-165,162,787" should yield ["55,966,575", "165,162,787"]
"55,966,575x-165,162,787" should yield ["55,966,575x-165,162,787"]
"sdf55,966,575-165,162,787" should yield ["sdf55,966,575", "165,162,787"]
Using two adjacent character sets seems to work.
See example at https://regex101.com/r/uFHMW1/1
([0-9,a-z]+?[0-9]+)-([0-9]+[0-9,a-z]+)
Try this (live here https://repl.it/EOOQ/0 ):
var strings = [
"55,966,575-165,162,787",
"55,966,575x-165,162,787",
"sdf55,966,575-165,162,787",
];
var pattern = '^([0-9,a-z]+?[0-9]+)-([0-9]+[0-9,a-z]+)$';
var regex = new RegExp(pattern, 'i');
var matched = strings.map(function (string) {
var matches = string.match( regex );
if (matches) {
return [matches[1], matches[2]];
} else {
return [string];
}
});
console.log(matched)
You can also run the above expression as split() like:
string.split(re).filter( str => str.length )
where Array.filter() is used to get rid of the leading and trailing empty strings created when the RegExp matches your input.
var strings = [
"55,966,575-165,162,787",
"55,966,575x-165,162,787",
"sdf55,966,575-165,162,787",
];
var pattern = '^([0-9,a-z]+?[0-9]+)-([0-9]+[0-9,a-z]+)$';
var regex = new RegExp(pattern, 'i');
var matched = strings.map( string => string.split(regex).filter( str => str.length ) );
console.log(matched)
Try using a non-capturing lookahead. You are using a regex that captures all of the characters found, then uses that result as the split character(s).

How can I remove all characters up to and including the 3rd slash in a string?

I'm having trouble with removing all characters up to and including the 3 third slash in JavaScript. This is my string:
http://blablab/test
The result should be:
test
Does anybody know the correct solution?
To get the last item in a path, you can split the string on / and then pop():
var url = "http://blablab/test";
alert(url.split("/").pop());
//-> "test"
To specify an individual part of a path, split on / and use bracket notation to access the item:
var url = "http://blablab/test/page.php";
alert(url.split("/")[3]);
//-> "test"
Or, if you want everything after the third slash, split(), slice() and join():
var url = "http://blablab/test/page.php";
alert(url.split("/").slice(3).join("/"));
//-> "test/page.php"
var string = 'http://blablab/test'
string = string.replace(/[\s\S]*\//,'').replace(/[\s\S]*\//,'').replace(/[\s\S]*\//,'')
alert(string)
This is a regular expression. I will explain below
The regex is /[\s\S]*\//
/ is the start of the regex
Where [\s\S] means whitespace or non whitespace (anything), not to be confused with . which does not match line breaks (. is the same as [^\r\n]).
* means that we match anywhere from zero to unlimited number of [\s\S]
\/ Means match a slash character
The last / is the end of the regex
var str = "http://blablab/test";
var index = 0;
for(var i = 0; i < 3; i++){
index = str.indexOf("/",index)+1;
}
str = str.substr(index);
To make it a one liner you could make the following:
str = str.substr(str.indexOf("/",str.indexOf("/",str.indexOf("/")+1)+1)+1);
You can use split to split the string in parts and use slice to return all parts after the third slice.
var str = "http://blablab/test",
arr = str.split("/");
arr = arr.slice(3);
console.log(arr.join("/")); // "test"
// A longer string:
var str = "http://blablab/test/test"; // "test/test";
You could use a regular expression like this one:
'http://blablab/test'.match(/^(?:[^/]*\/){3}(.*)$/);
// -> ['http://blablab/test', 'test]
A string’s match method gives you either an array (of the whole match, in this case the whole input, and of any capture groups (and we want the first capture group)), or null. So, for general use you need to pull out the 1th element of the array, or null if a match wasn’t found:
var input = 'http://blablab/test',
re = /^(?:[^/]*\/){3}(.*)$/,
match = input.match(re),
result = match && match[1]; // With this input, result contains "test"
let str = "http://blablab/test";
let data = new URL(str).pathname.split("/").pop();
console.log(data);

Categories

Resources