How can I obtain substrings from a string in javascript? - javascript

I have a string that looks like this:
var stringOriginal = "72157632110713449SomeDynamicText";
I want to separate this string into two substrings:
One substring is the first 17 digits
One substring is the rest of the string
I want these stored in two separate variables, like this:
var string1 = "72157632110713449"; //First static 17 digits
var string2 = "SomeDynamicText"; // Dynamic Text

Assuming your string is fixed, you can use the substring or substr string functions. The two are very similar:
substr(start, length) obtains a value from the start index to a specified length (or to the end, if unspecified)
substring(start, end) obtains a value from the start index to the end index (or the end, if unspecified)
So, one way you could do it by mixing and matching the two, is like this:
var string1 = stringOriginal.substring(0, 17);
# interestingly enough, this does the same in this case
var string1 = stringOriginal.substr(0, 17);
var string2 = stringOriginal.substr(17);
If, however, you need a more sophisticated solution (e.g. not a fixed length of digits), you could try using a regex:
var regex = /(\d+)(\w+)/;
var match = regex.exec(stringOriginal);
var string1 = match[1]; // Obtains match from first capture group
var string2 = match[2]; // Obtains match from second capture group
Of course, this adds to the complexity, but is more flexible.

Here you go:
string1 = stringOriginal.substring(0, 17);
string2 = stringOriginal.substring(17, stringOriginal.length);
or
string2 = stringOriginal.substring(17);
//Second parameter is optional. The index where to stop the extraction.
//If second parameter is omitted, it extracts the rest of the string

This will split the string into vars given that the first 17 characters always go into string1 and the remainder into string2.
var string1 = stringOriginal.substring(0,17);
var string2 = stringOriginal.substring(17,stringOriginal.length);

Assuming that you want to split the string by separating initial digits from the rest regardless of length :
string = string.match (/^(\d+)(.*)/) || [string, '', ''];
string[1] will hold the initial digits, string[2] the rest of the string, the original string will be in string[0].
If string does not start with a digit, string[0] will hold the original string and string[1] and string[2] will be empty strings.
By changing the code to :
string = string.match (/^(\d*)(.*)/);
strings containing no initial digits will have string[1] empty and string[2] will have the same value as string[0], i.e the initial string. In this case there is no need to handle the case of a failing match.

Related

JS - Split string into substrings by regex

Let's say I have a string that starts by 7878 and ends by 0d0a or 0D0A such as:
var string = "78780d0101234567890123450016efe20d0a";
var string2 = "78780d0101234567890123450016efe20d0a78780d0103588990504943870016efe20d0a";
var string 3 = "78780d0101234567890123450016efe20d0a78780d0103588990504943870016efe20d0a78780d0101234567890123450016efe20d0a"
How can I split it by regex so it becomes an array like:
['78780d0101234567890123450016efe20d0a']
['78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a']
['78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a','78780d0101234567890123450016efe20d0a']
You can split the string with a positive lookahead (?=7878). The regex isn't consuming any characters, so 7878 will be part of the string.
var rgx = /(?=7878)/;
console.log(string1.split(rgx));
console.log(string2.split(rgx));
console.log(string3.split(rgx));
Another option is to split on '7878' and then take all the elements except first and add '7878' to each of them. For example:
var arr = string3.split('7878').slice(1).map(function(str){
return '7878' + str;
});
That works BUT it also matches strings that do NOT end on 0d0a. How
can I only matches those ending on 0d0a OR 0D0A?
Well, then you can use String.match with a plain regex.
console.log(string3.match(/7878.*?0d0a/ig));

Why is this regex matching also words within a non-capturing group?

I have this string (notice the multi-line syntax):
var str = ` Number One: Get this
Number Two: And this`;
And I want a regex that returns (with match):
[str, 'Get this', 'And this']
So I tried str.match(/Number (?:One|Two): (.*)/g);, but that's returning:
["Number One: Get this", "Number Two: And this"]
There can be any whitespace/line-breaks before any "Number" word.
Why doesn't it return only what is inside of the capturing group? Am I misundersating something? And how can I achieve the desired result?
Per the MDN documentation for String.match:
If the regular expression includes the g flag, the method returns an Array containing all matched substrings rather than match objects. Captured groups are not returned. If there were no matches, the method returns null.
(emphasis mine).
So, what you want is not possible.
The same page adds:
if you want to obtain capture groups and the global flag is set, you need to use RegExp.exec() instead.
so if you're willing to give on using match, you can write your own function that repeatedly applies the regex, gets the captured substrings, and builds an array.
Or, for your specific case, you could write something like this:
var these = str.split(/(?:^|\n)\s*Number (?:One|Two): /);
these[0] = str;
Replace and store the result in a new string, like this:
var str = ` Number One: Get this
Number Two: And this`;
var output = str.replace(/Number (?:One|Two): (.*)/g, "$1");
console.log(output);
which outputs:
Get this
And this
If you want the match array like you requested, you can try this:
var getMatch = function(string, split, regex) {
var match = string.replace(regex, "$1" + split);
match = match.split(split);
match = match.reverse();
match.push(string);
match = match.reverse();
match.pop();
return match;
}
var str = ` Number One: Get this
Number Two: And this`;
var regex = /Number (?:One|Two): (.*)/g;
var match = getMatch(str, "#!SPLIT!#", regex);
console.log(match);
which displays the array as desired:
[ ' Number One: Get this\n Number Two: And this',
' Get this',
'\n And this' ]
Where split (here #!SPLIT!#) should be a unique string to split the matches. Note that this only works for single groups. For multi groups add a variable indicating the number of groups and add a for loop constructing "$1 $2 $3 $4 ..." + split.
Try
var str = " Number One: Get this\
Number Two: And this";
// `/\w+\s+\w+(?=\s|$)/g` match one or more alphanumeric characters ,
// followed by one or more space characters ,
// followed by one or more alphanumeric characters ,
// if following space or end of input , set `g` flag
// return `res` array `["Get this", "And this"]`
var res = str.match(/\w+\s+\w+(?=\s|$)/g);
document.write(JSON.stringify(res));

Tokenize a JavaScript String depending on the characters

In JavaScript, let's say I have a String like "23+var-5/422*b".
I want to split this String so that I get [23,+,var,-,5,/,422,*,b].
I want to tokenize it so that I split the string into 3 types of tokens:
Numerical literals, [0-9].
String literals, [A-z].
Operator characters, [-+*/].
So basically, go through the string, and for each "cluster of characters" that share the same class (each with 1 or more characters), convert that into a token.
I could probably use a for loop, comparing each character with each class, and manually create a token every time the current "character class" changes... it would be very tedious and use many variables and loops.
Does anyone know a more elegant (less verbose) way to get there?
A global regexp match will do this for you:
var str = "23+var-5/422*b";
var arr = str.match(/[0-9]+|[a-zA-Z]+|[-+*/]/g); // notice the creation of one token
// per operator (even if consecutive)
However, it simply ignores invalid characters instead of erroring out.
Here's a way to do it using Regex. Obviously the code can be simplified more if you use Underscore.js or CoffeeScript. So here's a longer version using vanilla JS:
var s = "23+var-5/422*b"; // your string
var re1 = /[0-9]/; // Regex for numerals
var re2 = /[a-zA-Z]/; // Regex for roman chars
var re3 = /[-+*\/]/; // Regex you wanted for operators
// Helper function, return true if n none-negative
function nonNegative(n) {
return n >= 0;
}
// helper function: add any none-negative n to array arr
function addNonNegative(n, arr) {
if (nonNegative(n)) {arr.push(n)};
}
// The main function to split string s
function split(s) {
var result = []; // The result array, initialized
// Do while string s is none empty.
while(s.length > 0) {
// The order of indices of regex found
var order = [];
// search for index or which the regex occurs, then if that index is none-negative, add it to the 'order' array
addNonNegative(s.search(re1), order);
addNonNegative(s.search(re2), order);
addNonNegative(s.search(re3), order);
// sort the order array
order = order.sort();
// variables to slice the string s.
// start is always 0. Marks the starting index of the first matched regex
var start = order.shift();
// Marks the starting index of the second matched regex
var end = order.shift(); // end is the second result in order
result.push(s.slice(start, end)); // slice the string s from start to end
// update s so that exclude what was sliced before
s = s.slice(end);
// boundary condition: finally when end is null once all regex have been pulled, set s = ""
if (end == null) {s = ""};
}
return result;
}

How to match a number at the start of a string

I would like to match a number at the start of each string:
1000_lang sorting_1 ghhgf_1002
1001_lang
100_abcdefg_sgdga_10001_321gg hjdshjdg
So, I will have numbers: 1000, 1001, 100 respectively. Basically, I want to match a number from a string until that number meets first underscore. But numbers can be any length, so if it is 12345_eyquyewuq_32136 df_1999 I need 12345. Don't need any other numbers coming after the first underscore.
^\d+
Get all numbers from the start of the line up to the first non-number
str = "123456_wibble";
patt = /^\d+/;
result = str.match( patt);
result is an array of matches, so as long as there is 1 or more, you've found something
See Mozilla Regular Expressions
This answer is javascript only, but it may be usefull if you don't care about regex:
var str = "1000_lang sorting_1 ghhgf_1002";
var result = str.split("_")[0];
result will hold the first number.
Something like this....
var str = '1000_lang sorting_1 ghhgf_1002',
matches = str.match(/^\d+/)
console.log(matches)

How can I remove all characters up to and including the 3rd slash in a string?

I'm having trouble with removing all characters up to and including the 3 third slash in JavaScript. This is my string:
http://blablab/test
The result should be:
test
Does anybody know the correct solution?
To get the last item in a path, you can split the string on / and then pop():
var url = "http://blablab/test";
alert(url.split("/").pop());
//-> "test"
To specify an individual part of a path, split on / and use bracket notation to access the item:
var url = "http://blablab/test/page.php";
alert(url.split("/")[3]);
//-> "test"
Or, if you want everything after the third slash, split(), slice() and join():
var url = "http://blablab/test/page.php";
alert(url.split("/").slice(3).join("/"));
//-> "test/page.php"
var string = 'http://blablab/test'
string = string.replace(/[\s\S]*\//,'').replace(/[\s\S]*\//,'').replace(/[\s\S]*\//,'')
alert(string)
This is a regular expression. I will explain below
The regex is /[\s\S]*\//
/ is the start of the regex
Where [\s\S] means whitespace or non whitespace (anything), not to be confused with . which does not match line breaks (. is the same as [^\r\n]).
* means that we match anywhere from zero to unlimited number of [\s\S]
\/ Means match a slash character
The last / is the end of the regex
var str = "http://blablab/test";
var index = 0;
for(var i = 0; i < 3; i++){
index = str.indexOf("/",index)+1;
}
str = str.substr(index);
To make it a one liner you could make the following:
str = str.substr(str.indexOf("/",str.indexOf("/",str.indexOf("/")+1)+1)+1);
You can use split to split the string in parts and use slice to return all parts after the third slice.
var str = "http://blablab/test",
arr = str.split("/");
arr = arr.slice(3);
console.log(arr.join("/")); // "test"
// A longer string:
var str = "http://blablab/test/test"; // "test/test";
You could use a regular expression like this one:
'http://blablab/test'.match(/^(?:[^/]*\/){3}(.*)$/);
// -> ['http://blablab/test', 'test]
A string’s match method gives you either an array (of the whole match, in this case the whole input, and of any capture groups (and we want the first capture group)), or null. So, for general use you need to pull out the 1th element of the array, or null if a match wasn’t found:
var input = 'http://blablab/test',
re = /^(?:[^/]*\/){3}(.*)$/,
match = input.match(re),
result = match && match[1]; // With this input, result contains "test"
let str = "http://blablab/test";
let data = new URL(str).pathname.split("/").pop();
console.log(data);

Categories

Resources