traversing string in javascript - javascript

I have a number of strings concatenated together
"[thing 1,thing 2,cat in the hat,Dr. Suese]"
I would like to traverse this string to stop at a specific comma (given an index) and return the substring immediately after the comma and before the next comma. The problem is I need to do it in JavaScript. I assume it would be something like this
function returnSubstring(i,theString){
var j,k = 0;
while(theString.charCodeAt(k) != ','){
while(i > 0){
if (theString.charCodeAt(j) == ','){
i--;
}
j++;
}
k++;
}
return theString.substring(j,k);
}
Is this what it should look like or is there some syntax issue here

I would like to traverse this string to stop at a specific comma (given an index) and return the substring immediately after the comma and before the next comma.
--> Let's assume specific index for comma accpeted is 8 i.e. first comma index, you can do :
var givenCommaIndex = 8;
var value = "[thing 1,thing 2,cat in the hat,Dr. Suese]";
var subString = value.substring(givenCommaIndex+1, value.indexOf(",", givenCommaIndex+1));
console.log(subString);
// Output :
"thing 2"
I can write the reusable function like below, it will not just work for comma but other delimiters as well :
function getSubString(str, delimiter, indexOfDelimiter) {
// TODO : handle specific cases like str is undefined or delimiter is null
return str.substring(indexOfDelimiter+1, str.indexOf(delimiter, indexOfDelimiter+1));
}

You may split :
var token = "[thing 1,thing 2,cat in the hat,Dr. Suese]"
.slice(1,-1) // remove [ and ]
.split(',')
[2]; // the third token
Or use a regular expression :
var token = "[thing 1,thing 2,cat in the hat,Dr. Suese]"
.match(/([^\]\[,]+)/g)
[2];

Related

Splitting a CSV string but not certain elements javascript [duplicate]

This question already has answers here:
How to parse CSV data?
(14 answers)
Closed 6 months ago.
If given an comma separated string as follows
'UserName,Email,[a,b,c]'
i want a split array of all the outermost elements so expected result
['UserName','Email', '[a,b,c]']
string.split(',') will split across every comma but that wont work so any suggestions? this is breaking a CSV reader i have.
I wrote 2 similar answers, so might as well make it a 3rd instead of referring you there. It's a stateful split. This doesn't support nested arrays, but can easily made so.
var str = 'UserName,Email,[a,b,c]'
function tokenize(str) {
var state = "normal";
var tokens = [];
var current = "";
for (var i = 0; i < str.length; i++) {
c = str[i];
if (state == "normal") {
if (c == ',') {
if (current) {
tokens.push(current);
current = "";
}
continue;
}
if (c == '[') {
state = "quotes";
current = "";
continue;
}
current += c;
}
if (state == "quotes") {
if (c == ']') {
state = "normal";
tokens.push(current);
current = "";
continue;
}
current += c;
}
}
if (current) {
tokens.push(current);
current = "";
}
return tokens;
}
console.log(tokenize(str))
You can do this by matching the string to this Regex:
/(^|(?<=,))\[[^[]+\]|[^,]+((?=,)|$)/
let string = '[a,b,c],UserName,[1,2],Email,[a,b,c],password'
let regex = /(^|(?<=,))\[[^[]+\]|[^,]+((?=,)|$)/g
let output = string.match(regex);
console.log(output)
The regex can be summarized as:
Match either an array or a string that's enclosed by commas or at the start/end of our input
The key token we're using is alternative | which works as a sort of either this, or that and since the regex engine is eager, when it matches one, it moves on. So if we match and array, then we move on and don't consider what's inside.
We can break it down to 3 main sections:
(^|(?<=,))
^ Match from the beginning of our string
| Alternatively
(?<=,) Match a string that's preceded by a comma without returning the comma. Read more about positive lookaround here.
\[[^[]+\] | [^,]+
\[[^[]+\] Match a string that starts with [ and ends with ] and can contain a string of one or more characters that aren't [
This because in [1,2],[a,b] it can match the whole string at once since it starts with [ and ends with ]. This way our condition stops that by removing matches that also contain [ indicating that it belongs the second array.
| Alternatively
[^,]+ Match a string of any length that doesn't contain a comma, for the same reason as the brackets above since with ,asd,qwe, technically all of asd,qwe is enclosed with commas.
((?=,)|$)
(?=,) Match any string that's followed by a comma
| Alternatively
$ Match a string that ends with the end of the main string. Read here for a better explanation.

Create regular expression to separate each word from . (dot)

I am trying to create a regular expression to separate each word from . (dot) in JavaScript.
function myFunction() {
var url = "in.k1.k2.k3.k4.com"
var result
var match
if (match = url.match(/^[^\.]+\.(.+\..+)$/)) {
console.log(match);
}
}
Actual result is:
match[0] : in.k1.k2.k3.k4.com
match[1] : k1.k2.k3.k4.com
Expected result is:
match[0] : in.k1.k2.k3.k4.com
match[1] : k1.k2.k3.k4.com
match[2] : k2.k3.k4.com
match[3] : k3.k4.com
match[4] : k4.com
Please help me to create perfect regular expression.
Using a regex, in this case, might not be the best choice. You could simply split your string at the . and then join them when you need it.
function recursivelySplitText(arrayOfString, output) {
// check if the output is set, otherwise create it.
if(!output) {
output = [];
}
// we add the current value to the output
output.push(arrayOfString.join('.'));
// we remove the first element of the array
arrayOfString.splice(0, 1);
// if we just have one element left in the array ( com ) we return the array
// otherwise, we call the function again with the newly splitted array.
return arrayOfString.length === 1 ? output : recursivelySplitText(arrayOfString, output);
}
const text = 'in.k1.k2.k3.k4.com';
// we need to split it first to have an array of string rather than a string.
console.log(recursivelySplitText(text.split('.')));
Here i used recursion because it is fun, but it is not the only way of getting the same result.
Instead of a regex you could use a combination of split() and map() to create the output array you require. Try this:
function myFunction() {
var arr = "in.k1.k2.k3.k4.com".split('.');
var output = arr.map((v, i) => arr.slice(i).join('.'));
console.log(output);
}
myFunction();

Tokenize a JavaScript String depending on the characters

In JavaScript, let's say I have a String like "23+var-5/422*b".
I want to split this String so that I get [23,+,var,-,5,/,422,*,b].
I want to tokenize it so that I split the string into 3 types of tokens:
Numerical literals, [0-9].
String literals, [A-z].
Operator characters, [-+*/].
So basically, go through the string, and for each "cluster of characters" that share the same class (each with 1 or more characters), convert that into a token.
I could probably use a for loop, comparing each character with each class, and manually create a token every time the current "character class" changes... it would be very tedious and use many variables and loops.
Does anyone know a more elegant (less verbose) way to get there?
A global regexp match will do this for you:
var str = "23+var-5/422*b";
var arr = str.match(/[0-9]+|[a-zA-Z]+|[-+*/]/g); // notice the creation of one token
// per operator (even if consecutive)
However, it simply ignores invalid characters instead of erroring out.
Here's a way to do it using Regex. Obviously the code can be simplified more if you use Underscore.js or CoffeeScript. So here's a longer version using vanilla JS:
var s = "23+var-5/422*b"; // your string
var re1 = /[0-9]/; // Regex for numerals
var re2 = /[a-zA-Z]/; // Regex for roman chars
var re3 = /[-+*\/]/; // Regex you wanted for operators
// Helper function, return true if n none-negative
function nonNegative(n) {
return n >= 0;
}
// helper function: add any none-negative n to array arr
function addNonNegative(n, arr) {
if (nonNegative(n)) {arr.push(n)};
}
// The main function to split string s
function split(s) {
var result = []; // The result array, initialized
// Do while string s is none empty.
while(s.length > 0) {
// The order of indices of regex found
var order = [];
// search for index or which the regex occurs, then if that index is none-negative, add it to the 'order' array
addNonNegative(s.search(re1), order);
addNonNegative(s.search(re2), order);
addNonNegative(s.search(re3), order);
// sort the order array
order = order.sort();
// variables to slice the string s.
// start is always 0. Marks the starting index of the first matched regex
var start = order.shift();
// Marks the starting index of the second matched regex
var end = order.shift(); // end is the second result in order
result.push(s.slice(start, end)); // slice the string s from start to end
// update s so that exclude what was sliced before
s = s.slice(end);
// boundary condition: finally when end is null once all regex have been pulled, set s = ""
if (end == null) {s = ""};
}
return result;
}

Separate value from string using javascript

I have a string in which every value is between [] and it has a . at the end. How can I separate all values from the string?
This is the example string:
[value01][value02 ][value03 ]. [value04 ]
//want something like this
v1 = value01;
v2 = value02;
v3 = value03;
v4 = value04
The number of values is not constant. How can I get all values separately from this string?
Use regular expressions to specify multiple separators. Please check the following posts:
How do I split a string with multiple separators in javascript?
Split a string based on multiple delimiters
var str = "[value01][value02 ][value03 ]. [value04 ]"
var arr = str.split(/[\[\]\.\s]+/);
arr.shift(); arr.pop(); //discard the first and last "" elements
console.log( arr ); //output: ["value01", "value02", "value03", "value04"]
JS FIDDLE DEMO
How This Works
.split(/[\[\]\.\s]+/) splits the string at points where it finds one or more of the following characters: [] .. Now, since these characters are also found at the beginning and end of the string, .shift() discards the first element, and .pop() discards the last element, both of which are empty strings. However, your may want to use .filter() and your can replace lines 2 and 3 with:
var arr = str.split(/[\[\]\.\s]+/).filter(function(elem) { return elem.length > 0; });
Now you can use jQuery/JS to iterate through the values:
$.each( arr, function(i,v) {
console.log( v ); // outputs the i'th value;
});
And arr.length will give you the number of elements you have.
If you want to get the characters between "[" and "]" and the data is regular and always has the pattern:
'[chars][chars]...[chars]'
then you can get the chars using match to get sequences of characters that aren't "[" or "]":
var values = '[value01][value02 ][value03 ][value04 ]'.match(/[^\[\]]+/g)
which returns an array, so values is:
["value01", "value02 ", "value03 ", "value04 "]
Match is very widely supported, so no cross browser issues.
Here's a fiddle: http://jsfiddle.net/5xVLQ/
Regex patern: /(\w)+/ig
Matches all words using \w (alphanumeric combos). Whitespace, brackets, dots, square brackets are all non-matching, so they don't get returned.
What I do is create a object to hold results in key/value pairs such as v1:'value01'. You can iterate through this object, or you can access the values directly using objRes.v1
var str = '[value01][value02 ][value03 ]. [value04 ]';
var myRe = /(\w)+/ig;
var res;
var objRes = {};
var i=1;
while ( ( res = myRe.exec(str) ) != null )
{
objRes['v'+i] = res[0];
i++;
}
console.log(objRes);

How to extract a string using JavaScript Regex?

I'm trying to extract a substring from a file with JavaScript Regex. Here is a slice from the file :
DATE:20091201T220000
SUMMARY:Dad's birthday
the field I want to extract is "Summary". Here is the approach:
extractSummary : function(iCalContent) {
/*
input : iCal file content
return : Event summary
*/
var arr = iCalContent.match(/^SUMMARY\:(.)*$/g);
return(arr);
}
function extractSummary(iCalContent) {
var rx = /\nSUMMARY:(.*)\n/g;
var arr = rx.exec(iCalContent);
return arr[1];
}
You need these changes:
Put the * inside the parenthesis as
suggested above. Otherwise your matching
group will contain only one
character.
Get rid of the ^ and $. With the global option they match on start and end of the full string, rather than on start and end of lines. Match on explicit newlines instead.
I suppose you want the matching group (what's
inside the parenthesis) rather than
the full array? arr[0] is
the full match ("\nSUMMARY:...") and
the next indexes contain the group
matches.
String.match(regexp) is
supposed to return an array with the
matches. In my browser it doesn't (Safari on Mac returns only the full
match, not the groups), but
Regexp.exec(string) works.
You need to use the m flag:
multiline; treat beginning and end characters (^ and $) as working
over multiple lines (i.e., match the beginning or end of each line
(delimited by \n or \r), not only the very beginning or end of the
whole input string)
Also put the * in the right place:
"DATE:20091201T220000\r\nSUMMARY:Dad's birthday".match(/^SUMMARY\:(.*)$/gm);
//------------------------------------------------------------------^ ^
//-----------------------------------------------------------------------|
Your regular expression most likely wants to be
/\nSUMMARY:(.*)$/g
A helpful little trick I like to use is to default assign on match with an array.
var arr = iCalContent.match(/\nSUMMARY:(.*)$/g) || [""]; //could also use null for empty value
return arr[0];
This way you don't get annoying type errors when you go to use arr
This code works:
let str = "governance[string_i_want]";
let res = str.match(/[^governance\[](.*)[^\]]/g);
console.log(res);
res will equal "string_i_want". However, in this example res is still an array, so do not treat res like a string.
By grouping the characters I do not want, using [^string], and matching on what is between the brackets, the code extracts the string I want!
You can try it out here: https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_match_regexp
Good luck.
(.*) instead of (.)* would be a start. The latter will only capture the last character on the line.
Also, no need to escape the :.
You should use this :
var arr = iCalContent.match(/^SUMMARY\:(.)*$/g);
return(arr[0]);
this is how you can parse iCal files with javascript
function calParse(str) {
function parse() {
var obj = {};
while(str.length) {
var p = str.shift().split(":");
var k = p.shift(), p = p.join();
switch(k) {
case "BEGIN":
obj[p] = parse();
break;
case "END":
return obj;
default:
obj[k] = p;
}
}
return obj;
}
str = str.replace(/\n /g, " ").split("\n");
return parse().VCALENDAR;
}
example =
'BEGIN:VCALENDAR\n'+
'VERSION:2.0\n'+
'PRODID:-//hacksw/handcal//NONSGML v1.0//EN\n'+
'BEGIN:VEVENT\n'+
'DTSTART:19970714T170000Z\n'+
'DTEND:19970715T035959Z\n'+
'SUMMARY:Bastille Day Party\n'+
'END:VEVENT\n'+
'END:VCALENDAR\n'
cal = calParse(example);
alert(cal.VEVENT.SUMMARY);

Categories

Resources