string to dictionary in javascript - javascript

I have a string that I want to transform into a dictionary, the string looks like this:
const str = '::student{name="this is the name" age="21" faculty="some faculty"}'
I want to transform that string into a dictionary that looks like this:
const dic = {
"name": "this is the name",
"age": "21",
"faculty": "some faculty"
}
So the string is formated in this was ::name{parameters...} and can have any parameters, not only name, faculty, ... How can I format any string that looks like that and transform it into a dictionary?
Also Is there a way to check that the current string I'm parsing follow this structure ::name{parameters...}, that way I can throw an error when it does not.
Any help would be greatly appreciated!

Assuming that you only have alphanumeric and spaces or an empty string within the parentheses of the values, and there is never a space between the key and value in key="value" you can match with the following regular expression and then iterate over it to construct your desired object.
const str = '::student{name="this is the name" age="21" faculty="some faculty"}'
const matches = str.match(/[\w]+\=\"[\w\s]*(?=")/g)
const result = {}
matches.forEach(match => {
const [key, value] = match.split('="')
result[key] = value
})
console.log(result)
The regex is composed of the following parts:
You can use https://regexr.com to experiment with your regular expressions. Depending on the string to process, you'll need to refine your regex.

This example uses exec to look for matches in the string using a regular expression.
const str = '::student{name="this is the name" age="21" faculty="some faculty"}';
const regex = /([a-z]+)="([a-z0-9 ?]+)"/g;
let match, output = {};
while (match = regex.exec(str)) {
output[match[1]] = match[2];
}
console.log(output);

If not exist a unique pattern, it is hard, but when pattern is the same, you can split the string:
var str = '::student{name="this is the name" age="21" faculty="some faculty"}'
console.log(createObj(str));
function createObj(myString){
if(myString.search("::") !== 0 && myString.search("{}") === -1 && myString.search("}") === -1){
console.log("format error")
return null;
}
var a = myString.split("{")[1];
var c = a.replace('{','').replace('}','');
if(c.search('= ""') !== -1){
console.log("format incorrect");
return null;
}
var d = c.split('="')
var keyValue = [];
for(item of d){
var e = item.split('" ')
if(e.length === 1){
keyValue.push(e[0].replace('"',''));
}else{
for(item2 of e){
keyValue.push(item2.replace('"',''));
}
}
}
var myObj = {}
if(keyValue.length % 2 === 0){
for(var i = 0; i<keyValue.length; i=i+2){
myObj[keyValue[i]] = keyValue[i+1]
}
}
return myObj;
}

You could use 2 patterns. The first pattern to match the format of the string and capture the content between the curly braces in a single capture group. The second pattern to get the key value pairs using 2 capture groups.
For the full match you can use
::\w+{([^{}]*)}
::\w+ match :: and 1+ word characters
{ Match the opening curly
([^{}]*) Capture group 1, match from opening till closing curly
} Match the closing curly
Regex demo
For the keys and values you can use
(\w+)="([^"]+)"
(\w+) Capture group 1, match 1+ word chars
=" Match literally
([^"]+) Capture group 2, match from an opening till closing double quote
" Match closing double quote
Regex demo
const str = '::student{name="this is the name" age="21" faculty="some faculty"}';
const regexFullMatch = /::\w+{([^{}]*)}/;
const regexKeyValue = /(\w+)="([^"]+)"/g;
const m = str.match(regexFullMatch);
if (m) {
dic = Object.fromEntries(
Array.from(m[1].matchAll(regexKeyValue), v => [v[1], v[2]])
);
console.log(dic)
}

Related

How to split string with multiple semi colons javascript

I have a string like below
var exampleString = "Name:Sivakumar ; Tadisetti;Country:India"
I want to split above string with semi colon, so want the array like
var result = [ "Name:Sivakumar ; Tadisetti", "Country:India" ]
But as the name contains another semi colon, I am getting array like
var result = [ "Name:Sivakumar ", "Tadisetti", "Country:India" ]
Here Sivakumar ; Tadisetti value of the key Name
I just wrote the code like exampleString.split(';')... other than this not able to get an idea to proceed further to get my desired output. Any suggestions?
Logic to split: I want to split the string as array with key:value pairs
Since .split accepts regular expressions as well, you can use a one that matches semicolons that are only followed by alphanumeric characters that end in : (in effect if they are followed by another key)
/;(?=\w+:)/
var exampleString = "Name:Sivakumar ; Tadisetti;Country:India";
var result = exampleString.split(/;(?=\w+:)/);
console.log(result)
here is an approach which we first split the input on ; and then concat the element without : with the previous one; since it shouldn't being spited.
let exampleString = "Name:Sivakumar ; Tadisetti;Country:India"
let reverseSplited = exampleString.split(";").reverse();
let prevoiusString;
let regex = /[^:]+:.*/;
let result = reverseSplited.map( str => {
if(regex.test(str)) {
if(prevoiusString){
let returnValue = str + ";" + prevoiusString;
prevoiusString = null;
return returnValue
}
return str
}
prevoiusString = str;
}).filter(e=>e);
console.log(result);

Split and replace text by two rules (regex)

I trying to split text by two rules:
Split by whitespace
Split words greater than 5 symbols into two separate words like (aaaaawww into aaaaa- and www)
I create regex that can detect this rules (https://regex101.com/r/fyskB3/2) but can't understand how to make both rules work in (text.split(/REGEX/)
Currently regex - (([\s]+)|(\w{5})(?=\w))
For example initial text is hello i am markopollo and result should look like ['hello', 'i', 'am', 'marko-', 'pollo']
It would probably be easier to use .match: match up to 5 characters that aren't whitespace:
const str = 'wqerweirj ioqwejr qiwejrio jqoiwejr qwer qwer';
console.log(
str.match(/[^ ]{1,5}/g)
)
My approach would be to process the string before splitting (I'm a big fan of RegEx):
1- Search and replace all the 5 consecutive non-last characters with \1-.
The pattern (\w{5}\B) will do the trick, \w{5} will match 5 exact characters and \B will match only if the last character is not the ending character of the word.
2- Split the string by spaces.
var text = "hello123467891234 i am markopollo";
var regex = /(\w{5}\B)/g;
var processedText = text.replace(regex, "$1- ");
var result = processedText.split(" ");
console.log(result)
Hope it helps!
Something like this should work:
const str = "hello i am markopollo";
const words = str.split(/\s+/);
const CHUNK_SIZE=5;
const out = [];
for(const word of words) {
if(word.length > CHUNK_SIZE) {
let chunks = chunkSubstr(word,CHUNK_SIZE);
let last = chunks.pop();
out.push(...chunks.map(c => c + '-'),last);
} else {
out.push(word);
}
}
console.log(out);
// credit: https://stackoverflow.com/a/29202760/65387
function chunkSubstr(str, size) {
const numChunks = Math.ceil(str.length / size)
const chunks = new Array(numChunks)
for (let i = 0, o = 0; i < numChunks; ++i, o += size) {
chunks[i] = str.substr(o, size)
}
return chunks
}
i.e., first split the string into words on spaces, and then find words longer than 5 chars and 'chunk' them. I popped off the last chunk to avoid adding a - to it, but there might be a more efficient way if you patch chunkSubstr instead.
regex.split doesn't work so well because it will basically remove those items from the output. In your case, it appears you want to strip the whitespace but keep the words, so splitting on both won't work.
Uses the regex expression of #CertainPerformance = [^\s]{1,5}, then apply regex.exec, finally loop all matches to reach the goal.
Like below demo:
const str = 'wqerweirj ioqwejr qiwejrio jqoiwejr qwer qwer'
let regex1 = RegExp('[^ ]{1,5}', 'g')
function customSplit(targetString, regexExpress) {
let result = []
let matchItem = null
while ((matchItem = regexExpress.exec(targetString)) !== null) {
result.push(
matchItem[0] + (
matchItem[0].length === 5 && targetString[regexExpress.lastIndex] && targetString[regexExpress.lastIndex] !== ' '
? '-' : '')
)
}
return result
}
console.log(customSplit(str, regex1))
console.log(customSplit('hello i am markopollo', regex1))

Javascript Regex to split line of log with key value pairs

I have a log like
t=2016-08-03T18:47:26+0000 lvl=dbug msg="Event Received" Service=SomeService
and I want to turn it into a javascript object like
{
t: 2016-08-03T18:47:26+0000,
lvl: dbug
msg: "Event Received"
Service: SomeService
}
But I am having trouble coming up with a regex that will detect the string "Event Received" in the log line.
I want to split the log line by space but because of the string it is much more difficult.
I am trying to come up with a regex that will detect the fields and parameters so that I can isolate them and split with the equal sign.
I suggest a regex without any lookahead:
var re = /(\w+)=(?:"([^"]*)"|(\S*))/g;
See the regex demo
The point is that the first group ((\w+)) captures the attribute name and the 2nd and 3rd are placed into a non-capturing "container" as alternative branches. Their values can be checked and then either one will be used to fill out the object.
Pattern details:
(\w+) - Group 1 (attribute name) matching 1+ word chars (from [a-zA-Z0-9_] ranges)
= - an equal sign
(?:"([^"]*)"|(\S*)) - a non-capturing "container" group matching either of the two alternatives:
"([^"]*)" - a quote, then Group 2 capturing 0+ chars other than ", and a quote
| - or
(\S*) - Group 3 capturing 0+ non-whitespace symbols.
var rx = /(\w+)=(?:"([^"]*)"|(\S*))/g;
var s = "t=2016-08-03T18:47:26+0000 lvl=dbug msg=\"Event Received\" Service=SomeService";
var obj = {};
while((m=rx.exec(s))!==null) {
if (m[2]) {
obj[m[1]] = m[2];
} else {
obj[m[1]] = m[3];
}
}
console.log(obj);
You can use this regex to capture various name=value pairs:
/(\w+)=(.*?)(?= \w+=|$)/gm
RegEx Demo
Code:
var re = /(\w+)=(.*?)(?= \w+=|$)/gm;
var str = 't=2016-08-03T18:47:26+0000 lvl=dbug msg="Event Received" Service=SomeService';
var m;
var result = {};
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex)
re.lastIndex++;
result[m[1]] = m[2];
}
console.log(result);
Use this pattern:
/^t=([^ ]+) lvl=([^ ]+) msg=(.*?[a-z]") Service=(.*)$/gm
Online Demo
To achieve expected result, use below
var x = 't=2016-08-03T18:47:26+0000 lvl=dbug msg="Event Received" Service=SomeService';
var y = x.replace(/=/g,':').split(' ');
var z = '{'+ y+'}';
console.log(z);
http://codepen.io/nagasai/pen/oLPRAy

Get numbers and characters after 3 expression with regex

I'm using a regular expression to get the next value of a particular word in some huge text.
Example:
Money: 0,00 0,00 0,00 50,00
Currently I'm taking the value of 0.00 with the following Regex:
var obj =
var text = 'HUGE TEXT HERE'
var reg = new RegExp('Money' + '.*?(\\d\\S*)');
var match = reg.exec(text);
if (match === null) {
obj[key] = '';
continue;
}
obj[key] = match[1];
output:
object.money = '0,00'
This value is dynamic and sometimes the word 'Money' changes. So i need to be able to pass the word name.
    
I would like to amplify my regular expression to ignore the next three expressions and then get the value. It is possible?
Thanks.
You can look for (and ignore) the three repeating numbers, then catch the last one:
(?:\d\S*\s+){3}(\d\S*)
Where:
(?:...)
means "don't include this group in the returned list of captured groups" and
{3}
means "match three of them".
We're including \s+ at the end to catch the whitespace between the numbers.
Something like (simplified):
var obj = {};
var key = 'Money';
var text = "Something something:\n" +
"Money: 1,00 2,00 3,00 3,00\n" +
"Something else";
var reg = new RegExp(key + '\\b.*?(?:\\d\\S*\\s+){3}(\\d\\S*)');
var match = reg.exec(text);
if (match === null)
obj[key] = '';
else
obj[key] = match[1];
console.log(obj[key]);

Javascript get query string values using non-capturing group

Given this query string:
?cgan=1&product_cats=mens-jeans,shirts&product_tags=fall,classic-style&attr_color=charcoal,brown&attr_size=large,x-small&cnep=0
How can I extract the values from only these param types 'product_cat, product_tag, attr_color, attr_size' returning only 'mens-jeans,shirts,fall,classic-style,charcoal,brown,large,x-small?
I tried using a non-capturing group for the param types and capturing group for just the values, but its returning both.
(?:product_cats=|product_tags=|attr\w+=)(\w|,|-)+
You can collect tha values using
(?:product_cats|product_tags|attr\w+)=([\w,-]+)
Mind that a character class ([\w,-]+) is much more efficient than a list of alternatives ((\w|,|-)*), and we avoid the issue of capturing just the last single character.
Here is a code sample:
var re = /(?:product_cats|product_tags|attr\w+)=([\w,-]+)/g;
var str = '?cgan=1&product_cats=mens-jeans,shirts&product_tags=fall,classic-style&attr_color=charcoal,brown&attr_size=large,x-small&cnep=0';
var res = [];
while ((m = re.exec(str)) !== null) {
res.push(m[1]);
}
document.getElementById("res").innerHTML = res.join(",");
<div id="res"/>
You can always use a jQuery method param.
You can use following simple regex :
/&\w+=([\w,-]+)/g
Demo
You need to return the result of capture group and split them with ,.
var mystr="?cgan=1&product_cats=mens-jeans,shirts&product_tags=fall,classic-style&attr_color=charcoal,brown&attr_size=large,x-small&cnep=0
";
var myStringArray = mystr.match(/&\w+=([\w,-]+)/g);
var arrayLength = myStringArray.length-1; //-1 is because of that the last match is 0
var indices = [];
for (var i = 0; i < arrayLength; i++) {
indices.push(myStringArray[i].split(','));
}
Something like
/(?:product_cats|product_tag|attr_color|attr_size)=[^,]+/g
(?:product_cats|product_tag|attr_color|attr_size) will match product_cats or product_tag or attr_color or attr_size)
= Matches an equals
[^,] Negated character class matches anything other than a ,. Basically it matches till the next ,
Regex Demo
Test
string = "?cgan=1&product_cats=mens-jeans,shirts&product_tags=fall,classic-style&attr_color=charcoal,brown&attr_size=large,x-small&cnep=0";
matched = string.match(/(product_cats|product_tag|attr_color|attr_size)=[^,]+/g);
for (i in matched)
{
console.log(matched[i].split("=")[1]);
}
will produce output as
mens-jeans
charcoal
large
There is no need for regular expressions. just use splits and joins.
var s = '?cgan=1&product_cats=mens-jeans,shirts&product_tags=fall,classic-style&attr_color=charcoal,brown&attr_size=large,x-small&cnep=0';
var query = s.split('?')[1],
pairs = query.split('&'),
allowed = ['product_cats', 'product_tags', 'attr_color', 'attr_size'],
results = [];
$.each(pairs, function(i, pair) {
var key_value = pair.split('='),
key = key_value[0],
value = key_value[1];
if (allowed.indexOf(key) > -1) {
results.push(value);
}
});
console.log(results.join(','));
($.each is from jQuery, but can easily be replaced if jQuery is not around)

Categories

Resources