I have a sample code but I am looking for the most efficient solution. Sure I can loop twice through the array and string but I was wondering if I could just do a prefix search character per character and identify elements to be replaced. My code does not really do any of that since my regex is broken.
const dict = {
'\\iota': 'ι',
'\\nu': 'ν',
'\\omega': 'ω',
'\\\'e': 'é',
'^e': 'ᵉ'
}
const value = 'Ko\\iota\\nu\\omega L\'\\\'ecole'
const replaced = value.replace(/\b\w+\b/g, ($m) => {
console.log($m)
const key = dict[$m]
console.log(key)
return (typeof key !== 'undefined') ? key : $m
})
Your keys are not fully word characters, so \b\w+\b will not match them. Construct the regex from the keys instead:
// https://stackoverflow.com/questions/3446170/escape-string-for-use-in-javascript-regex
const escapeRegExp = string => string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
const dict = {
'\\iota': 'ι',
'\\nu': 'ν',
'\\omega': 'ω',
'\\\'e': 'é',
'^e': 'ᵉ'
}
const value = 'Ko\\iota\\nu\\omega L\'\\\'ecole'
const pattern = new RegExp(Object.keys(dict).map(escapeRegExp).join('|'), 'g');
const replaced = value.replace(pattern, match => dict[match]);
console.log(replaced);
Related
I have a string like,
const string =
"DEVICE_SIZE IN ('036','048','060','070') AND DEVICE_VOLTAGE IN ('1','3') AND NOT DEVICE_DISCHARGE_AIR IN ('S') AND NOT DEVICE_REFRIGERANT_CIRCUIT IN ('H','C')";
From this, I need to map the respective key and value like, DEVICE_SIZE: ["036", "048", "060", "070"]
Current Result:
const string =
"DEVICE_SIZE IN ('036','048','060','070') AND DEVICE_VOLTAGE IN ('1','3') AND NOT DEVICE_DISCHARGE_AIR IN ('S') AND NOT DEVICE_REFRIGERANT_CIRCUIT IN ('H','C')";
const res = string.split('IN ');
const regExp = /\(([^)]+)\)/;
const Y = 'AND';
const data = res.map((item) => {
if (regExp.exec(item)) {
return {
[item.slice(item.indexOf(Y) + Y.length)]: regExp.exec(item)[1],
};
}
});
console.log('data ', data);
Expected Result:
[
{ "DEVICE_SIZE": ["036", "048", "060", "070"] },
{ "DEVICE_VOLTAGE": ["1", "3"] },
{ "NOT DEVICE_DISCHARGE_AIR": ["s"] },
{ "NOT DEVICE_REFRIGERANT_CIRCUIT": ["H", "C"] },
];
I couldn't get the exact result based on my try in the current result. Could you please kindly help me to achieve the above given expected result?
Note: I am trying the above to achieve the end result mentioned in my previous question
How to get valid object from the string matching respective array?
You could make use of a single regex with 2 capture groups to match the key and the value.
((?:\bNOT\s+)?\w+)\s+IN\s+\('([^()]*)'\)
See the regex demo.
The pattern matches:
( Capture group 1
(?:\bNOT\s+)? Optionally match the word NOT followed by 1+ whitespace chars
\w+ Match 1 or more word characters
) Close group
\s+IN\s+ Match the word IN between whitespace characters
\(' Match ('
([^()]*) Capture group 2, match 1+ occurrences of any char except ( and )
'\) Match ')
To create the dynamic keys for the object, you can make use of Object Initializer and you can split on ',' to create the resulting array for the value.
const regex = /((?:\bNOT\s+)?\w+)\s+IN\s+\('([^()]*)'\)/g;
const string = "DEVICE_SIZE IN ('036','048','060','070') AND DEVICE_VOLTAGE IN ('1','3') AND NOT DEVICE_DISCHARGE_AIR IN ('S') AND NOT DEVICE_REFRIGERANT_CIRCUIT IN ('H','C')";
const data = Array.from(
string.matchAll(regex), m =>
({
[m[1]]: m[2].split("','")
})
);
console.log(data);
You could have split by AND first and then split again with IN to separate the key and the value part.
This would also work:
const string =
"DEVICE_SIZE IN ('036','048','060','070') AND DEVICE_VOLTAGE IN ('1','3') AND NOT DEVICE_DISCHARGE_AIR IN ('S') AND NOT DEVICE_REFRIGERANT_CIRCUIT IN ('H','C')";
const output = string
.split("AND")
.map((item) => item.split("IN").map((text) => text.trim()))
.map(([key, value]) => ({
[key]: value.replace(/[\(\)\']/g, "").split(","),
}));
console.log(output);
You can use Object.fromEntries, which transforms key-value tuples [string, any] into an object with those tuples:
const data = Object.fromEntries(res.map((item) => {
if (regExp.exec(item)) {
return [item.slice(item.indexOf(Y) + Y.length).trim(), regExp.exec(item)[1].split(',').map(value => value.slice(1, -1))];
}
return undefined;
}).filter(Boolean));
and then continue from there.
I have a string that I want to transform into a dictionary, the string looks like this:
const str = '::student{name="this is the name" age="21" faculty="some faculty"}'
I want to transform that string into a dictionary that looks like this:
const dic = {
"name": "this is the name",
"age": "21",
"faculty": "some faculty"
}
So the string is formated in this was ::name{parameters...} and can have any parameters, not only name, faculty, ... How can I format any string that looks like that and transform it into a dictionary?
Also Is there a way to check that the current string I'm parsing follow this structure ::name{parameters...}, that way I can throw an error when it does not.
Any help would be greatly appreciated!
Assuming that you only have alphanumeric and spaces or an empty string within the parentheses of the values, and there is never a space between the key and value in key="value" you can match with the following regular expression and then iterate over it to construct your desired object.
const str = '::student{name="this is the name" age="21" faculty="some faculty"}'
const matches = str.match(/[\w]+\=\"[\w\s]*(?=")/g)
const result = {}
matches.forEach(match => {
const [key, value] = match.split('="')
result[key] = value
})
console.log(result)
The regex is composed of the following parts:
You can use https://regexr.com to experiment with your regular expressions. Depending on the string to process, you'll need to refine your regex.
This example uses exec to look for matches in the string using a regular expression.
const str = '::student{name="this is the name" age="21" faculty="some faculty"}';
const regex = /([a-z]+)="([a-z0-9 ?]+)"/g;
let match, output = {};
while (match = regex.exec(str)) {
output[match[1]] = match[2];
}
console.log(output);
If not exist a unique pattern, it is hard, but when pattern is the same, you can split the string:
var str = '::student{name="this is the name" age="21" faculty="some faculty"}'
console.log(createObj(str));
function createObj(myString){
if(myString.search("::") !== 0 && myString.search("{}") === -1 && myString.search("}") === -1){
console.log("format error")
return null;
}
var a = myString.split("{")[1];
var c = a.replace('{','').replace('}','');
if(c.search('= ""') !== -1){
console.log("format incorrect");
return null;
}
var d = c.split('="')
var keyValue = [];
for(item of d){
var e = item.split('" ')
if(e.length === 1){
keyValue.push(e[0].replace('"',''));
}else{
for(item2 of e){
keyValue.push(item2.replace('"',''));
}
}
}
var myObj = {}
if(keyValue.length % 2 === 0){
for(var i = 0; i<keyValue.length; i=i+2){
myObj[keyValue[i]] = keyValue[i+1]
}
}
return myObj;
}
You could use 2 patterns. The first pattern to match the format of the string and capture the content between the curly braces in a single capture group. The second pattern to get the key value pairs using 2 capture groups.
For the full match you can use
::\w+{([^{}]*)}
::\w+ match :: and 1+ word characters
{ Match the opening curly
([^{}]*) Capture group 1, match from opening till closing curly
} Match the closing curly
Regex demo
For the keys and values you can use
(\w+)="([^"]+)"
(\w+) Capture group 1, match 1+ word chars
=" Match literally
([^"]+) Capture group 2, match from an opening till closing double quote
" Match closing double quote
Regex demo
const str = '::student{name="this is the name" age="21" faculty="some faculty"}';
const regexFullMatch = /::\w+{([^{}]*)}/;
const regexKeyValue = /(\w+)="([^"]+)"/g;
const m = str.match(regexFullMatch);
if (m) {
dic = Object.fromEntries(
Array.from(m[1].matchAll(regexKeyValue), v => [v[1], v[2]])
);
console.log(dic)
}
How to convert a string with commas to a string with hashtags. Could anyone please help?
What I tried so far is below. It gives individual elements when used map. How do i make it a single string with # appended before each item in the array.
const tags = 'caramel,vanilla,chocolate';
const splitArray = tags.split(",").map(val=> console.log('#'+val));
You can use .replace() instead with global and extend with # in the beginning.
Extend with template literals, using # in the first character's place:
const tags = 'caramel,vanilla,chocolate';
const result = `#${tags.replace(/,/g, '#')}`;
console.log(result);
Or as Patrick suggested in the comment section with /^|,/g in the RegExp:
const tags = 'caramel,vanilla,chocolate';
const result = tags.replace(/^|,/g, '#');
console.log(result);
You can use .join(''):
const tags = 'caramel,vanilla,chocolate';
const result = tags.split(",").map(val=> "#"+val).join('');
console.log(result);
Using Reduce.
const tags = 'caramel,vanilla,chocolate';
const splitArray = tags.split(",").reduce( (res,val) => (res+'#'+val) ,'');
console.log(splitArray);
I believe that basic replace will make it faster.
Check this:
const tags = 'caramel,vanilla,chocolate';
var hastags = (tags, sp) => {
return typeof tags === "string" && tags.trim() ? "#" + tags.replace(sp || ",", "#") : "";
};
console.log(hastags(tags, ","));
I need to parse an email template for custom variables that occur between pairs of dollar signs, e.g:
$foo$bar$baz$foo$bar$baz$wtf
So I would want to start by extracting 'foo' above, since it comes between the first pair (1st and 2nd) of dollar signs. And then skip 'bar' but extract 'baz' as it comes between the next pair (3rd and 4th) of dollar signs.
I was able to accomplish this with split and filter as below, but am wondering, if there's a way to accomplish the same with a regular expression instead? I presume some sort of formal parser, recursive or otherwise, could be used, but that would seem like overkill in my opinion
const body = "$foo$bar$baz$foo$bar$baz$wtf";
let delimitedSegments = body.split('$');
if (delimitedSegments.length % 2 === 0) {
// discard last segment when length is even since it won't be followed by the delimiter
delimitedSegments.pop();
}
const alternatingDelimitedValues = delimitedSegments.filter((segment, index) => {
return index % 2;
});
console.log(alternatingDelimitedValues);
OUTPUT: [ 'foo', 'baz', 'bar' ]
Code also at: https://repl.it/#dexygen/findTextBetweenDollarSignDelimiterPairs
Just match the delimiter twice in the regexp
const body = "$foo$bar$baz$foo$bar$baz$wtf";
const result = body.match(/\$[^$]*\$/g).map(s => s.replace(/\$/g, ''));
console.log(result);
You could use this regex /\$\w+\$/g to get the expected output'
let regex = /\$\w+\$/g;
let str = '$foo$bar$baz$foo$bar$baz$wtf';
let result = str.match(regex).map( item => item.replace(/\$/g, ''));
console.log(result);
You can use capturing group in the regex.
const str1 = '$foo$bar$baz$foo$bar$baz$wtf';
const regex1 = /\$(\w+)\$/g;
const str2 = '*foo*bar*baz*foo*bar*baz*wtf';
const regex2 = /\*(\w+)\*/g;
const find = (str, regex) =>
new Array(str.match(regex).length)
.fill(null)
.map(m => regex.exec(str)[1]);
console.log('delimiters($)', JSON.stringify(find(str1, regex1)));
console.log('delimiters(*)', JSON.stringify(find(str2, regex2)));
Is there a way to split a CSV string with javascript where the separator can also occur as an escaped value. Other regex implementations solve this problem with a lookbehind, but since javascript does not support lookbehind I wonder how I could accomplish this in a neatly fashion using a regex expression.
A csv line might look like this
"This is\, a value",Hello,4,'This is also\, possible',true
This must be split into (strings containing)
[0] => "This is\, a value"
[1] => Hello
[2] => 4
[3] => 'This is also\, possible'
[4] => true
Instead of trying to split you can try a global match for all that is not a , with this pattern:
/"[^"]+"|'[^']+'|[^,]+/g
for example you can use this regex:
(.*?[^\\])(,|$)
regex takes everything .*? until first comma, which does not have \ in front of it, or end of line
Here's some code that changes csv to json (assuming the first row it prop names). You can take the first part (array2d) and do other things with it very easily.
// split rows by \r\n. Not sure if all csv has this, but mine did
const rows = rawCsvFile.split("\r\n");
// find all commas, or chunks of text in quotes. If not in quotes, consider it a split point
const splitPointsRegex = /"(""|[^"])+?"|,/g;
const array2d = rows.map((row) => {
let lastPoint = 0;
const cols: string[] = [];
let match: RegExpExecArray;
while ((match = splitPointsRegex.exec(row)) !== null) {
if (match[0] === ",") {
cols.push(row.substring(lastPoint, match.index));
lastPoint = match.index + 1;
}
}
cols.push(row.slice(lastPoint));
// remove leading commas, wrapping quotes, and unneeded \r
return cols.map((datum) =>
datum.replace(/^,?"?|"$/g, "")
.replace(/""/g, `\"`)
.replace(/\r/g, "")
);
})
// assuming first row it props name, create an array of objects with prop names of the values given
const out = [];
const propsRow = array2d[0];
array2d.forEach((row, i) => {
if (i === 0) { return; }
const addMe: any = {};
row.forEach((datum, j) => {
let parsedData: any;
if (isNaN(Number(datum)) === false) {
parsedData = Number(datum);
} else if (datum === "TRUE") {
parsedData = true;
} else if (datum === "FALSE") {
parsedData = false;
} else {
parsedData = datum;
}
addMe[propsRow[j]] = parsedData;
});
out.push(addMe);
});
console.log(out);
Unfortunately this doesn't work with Firefox, only in Chrome and Edge:
"abc\\,cde,efg".split(/(?<!\\),/) will result in ["abc\,cde", "efg"].
You will need to remove all (unescaped) escapes in a second step.