Find inner to outer match of element by applying regex - javascript

I am trying implement a replacing mechanism for a string like prepared statements that are evaluated dynamicaly in javascript. I have replacements like
[{username:"Max",age:10}]
Eg assume we have the string as input (username) is (age) so a find replace is easy by the attribute and its value.
However I want something more advanced where parentheses are 'identified' and evaluted from the inner to outer eg for input:
[{username:"Max",age:10,myDynamicAttribute:"1",label1:'awesome', label2:'ugly'}]
and string
(username) is (age) and (label(myDynamicAttribute)). In the first iteration of replacements the string should become
(username) is (age) and (label1)
and in second Peter is 10 and awesome. Is there any tool or pattern that I can use to 'understand' the inner parentheses first and the evaluate the other?. I tried regexes but I wasn't able to create a regex that matches the inner parentheses first and then the outer.

You could tokenise the string and use a recursive replacer that traverses the tokens in one pass. If text within parentheses does not match with an object property, they are left as they are. When parentheses occur in the string that is retrieved from the object, they are taken as literals, and no attempt is made to perform a lookup on those again.
function interpolate(encoded, lookup) {
const tokens = encoded.matchAll(/[^()]+|./g);
function* dfs(end="") {
while (true) {
const {value, done} = tokens.next();
if (value == end || done) return;
if (value != "(") yield value;
else {
const key = [...dfs(")")].join("");
yield lookup[key] ?? `(${key})`;
}
}
}
return [...dfs()].join("");
}
// Example run
const lookup = {
username: "Max",
age: 10,
myDynamicAttribute: "1",
label1019: 'awesome(really)', // These parentheses are treated as literals
really: "not", // ...so that this will not be retrieved
label2: 'ugly',
};
const str = "1) (username) is (age) (uh!) and (label(myDynamicAttribute)0(myDynamicAttribute)9)"
const res = interpolate(str, lookup);
console.log(res);

We can write a regular expression that finds a parenthesized expression which contains no internal parentheses, use the expression's internals as a key for our data object, replace the whole expression with that value, and then recur. We would stop when the string contains no such parenthesized expressions, and return the string intact.
Here's one way:
const interpolate = (data, str, parens = str .match (/\(([^\(\)]+)\)/)) =>
parens ? interpolate (data, str. replace (parens [0], data [parens [1]])) : str
const data = {username: 'Max', age: 10, myDynamicAttribute: '1', label1: 'awesome', label2: 'ugly'}
const str = `(username) is (age) and (label(myDynamicAttribute))`
console .log (interpolate (data, str))
This would lead to a sequence of recursive calls with these strings:
"(username) is (age) and (label(myDynamicAttribute))",
"Max is (age) and (label(myDynamicAttribute))",
"Max is 10 and (label(myDynamicAttribute))",
"Max is 10 and (label1)",
"Max is 10 and awesome"

Related

How to replace all substrings enclosed by {} in a string?

I have to address an algo problem with vanilla JS.
I have a string:
let stringExample = "my {cat} has {2} ears";
And I want to replace it (in order) with an array of replacements
const replaceArray = ["dog", "4"];
So that after the replacement, the stringExample should be:
"my dog has 4 ears"
One approach that can be used is as follows:
// a sample of Strings and their corresponding replacement-arrays:
let string1 = "my {cat} has {2} ears",
arr1 = ["dog", "4"],
string2 = "They call me ~Ishmael~",
arr2 = ['Susan'],
string3 = "It is a ##truth## ##universally acknowledged##...",
arr3 = ['questionable assertion', 'oft cited'];
// here we use a named Arrow function which takes three arguments:
// haystack: String, the string which features the words/characters to
// be replaced,
// needles: Array of Strings with which to replace the identified
// groups in the 'haystack',
// Array of Strings, these strings represent the characters with
// which the words/characters to be replaced may be identified;
// the first String (index 0) is the character marking the begining
// of the capture group, and the second (index 1) indicates the
// end of the captured group; this is supplied with the default
// curly-braces:
const replaceWords = (haystack, needles, delimiterPair = ['{', '}']) => {
// we use destructuring assignment, to assign the string at
// index 0 to the 'start' variable, the string at index 1 to
// the 'end' variable; if no argument is provided the default
// curly-braces are used/assigned:
const [start, end] = delimiterPair,
// here we construct a regular expression, using a template
// string which interpolates the variables within the string;
// the regular expression is composed of:
// 'start' (eg: '{')
// .+ : any character that appears one or more times,
// ? : lazy quantifier so the expression matches the
// shortest possible string,
// 'end' (eg: '}'),
// matched with the 'g' (global) flag to replace all
// matches within the supplied string.
// this gives a regular expression of: /{.+?}/g
regexp = new RegExp(`${start}.+?${end}`, 'g')
// here we compose a String using the template literal, to
// interpolate the 'haystack' variable and also a tab-character
// concatenated with the result returned by String.prototype.replace()
// we use an anonymous function to supply the replacement-string:
//
return `"${haystack}":\t` + haystack.replace(regexp, () => {
// here we remove the first element from the array of replacements
// provided to the function, and return that to the string as the
// replacement:
return needles.shift();
})
}
console.log(replaceWords(string1, arr1));
console.log(replaceWords(string2, arr2, ['~', '~']));
console.log(replaceWords(string3, arr3, ['##', '##']));
JS Fiddle demo.
This has no sanity checks at all, nor have I explored to find any edge-cases. If at all possible I would seriously recommend using an open source – and well-tested, well-proofed – framework or templating library.
References:
Arrow function syntax.
Regular Expressions.
String.prototype.replace().
Bibliography:
Regular Expression Syntax Cheat Sheet.

Processing and replacing text inside double curly braces

I have a URL string:
var url = https://url.com/{{query}}/foo/{{query2}}
I have a line of code that is able to take in a string, then get an array of all the queries inside the braces:
var queries = String(url).match(/[^{\}]+(?=})/g);
Returns:
queries = ['query', 'query2']
I have a function, parse(queries), which processes these queries and returns a list of their results:
results = ['resultOfQuery', 'resultOfQuery2']
I want to be able to take this list, and then replace the queries in the URL string with their results. The final result of this example would be:
url = https://url.com/resultOfQuery/foo/resultOfQuery2
I have two separate problems:
The regex in the String.match line of code only counts for once set of curly braces, {something}. How can I modify it to look for a set of double curly braces, {{something}}?
I already have the array of results. What is the best way to do the string replacement so that the queries and each of their accompanying set of double braces are replaced with their corresponding result?
You can use replace with following pattern,
{{(.+?)}}
{{ - Matches {{
(.+?) - Matches anything except newline one or more time
let url = "https://url.com/{{query}}/foo/{{query2}}"
let result = {'query': 'queryResult1', 'query2':'queryResult2' }
let replaceDoubleBraces = (str,result) =>{
return str.replace(/{{(.+?)}}/g, (_,g1) => result[g1] || g1)
}
console.log(replaceDoubleBraces(url,result))
Note:- I am using result as object here so it becomes easy to find and replace values, if you can change your parse function consider returning an object from parse
Generalized solution which will also work with nested object.
function replaceText(text, obj, start = '{{', end = '}}') {
return text.replace(new RegExp(`${start}(.+?)${end}`, 'g'), (_, part) => {
return part.split('.')
.reduce((o, k) => (
o || {}
)[k], obj);
});
}
console.log(replaceText(
'Hello my name is {{name.first}} {{name.last}}, age: {{age}}',
{
name: {
first: 'first', last: 'last'
},
age: 20,
}
));
To use Regex inbuild symbols as template, use double backslash \\ before each character. It is because above function uses template string new RegExp('\\^(.+?)\\^', 'g') to create regular expression instead RegExp constructor new RegExp(/\^(.+?)\^/g), otherwise single backslash is enough.
For example
( => \\(
) => \\)
(( => \\(\\(
)) => \\)\\)
^ => \\^

why condition is always true in javascript?

Could you please tell me why my condition is always true? I am trying to validate my value using regex.i have few conditions
Name should not contain test "text"
Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
I do like this
https://jsfiddle.net/aoerLqkz/2/
var val = 'ab dd'
if (/test|[^a-z]|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i.test(val)) {
alert( 'match')
} else {
alert( 'false')
}
I tested my code with the following string and getting an unexpected result
input string "abc" : output fine :: "match"
input string "aaa" : output fine :: "match"
input string "aa a" : **output ** :: "match" why it is match ?? there is space between them why it matched ????
input string "sa c" : **output ** :: "match" why it is match ?? there is different string and space between them ????
The string sa c includes a space, the pattern [^a-z] (not a to z) matches the space.
Possibly you want to use ^ and $ so your pattern also matches the start and end of the string instead of looking for a match anywhere inside it.
there is space between them why it matched ????
Because of the [^a-z] part of your regular expression, which matches the space:
> /[^a-z]/i.test('aa a');
true
The issue is the [^a-z]. This means that any string that has a non-letter character anywhere in it will be a match. In your example, it is matching the space character.
The solution? Simply remove |[^a-z]. Without it, your regex meets all three criteria.
test checks if the value contains the word 'test'.
abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz checks if the value contains three sequential letters.
(.)\1\1 checks if any character is repeated three times.
Complete regex:
/test|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i`
I find it helpful to use a regex tester, like https://www.regexpal.com/, when writing regular expressions.
NOTE: I am assuming that the second criteria actually means "three consecutive letters", not "three consecutive characters" as it is written. If that is not true, then your regex doesn't meet the second criteria, since it only checks for three consecutive letters.
I would not do this with regular expresions, this expresion will always get more complicated and you have not the possibilities you had if you programmed this.
The rules you said suggest the concept of string derivative. The derivative of a string is the distance between each succesive character. It is specially useful dealing with password security checking and string variation in general.
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
//these strings have the same derivative: [0,0,0,0]
console.log(derivative('aaaaa'));
console.log(derivative('bbbbb'));
//these strings also have the same derivative: [1,1,1,1]
console.log(derivative('abcde'));
console.log(derivative('mnopq'));
//up and down: [1,-1, 1,-1, 1]
console.log(derivative('ababa'));
With this in mind you can apply your each of your rules to each string.
// Rules:
// 1. Name should not contain test "text"
// 2. Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
// 3. Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
const arrayContains = (master, sub) =>
master.join(",").indexOf( sub.join( "," ) ) == -1;
const rule1 = (text) => !text.includes('text');
const rule2 = (text) => !arrayContains(derivative(text),[1,1]);
const rule3 = (text) => !arrayContains(derivative(text),[0,0]);
const testing = [
"smthing textual",'abc','aaa','xyz','12345',
'1111','12abb', 'goodbcd', 'weeell'
];
const results = testing.map((input)=>
[input, rule1(input), rule2(input), rule3(input)]);
console.log(results);
Based on the 3 conditions in the post, the following regex should work.
Regex: ^(?:(?!test|([a-z])\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz).)*$
Demo

How to get value in $1 in regex to a variable for further manipulation [duplicate]

You can backreference like this in JavaScript:
var str = "123 $test 123";
str = str.replace(/(\$)([a-z]+)/gi, "$2");
This would (quite silly) replace "$test" with "test". But imagine I'd like to pass the resulting string of $2 into a function, which returns another value. I tried doing this, but instead of getting the string "test", I get "$2". Is there a way to achieve this?
// Instead of getting "$2" passed into somefunc, I want "test"
// (i.e. the result of the regex)
str = str.replace(/(\$)([a-z]+)/gi, somefunc("$2"));
Like this:
str.replace(regex, function(match, $1, $2, offset, original) { return someFunc($2); })
Pass a function as the second argument to replace:
str = str.replace(/(\$)([a-z]+)/gi, myReplace);
function myReplace(str, group1, group2) {
return "+" + group2 + "+";
}
This capability has been around since Javascript 1.3, according to mozilla.org.
Using ESNext, quite a dummy links replacer but just to show-case how it works :
let text = 'Visit http://lovecats.com/new-posts/ and https://lovedogs.com/best-dogs NOW !';
text = text.replace(/(https?:\/\/[^ ]+)/g, (match, link) => {
// remove ending slash if there is one
link = link.replace(/\/?$/, '');
return `${link.substr(link.lastIndexOf('/') +1)}`;
});
document.body.innerHTML = text;
Note: Previous answer was missing some code. It's now fixed + example.
I needed something a bit more flexible for a regex replace to decode the unicode in my incoming JSON data:
var text = "some string with an encoded 's' in it";
text.replace(/&#(\d+);/g, function() {
return String.fromCharCode(arguments[1]);
});
// "some string with an encoded 's' in it"
If you would have a variable amount of backreferences then the argument count (and places) are also variable. The MDN Web Docs describe the follwing syntax for sepcifing a function as replacement argument:
function replacer(match[, p1[, p2[, p...]]], offset, string)
For instance, take these regular expressions:
var searches = [
'test([1-3]){1,3}', // 1 backreference
'([Ss]ome) ([A-z]+) chars', // 2 backreferences
'([Mm][a#]ny) ([Mm][0o]r[3e]) ([Ww][0o]rd[5s])' // 3 backreferences
];
for (var i in searches) {
"Some string chars and many m0re w0rds in this test123".replace(
new RegExp(
searches[i]
function(...args) {
var match = args[0];
var backrefs = args.slice(1, args.length - 2);
// will be: ['Some', 'string'], ['many', 'm0re', 'w0rds'], ['123']
var offset = args[args.length - 2];
var string = args[args.length - 1];
}
)
);
}
You can't use 'arguments' variable here because it's of type Arguments and no of type Array so it doesn't have a slice() method.

JavaScript - string regex backreferences

You can backreference like this in JavaScript:
var str = "123 $test 123";
str = str.replace(/(\$)([a-z]+)/gi, "$2");
This would (quite silly) replace "$test" with "test". But imagine I'd like to pass the resulting string of $2 into a function, which returns another value. I tried doing this, but instead of getting the string "test", I get "$2". Is there a way to achieve this?
// Instead of getting "$2" passed into somefunc, I want "test"
// (i.e. the result of the regex)
str = str.replace(/(\$)([a-z]+)/gi, somefunc("$2"));
Like this:
str.replace(regex, function(match, $1, $2, offset, original) { return someFunc($2); })
Pass a function as the second argument to replace:
str = str.replace(/(\$)([a-z]+)/gi, myReplace);
function myReplace(str, group1, group2) {
return "+" + group2 + "+";
}
This capability has been around since Javascript 1.3, according to mozilla.org.
Using ESNext, quite a dummy links replacer but just to show-case how it works :
let text = 'Visit http://lovecats.com/new-posts/ and https://lovedogs.com/best-dogs NOW !';
text = text.replace(/(https?:\/\/[^ ]+)/g, (match, link) => {
// remove ending slash if there is one
link = link.replace(/\/?$/, '');
return `${link.substr(link.lastIndexOf('/') +1)}`;
});
document.body.innerHTML = text;
Note: Previous answer was missing some code. It's now fixed + example.
I needed something a bit more flexible for a regex replace to decode the unicode in my incoming JSON data:
var text = "some string with an encoded 's' in it";
text.replace(/&#(\d+);/g, function() {
return String.fromCharCode(arguments[1]);
});
// "some string with an encoded 's' in it"
If you would have a variable amount of backreferences then the argument count (and places) are also variable. The MDN Web Docs describe the follwing syntax for sepcifing a function as replacement argument:
function replacer(match[, p1[, p2[, p...]]], offset, string)
For instance, take these regular expressions:
var searches = [
'test([1-3]){1,3}', // 1 backreference
'([Ss]ome) ([A-z]+) chars', // 2 backreferences
'([Mm][a#]ny) ([Mm][0o]r[3e]) ([Ww][0o]rd[5s])' // 3 backreferences
];
for (var i in searches) {
"Some string chars and many m0re w0rds in this test123".replace(
new RegExp(
searches[i]
function(...args) {
var match = args[0];
var backrefs = args.slice(1, args.length - 2);
// will be: ['Some', 'string'], ['many', 'm0re', 'w0rds'], ['123']
var offset = args[args.length - 2];
var string = args[args.length - 1];
}
)
);
}
You can't use 'arguments' variable here because it's of type Arguments and no of type Array so it doesn't have a slice() method.

Categories

Resources