So I am trying to certain values from a URL. Suppose I have the following URL:
let url = "https://gk.example.com/my-path/to/some+more/multiple.variables/moreparams";
I am trying to extract the some+more and multiple.variables parts from the URL using a Regular Expression. I came up with the following expressions:
/(?<=/)([^/]*\+[^/]*)(?=/)/g (for the + separator) and /(?<=/)([^/]*\.[^/]*)(?=/)/g (for the . separator)
results = url.match(/(?<=/)([^/]*\+[^/]*)(?=/)/g); // result: ['some+more']
results = url.match(/(?<=/)([^/]*\.[^/]*)(?=/)/g); // result: ['gk.example.com', 'multiple.variables']
This returns ['some+more'] and ['gk.example.com', 'multiple.variables'], which is a result I can work with. However, instead of using a if statement to switch between expressions, I would rather inject a variable into a generic regular expression. I tried the following (using backticks ( ` ) to be able to inject the variable :
function getSplitUrl(sep, url) {
if (!url) url = window.location.href;
let regex = new RegExp('(?<=/)([^/]*'+sep+'[^/]*)(?=/)', `g`),
results = [];
results[0] = url.match(`(?<=/)([^/]*${sep}[^/]*)(?=/)`, `g`);
results[1] = regex.exec(url);
console.log('Regex 1: ', regex); // logs `Regex 1: /(?<=/)([^/]*${sep}[^/]*)(?=\/)/g`, where ${sep} is replaced by either \. or \+ (seemingly correct expression)
console.log(url, sep, results);
return null;
}
From the console.log(regex) it seems that it is the correct regular expression but the result is still wrong. The result is now ['gk.example.com', 'gk.example.com'].
Am I missing something obvious here?
Edit:
Somehow, url.match(regex) returns a correct result, whereas regex.exec(url) does not.
Keep it simple and don't use regex, but rather the URL API to parse urls:
const url = new URL("https://gk.example.com/my-path/to/some+more/multiple.variables/moreparams?limit=5&more=conf/fusion");
console.log(url.hostname);
console.log(url.pathname)
const parts = url.pathname.slice(1).split('/');
console.log(parts);
const find = (substring) => parts.filter(p => p.includes(substring));
console.log(find('+'));
console.log(find('.'));
Related
I have string with slash separated contains function names.
e.g.
my_doc/desktop/customer=getCustomer()/getCsvFileName()/controller=getControllerName()
Within above string I want only function name i.e. getCustomer(), getControllerName() & getCsvFileName()
I searched some regex like:
let res = myString.match(/(?<=(function\s))(\w+)/g);
but its returning result as null.
Update:
Now I want to get function names without parentheses () i.e. getCustomer, getControllerName & getCsvFileName
Please help me in this
const str = "my_doc/desktop/customer=getCustomer()/getCsvFileName()/controller=getControllerName()"
let tokens = [];
for (element of str.split("/"))
if (element.endsWith("()"))
tokens.push(element.split("=")[1] ?? element.split("=")[0])
console.log(tokens);
General idea: split the string along slashes, and for each of these tokens, if the token ends with () (as per Nick's suggestion), split the token along =. Append the second index of the token split along = if it exists, otherwise append the first.
A "smaller" version (using purely array methods) could be:
const str = "my_doc/desktop/customer=getCustomer()/getCsvFileName()/controller=getControllerName()"
let tokens = str.split("/")
.filter(element => element.endsWith("()"))
.map(element => element.split("=")[1] ?? element.split("=")[0]);
console.log(tokens);
You can split the string that has parentheses () first like /.*?\([^)]*\)/g.
This will give array of results, and after that you can iterate the array data and for each item, you can split the = and / before function name with the help of item.split(/=|\//).
Then push the filtered function name into empty array functionNames.
Working Example:
const string = `my_doc/desktop/customer=getCustomer()/getCsvFileName()/controller=getControllerName()`;
const functionNames = [];
string.match(/.*?\([^)]*\)/g).forEach(item => {
const splitString = item.split(/=|\//);
const functionName = splitString[splitString.length - 1];
functionNames.push(functionName);
});
console.log(functionNames);
As per, MDN docs the match() method returns null if it does not find a match for the provided regex in the provided search string.
The regular expression which you have provided,/(?<=(function\s))(\w+)/g matches any word that has 'function ' before it. (NOTE: a space after the word function)
Your search string my_doc/desktop/customer=getCustomer()/getCsvFileName()/controller=getControllerName() does not include 'function ' before any characters. That is why you got null as result of match() method.
let yourString = 'my_doc/desktop/customer=getCustomer()/getCsvFileName()/controller=getControllerName()';
let myReferenceString = 'SAMPLETEXTfunction sayHi()/function sayHello()';
let res = yourString.match(/(?<=(function\s))(\w+)/g);
let res2 = myReferenceString.match(/(?<=(function\s))(\w+)/g);
console.log("Result of your string", res);
console.log("Result of my string", res2);
My solution here,
let myreferenceString = 'my_doc/desktop/customer=getCustomer()/getCsvFileName()/controller=getControllerName()'
let res = myreferenceString.match(/((?<==)(\w+\(\)))|((?<=\/)(\w+\(\)))/g);
console.log("Result", res);
NOTE: I have used the 'Positive Look Behind regex operator', This is not supported in browsers like Safari and IE. Please do reasearch about this before considering this approach.
I have a string in Node.js Runtime e.g
var content = "my content contain some URL like https://this.me/36gD6d3 or https://this.me/39Jwjd";
How can I read each https://this.me/36gD6d3 and https://this.me/39Jwjd to replace it with another URL?
A forEach loop or something similar would be best. :-)
What I need is to make a request to each of that URL to get the real URL behind the shorten URL. That's not the problem.
Before and after each of that URLs is neither a whitespace or a ..
Domain https://this.me/ is constant but the IDs 39Jwjd, 36gD6d3 are changing.
Looking forward to your answers! :)
You can use regex to find occurrences of this URL.
var content = "my content contain some URL like https://this.me/36gD6d3 or https://this.me/39Jwjd";
console.log(content.match(/https:\/\/this\.me\/[a-zA-Z0-9]+/g))
This outputs:
[
"https://this.me/36gD6d3",
"https://this.me/39Jwjd"
]
In order to replace the found occurrences, use replace() function.
var content = "my content contain some URL like https://this.me/36gD6d3 or https://this.me/39Jwjd";
console.log(content.replace(/https:\/\/this\.me\/[a-zA-Z0-9]+/g, "<Replaced URL here>"))
Output:
my content contain some URL like <Replaced URL here> or <Replaced URL here>
If you want to replace the occurrences depending on the previous value, you could either use substitution or pass replacement function as the second argument.
Learn more on String.prototype.replace function at MDN
If you want your replace to be asynchronous (which I'm guessing is the case when you lookup the full URL), you could do something like this:
(async () => {
const str = "my content contain some URL like https://this.me/36gD6d3 or https://this.me/39Jwjd",
res = await replaceAllUrls(str);
console.log(res);
})();
function replaceAllUrls(str) {
const regex = /https?:\/\/this\.me\/[a-zA-Z0-9_-]+/g,
matches = str.match(regex) || [];
return Promise.all(matches.map(getFullUrl)).then(values => {
return str.replace(regex, () => values.shift());
});
}
function getFullUrl(u) {
// Just for the demo, use your own
return new Promise((r) => setTimeout(() => r(`{{Full URL of ${u}}}`), 100));
// If it fails (you cannot get the full URL),
// don't forget to catch the error and return the original URL!
}
I am designing a regular expression tester in HTML and JavaScript. The user will enter a regex, a string, and choose the function they want to test with (e.g. search, match, replace, etc.) via radio button and the program will display the results when that function is run with the specified arguments. Naturally there will be extra text boxes for the extra arguments to replace and such.
My problem is getting the string from the user and turning it into a regular expression. If I say that they don't need to have //'s around the regex they enter, then they can't set flags, like g and i. So they have to have the //'s around the expression, but how can I convert that string to a regex? It can't be a literal since its a string, and I can't pass it to the RegExp constructor since its not a string without the //'s. Is there any other way to make a user input string into a regex? Will I have to parse the string and flags of the regex with the //'s then construct it another way? Should I have them enter a string, and then enter the flags separately?
Use the RegExp object constructor to create a regular expression from a string:
var re = new RegExp("a|b", "i");
// same as
var re = /a|b/i;
var flags = inputstring.replace(/.*\/([gimy]*)$/, '$1');
var pattern = inputstring.replace(new RegExp('^/(.*?)/'+flags+'$'), '$1');
var regex = new RegExp(pattern, flags);
or
var match = inputstring.match(new RegExp('^/(.*?)/([gimy]*)$'));
// sanity check here
var regex = new RegExp(match[1], match[2]);
Here is a one-liner: str.replace(/[|\\{}()[\]^$+*?.]/g, '\\$&')
I got it from the escape-string-regexp NPM module.
Trying it out:
escapeStringRegExp.matchOperatorsRe = /[|\\{}()[\]^$+*?.]/g;
function escapeStringRegExp(str) {
return str.replace(escapeStringRegExp.matchOperatorsRe, '\\$&');
}
console.log(new RegExp(escapeStringRegExp('example.com')));
// => /example\.com/
Using tagged template literals with flags support:
function str2reg(flags = 'u') {
return (...args) => new RegExp(escapeStringRegExp(evalTemplate(...args))
, flags)
}
function evalTemplate(strings, ...values) {
let i = 0
return strings.reduce((str, string) => `${str}${string}${
i < values.length ? values[i++] : ''}`, '')
}
console.log(str2reg()`example.com`)
// => /example\.com/u
Use the JavaScript RegExp object constructor.
var re = new RegExp("\\w+");
re.test("hello");
You can pass flags as a second string argument to the constructor. See the documentation for details.
In my case the user input somethimes was sorrounded by delimiters and sometimes not. therefore I added another case..
var regParts = inputstring.match(/^\/(.*?)\/([gim]*)$/);
if (regParts) {
// the parsed pattern had delimiters and modifiers. handle them.
var regexp = new RegExp(regParts[1], regParts[2]);
} else {
// we got pattern string without delimiters
var regexp = new RegExp(inputstring);
}
Try using the following function:
const stringToRegex = str => {
// Main regex
const main = str.match(/\/(.+)\/.*/)[1]
// Regex options
const options = str.match(/\/.+\/(.*)/)[1]
// Compiled regex
return new RegExp(main, options)
}
You can use it like so:
"abc".match(stringToRegex("/a/g"))
//=> ["a"]
Here is my one liner function that handles custom delimiters and invalid flags
// One liner
var stringToRegex = (s, m) => (m = s.match(/^([\/~#;%#'])(.*?)\1([gimsuy]*)$/)) ? new RegExp(m[2], m[3].split('').filter((i, p, s) => s.indexOf(i) === p).join('')) : new RegExp(s);
// Readable version
function stringToRegex(str) {
const match = str.match(/^([\/~#;%#'])(.*?)\1([gimsuy]*)$/);
return match ?
new RegExp(
match[2],
match[3]
// Filter redundant flags, to avoid exceptions
.split('')
.filter((char, pos, flagArr) => flagArr.indexOf(char) === pos)
.join('')
)
: new RegExp(str);
}
console.log(stringToRegex('/(foo)?\/bar/i'));
console.log(stringToRegex('#(foo)?\/bar##gi')); //Custom delimiters
console.log(stringToRegex('#(foo)?\/bar##gig')); //Duplicate flags are filtered out
console.log(stringToRegex('/(foo)?\/bar')); // Treated as string
console.log(stringToRegex('gig')); // Treated as string
I suggest you also add separate checkboxes or a textfield for the special flags. That way it is clear that the user does not need to add any //'s. In the case of a replace, provide two textfields. This will make your life a lot easier.
Why? Because otherwise some users will add //'s while other will not. And some will make a syntax error. Then, after you stripped the //'s, you may end up with a syntactically valid regex that is nothing like what the user intended, leading to strange behaviour (from the user's perspective).
This will work also when the string is invalid or does not contain flags etc:
function regExpFromString(q) {
let flags = q.replace(/.*\/([gimuy]*)$/, '$1');
if (flags === q) flags = '';
let pattern = (flags ? q.replace(new RegExp('^/(.*?)/' + flags + '$'), '$1') : q);
try { return new RegExp(pattern, flags); } catch (e) { return null; }
}
console.log(regExpFromString('\\bword\\b'));
console.log(regExpFromString('\/\\bword\\b\/gi'));
Thanks to earlier answers, this blocks serves well as a general purpose solution for applying a configurable string into a RegEx .. for filtering text:
var permittedChars = '^a-z0-9 _,.?!#+<>';
permittedChars = '[' + permittedChars + ']';
var flags = 'gi';
var strFilterRegEx = new RegExp(permittedChars, flags);
log.debug ('strFilterRegEx: ' + strFilterRegEx);
strVal = strVal.replace(strFilterRegEx, '');
// this replaces hard code solt:
// strVal = strVal.replace(/[^a-z0-9 _,.?!#+]/ig, '');
You can ask for flags using checkboxes then do something like this:
var userInput = formInput;
var flags = '';
if(formGlobalCheckboxChecked) flags += 'g';
if(formCaseICheckboxChecked) flags += 'i';
var reg = new RegExp(userInput, flags);
Safer, but not safe. (A version of Function that didn't have access to any other context would be good.)
const regexp = Function('return ' + string)()
I found #Richie Bendall solution very clean. I added few small modifications because it falls appart and throws error (maybe that's what you want) when passing non regex strings.
const stringToRegex = (str) => {
const re = /\/(.+)\/([gim]?)/
const match = str.match(re);
if (match) {
return new RegExp(match[1], match[2])
}
}
Using [gim]? in the pattern will ignore any match[2] value if it's invalid. You can omit the [gim]? pattern if you want an error to be thrown if the regex options is invalid.
I use eval to solve this problem.
For example:
function regex_exec() {
// Important! Like #Samuel Faure mentioned, Eval on user input is a crazy security risk, so before use this method, please take care of the security risk.
var regex = $("#regex").val();
// eval()
var patt = eval(userInput);
$("#result").val(patt.exec($("#textContent").val()));
}
I am designing a regular expression tester in HTML and JavaScript. The user will enter a regex, a string, and choose the function they want to test with (e.g. search, match, replace, etc.) via radio button and the program will display the results when that function is run with the specified arguments. Naturally there will be extra text boxes for the extra arguments to replace and such.
My problem is getting the string from the user and turning it into a regular expression. If I say that they don't need to have //'s around the regex they enter, then they can't set flags, like g and i. So they have to have the //'s around the expression, but how can I convert that string to a regex? It can't be a literal since its a string, and I can't pass it to the RegExp constructor since its not a string without the //'s. Is there any other way to make a user input string into a regex? Will I have to parse the string and flags of the regex with the //'s then construct it another way? Should I have them enter a string, and then enter the flags separately?
Use the RegExp object constructor to create a regular expression from a string:
var re = new RegExp("a|b", "i");
// same as
var re = /a|b/i;
var flags = inputstring.replace(/.*\/([gimy]*)$/, '$1');
var pattern = inputstring.replace(new RegExp('^/(.*?)/'+flags+'$'), '$1');
var regex = new RegExp(pattern, flags);
or
var match = inputstring.match(new RegExp('^/(.*?)/([gimy]*)$'));
// sanity check here
var regex = new RegExp(match[1], match[2]);
Here is a one-liner: str.replace(/[|\\{}()[\]^$+*?.]/g, '\\$&')
I got it from the escape-string-regexp NPM module.
Trying it out:
escapeStringRegExp.matchOperatorsRe = /[|\\{}()[\]^$+*?.]/g;
function escapeStringRegExp(str) {
return str.replace(escapeStringRegExp.matchOperatorsRe, '\\$&');
}
console.log(new RegExp(escapeStringRegExp('example.com')));
// => /example\.com/
Using tagged template literals with flags support:
function str2reg(flags = 'u') {
return (...args) => new RegExp(escapeStringRegExp(evalTemplate(...args))
, flags)
}
function evalTemplate(strings, ...values) {
let i = 0
return strings.reduce((str, string) => `${str}${string}${
i < values.length ? values[i++] : ''}`, '')
}
console.log(str2reg()`example.com`)
// => /example\.com/u
Use the JavaScript RegExp object constructor.
var re = new RegExp("\\w+");
re.test("hello");
You can pass flags as a second string argument to the constructor. See the documentation for details.
In my case the user input somethimes was sorrounded by delimiters and sometimes not. therefore I added another case..
var regParts = inputstring.match(/^\/(.*?)\/([gim]*)$/);
if (regParts) {
// the parsed pattern had delimiters and modifiers. handle them.
var regexp = new RegExp(regParts[1], regParts[2]);
} else {
// we got pattern string without delimiters
var regexp = new RegExp(inputstring);
}
Try using the following function:
const stringToRegex = str => {
// Main regex
const main = str.match(/\/(.+)\/.*/)[1]
// Regex options
const options = str.match(/\/.+\/(.*)/)[1]
// Compiled regex
return new RegExp(main, options)
}
You can use it like so:
"abc".match(stringToRegex("/a/g"))
//=> ["a"]
Here is my one liner function that handles custom delimiters and invalid flags
// One liner
var stringToRegex = (s, m) => (m = s.match(/^([\/~#;%#'])(.*?)\1([gimsuy]*)$/)) ? new RegExp(m[2], m[3].split('').filter((i, p, s) => s.indexOf(i) === p).join('')) : new RegExp(s);
// Readable version
function stringToRegex(str) {
const match = str.match(/^([\/~#;%#'])(.*?)\1([gimsuy]*)$/);
return match ?
new RegExp(
match[2],
match[3]
// Filter redundant flags, to avoid exceptions
.split('')
.filter((char, pos, flagArr) => flagArr.indexOf(char) === pos)
.join('')
)
: new RegExp(str);
}
console.log(stringToRegex('/(foo)?\/bar/i'));
console.log(stringToRegex('#(foo)?\/bar##gi')); //Custom delimiters
console.log(stringToRegex('#(foo)?\/bar##gig')); //Duplicate flags are filtered out
console.log(stringToRegex('/(foo)?\/bar')); // Treated as string
console.log(stringToRegex('gig')); // Treated as string
I suggest you also add separate checkboxes or a textfield for the special flags. That way it is clear that the user does not need to add any //'s. In the case of a replace, provide two textfields. This will make your life a lot easier.
Why? Because otherwise some users will add //'s while other will not. And some will make a syntax error. Then, after you stripped the //'s, you may end up with a syntactically valid regex that is nothing like what the user intended, leading to strange behaviour (from the user's perspective).
This will work also when the string is invalid or does not contain flags etc:
function regExpFromString(q) {
let flags = q.replace(/.*\/([gimuy]*)$/, '$1');
if (flags === q) flags = '';
let pattern = (flags ? q.replace(new RegExp('^/(.*?)/' + flags + '$'), '$1') : q);
try { return new RegExp(pattern, flags); } catch (e) { return null; }
}
console.log(regExpFromString('\\bword\\b'));
console.log(regExpFromString('\/\\bword\\b\/gi'));
Thanks to earlier answers, this blocks serves well as a general purpose solution for applying a configurable string into a RegEx .. for filtering text:
var permittedChars = '^a-z0-9 _,.?!#+<>';
permittedChars = '[' + permittedChars + ']';
var flags = 'gi';
var strFilterRegEx = new RegExp(permittedChars, flags);
log.debug ('strFilterRegEx: ' + strFilterRegEx);
strVal = strVal.replace(strFilterRegEx, '');
// this replaces hard code solt:
// strVal = strVal.replace(/[^a-z0-9 _,.?!#+]/ig, '');
You can ask for flags using checkboxes then do something like this:
var userInput = formInput;
var flags = '';
if(formGlobalCheckboxChecked) flags += 'g';
if(formCaseICheckboxChecked) flags += 'i';
var reg = new RegExp(userInput, flags);
Safer, but not safe. (A version of Function that didn't have access to any other context would be good.)
const regexp = Function('return ' + string)()
I found #Richie Bendall solution very clean. I added few small modifications because it falls appart and throws error (maybe that's what you want) when passing non regex strings.
const stringToRegex = (str) => {
const re = /\/(.+)\/([gim]?)/
const match = str.match(re);
if (match) {
return new RegExp(match[1], match[2])
}
}
Using [gim]? in the pattern will ignore any match[2] value if it's invalid. You can omit the [gim]? pattern if you want an error to be thrown if the regex options is invalid.
I use eval to solve this problem.
For example:
function regex_exec() {
// Important! Like #Samuel Faure mentioned, Eval on user input is a crazy security risk, so before use this method, please take care of the security risk.
var regex = $("#regex").val();
// eval()
var patt = eval(userInput);
$("#result").val(patt.exec($("#textContent").val()));
}
Is is possible escape parameterized regex when parameter contains multiple simbols that need to be escaped?
const _and = '&&', _or = '||';
let reString = `^(${_and}|${_or})`; //&{_or} needs to be escaped
const reToken = new RegExp(reString);
Working but not optimal:
_or = '\\|\\|';
Or:
let reString = `^(${_and}|\\|\\|)`;
It is preferred to reuse _or variable and keep regex parameterized.
You can make your own function which would escape your parameters, so that these works in final regexp. To save you time, I already found one written in this answer. With that function, you can write clean parameters without actually escaping everything by hand. Though I would avoid modifying build in classes (RegExp) and make a wrapper around it or something separate. In example below I use exact function I found in the other answer, which extends build in RegExp.
RegExp.escape = function(s) {
return s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
};
const and = RegExp.escape('&&');
const or = RegExp.escape('||');
const andTestString = '1 && 2';
const orTestString = '1 || 2';
const regexp = `${and}|${or}`;
console.log(new RegExp(regexp).test(andTestString)); // true
console.log(new RegExp(regexp).test(orTestString)); // true
EDITED
https://jsfiddle.net/ao4t0pzr/1/
You can use a Template Literal function to escape the characters within the string using a Regular Expression. You can then use that string to propagate a new RegEx filled with escaped characters:
function escape(s) {
return s[0].replace(/[-&\/\\^$*+?.()|[\]{}]/g, '\\$&');
};
var or = escape`||`;
var and = escape`&&`;
console.log(new RegExp(`${and}|${or}`)); // "/\&\&|\|\|/"