javascript regex - match between slashes - javascript

I have a URL: https://api.example.com/main/user1/collection1/contents/folder1/test.js
To capture only collection1, I tried the following but it returns user1 along with it, how can I exclude user1?
(?:\/user1\/)[^\/]*
To capture only folder1/test.js, the following also returns contents and I want to exclude it
contents\/(.*)$

If you want to use a regex. you could try (?:\/user1\/)([^\/]+) to capture only collection1.
This captures the part after /user1/ in a capturing group ([^\/]+).
const pattern = /\/user1\/([^\/]+)/;
const str = "https://api.example.com/main/user1/collection1/contents/folder1/test.js";
matches = str.match(pattern);
console.log(matches[1]);
To capture only folder1/test.js you could use folder1\/.*$
const pattern = /folder1\/.*$/;
const str = "https://api.example.com/main/user1/collection1/contents/folder1/test.js";
matches = str.match(pattern);
console.log(matches[0]);
Without regex you might use URL:
const str = "https://api.example.com/main/user1/collection1/contents/folder1/test.js";
const url = new URL(str);
const parts = url.pathname.split('/');
console.log(parts[3]);
console.log(parts[5] + '/' + parts[6]);

Related

How can I replace all duplicated paths of a url with a JS Regex

For the following URL:
https://www.google.es/test/test/hello/world
I want to replace all the occurrences of "/test/", and its important that it "test" starts with and ends with a "/".
I tried with:
let url = "https://www.google.es/test/test/hello/world"
url.replace(/\/test\//g, "/");
But it returns:
'https://www.google.es/test/hello/world'
It doesn't replace the second "/test/"
Any clues on how I could do this with a regex?
I basically want to replace the content that lives inside the dashes, but not the dashes themselves.
Something like this would work:
/(\/[^\/]+)(?:\1)+\//g
( - open capture group #1
\/ - find a literal slash
[^\/]+ - capture at least one non-slash char
) - close capture group #1
(?:\1)+ - repeat capture group #1 one or more times
\/ - ensure a closing slash
/g - global modifier
https://regex101.com/r/NgJA3X/1
var regex = /(\/[^\/]+)(?:\1)+\//g;
var str = `https://www.google.es/test/test/hello/world
https://www.google.es/test/test/test/test/hello/world
https://www.google.es/test/test/hello/test/hello/hello/world
https://www.google.es/test/hello/world/world
https://www.google.es/test/hello/helloworld/world`;
var subst = ``;
// The substituted value will be contained in the result variable
var result = str.replace(regex, subst);
console.log(result);
You can do this with a regular expression, but it sounds like your intent is to replace only individual parts of the pathname component of a URL.
A URL has other components (such as the fragment identifier) which could contain the pattern that you describe, and handling that distinction with a regular expression is more challenging.
The URL class is designed to help solve problems just like this, and you can replace just the path parts using a functional technique like this:
function replacePathParts (url, targetStr, replaceStr = '') {
url = new URL(url);
const updatedPathname = url.pathname
.split('/')
.map(str => str === targetStr ? replaceStr : str)
.filter(Boolean)
.join('/');
url.pathname = updatedPathname;
return url.href;
}
const inputUrl = 'https://www.google.es/test/test/hello/world';
const result1 = replacePathParts(inputUrl, 'test');
console.log(result1); // "https://www.google.es/hello/world"
const result2 = replacePathParts(inputUrl, 'test', 'message');
console.log(result2); // "https://www.google.es/message/message/hello/world"
Based on conversation in comment, you can use this solution:
let url = "https://www.google.es/test/test/hello/world"
console.log( url.replace(/(?:\/test)+(?=\/)/g, "/WORKS") );
//=> https://www.google.es/WORKS/hello/world
RegEx Breakdown:
(?:\/test)+: Match 1+ repeats of text /test
(?=\/): Lookahead to assert that we have a / ahead

Call Substring multiple times

I have a string and I want to remove this data like this \(f(x) = (a+b)\)
so i am thinking to get all subsstring and then make some operation on array. But it is giving me only one stripedHtml. So not able to get how to clean it. Just wants to remove this equations.
Result will be : Here’s the evidence
const filter_data = `<p>\(f(x) = (a+b)\)</p><p>\(f(x) = (a+db)\)</p><p>\(f(x) = (a+d+c+b)\)</p>
<p>Here’s the evidence.</p>`
var strippedHtml = filter_data.substring(
filter_data.lastIndexOf("\(") + 1,
filter_data.lastIndexOf("\)")
);
console.log(strippedHtml)
JS has a replace method for this that accepts RegExp:
const filter_data = `<p>\(f(x) = (a+b)\)</p><p>\(f(x) = (a+db)\)</p><p>\(f(x) = (a+d+c+b)\)</p>
<p>Here’s the evidence.</p>`;
var strippedHtml = filter_data.replace(/\<.*?\(.*?>/g, "");
console.log(strippedHtml);
The RegExp searches for an < followed by a ( and then an > and replaces all appearances with an empty value.
In your string it will match two times and do a replace.
Maybe you have to modify the RegExp to fit your real string as it would also match text nodes containing ( but that's what I would do at this point with the given data.
You can use following regular expressions to obtain solution for only similar type of data you were provided
const filterData1 = `<p>\(f(x) = (a+b)\)</p><p>\(f(x) = (a+db)\)</p><p>\(f(x) = (a+d+c+b)\)</p><p>Here’s the evidence.</p>`
const filterData2 = `<p>\(f(x) = (a+b)\)</p><p>\(f(x) = (a+db)\)</p><p>\(f(x) = (a+d+c+b)\)</p><p>Here’s the evidence.</p><p>\(f(x) = (a+b)\)</p>`
const regEx1 = /<[^>]*>/g //regular expression to remove all html tags
const regEx2 = /\([^\)]*\)/g //regular expression to remove everything between \( and \)
const regEx3 = /[=)]/g //regular expression to remove = and )
const result1 = filterData1.replace(regEx1,'').replace(regEx2,'').replace(regEx3,'').trim()
const result2 = filterData2.replace(regEx1,'').replace(regEx2,'').replace(regEx3,'').trim()
console.log("Result1 : ",result1);
console.log("Result2 : ",result2);

URL regular expression pattern

I would like to parse URLs with Regular Expressions and find a pattern that matches with https://*.global.
Here is my URL test string on regex101.
Ideally, the regex would return https://app8.global instead of cover other https string.
const URL = `https://temp/"https://app8.global"https://utility.localhost/`;
const regex = /https:\/\/(.+?)\.global(\/|'|"|`)/gm;
const found = URL.match(regex);
console.log(found);
How would I manipulate the regex so it will return the https://*.global?
First of all, you need to exclude slashes from the starting part, otherwise it'll match things from the previous url:
const regex = /https:\/\/([^\/]+?)\.global(\/|'|"|`)/gm;
Now, you can convert the weird 4 character or with a character group:
const regex = /https:\/\/([^\/]+?)\.global[\/'"`]/gm;
And now you can get the matches and trim off that last character:
const matches = URL.match(regex).map(v => v.slice(0, -1));
Then, matches would evaluate to ["https://app8.global"].
Using Group RegExp.$1
const URL = `https://temp/"https://app8.global"https://utility.localhost/`;
const regex = /(https:\/\/([^\/]+?)\.global[\/'"`])/;
const found = URL.match(regex);
console.log(RegExp.$1);

Ignore character in regex

I want to ignore 0 in the regex below. Currently, the regex returns an array and splitting the characters into n digits. I want the regex to ignore character 0.
var n = 2
var str = '123045';
var regex2 = new RegExp(`.{1,${n}}`,'g');
var reg = str.match(regex2)
One way you could achieve this is by removing the 0 before you perform your match. This can be done by using .replace() like so:
const n = 2
const str = '123045';
const regex2 = new RegExp(`.{1,${n}}`, 'g');
const reg = str.replace(/0/g, '').match(regex2);
console.log(reg); // [‘12’, ‘34’, ‘5’]
To ignore leading zeros, you can match for 0 followed by n amount of digits for each element in your matched array (using .map()) and .replace() this with the remaining digits by capturing them in a group, and using the group as the replacement:
const n = 2
const str = '123045';
const regex2 = new RegExp(`.{1,${n}}`, 'g');
const reg = str.match(regex2).map(m => m.replace(/0(\d+)/g, '$1')).join('').match(regex2);
console.log(reg); // [‘12’, ‘30’, ‘45’]
Best way is to use replace,
But if you want to do it using regex try (?!0)[\d] It shall give you matches 1,2,3,4,5

Regex working fine in C# but not in Javascript

I have the following javascript code:
var markdown = "I have \(x=1\) and \(y=2\) and even \[z=3\]"
var latexRegex = new RegExp("\\\[.*\\\]|\\\(.*\\\)");
var matches = latexRegex.exec(markdown);
alert(matches[0]);
matches has only matches[0] = "x=1 and y=2" and should be:
matches[0] = "\(x=1\)"
matches[1] = "\(y=2\)"
matches[2] = "\[z=3\]"
But this regex works fine in C#.
Any idea why this happens?
Thank You,
Miguel
Specify g flag to match multiple times.
Use String.match instead of RegExp.exec.
Using regular expression literal (/.../), you don't need to escape \.
* matches greedily. Use non-greedy version: *?
var markdown = "I have \(x=1\) and \(y=2\) and even \[z=3\]"
var latexRegex = /\[.*?\]|\(.*?\)/g;
var matches = markdown.match(latexRegex);
matches // => ["(x=1)", "(y=2)", "[z=3]"]
Try non-greedy: \\\[.*?\\\]|\\\(.*?\\\). You need to also use a loop if using the .exec() method like so:
var res, matches = [], string = 'I have \(x=1\) and \(y=2\) and even \[z=3\]';
var exp = new RegExp('\\\[.*?\\\]|\\\(.*?\\\)', 'g');
while (res = exp.exec(string)) {
matches.push(res[0]);
}
console.log(matches);
Try using the match function instead of the exec function. exec only returns the first string it finds, match returns them all, if the global flag is set.
var markdown = "I have \(x=1\) and \(y=2\) and even \[z=3\]";
var latexRegex = new RegExp("\\\[.*\\\]|\\\(.*\\\)", "g");
var matches = markdown.match(latexRegex);
alert(matches[0]);
alert(matches[1]);
If you don't want to get \(x=1\) and \(y=2\) as a match, you will need to use non-greedy operators (*?) instead of greedy operators (*). Your RegExp will become:
var latexRegex = new RegExp("\\\[.*?\\\]|\\\(.*?\\\)");

Categories

Resources