URL regular expression pattern

URL regular expression pattern - javascript

I would like to parse URLs with Regular Expressions and find a pattern that matches with https://*.global.
Here is my URL test string on regex101.
Ideally, the regex would return https://app8.global instead of cover other https string.
const URL = `https://temp/"https://app8.global"https://utility.localhost/`;
const regex = /https:\/\/(.+?)\.global(\/|'|"|`)/gm;
const found = URL.match(regex);
console.log(found);
How would I manipulate the regex so it will return the https://*.global?

First of all, you need to exclude slashes from the starting part, otherwise it'll match things from the previous url:
const regex = /https:\/\/([^\/]+?)\.global(\/|'|"|`)/gm;
Now, you can convert the weird 4 character or with a character group:
const regex = /https:\/\/([^\/]+?)\.global[\/'"`]/gm;
And now you can get the matches and trim off that last character:
const matches = URL.match(regex).map(v => v.slice(0, -1));
Then, matches would evaluate to ["https://app8.global"].

Using Group RegExp.$1
const URL = `https://temp/"https://app8.global"https://utility.localhost/`;
const regex = /(https:\/\/([^\/]+?)\.global[\/'"`])/;
const found = URL.match(regex);
console.log(RegExp.$1);

Related

How can I replace all duplicated paths of a url with a JS Regex

For the following URL:
https://www.google.es/test/test/hello/world
I want to replace all the occurrences of "/test/", and its important that it "test" starts with and ends with a "/".
I tried with:
let url = "https://www.google.es/test/test/hello/world"
url.replace(/\/test\//g, "/");
But it returns:
'https://www.google.es/test/hello/world'
It doesn't replace the second "/test/"
Any clues on how I could do this with a regex?
I basically want to replace the content that lives inside the dashes, but not the dashes themselves.

Something like this would work:
/(\/[^\/]+)(?:\1)+\//g
( - open capture group #1
\/ - find a literal slash
[^\/]+ - capture at least one non-slash char
) - close capture group #1
(?:\1)+ - repeat capture group #1 one or more times
\/ - ensure a closing slash
/g - global modifier
https://regex101.com/r/NgJA3X/1
var regex = /(\/[^\/]+)(?:\1)+\//g;
var str = `https://www.google.es/test/test/hello/world
https://www.google.es/test/test/test/test/hello/world
https://www.google.es/test/test/hello/test/hello/hello/world
https://www.google.es/test/hello/world/world
https://www.google.es/test/hello/helloworld/world`;
var subst = ``;
// The substituted value will be contained in the result variable
var result = str.replace(regex, subst);
console.log(result);

You can do this with a regular expression, but it sounds like your intent is to replace only individual parts of the pathname component of a URL.
A URL has other components (such as the fragment identifier) which could contain the pattern that you describe, and handling that distinction with a regular expression is more challenging.
The URL class is designed to help solve problems just like this, and you can replace just the path parts using a functional technique like this:
function replacePathParts (url, targetStr, replaceStr = '') {
url = new URL(url);
const updatedPathname = url.pathname
.split('/')
.map(str => str === targetStr ? replaceStr : str)
.filter(Boolean)
.join('/');
url.pathname = updatedPathname;
return url.href;
}
const inputUrl = 'https://www.google.es/test/test/hello/world';
const result1 = replacePathParts(inputUrl, 'test');
console.log(result1); // "https://www.google.es/hello/world"
const result2 = replacePathParts(inputUrl, 'test', 'message');
console.log(result2); // "https://www.google.es/message/message/hello/world"

Based on conversation in comment, you can use this solution:
let url = "https://www.google.es/test/test/hello/world"
console.log( url.replace(/(?:\/test)+(?=\/)/g, "/WORKS") );
//=> https://www.google.es/WORKS/hello/world
RegEx Breakdown:
(?:\/test)+: Match 1+ repeats of text /test
(?=\/): Lookahead to assert that we have a / ahead

How to regex replace a query string with matching 2 words?

I have a url and I want to replace the query string. For example
www.test.com/is/images/383773?wid=200&hei=200
I want to match the wid= and hei= and the numbers don't have to be 200 to replace the whole thing so it should look like this.
Expected
www.test.com/is/images/383773?#HT_dtImage
So I've tried doing but it only replaced the matching wei and hei.
const url = "www.test.com/is/images/383773?wid=200&hei=200"
url.replace(/(wid)(hei)_[^\&]+/, "#HT_dtImage")

You can match either wid= or hei= until the next optional ampersand and then remove those matches, and then append #HT_dtImage to the result.
\b(?:wid|hei)=[^&]*&?
The pattern matches:
\b A word boundary to prevent a partial word match
(?:wid|hei)= Non capture group, match either wid or hei followed by =
[^&]*&? Match 0+ times a char other than &, and then match an optional &
See a regex demo.
let url = "www.test.com/is/images/383773?wid=200&hei=200"
url = url.replace(/\b(?:wid|hei)=[^&]*&?/g, "") + "#HT_dtImage";
console.log(url)

I would just use string split here:
var url = "www.test.com/is/images/383773?wid=200&hei=200";
var output = url.split("?")[0] + "?#HT_dtImage";
console.log(output);
If you only want to target query strings havings both keys wid and hei, then use a regex approach:
var url = "www.test.com/is/images/383773?wid=200&hei=200";
var output = url.replace(/(.*)\?(?=.*\bwid=\d+)(?=.*\bhei=\d+).*/, "$1?#HT_dtImage");
console.log(output);

You can make use of lookaround using regex /\?.*/
const url = 'www.test.com/is/images/383773?wid=200&hei=200';
const result = url.replace(/\?.*/, '?#HT_dtImage');
console.log(result);

Try
url.replace(/\?.*/, "?#HT_dtImage")

JavaScript - more concise way to grab string between two characters in URL?

Is there a more concise or more standard way to grab a string between the last slash and query question mark of a URL than this?
const recordId= window.location.href.split("item/")[1].split("?")[0]
In this case I'm using item/ because my URLs are always:
mysite.com/item/recordIdIwantToGrab?foo=bar&life=42

A regular expression can do the trick - match a /, followed by word characters, up until a ?.
const str = 'mysite.com/item/recordIdIwantToGrab?foo=bar&life=42';
const result = str.match(/\/(\w+)\?/)[1];
console.log(result);
\/ - match a literal /
(\w+) - capturing group, match word characters
\ - match a literal ?
[1] - extract the value matched by the capturing group

We can achieve with URL class in javascript.
let url = 'http://example.com/item/recordIdIwantToGrab?foo=bar&life=42';
url = new URL(url);
let result = url.pathname.split('/').at('-1');
console.log(result);

You can achieve the result using lastIndexOf and indexOf
const str = `mysite.com/item/recordIdIwantToGrab?foo=bar&life=42`;
const result = str.slice(str.lastIndexOf("/") + 1, str.indexOf("?"));
console.log(result);

Manipulate a string containing a file's path to get only the file name

I got a file path as
falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV
as file extension will change as file type
file path will also change
how could I manipulate string to get file name as
BD6FE729-70F1-48B0-83EB-8E7D956E599E"
is in given example
2nd example of path and file type change
falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540ppphrx%252Fiia-mas-app-new//IIAMASATTCHMENTS/DD6FE729-60F2-58B0-8M8B-8E759R6E547K.jpeg

you can do simply
let str="falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV"
console.log( str.split(".")[0].split("/").pop()
)
just remember split split pop

Some variation of slice/split would work
const str = 'falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV'
console.log(
str.slice(str.lastIndexOf("/")+1).split(".")[0]
)
// or
console.log(
str.split("/").pop().split(".")[0]
)

You can use regular expression for example.
The first thing comes in my mind is:
const filepath = 'falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV'
const filenameWithoutExtension = filepath.match(/IIAMASATTCHMENTS\/(.*)\./)[1] // "BD6FE729-70F1-48B0-83EB-8E7D956E599E"
console.log(filenameWithoutExtension)

If you know the format of the value you want to capture, you might get a more exact match using a regex and capture your value in the first capturing group.
You might use the /i flag to make the match case insensitive.
([A-Z0-9]+(?:-[A-Z0-9]+){4})\.\w+$
That will match:
( Capturing group
[A-Z0-9]+ Match 1+ times what is listed in the character class
(?:-[A-Z0-9]+){4} Repeat 4 times matching a hyphen and 1+ times what is listed in the character class
) Close capturing group
\.\w+$ Match a dot, 1+ times a word char and assert the end of the string
Regex demo
let strs = [
`falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV`,
`falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540ppphrx%252Fiia-mas-app-new//IIAMASATTCHMENTS/DD6FE729-60F2-58B0-8M8B-8E759R6E547K.jpeg`
];
let pattern = /([A-Z0-9]+(?:-[A-Z0-9]+){4})\.\w+$/i;
strs.forEach(str => console.log(str.match(pattern)[1]));

You could use regular expressions like here:
function get_filename(str) {
const regex = /\/([A-Z0-9\-_]+)\.[\w\d]+/gm;
let m = regex.exec(str);
return m[1];
}
console.log(
get_filename(`falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540ppphrx%252Fiia-mas-app-new//IIAMASATTCHMENTS/DD6FE729-60F2-58B0-8M8B-8E759R6E547K.jpeg`)
)

var filpath = "falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV"
console.log(
filpath.substring(filpath.lastIndexOf('/') + 1, filpath.length).substring(1, filpath.substring(filpath.lastIndexOf('/') + 1, filpath.length).lastIndexOf('.'))
)

var str = "falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV",
re = /[\w|-]*\.\w*/
stringNameWithExt = str.match(re)
stringNameWithoutExt = str.match(re)[0].split(".")[0]
console.log(stringNameWithoutExt)

javascript regex - match between slashes

I have a URL: https://api.example.com/main/user1/collection1/contents/folder1/test.js
To capture only collection1, I tried the following but it returns user1 along with it, how can I exclude user1?
(?:\/user1\/)[^\/]*
To capture only folder1/test.js, the following also returns contents and I want to exclude it
contents\/(.*)$

If you want to use a regex. you could try (?:\/user1\/)([^\/]+) to capture only collection1.
This captures the part after /user1/ in a capturing group ([^\/]+).
const pattern = /\/user1\/([^\/]+)/;
const str = "https://api.example.com/main/user1/collection1/contents/folder1/test.js";
matches = str.match(pattern);
console.log(matches[1]);
To capture only folder1/test.js you could use folder1\/.*$
const pattern = /folder1\/.*$/;
const str = "https://api.example.com/main/user1/collection1/contents/folder1/test.js";
matches = str.match(pattern);
console.log(matches[0]);
Without regex you might use URL:
const str = "https://api.example.com/main/user1/collection1/contents/folder1/test.js";
const url = new URL(str);
const parts = url.pathname.split('/');
console.log(parts[3]);
console.log(parts[5] + '/' + parts[6]);

Develop Reference

JavaScript is the programming language of the Web.

URL regular expression pattern - javascript

Using Group RegExp.$1 const URL = `https://temp/"https://app8.global"https://utility.localhost/`; const regex = /(https:\/\/([^\/]+?)\.global[\/'"`])/; const found = URL.match(regex); console.log(RegExp.$1);

Related

How can I replace all duplicated paths of a url with a JS Regex

How to regex replace a query string with matching 2 words?

JavaScript - more concise way to grab string between two characters in URL?

Manipulate a string containing a file's path to get only the file name

javascript regex - match between slashes

Categories

Resources