Regex replace last part of url path if condition matches? - javascript

I am basically trying to remove the last part of a URL if the URL contains the path /ice/flag/. Example:
Input:
https://test.com/plants/ice/flag/237468372912873
Desired Output:
Because the above URL has /ice/flag/ in its path, I want the last part of the URL to be replaced with redacted.
https://test.com/plants/ice/flag/redacted
However, if the URL did not have /ice/flag (ex: https://test.com/plants/not_ice/flag/237468372912873), it shouldn't be replaced.
What I tried to do is to use the answer mentioned here to change the last part of the path:
var url = 'https://test.com/plants/ice/flag/237468372912873'
url = url.replace(/\/[^\/]*$/, '/redacted')
This works in doing the replacement, but I am unsure how to modify this so that it only matches if /ice/flag is in the path. I tried putting \/ice\/flag in certain parts of the regex to change the behavior to only replace if that is in the string, but nothing has been working. Any tips from those more experienced with regex on how to do this would be greatly appreciated, thank you!
Edit: The URL can be formed in different ways, so there may be other paths before or after /ice/flag/. So all of these are possibilities:
Input:
https://test.com/plants/ice/flag/237468372912873
https://test.com/plants/extra/ice/flag/237468372912873
https://test.com/plants/ice/flag/extra/237468372912873
https://test.com/plants/ice/flag/extra/extra/237468372912873
https://test.com/plants/ice/flag/extra/237468372912873?paramOne=1&paramTwo=2#someHash
Desired Output:
https://test.com/plants/ice/flag/redacted
https://test.com/plants/extra/ice/flag/redacted
https://test.com/plants/ice/flag/extra/redacted
https://test.com/plants/ice/flag/extra/extra/redacted
https://test.com/plants/ice/flag/extra/redacted?paramOne=1&paramTwo=2#someHash

You may search for this regex:
(\/ice\/flag\/(?:[^?#]*\/)?)[^\/#?]+
and replace it with:
$1redacted
RegEx Demo
RegEx Breakup:
(: Start capture group #1
\/ice\/flag\/: Match /ice/flag/
(?:[^?#]*\/)?: Match 0 or more of any char that is not # and ? followed by a / as an optional match
): End capture group #1
[^\/#?]+ Match 1+ of any char that is not / and # and ?
Code:
var arr = [
'https://test.com/plants/ice/flag/237468372912873',
'https://test.com/plants/ice/flag/a/b/237468372912873',
'https://test.com/a/ice/flag/e/237468372912873?p=2/12#aHash',
'https://test.com/plants/not_ice/flag/237468372912873'];
var rx = /(\/ice\/flag\/(?:[^?#\n]*\/)?)[^\/#?\n]+/;
var subst = '$1redacted';
arr.forEach(el => console.log(el.replace(rx, subst)));

Here is functional code with test input strings based on your requirements:
const input = [
'https://test.com/plants/ice/flag/237468372912873',
'https://test.com/plants/extra/ice/flag/237468372912873',
'https://test.com/plants/ice/flag/extra/237468372912873',
'https://test.com/plants/ice/flag/extra/extra/237468372912873',
'https://test.com/plants/ice/flag/extra/237468372912873#someHash',
'https://test.com/plants/ice/flag/extra/237468372912873?paramOne=1&paramTwo=2#someHash',
'https://test.com/plants/not_ice/flag/237468372912873'
];
const re = /(\/ice\/flag\/([^\/#?]+\/)*)[^\/#?]+/;
input.forEach(str => {
console.log('str: ' + str + '\n => ' + str.replace(re, '$1redacted'));
});
Output:
str: https://test.com/plants/ice/flag/237468372912873
=> https://test.com/plants/ice/flag/redacted
str: https://test.com/plants/extra/ice/flag/237468372912873
=> https://test.com/plants/extra/ice/flag/redacted
str: https://test.com/plants/ice/flag/extra/237468372912873
=> https://test.com/plants/ice/flag/extra/redacted
str: https://test.com/plants/ice/flag/extra/extra/237468372912873
=> https://test.com/plants/ice/flag/extra/extra/redacted
str: https://test.com/plants/ice/flag/extra/237468372912873#someHash
=> https://test.com/plants/ice/flag/extra/redacted#someHash
str: https://test.com/plants/ice/flag/extra/237468372912873?paramOne=1&paramTwo=2#someHash
=> https://test.com/plants/ice/flag/extra/redacted?paramOne=1&paramTwo=2#someHash
str: https://test.com/plants/not_ice/flag/237468372912873
=> https://test.com/plants/not_ice/flag/237468372912873
Regex:
( - capture group start
\/ice\/flag\/ - expect /ice/flag/
([^\/#?]+\/)* - zero or more patterns of chars other than /, #, ?, followed by /
) - capture group end
[^\/#?]+ - discard anything that is not /, #, ? but expect at least one char (this will force stuff after the last /)

You can add a ternary operation condition to check if the url includes /ice/flag by url.includes('/ice/flag'), then replace url.replace(/\/[^\/]*$/, '/redacted') else return the url as it is.
function replace(url) {
return url.includes('/ice/flag') ? url.replace(/\/[^\/]*$/, '/redacted') : url;
}
console.log(replace("https://test.com/plants/ice/flag/237468372912873"))
console.log(replace("https://test.com/plants/not_ice/flag/237468372912873"));

Related

regex for ignoring character if inside () parenthesis?

I was doing some regex, but I get this bug:
I have this string for example "+1/(1/10)+(1/30)+1/50" and I used this regex /\+.[^\+]*/g
and it working fine since it gives me ['+1/(1/10)', '+(1/30)', '+1/50']
BUT the real problem is when the + is inside the parenthesis ()
like this: "+1/(1+10)+(1/30)+1/50"
because it will give ['+1/(1', '+10)', '+(1/30)', '+1/50']
which isn't what I want :(... the thing I want is ['+1/(1+10)', '+(1/30)', '+1/50']
so the regex if it see \(.*\) skip it like it wasn't there...
how to ignore in regex?
my code (js):
const tests = {
correct: "1/(1/10)+(1/30)+1/50",
wrong : "1/(1+10)+(1/30)+1/50"
}
function getAdditionArray(string) {
const REGEX = /\+.[^\+]*/g; // change this to ignore the () even if they have the + sign
const firstChar = string[0];
if (firstChar !== "-") string = "+" + string;
return string.match(REGEX);
}
console.log(
getAdditionArray(test.correct),
getAdditionArray(test.wrong),
)
You can exclude matching parenthesis, and then optionally match (...)
\+[^+()]*(?:\([^()]*\))?
The pattern matches:
\+ Match a +
[^+()]* Match optional chars other than + ( )
(?: Non capture group to match as a whole part
\([^()]*\) Match from (...)
)? Close the non capture group and make it optional
See a regex101 demo.
Another option could be to be more specific about the digits and the + and / and use a character class to list the allowed characters.
\+(?:\d+[+/])?(?:\(\d+[/+]\d+\)|\d+)
See another regex101 demo.

How can I replace all duplicated paths of a url with a JS Regex

For the following URL:
https://www.google.es/test/test/hello/world
I want to replace all the occurrences of "/test/", and its important that it "test" starts with and ends with a "/".
I tried with:
let url = "https://www.google.es/test/test/hello/world"
url.replace(/\/test\//g, "/");
But it returns:
'https://www.google.es/test/hello/world'
It doesn't replace the second "/test/"
Any clues on how I could do this with a regex?
I basically want to replace the content that lives inside the dashes, but not the dashes themselves.
Something like this would work:
/(\/[^\/]+)(?:\1)+\//g
( - open capture group #1
\/ - find a literal slash
[^\/]+ - capture at least one non-slash char
) - close capture group #1
(?:\1)+ - repeat capture group #1 one or more times
\/ - ensure a closing slash
/g - global modifier
https://regex101.com/r/NgJA3X/1
var regex = /(\/[^\/]+)(?:\1)+\//g;
var str = `https://www.google.es/test/test/hello/world
https://www.google.es/test/test/test/test/hello/world
https://www.google.es/test/test/hello/test/hello/hello/world
https://www.google.es/test/hello/world/world
https://www.google.es/test/hello/helloworld/world`;
var subst = ``;
// The substituted value will be contained in the result variable
var result = str.replace(regex, subst);
console.log(result);
You can do this with a regular expression, but it sounds like your intent is to replace only individual parts of the pathname component of a URL.
A URL has other components (such as the fragment identifier) which could contain the pattern that you describe, and handling that distinction with a regular expression is more challenging.
The URL class is designed to help solve problems just like this, and you can replace just the path parts using a functional technique like this:
function replacePathParts (url, targetStr, replaceStr = '') {
url = new URL(url);
const updatedPathname = url.pathname
.split('/')
.map(str => str === targetStr ? replaceStr : str)
.filter(Boolean)
.join('/');
url.pathname = updatedPathname;
return url.href;
}
const inputUrl = 'https://www.google.es/test/test/hello/world';
const result1 = replacePathParts(inputUrl, 'test');
console.log(result1); // "https://www.google.es/hello/world"
const result2 = replacePathParts(inputUrl, 'test', 'message');
console.log(result2); // "https://www.google.es/message/message/hello/world"
Based on conversation in comment, you can use this solution:
let url = "https://www.google.es/test/test/hello/world"
console.log( url.replace(/(?:\/test)+(?=\/)/g, "/WORKS") );
//=> https://www.google.es/WORKS/hello/world
RegEx Breakdown:
(?:\/test)+: Match 1+ repeats of text /test
(?=\/): Lookahead to assert that we have a / ahead

Regex to extract city and city code from a dynamic URL in React

I am storing the URL in location variale ,and which will be dynamic for ex.
1- https://abc.go.com/United_States,US/
2- https://abc.go.com/US/
3- https://abc.go.com/Uganda,UG/
4- https://abc.go.com/United_States,US
5- https://abc.go.com/
the URL's are totally dynamic ,
I was able to extract the state Code with following regex :-
const cityCode =location.split(/(?=[A-Z][A-Z])/)[1].split('/')[0];
//this will get two consecutive Uppercase characters and remove "/" in the end if present
Is there any possible regex to extract the country and State Code in a single variable if present in the url.
You can use a capture group with an optional part for the first part and the comma
https?:\/\/\S+?\/((?:\w+,)?[A-Z][A-Z])
https?:\/\/ Match the protocol with optional s
\S+?\/ Match 1+ non whitespace chars as least as possible
( Capture group 1 (Which is accessed by m[1] in the example code)
(?:\w+,)? Optionally match 1+ word characters and a comma
[A-Z][A-Z] Match 2 uppercase chars
) Close group 1
Regex demo
const pattern = /https?:\/\/\S+?\/((?:\w+,)?[A-Z][A-Z])/;
[
"https://abc.go.com/United_States,US/",
"https://abc.go.com/US/",
"https://abc.go.com/Uganda,UG/",
"https://abc.go.com/United_States,US",
"https://abc.go.com/"
].forEach(s => {
const m = s.match(pattern);
console.log(m ? m[1] : "no match");
});
Or if it is the part after the first forward slash
https?:\/\/[^\s\/]+\/((?:\w+,)?[A-Z][A-Z])
Regex demo
Reference: String.prototype.match()
urls = ['https://abc.go.com/United_States,US/',
'https://abc.go.com/US/',
'https://abc.go.com/Uganda,UG/',
'https://abc.go.com/United_States,US',
'https://abc.go.com/'
];
urls.forEach(url => {
console.log(url.match(/.*:\/\/.*?\/(.*)/m)[1].replace(/\/$/, ''))
});
var pathName = location.pathname;
var countryCode = (pathName.split('/')).length!= 0 ? pathName.split('/')[1].split(',') : [];
console.log(countryCode);
// if the Window URL is - "https://test.org/United_States,US" then result is - ["United_States", "US"]
// if the Window URL is - "https://test.org/United_States" then result is - ["United_States"]

Manipulate a string containing a file's path to get only the file name

I got a file path as
falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV
as file extension will change as file type
file path will also change
how could I manipulate string to get file name as
BD6FE729-70F1-48B0-83EB-8E7D956E599E"
is in given example
2nd example of path and file type change
falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540ppphrx%252Fiia-mas-app-new//IIAMASATTCHMENTS/DD6FE729-60F2-58B0-8M8B-8E759R6E547K.jpeg
you can do simply
let str="falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV"
console.log( str.split(".")[0].split("/").pop()
)
just remember split split pop
Some variation of slice/split would work
const str = 'falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV'
console.log(
str.slice(str.lastIndexOf("/")+1).split(".")[0]
)
// or
console.log(
str.split("/").pop().split(".")[0]
)
You can use regular expression for example.
The first thing comes in my mind is:
const filepath = 'falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV'
const filenameWithoutExtension = filepath.match(/IIAMASATTCHMENTS\/(.*)\./)[1] // "BD6FE729-70F1-48B0-83EB-8E7D956E599E"
console.log(filenameWithoutExtension)
If you know the format of the value you want to capture, you might get a more exact match using a regex and capture your value in the first capturing group.
You might use the /i flag to make the match case insensitive.
([A-Z0-9]+(?:-[A-Z0-9]+){4})\.\w+$
That will match:
( Capturing group
[A-Z0-9]+ Match 1+ times what is listed in the character class
(?:-[A-Z0-9]+){4} Repeat 4 times matching a hyphen and 1+ times what is listed in the character class
) Close capturing group
\.\w+$ Match a dot, 1+ times a word char and assert the end of the string
Regex demo
let strs = [
`falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV`,
`falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540ppphrx%252Fiia-mas-app-new//IIAMASATTCHMENTS/DD6FE729-60F2-58B0-8M8B-8E759R6E547K.jpeg`
];
let pattern = /([A-Z0-9]+(?:-[A-Z0-9]+){4})\.\w+$/i;
strs.forEach(str => console.log(str.match(pattern)[1]));
You could use regular expressions like here:
function get_filename(str) {
const regex = /\/([A-Z0-9\-_]+)\.[\w\d]+/gm;
let m = regex.exec(str);
return m[1];
}
console.log(
get_filename(`falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540ppphrx%252Fiia-mas-app-new//IIAMASATTCHMENTS/DD6FE729-60F2-58B0-8M8B-8E759R6E547K.jpeg`)
)
var filpath = "falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV"
console.log(
filpath.substring(filpath.lastIndexOf('/') + 1, filpath.length).substring(1, filpath.substring(filpath.lastIndexOf('/') + 1, filpath.length).lastIndexOf('.'))
)
var str = "falsefile:///var/mobile/Containers/Data/Application/D4B6F6CD-5E5C-4459-90CC-0C649B3B31B8/Documents/ExponentExperienceData/%2540hherax%252Fiia-mas-app-new//IIAMASATTCHMENTS/BD6FE729-70F1-48B0-83EB-8E7D956E599E.MOV",
re = /[\w|-]*\.\w*/
stringNameWithExt = str.match(re)
stringNameWithoutExt = str.match(re)[0].split(".")[0]
console.log(stringNameWithoutExt)

regex to extract numbers starting from second symbol

Sorry for one more to the tons of regexp questions but I can't find anything similar to my needs. I want to output the string which can contain number or letter 'A' as the first symbol and numbers only on other positions. Input is any string, for example:
---INPUT--- -OUTPUT-
A123asdf456 -> A123456
0qw#$56-398 -> 056398
B12376B6f90 -> 12376690
12A12345BCt -> 1212345
What I tried is replace(/[^A\d]/g, '') (I use JS), which almost does the job except the case when there's A in the middle of the string. I tried to use ^ anchor but then the pattern doesn't match other numbers in the string. Not sure what is easier - extract matching characters or remove unmatching.
I think you can do it like this using a negative lookahead and then replace with an empty string.
In an non capturing group (?:, use a negative lookahad (?! to assert that what follows is not the beginning of the string followed by ^A or a digit \d. If that is the case, match any character .
(?:(?!^A|\d).)+
var pattern = /(?:(?!^A|\d).)+/g;
var strings = [
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
for (var i = 0; i < strings.length; i++) {
console.log(strings[i] + " ==> " + strings[i].replace(pattern, ""));
}
You can match and capture desired and undesired characters within two different sides of an alternation, then replace those undesired with nothing:
^(A)|\D
JS code:
var inputStrings = [
"A-123asdf456",
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
console.log(
inputStrings.map(v => v.replace(/^(A)|\D/g, "$1"))
);
You can use the following regex : /(^A)?\d+/g
var arr = ['A123asdf456','0qw#$56-398','B12376B6f90','12A12345BCt', 'A-123asdf456'],
result = arr.map(s => s.match(/(^A|\d)/g).join(''));
console.log(result);

Categories

Resources