Extracting certain values from a large URL string in javascript? - javascript

I have a big URL hash that is given in string form and need to extract each part of it:
type=recovery&access_token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9eyJhdWQiOiJhdXRoZW50aWNhdGV&expires_in=3600&refresh_token=sYlurmTtfrAhyHl39Oqwww&token_type=bearer&type=recovery
I have tried substr() but it isn't reliable because each item may have a differing amount of characters each time.
What's the best way to extract type, access_token, expires_in, refresh_token, token_type reliably?

Please use URLSearchParams
var urlSearchParams = new URLSearchParams("type=recovery&access_token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9eyJhdWQiOiJhdXRoZW50aWNhdGV&expires_in=3600&refresh_token=sYlurmTtfrAhyHl39Oqwww&token_type=bearer&type=recovery")
const params = Object.fromEntries(urlSearchParams.entries());
console.log(params)

You could use URL:
// Adding a dummy domain so the string is a valid url
let x = new URL('http://dummydomain.tpl/dummyfile.ext?' +
'type=recovery&access_token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9eyJhdWQiOiJhdXRoZW50aWNhdGV&expires_in=3600&refresh_token=sYlurmTtfrAhyHl39Oqwww&token_type=bearer&type=recovery');
x.searchParams.get("refresh_token")

Related

How do I split a website URL with multiple separators in JavaScript?

I'm trying to create a Custom JavaScript Variable in Google Tag Manager to split up information from a page url that has multiple separators. For example, in https://website.com/item/2006-yellow-submarine, I would want to capture the 2006. I've been using the code below to separate a URL based on one separator at a time (- or /). But if I used the code below to pull 2006, it would pull 2006 and everything after (so the data pulled would be 2006-yellow-submarine and not just 2006).
function() {
var pageUrl = window.location.href;
return pageUrl.split("/")[4];
}
Is there a way to extract only the 2006, or to essentially use a combination of - and / separators to pull single path points from a URL, without having to specify the URL in the code each time?
Since it's a variable meant to be used to automatically capture the year on each page, I can't make a variable for every individual page of the website. Therefore the solution can't involve specifying the URL.
You can split by RegExp, splitting by either / or -, i.e. /[/-]/:
const u = new URL("https://website.com/item/2006-yellow-submarine");
const pathname = u.pathname;
console.log(pathname);
// skip the first item... it will always be empty
const [, ...pathParts] = pathname.split(/[/-]/);
console.log(pathParts);
Make a new URL object from the string, extract the last part of the pathname, and then match the year with a small regular expression.
// Create a new URL object from the string
const url = new URL('https://website.com/item/2006-yellow-submarine');
// `split` the pathname on "/", and `pop` off
// the last element of that array
const end = url.pathname.split('/').pop();
// Then match the year
console.log(end.match(/[0-9]{4}/)[0]);
To do this you could do this:
function splitWebURL(url, loc1, loc2) {
return url.split("/")[loc1].split("-")[loc2];
}
var test = splitWebURL("https://website.com/item/2006-yellow-submarine", 4, 0);
console.log(test);
It works great and you can change it with ease by just changing the 2 index numbers and changing the url.
You could do something like that
let year = window.location.href.match(/item\/([0-9]*)-/)[1];
assuming you have an url that's like item/year-something,
you will have to handle the cases where the url does not match though, so perhaps like that :
let match = window.location.href.match(/item\/([0-9]*)-/);
if(match && match.hasOwnProperty(1)) {
let year = match[1]
}

Split URL into params

I'm having trouble splitting the url into the following format for my URL.
Url = ?Point%20Cook-VIC-3030
desired format: Point Cook, VIC, 3030
So far I've tried the below to get rid of the "-" but I'm not sure how to get rid of the "?" and "%" and "20"
const url = "?Point%20Cook-VIC-3030"
const queryParams = url.split("-")
I tried using URLSearchParams but realized it doesn't have browser support for older browsers
which is a bummer
If your URL is encoded (due to space character) you need to properly decode it.
const url = '?Point%20Cook-VIC-3030';
// Decode URL and remove initial '?'
const decodedUrl = decodeURIComponent(url).substring(1);
// Split parameters
const paramerters = decodedUrl.split('-');

Using split to get only the URL without query string in javascript

I want to split the website name and get only URL without the query string
example:
www.xyz.com/.php?id=1
the URL can be of any length so I want to split to get the URL to only
xyz.com
able to split the URL and getting xyz.com/php?id=1
but how do I end the split and get only xyz.com
var domain2 = document.getElementById("domain_id").value.split("w.")[1];
You can use:
new URL()
for example -
var urlData = new URL("http://www.example.org/.php?id=1")
and than
urlData.host
which will only return the hostname
You can use a simple regex with match to capture the host in the way you want:
var url = 'www.xyz.com/.php?id=1';
var host = url.match(/www.(.*)\//)[1];
console.log(host)
Just adding it to other, you can also use this regex expression to capture everything up until the query string "?" like so;
This will also work if you want to grab any sub pages from url before the query string
var exp = new RegExp('^.*(?=([\?]))');
var url = exp.exec("www.xyz.com/.php?id=1");
var host = url[0];
console.log(host);

Get base url from string with Regex and Javascript

I'm trying to get the base url from a string (So no window.location).
It needs to remove the trailing slash
It needs to be regex (No New URL)
It need to work with query parameters and anchor links
In other words all the following should return https://apple.com or https://www.apple.com for the last one.
https://apple.com?query=true&slash=false
https://apple.com#anchor=true&slash=false
http://www.apple.com/#anchor=true&slash=true&whatever=foo
These are just examples, urls can have different subdomains like https://shop.apple.co.uk/?query=foo should return https://shop.apple.co.uk - It could be any url like: https://foo.bar
The closer I got is with:
const baseUrl = url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1').replace(/\/$/, ""); // Base Path & Trailing slash
But this doesn't work with anchor links and queries which start right after the url without the / before
Any idea how I can get it to work on all cases?
You could add # and ? to your negated character class. You don't need .* because that will match until the end of the string.
For your example data, you could match:
^https?:\/\/[^#?\/]+
Regex demo
strings = [
"https://apple.com?query=true&slash=false",
"https://apple.com#anchor=true&slash=false",
"http://www.apple.com/#anchor=true&slash=true&whatever=foo",
"https://foo.bar/?q=true"
];
strings.forEach(s => {
console.log(s.match(/^https?:\/\/[^#?\/]+/)[0]);
})
You could use Web API's built-in URL for this. URL will also provide you with other parsed properties that are easy to get to, like the query string params, the protocol, etc.
Regex is a painful way to do something that the browser makes otherwise very simple.
I know that you asked about using regex, but in the event that you (or someone coming here in the future) really just cares about getting the information out and isn't committed to using regex, maybe this answer will help.
let one = "https://apple.com?query=true&slash=false"
let two = "https://apple.com#anchor=true&slash=false"
let three = "http://www.apple.com/#anchor=true&slash=true&whatever=foo"
let urlOne = new URL(one)
console.log(urlOne.origin)
let urlTwo = new URL(two)
console.log(urlTwo.origin)
let urlThree = new URL(three)
console.log(urlThree.origin)
const baseUrl = url.replace(/(.*:\/\/.*)[\?\/#].*/, '$1');
This will get you everything up to the .com part. You will have to append .com once you pull out the first part of the url.
^http.*?(?=\.com)
Or maybe you could do:
myUrl.Replace(/(#|\?|\/#).*$/, "")
To remove everything after the host name.

Is it possible to get this part of a string

I wonder if it's possible to get this part of a string.
Here is my string:
var string = "www.somesite.com/o/images%2Fc834vePyJ3SFVk2iO4rU0ke1cSa2%2F12391381_10205760647243398_2385261683139818614_n.jpg?alt=media&token=7a692a38-6982-474f-bea5-459c987ae575";
Now I want to be able to grab just this part of the string, the file name:
12391381_10205760647243398_2385261683139818614_n.jpg
I tried:
var result = /[^/]*$/.exec(""+url+"")[0];
, but it will return
user%2Fc834vePyJ3SFVk2iO4rU0ke1cSa2%2F12391381_10205760647243398_2385261683139818614_n.jpg?alt=media&token=4c92c4d7-8979-4478-a63d-ea190bec87cf
My Regex is wrong.
Another this is, the file extension can be .png or jpg so it's not fixed to jpg.
You could use a regex to isolate the part you want :
This works :
var string = "www.somesite.com/o/images%2Fc834vePyJ3SFVk2iO4rU0ke1cSa2%2F12391381_10205760647243398_2385261683139818614_n.jpg?alt=media&token=7a692a38-6982-474f-bea5-459c987ae575";
console.log((string.match(/[A-Za-z0-9_]+.(jpg|png|bmp)/))[0].substring(2));
Note that may have to be adapted depending on how much the URL string changes:
var string = "www.somesite.com/o/images%2Fc834vePyJ3SFVk2iO4rU0ke1cSa2%2F12391381_10205760647243398_2385261683139818614_n.jpg?alt=media&token=7a692a38-6982-474f-bea5-459c987ae575";
var out = string.split('?')[0].split('%2F')[2];
console.log(out); // "12391381_10205760647243398_2385261683139818614_n.jpg"
Assuming, you always have an url, first I would decode the encoded / (%2F) characters via:
var string = "www.somesite.com/o/images%2Fc834vePyJ3SFVk2iO4rU0ke1cSa2%2F12391381_10205760647243398_2385261683139818614_n.jpg?alt=media&token=7a692a38-6982-474f-bea5-459c987ae575";
var decodedUrl = decodeURIComponent(string);
and then use a regex:
decodedUrl.match(/[^/]*(?=[?])/)
Mind, that this regex assumes parameters (the part starting with ?...) are present, so if that's not the case, you might have to alter it to your needs.
If the filename always has a .jpg extension:
var url = decodeURIComponent(string);
var filename = url.substring(url.lastIndexOf("/")+1, url.lastIndexOf(".jpg"))
If not:
url = url.substring(url.lastIndexOf("/")+1)
filename = url.substring(0,url.indexOf("?"))
Looking at the string, it appears that the file name is between the second occurrence of "%2F" and the first occurrence of "?" in the string.
The first step is to get rid of the part of the string before the second "%2F". This can be done by splitting the string at every "%2F" and taking the third element in the resulting array.
var intermediate = string.split("%2F")[2]
Then, we need to get rid of everything after the "?":
var file_name = intermediate.split("?")[0]
This should give you the file name from the URL

Categories

Resources