Extracting a portion of the a href path using Javascript - javascript

I'm needing to extract a portion of an href attribute using javascript.
I've come up with a solution, but I'm not sure it's the best way. Below is my sample code. I've got a variable with the complete href path -- and all I'm interesting in extracting is the "Subcategory-Foobaz" portion.
I can assume it will always be sandwiched between the 2nd "/" and the "?".
I'm terrible with Regex, so I came up with what seems like a hokie solution using 2 "splits".
var path = "/Category-Foobar/Subcategory-Foobaz?cm_sp=a-bunch-of-junk-i-dont-care-about";
var subcat = path.split("/")[2].split("?")[0];
console.log(subcat);
Is this horrible? How would you do it?
Thanks

var path = "/Category-Foobar/Subcategory-Foobaz?cm_sp=a-bunch-of-junk-i-dont-care-about";
var subcat = path.split("/").pop().split("?")[0];
console.log(subcat);
Just a small change; use pop() to get the last element no matter how many /'s you have in your string

re = /\/[^\/]+\/([^?]+)/;
str = '/Category-Foobar/Subcategory-Foobaz?cm_sp=a-bunch-of-junk-i-dont-care-about';
matches = str.match(re);
console.log(matches);

Related

JS / Node.js string().replace() help also help editing URL

I am trying to edit a string in Node.js. I am developing a web proxy, and I need to rewrite some stuff.
What I need to rewrite is when there is two querystrings in a URL.
I just need the first one to stay but the rest to be modified to "&".
This is what I got.
.replace(new RegExp(/src="(.*?)?(.*?)?&_get=/gi),'src="$1' + '$2' + '&_get=')
But it's not replacing the querystring but the replace is working.
I also need this in string.replace() specifically. If this is not possible, I would like to know how I can get a URL to redirect to a link that replaces the querystring except the first one.
Based on your input output example in the comments, it seems like there's lots of ways you could do this.
One way would be to find the index of the last "?" and use the substring method to "replace" it.
let str = "balalala?query?_get=https://example.org"
let ind = str.lastIndexOf("?")
let newStr = str.substring(0,ind) + "&" + str.substring(ind+1)
If balalala? is always going to be constant, you can separate that out.
var source = 'balalala?'
var str = 'balalala?query?_get=https://example.org'
var rest = str.replace(source, '');
console.log(`${source}${rest.split('?').join('&')}`);

Extract characters in URL after certain character up to certain character

I'm trying to extract certain piece of a URL using regex (JavaScript) and having trouble excluding characters after a certain piece. Here's what I have so far:
URL: http://www.somesite.com/state-de
Using url.match(/\/[^\/]+$/)[0] I can extract the state-de like I want.
However when the URL becomes http://www.somesite.com/state-de?page=r and I do the same regex it pulls everything including the "?page=r" which I don't want. I want to only extract the state-de regardless of whats after it (looks like usually a "?" follows it)
This might work:
var arr = url.split("/")
arr[arr.length - 1].split("?")[0]
I'd recommend reading up on regular expressions in general. What you want to do here is make the regular expression stop when it hits the ? in the URL.
Using capturing groups to select which part of the match that you want might also be useful here.
Example:
url.match(/(\/[^\/?]+)(?:\?.*)?$/)[1]
I avoid overly complex RegExs when possible, so I tend to do this in multiple steps (with .replace()):
var stripped = url.replace(/[?#].*/, ''); // Strips anything after ? or #
You can now do the simpler transform to get the state, e.g.:
var state = stripped.split('/').pop()
If you want do it by regex try this one:
url.match(/https?:\/\/([a-z0-9-]+\.)+[a-z]+\/([a-z0-9_-])\/?(\?.*)?/)[1]
Or you could do it using JQuery:
var url = 'http://www.somesite.com/state-de?page=r#mark4';
// Create a special anchor element, set the URL to it
var a = $('<a>', { href:url } )[1];
console.log(a.hostname);
console.log(a.pathname);
console.log(a.search);
console.log(a.hash);

Get portion of a string using Javascript

I have a string (from the pathname in the url) and I am trying to pull out part of it, but I'm having trouble.
This is what I have so far:
^(/svc_2/pub/(.*?).php)
The string is:
/svc_2/pub/stats/dashboard.php?ajax=1
How can I get a regex that returns /pub/stats/dashboard only?
If it's always that format (I'm assuming the /svc_2/ is always there) this should do it.
var s = "/svc_2/pub/stats/dashboard.php?ajax=1";
var match = s.match(/\/svc_2(.+)\./)[1];
But not if anything comes before that.
For this, using a regex is too much, but here is:
var string = "/svc_2/pub/stats/dashboard.php?ajax=1";
console.log(string.replace(/.*(\/pub.*)\.php.*$/,"$1"));
But you can do it, without a regex, like this
console.log(string.substr(6,string.indexOf(".php")-6));
In both cases, the console.log will give you /pub/stats/dashboard
This is not very flexible, but should work in this specific case.
var basedir = '/svc_2/', str = '/svc_2/pub/stats/dashboard.php?ajax=1';
str = str.substring(basedir.length, str.indexOf('.'));
alert(str);

Whats the best way to get an id from a URL written in different ways?

I'm trying to find a quick way to get an ID from a url like this in javascript or jquery?
https://plus.google.com/115025207826515678661/posts
https://plus.google.com/115025207826515678661/
https://plus.google.com/115025207826515678661
http://plus.google.com/115025207826515678661/posts
http://plus.google.com/115025207826515678661/
http://plus.google.com/115025207826515678661
plus.google.com/115025207826515678661/posts
plus.google.com/115025207826515678661/
plus.google.com/115025207826515678661
want to just get 115025207826515678661 from the URL
Is there a sure way to always get the ID regardless of the way its typed?
You could use this javascript which works on all the urls you posted:
var url, patt, matches, id;
url = 'https://plus.google.com/115025207826515678661/posts';
patt = /\/(\d+)(?:\/|$)/
matches = patt.exec(url);
id = matches[1];
Use the following regular expression to extract the value:
\/([0-9]+)\/?
Tested on all of your input strings and it worked on each.
The first and only group will have the number you're looking for
var pos=url.indexOf('com').
var str1=url.substr(pos+3,21);
if the 21 is not constant, then just check the indexOf('/') in the substr:
var pos=url.indexOf('com').
var str1=url.substr(pos+3);
pos=str1.indexOf('/');
var str2=str1.substr(0,pos);
alert(str2);
or, use some regexp magic as was written in a different answer here.
I believe you could create a new string, and then loop through the URL string, and copy any numbers into the new string, especially if the URLs never have other numerical characters.
Basically, stripping all but numerical characters.

Javascript Regex - How to extract last word before path to image

I want to use regex to extract the last word from a file path. For example, I have:
/xyz/blahblah/zzz/abc-blah/def-xyz-color.jpg
I want to extract the "color" out of the path. The path color have different syntax. The only thing that is consistent is the ending where it is always -color.jpg where color would be any [a-z] word.
Is there an elegant way to do this?
I would really appreciate any help here. Thanks
Why could you just take substring rather than using regex?
var path=" /xyz/blahblah/zzz/abc-blah/def-xyz-color.jpg";
var lastHyphen = path.lastIndexOf("-");
var lastDot = path.lastIndexOf(".");
var extractedValue=path.substring(lastHyphen + 1, lastDot);
a more compact version will be
var extractedValue=path.substring(path.lastIndexOf("-") + 1, path.lastIndexOf("."));
var matched = /-(\w+).jpg/i.exec('/xyz/blahblah/zzz/abc-blah/def-xyz-color.jpg')[1];
Why use regex?
var a = '/xyz/blahblah/zzz/abc-blah/def-xyz-color.jpg'
.split('/').pop()
.split('-').pop()
.split('.')[0];
console.log(a);
What about
-(\w+)\.jpg
?
If you don't want to hardcode the extension, you can do:
-(\w+)\.\w+\b
Of course, that will match lots of things, but I'm assuming the text to be matched will be the url ;)
Edit:
It will match two groups, and you need to take only the second one, so just access to the 1st index:
var text = '/xyz/blahblah/zzz/abc-blah/def-xyz-color.jpg';
var pattern = /-(\w+)\.\w+\b/;
var match = pattern.exec(text);
alert(match[1]); // color
Or do it in one line like #Ryan suggested.
Why not just
var path= "/xyz/blahblah/zzz/abc-blah/def-xyz-color.jpg";
/(?:([^-.]+?)\.[^.]+?$)/i.test(path);
var color = RegExp.$1;
alert(color);

Categories

Resources