How to get youku video id from url by regex? - javascript

I need to get youku video id from url by regex, for example:
http://v.youku.com/v_show/id_XNTg3OTc3MzY4.html
I only need XNTg3OTc3MzY4 to keep in a variable.
How can I write it in function below
var youkuEmbed = "[[*supplier-video]]";
var youkuUrl = youkuEmbed.match(/http://v\.youku\.com/v_show/id_(\w+)\.html/);
I tried this but it didn't work.
Thanks!

You can use a simple regex like this:
id_(\w+)
Working demo
The idea is to match the _id and the capture all the alphanumeric strings.
MATCH 1
1. [29-42] `XNTg3OTc3MzY4`
If you go the Code Generator section you can get the code. However, you can use something like this:
var myString = 'http://v.youku.com/v_show/id_XNTg3OTc3MzY4.html';
var myRegexp = /id_(\w+)/;
var match = myRegexp.exec(myString);
alert(match[1]);
//Shows: XNTg3OTc3MzY4

You can use this regex:
http://v\.youku\.com/v_show/id_(\w+)\.html
Your match is in the first capturing group.
Here is a regex demo.

Id the id always follows id_, you could possibly split the string.
'http://v.youku.com/v_show/id_XNTg3OTc3MzY4.html'.split(/.*id_|\./)[1]
//=> 'XNTg3OTc3MzY4'
For this specific string, you could just do.
'http://youku.com/id_XNTg30Tc3MzY4.html'.split(/id_|\./)[2]
//=> 'XNTg3OTc3MzY4'

It looks like you need to escape all the slashes because that's the delimiter for the regex itself:
var youkuUrl = youkuEmbed.match(/http:\/\/v\.youku\.com\/v_show\/id_(\w+)\.html/);
Then use the first capture group, as Unihedron stated.

Related

how to extract url id from string with regex?

suppose that, i've this string:
google.com/:id/:category
how can i extract only id and category from this string?
i should use regex
this match doesn't work:
match(/\/:([a-zA-Z0-9]*)/g);
You may try the following:
var url = "google.com/:id/:category";
var parts = url.match(/(?<=\/:)[a-zA-Z0-9]+/g);
console.log(parts);
This approach uses the positive lookbehind (?<=\/:) to get around the problem of matching the unwanted leading /: portion. Instead, this leading marker is asserted but not matched in the version above.
Well, capture groups are ignored in match with /g. You might go with matchAll like this:
const url = "google.com/:id/:category"
const info = [...url.matchAll(/\/:([a-zA-Z0-9]*)/g)].map(match => match[1])
console.log(info)
Credit: Better access to capturing groups (than String.prototype.match())

How would I write a Regular Expression to capture the value between Last Slash and Query String?

Problem:
Extract image file name from CDN address similar to the following:
https://cdnstorage.api.com/v0/b/my-app.com/o/photo%2FB%_2.jpeg?alt=media&token=4e32-a1a2-c48e6c91a2ba
Two-stage Solution:
I am using two regular expressions to retrieve the file name:
var postLastSlashRegEx = /[^\/]+$/,
preQueryRegEx = /^([^?]+)/;
var fileFromURL = urlString.match(postLastSlashRegEx)[0].match(preQueryRegEx)[0];
// fileFromURL = "photo%2FB%_2.jpeg"
Question:
Is there a way I can combine both regular expressions?
I've tried using capture groups, but haven't been able to produce a working solution.
From my comment
You can use a lookahead to find the "?" and use [^/] to match any non-slash characters.
/[^/]+(?=\?)/
To remove the dependency on the URL needing a "?", you can make the lookahead match a question mark or the end of line indicator (represented by $), but make sure the first glob is non-greedy.
/[^/]+?(?=\?|$)/
You don't have to use regex, you can just use split and substr.
var str = "https://cdnstorage.api.com/v0/b/my-app.com/o/photo%2FB%_2.jpeg?alt=media&token=4e32-a1a2-c48e6c91a2ba".split("?")[0];
var fileName = temp.substr(temp.lastIndexOf('/')+1);
but if regex is important to you, then:
str.match(/[^?]*\/([^?]+)/)[1]
The code using the substring method would look like the following -
var fileFromURL = urlString.substring(urlString.lastIndexOf('/') + 1, urlString.lastIndexOf('?'))

JavaScript String test with array of RegEx

I have some doubts regarding RegEx in JavaScript as I am not good in RegEx.
I have a String and I want to compare it against some array of RegEx expressions.
First I tried for one RegEx and it's not working. I want to fix that also.
function check(str){
var regEx = new RegEx("(users)\/[\w|\W]*");
var result = regEx.test(str);
if(result){
//do something
}
}
It is not working properly.
If I pass users, it doesn't match. If I pass users/ or users/somestring, it is matching.
If I change the RegEx to (usersGroupList)[/\w|\W]*, then it is matching for any string that contains the string users
fdgdsfgguserslist/data
I want to match like if string is either users or it should contain users/something or users/
And also I want the string to compare it with similar regex array.
I want to compare the string str with users, users/something, list, list/something, anothermatch, anothermatch/something. If if it matches any of these expression i want to do something.
How can I do that?
Thanks
Then, you'll have to make the last group optional. You do that by capturing the /something part in a group and following it with ? which makes the previous token, here the captured group, optional.
var regEx = new RegExp("(users)(\/[\w|\W]*)?");
What about making:
the last group optional
starting from beginning of the string
Like this:
var regEx = new RegExp("^(users)(\/[\w|\W]*)?");
Same applies for all the others cases, e.g. for list:
var regEx = new RegExp("^(list)(\/[\w|\W]*)?");
All in One Approach
var regEx = new RegExp("^(users|list|anothermatch)(\/[\w|\W]*)?");
Even More Generic
var keyw = ["users", "list", "anothermatch"];
var keyws = keyw.join("|");
var regEx = new RegExp("^("+keyws+")(\/[\w|\W]*)?");
You haven't made the / optional. Try this instead
(users)\/?[\w|\W]*

How to find in javascript with regular expression string from url?

Good evening, How can I find in javascript with regular expression string from url address for example i have url: http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/ and I need only string between last slashes (/ /) http://something.cz/something/string/ in this example word that i need is mikronebulizer. Thank you very much for you help.
You could use a regex match with a group.
Use this:
/([\w\-]+)\/$/.exec("http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/")[1];
Here's a jsfiddle showing it in action
This part: ([\w\-]+)
Means at least 1 or more of the set of alphanumeric, underscore and hyphen and use it as the first match group.
Followed by a /
And then finally the: $
Which means the line should end with this
The .exec() returns an array where the first value is the full match (IE: "mikronebulizer/") and then each match group after that.
So .exec()[1] returns your value: mikronebulizer
Simply:
url.match(/([^\/]*)\/$/);
Should do it.
If you want to match (optionally) without a trailing slash, use:
url.match(/([^\/]*)\/?$/);
See it in action here: http://regex101.com/r/cL3qG3
If you have the url provided, then you can do it this way:
var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
var urlsplit = url.split('/');
var urlEnd = urlsplit[urlsplit.length- (urlsplit[urlsplit.length-1] == '' ? 2 : 1)];
This will match either everything after the last slash, if there's any content there, and otherwise, it will match the part between the second-last and the last slash.
Something else to consider - yes a pure RegEx approach might be easier (heck, and faster), but I wanted to include this simply to point out window.location.pathName.
function getLast(){
// Strip trailing slash if present
var path = window.location.pathname.replace(/\/$?/, '');
return path.split('/').pop();
}
Alternatively you could get using split:
var pieces = "http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/".split("/");
var lastSegment = pieces[pieces.length - 2];
// lastSegment == mikronebulizer
var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
if (url.slice(-1)=="/") {
url = url.substr(0,url.length-1);
}
var lastSegment = url.split('/').pop();
document.write(lastSegment+"<br>");

Simple javascript regex

I need: www.mydomain.com:1235 form the text var below:
var text = 'http://www.mydomain.com:1235/;image.jpg';
alert(text.match(/\/[^]+\//));
output is: //www.mydomain.com:1235/
How do I exclude the delimiters?
You need to use parens to group what you want to match. Then, the call to .match() will let you use indexers. Index 0 is the whole string match, and index 1 is the first paren grouping.
var text = 'http://www.mydomain.com:1235/;image.jpg';
alert(text.match(/\/([^\/]+)\//)[1]);
Not a regex, but you could do this:
Example: http://jsfiddle.net/nTmv9/
text = text.split('http://')[1].split('/')[0];
or with a regex:
Example: http://jsfiddle.net/nTmv9/1/
text = text.match(/http:\/\/([^\/]+)\//)[1];
This will capture the domain without the http or the url slugs.
https?:\/\/([^\/]+)\/
If you need help figuring out regex here is a great tool I use all of the time.
http://gskinner.com/RegExr/
Cheers

Categories

Resources