Better RegEx to extract GoogleVideo ID from URL - javascript

HI!
I use this the following regex with JS to extract this id 6321890784249785097 from that url
http://video.google.com/googleplayer.swf?docId=6321890784249785097
url.replace(/^[^\$]+.(.{19}).*/,"$1");
But I only cut the last 19 chars from the tail. How can I make to more bullet-proof? Maybe with an explanation so that I learn something?

This should work a bit better:
/^.*docId=(\d+)$/
This matches all characters up to the 'docId=', then gives you all digits after that up to the end of the url.

video[.]google[.]com/googleplayer[.]swf[?]docId=(\d+)
The ID will be captured in reference #1. If you just want to match 19 digits you can chance it to this:
video[.]google[.]com/googleplayer[.]swf[?]docId=(\d{19})

url.replace(/.*docId=(\d{19}).*/i,"$1");
this cuts 19 digits that follow docId=.

Here is the function I use in our app to read url parameters. So far it didn't let me down ;)
urlParam:function(name, w){
w = w || window;
var rx = new RegExp('[\&|\?]'+name+'=([^\&\#]+)'),
val = w.location.href.match(rx);
return !val ? '':val[1];
}
For the explanation of the regexp:
[\&|\?] take either the start of the query string '?' or the separation between parameters '&'
'name' will be the name of the parameter 'docId' in your case
([^\&#]+) take any characters that are not & and #. The hash key is often used in one page apps. And the parenthesis keep the reference of the content.
val will be an array or null/undefined and val[1] the value you are looking for

Related

javascript regex insert new element into expression

I am passing a URL to a block of code in which I need to insert a new element into the regex. Pretty sure the regex is valid and the code seems right but no matter what I can't seem to execute the match for regex!
//** Incoming url's
//** url e.g. api/223344
//** api/11aa/page/2017
//** Need to match to the following
//** dir/api/12ab/page/1999
//** Hence the need to add dir at the front
var url = req.url;
//** pass in: /^\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var re = myregex.toString();
//** Insert dir into regex: /^dir\/api\/([a-zA-Z0-9-_~ %]+)(?:\/page\/([a-zA-Z0-9-_~ %]+))?$/
var regVar = re.substr(0, 2) + 'dir' + re.substr(2);
var matchedData = url.match(regVar);
matchedData === null ? console.log('NO') : console.log('Yay');
I hope I am just missing the obvious but can anyone see why I can't match and always returns NO?
Thanks
Let's break down your regex
^\/api\/ this matches the beginning of a string, and it looks to match exactly the string "/api"
([a-zA-Z0-9-_~ %]+) this is a capturing group: this one specifically will capture anything inside those brackets, with the + indicating to capture 1 or more, so for example, this section will match abAB25-_ %
(?:\/page\/([a-zA-Z0-9-_~ %]+)) this groups multiple tokens together as well, but does not create a capturing group like above (the ?: makes it non-captuing). You are first matching a string exactly like "/page/" followed by a group exactly like mentioned in the paragraph above (that matches a-z, A-Z, 0-9, etc.
?$ is at the end, and the ? means capture 0 or more of the precending group, and the $ matches the end of the string
This regex will match this string, for example: /api/abAB25-_ %/page/abAB25-_ %
You may be able to take advantage of capturing groups, however, and use something like this instead to get similar results: ^\/api\/([a-zA-Z0-9-_~ %]+)\/page\/\1?$. Here, we are using \1 to reference that first capturing group and match exactly the same tokens it is matching. EDIT: actually, this probably won't work, since the text after /api/ and the text after /page/ will most likely be different, carrying on...
Afterwards, you are are adding "dir" to the beginning of your search, so you can now match someting like this: dir/api/abAB25-_ %/page/abAB25-_ %
You have also now converted the regex to a string, so like Crayon Violent pointed out in their comment, this will break your expected funtionality. You can fix this by using .source on your regex: var matchedData = url.match(regVar.source); https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/source
Now you can properly match a string like this: dir/api/11aa/page/2017 see this example: https://repl.it/Mj8h
As mentioned by Crayon Violent in the comments, it seems you're passing a String rather than a regular expression in the .match() function. maybe try the following:
url.match(new RegExp(regVar, "i"));
to convert the string to a regular expression. The "i" is for ignore case; don't know that's what you want. Learn more here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

Use only one of the characters in regular expression javascript

I guess that should be smth very easy, but I'm stuck with that for at least 2 hours and I think it's better to ask the question here.
So, I've got a reg expression /&t=(\d*)$/g and it works fine while it is not ?t instead of &t in url. I've tried different combinations like /\?|&t=(\d*)$/g ; /\?t=(\d*)$|/&t=(\d*)$/g ; /(&|\?)t=(\d*)$/g and various others. But haven't got the expected result which is /\?t=(\d*)$/g or /&t=(\d*)$/g url part (whatever is placed to input).
Thx for response. I think need to put some details here. I'm actually working on this peace of code
var formValue = $.trim($("#v").val());
var formValueTime = /&t=(\d*)$/g.exec(formValue);
if (formValueTime && formValueTime.length > 1) {
formValueTime = parseInt(formValueTime[1], 10);
formValue = formValue.replace(/&t=\d*$/g, "");
}
and I want to get the t value whether reference passed with &t or ?t in references like youtu.be/hTWKbfoikeg?t=82 or similar one youtu.be/hTWKbfoikeg&t=82
To replace, you may use
var formValue = "some?some=more&t=1234"; // $.trim($("#v").val());
var formValueTime;
formValue = formValue.replace(/[&?]t=(\d*)$/g, function($0,$1) {
formValueTime = parseInt($1,10);
return '';
});
console.log(formValueTime, formValue);
To grab the value, you may use
/[?&]t=(\d*)$/g.exec(formValue);
Pattern details
[?&] - a character class matching ? or &
t= - t= substring
(\d*) - Group 1 matching zero or more digits
$ - end of string
/\?t=(\d*)|\&t=(\d*)$/g
you inverted the escape character for the second RegEx.
http://regexr.com/3gcnu
I want to thank you all guys for trying to help. Special thanks to #Wiktor Stribiżew who gave the closest answer.
Now the piece of code I needed looks exactly like this:
/[?&]t=(\d*)$/g.exec(formValue);
So that's the [?&] part that solved the problem.
I use array later, so /\?t=(\d*)|\&t=(\d*)$/g doesn't help because I get an array like [t&=50,,50] when reference is & type and the correct answer [t?=50,50] when reference is ? type just because of the order of statements in RegExp.
Now, if you're looking for a piece of RegExp that picks either character in one place while the rest of RegExp remains the same you may use smth like this [?&] for the example where wanted characters are ? and &.

Any way to extract string within 2 different special characters using javascript?

Hi I have a varying URL similar to:
http://farm4.staticflickr.com/3877/[image_id]_[secret].jpg
e.g. http://farm4.staticflickr.com/3877/14628998490_233a15c423_q.jpg
I need to extract image_id that's first set of numbers (i.e. 14628998490) before an underscore from 14628998490_233a15c423_q.jpg between the whole URL
Is there a good way to extract image_id?
Right now I am going to use:
var image_id = image_url.match(/[\/]([0-9]+)_/)[1]
Like i said in the comment, you don't need to escape / symbol in the character class. And also you don't need even a character class also. Just \/ would be enough. The below regex would capture one or more numbers which are preceded by / symbol and followed by _ symbol.
\/(\d+)_
DEMO
> var image_id = image_url.match(/\/(\d+)_/)[1]
undefined
> image_id
'14628998490'
OR
You could try this also, if you don't want to give \d+ in your pattern.
\/([^/]*?)_
DEMO
> var image_id = image_url.match(/\/([^/]*?)_/)[1]
undefined
> image_id
'14628998490'
Not shure that it's is better way, but you can do like this:
var str = 'http://farm4.staticflickr.com/3877/[image_id]_[secret].jpg';
var image_id = str.split('/').pop().split('.')[0].split('_');
If the special character is always the same (_), you could first obtain the last part (width substring+lastIndexOf) and then use split() :
var url = "http://farm4.staticflickr.com/3877/14628998490_233a15c423_q.jpg";
var splittedUrl = url.substr(url.lastIndexOf('/')+1).split("_");
var image_id = splittedUrl[0];
console.log(image_id);
I've read somewhere that string functions are faster than regexp, so it's an option you might consider.
String splitting is faster tha regex.You can just get the last index of / and string between first occurence of _ after last occurence of /. I think that will be better idea.

How to find in javascript with regular expression string from url?

Good evening, How can I find in javascript with regular expression string from url address for example i have url: http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/ and I need only string between last slashes (/ /) http://something.cz/something/string/ in this example word that i need is mikronebulizer. Thank you very much for you help.
You could use a regex match with a group.
Use this:
/([\w\-]+)\/$/.exec("http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/")[1];
Here's a jsfiddle showing it in action
This part: ([\w\-]+)
Means at least 1 or more of the set of alphanumeric, underscore and hyphen and use it as the first match group.
Followed by a /
And then finally the: $
Which means the line should end with this
The .exec() returns an array where the first value is the full match (IE: "mikronebulizer/") and then each match group after that.
So .exec()[1] returns your value: mikronebulizer
Simply:
url.match(/([^\/]*)\/$/);
Should do it.
If you want to match (optionally) without a trailing slash, use:
url.match(/([^\/]*)\/?$/);
See it in action here: http://regex101.com/r/cL3qG3
If you have the url provided, then you can do it this way:
var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
var urlsplit = url.split('/');
var urlEnd = urlsplit[urlsplit.length- (urlsplit[urlsplit.length-1] == '' ? 2 : 1)];
This will match either everything after the last slash, if there's any content there, and otherwise, it will match the part between the second-last and the last slash.
Something else to consider - yes a pure RegEx approach might be easier (heck, and faster), but I wanted to include this simply to point out window.location.pathName.
function getLast(){
// Strip trailing slash if present
var path = window.location.pathname.replace(/\/$?/, '');
return path.split('/').pop();
}
Alternatively you could get using split:
var pieces = "http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/".split("/");
var lastSegment = pieces[pieces.length - 2];
// lastSegment == mikronebulizer
var url = 'http://www.odsavacky.cz/blog/wpcproduct/mikronebulizer/';
if (url.slice(-1)=="/") {
url = url.substr(0,url.length-1);
}
var lastSegment = url.split('/').pop();
document.write(lastSegment+"<br>");

Find and get only number in string

Please help me solve this strange situation:
Here is code:
The link is so - www.blablabla.ru#3
The regex is so:
var id = window.location.href.replace(/\D/, '' );
alert(id);
The regular expression is correct - it must show only numbers ... but it's not showing numbers :-(
Can you please advice me and provide some informations on how to get only numbers in the string ?
Thanks
You're replacing only the first non-digit character with empty string. Try using:
var id = window.location.href.replace(/\D+/g, '' ); alert(id);
(Notice the "global" flag at the end of regex).
Consider using location.hash - this holds just the hashtag on the end of the url: "#42".
You can write:
var id = location.hash.substring(1);
Edit: See Kobi's answer. If you really are using the hash part of things, just use location.hash! (To self: Doh!)
But I'll leave the below in case you're doing something more complex than your example suggests.
Original answer:
As the others have said, you've left out the global flag in your replacement. But I'm worried about the expression, it's really fragile. Consider: www.37signals.com#42: Your resulting numeric string will be 3742, which probably isn't what you want. Other examples: www.blablabla.ru/user/4#3 (43), www2.blablabla.ru#3 (23), ...
How 'bout:
id = window.location.href.match(/\#(\d+)/)[1];
...which gets you the contiguous set of digits immediately following the hash mark (or undefined if there aren't any).
Use the flag /\D/g, globally replace all the instances
var id = window.location.href.replace(/\D/g, '' );
alert(id);
And /\D+/ gets better performance than /\D/g, according to Justin Johnson, which I think because of \D+ can match and replace it in one shot.

Categories

Resources